DIY: Formatting strings without extra allocations

What's the goal?

Basically, to be able to write

    auto s = format("%0 %1", "Hello", "World");

and get Hello World as a result.

And only allocate memory once, enough to fit the final result.,

What's the point?

“Why not to use …

  • printf?” - it sucks, really. Non-type safe, without validation (unless your compiler built special support for it… strangely, all of them did… huh), needs to extract C-style string from C++ strings and string_views etc.

  • <iostream>?" - iostreams are slow. Sure, they support lots of stuff, including locales and what not, but they are slow. And allocate a lot of memory here and there. And it’s not friendly to localization - you need an ability to reorder arguments in that case.

  • …something like QString::arg?" - actually this article is inspired by Qt’s formatting - or rather by its imitation in our codebase. This being said, it’s also inefficient (which is what brought me to this post - the formatting stood up high in the CPU profile);

  • {fmt}? it’s going to be a part of standard soon?" Well yes, you caught me. {fmt} is great. Use it if you can. At my job we use UTF-16 built-in-house strings1, and {fmt} can’t handle those (AFAIK). But yes, one day…

Anyway there wouldn’t be a post if there was no problem to solve, right?

Off we go!

So, the function signature would look like something like this in pseudocode:

    str format(str fmt, ...);

which in C++ would look like2:

    template<typename... ArgsT>
    std::string format(const std::string_view& fmt, ArgsT&&... args);

The implementation will look like this (in pseudocode)

    template<typename... ArgsT>
    std::string format(const std::string_view& fmt, ArgsT&&... args)
    {
        // compute length of all the strings, including substitutions.
        size_t total_length = 0;
        
        for (auto marker: markers(fmt))
        {
            total_length = marker.string_before().size();
            if (marker.is_ok())
            {
                total_length += args[marker.arg_id].size();
            }
        }

        std::string result;
        result.reserve(total_length); // the only place we're going to allocate!

        // concatenate the elements
        for (auto marker: markers(fmt))
        {
            result += marker.string_before();
            if (marker.is_ok())
            {
                result += args[marker.arg_id];
            }
        }

        return result;
    }

The downside here is that we traverse the fmt twice. But otherwise seems to be efficient, no?

There’re gotchas though:

  • we need to be able to find the markers efficiently (and allow to still use %% as an escaped %);

  • to my knowledge, there’s no standard facility to access a value in a parameter pack by index (unless it’s known in compile time);

  • very likely, ArgsT... will contain a mix-n-match of string literals, C-strings, std::strings, std::string_views etc. We need to handle them uniformly. Ideally, we also need to handle them efficiently - e.g. string literals don’t need their length to be computed at run-time, since it’s known at compile-time.

% marks the spot

Let’s start with iteration over the markers. I’m too lazy to implement proper STL-style iterators to be used with for loop, so let’s do an approach with callbacks:


    struct marker_info {
        std::string_view literal;
        int arg_number;
        std::string_view rest;
    };

    template<typename Func>
    constexpr void foreach_marker(std::string_view fmt, Func func)
    {
        while(!fmt.empty())
        {
            auto [ literal, arg_number, rest ] = find_marker(fmt);
            func(literal, arg_number);
            fmt = rest;
        }
    }

Finding markers is a bit trickier but still pretty clean:

    constexpr marker_info find_marker(const std::string_view& format)
    {
        for (auto i = 0; i < format.size(); ++i)
        {
            if (format[i] == '%')
            {
                auto literal_end = i++; // `literal_end` points at '%', `i` at next char.
                auto c = format[i++];   // `c` is character after '%', `i` is immediiately after the marker
                if (c == '%')   // escaped '%'
                {
                    return {
                            format.substr(0, literal_end + 1), // literal including first '%' 
                            -1, // invalid argument id - nothing to substitute
                            format.substr(i) // everything after the marker
                        };
                }
                else if ('0' <= c && c <= '9')
                {
                    return {
                            format.substr(0, literal_end), // text before marker
                            c - '0', // index of an element in the parameter pack
                            format.substr(i)    // everything after the marker
                        };
                }
            }
        }
        // no marker is found, return the whole `format`, invalid arg.id and empty `rest`.
        return { format, -1, format.substr(0,0) };
    }

Here we cut some corners:

  • we’re lucky to have '%' as both escaped character and escaping character. Otherwise the code would be a bit more complex;

  • we only allow indices in 0..9, i.e. no more than 10 arguments. It’s trivial to extend it (just check if more characters ahead are digits), but I omitted that for simplicity sake.

Parameter pack as array? Really?

OK, now we need to access argument in the parameter pack by its index (in 0..9 range). How to do that?

If we knew the index at compile-time, we could use something like get<I>(make_tuple(args...)). But we need to use indices at run-time.

I’m still new to the C++ template magic, so I did it in a pretty straightforward and stupid way - finding n-th element by skipping n-1 elements (good thing, compilers these days are fast):


    template<int N>
    std::string_view nth(int) {
        return "";
    }    

    template<int N = 0, typename ArgT, typename... ArgsT>
    std::string_view nth(int i, ArgT&& arg, ArgsT&&... args)
    {
        if (i == N)
        {
            return to_string_view(std::forward<ArgT>(arg));
        }
        return nth<N+1, ArgsT...>(i, std::forward<ArgsT>(args)...);
    }

and to_string_view will be overloaded basing on the argument:

For string literal, compiler knows size already, so we can use it immediately

    template<size_t N>
    std::string_view to_string_view(const char (&s)[N])
    {
        // C-style arrays have terminating '\0' that we don't need.
        return {s, N-1};
    }

but then we need to distinguish it somehow from char *. The best way to force compiler into it is to make a template specification that is more complex than the one for literal (above). I stole this one from a string view implementation in the Android codebase:

    template<typename CharT, typename = std::enable_if_t<std::is_same_v<CharT, char>>>
    std::string_view to_string_view(const CharT* const &s)
    {
        return s;
    }

Yuck!…

FInally, we need ones for std::string and std::string_view. They need to be templates too, otherwise the implementations above will be ignored. The implementations are trivial though:

    template<typename = char>
    std::string_view to_string_view(const std::string& s){
        return s;
    }

    template<typename = char>
    std::string_view to_string_view(const std::string_view& s){
        return s;
    }

Note: we don’t really need the to_string_view functions above: we could easily implement nth as

    template<int N = 0, typename ArgT, typename... ArgsT>
    std::string_view nth(int i, ArgT&& arg, ArgsT&&... args)
    {
        if (i == N)
        {
            return arg;
        }
        return nth<N+1, ArgsT...>(i, std::forward<ArgsT>(args)...);
    }

The downside here would be that for string literals length would be computed at runtime via strlen or equivalent. Which may be OK in many cases. But

Final assembly

Having all the pieces in place, we can write final version of format:

template<typename... ArgsT>
std::string format(const std::string_view& fmt, ArgsT&&... args)
{
    using namespace details;
    // compute length of all the strings, including substitutions.
    size_t total_length = 0;
    foreach_marker(fmt, [&total_length, &args...](auto&& s, int arg_id) {
        total_length += s.size();
        if (arg_id >= 0)
        {
            total_length += nth(arg_id, std::forward<ArgsT>(args)...).size();
        }
    });

    std::string result;
    result.reserve(total_length);

    // concatenate the elements
    foreach_marker(fmt, [&result, &args...](auto&& s, int arg_id) {
        result.append(s);
        if (arg_id >= 0)
        {
            result.append(nth(arg_id, std::forward<ArgsT>(args)...));
        }
    });

    return result;
}

We can verify that the memory is allocated at most once by overriding new and delete operators (which I did: see the full code).

Homework

What a possible areas of improvement here? We could:

  • allowing more than 10 arguments (see above);

  • add validation - at the very least enforce that the maximum argument id in the markers is not greater than number of elements in the parameter pack;

  • making code work with char16_t and char32_t (nothing hard, again);

  • allow formatting custom types (would be similar to to_string_view overloads, although it’ll probably need to actually be to_string)

  • allow markers to have format specifiers (similar to how {fmt} does that).

But again, maybe you should use production-quality libraries instead 😉.


  1. implementation below will use ASCII/UTF-8 strings too, but it’s trivial to upgrade it to other encodings. ↩︎

  2. many people prefer to pass string_views by value. Unfortunately, Windows ABI doesn’t support passing a parameter via multiple registers, so I keep passing them by const reference, sorry. ↩︎