Crafting a versatile generic function using concepts and unexpected pitfalls of implicit ranges

30/12/2024 updated on 09/03/2025  │ Blog

Context

During the course of my work, I was brought to develop a feature where the library user should be able upload a file or folder to a foreign storage. This post goes over my train of thought from a barebone function to a richer interface, covering some common use-cases. For the purpose of this post, upload will be reworded as print, where the argument is output to the console. A minor change, but relatively similar interface. The default prototype looks like this:

1
void print(std::filesystem::path const&);

The type is passed as a const reference because it is a heavy stateful object (until path_view gets in) and I just want my function to bind the argument to either a L-value or R-value reference, without having to duplicate the function, or use template for the default case. Moreover, this choice enables to pass any object convertible to a path, including string.

1
2
3
4
5
// Usage
std::filesystem::path p;
print(p);
print(std::filesystem::path{"foo/bar"});
print("foo/bar");

Later, I wanted to enrich my API by enabling users to upload a sequence of paths, for example

1
2
3
// API extension example
std::vector<std::filesystem::path> paths;
print(paths);

Although,

  1. Not just std::vector, but any container
  2. Not just std::filesystem::path, but anything that can be converted to it

One could go with the fully unconstrained version:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
template<typename T>
void print(T&& paths)
{
    for(auto const& path : paths)
        print(path);
}
// or also
void print(auto&& paths)
{
    for(auto const& path : paths)
        print(path);
}

But it is strongly discouraged. This form is easy to misuse with invalid inputs, for which compilers would give terrible error messages. One bakes in some assumptions which are not explicitly stated in the function declaration, and likely requires extensive additional documentation to specify its terms of use.

Furthermore, it is always better first to constrain as much as possible, and release as per need, to avoid user reliance on something that would need to be limited afterwards.

The constraint…

Every container in the standard library (and followed by everyone in general) provides a dependent type name indicating the type of its elements: value_type. It is a vital piece for the development of generic algorithms. We want our new function to be usable by any container (which follows the standard recommendations). Thus, here is a possible constraint on our template parameter:

1
std::is_same_v<typename T::value_type, std::filesystem::path>

However, T can be deduced to a reference and the like, which do not allow for evaluating dependent name.

❌ error: 'std::vector<std::filesystem::__cxx11::path>&' is not a class, struct, or union type

That’s why it is often necessary to peel the template parameters with the help of some type traits, e.g. std::decay_t or std::remove_cvref_t

1
std::is_same_v<typename std::remove_cvref_t<T>::value_type,std::filesystem::path>

…using SFINAE

Substitution failure is not an error (SFINAE) is was one of the generic technique commonly used to constrain an overload set. Tag dispatch is another main alternative, often more explicit but less concise (and adds clutter); it will be skipped for the sake of brevity. SFINAE is already widely described online so I will not linger on it12.

1
2
3
4
5
6
7
// SFINAE.A
template<typename T, std::enable_if_t<std::is_same_v<typename T::value_type,std::filesystem::path>* = nullptr>
void print(T&& paths)
{
    for(auto const& path : paths)
        print(path);  
}

The position of the constraint can vary, but I usually prefer to keep it within the template argument list. Note that alternative versions:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
// SFINAE.B
template<typename T, typename = std::enable_if_t<std::is_same_v<typename T::value_type,std::filesystem::path>>
void print(T&& paths)
{
    for(auto const& path : paths)
        print(path);  
}
// SINFAE.C
template<typename T, std::enable_if_t<std::is_same_v<typename T::value_type,std::filesystem::path>,bool> = true>
void print(T&& paths)
{
    for(auto const& path : paths)
        print(path);  
}
// SINFAE.D
template<typename T>
auto print(T&& paths) -> std::enable_if_t<std::is_same_v<typename T::value_type,std::filesystem::path>,bool>
{
    for(auto const& path : paths)
        print(path);  
}

are all valid. Although SFINAE.B has the subtle disadvantage, but a major one, to prevent several definition containing enable_if because default type template parameters do not get mangled into the final function signature generated by the compiler. The compiler would then generate an error for the following examples:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
// SFINAE.B.1
template<typename T, typename = std::enable_if_t<std::is_same_v<typename T::value_type,std::filesystem::path>>
void print(T&& paths)
{
    for(auto const& path : paths)
        print(path);  
}
// SFINAE.B.2
template<typename T, typename = std::enable_if_t<std::is_array_v<T>>
void print(T&& paths)
{
    for(auto const& path : paths)
        print(path);  
}
1
error: redefinition of 'template<class T, class> void print(T&&)'

…using constant expression

SFINAE (and tag dispatch) were the main (and only?) way to handle such scenario until the introduction of if constexpr in C++17 (and concepts in C++20). This new option allows the compiler to select a conditional branch at compile-time.

1
2
3
4
5
6
7
8
9
template<typename T>
void print(T&& paths)
{
    if constexpr(std::is_same_v<typename T::value_type,std::filesystem::path>)
    {
    	for(auto const& path : paths)
            print(path);
    }
}

It is definitely more readable than SFINAE since it reads like the good ol’ code (if/else) we are used to read. The issue with this technique for me, although my opinion might change with further experience with it, is:

  1. The signature is unhelpful to the reader, because catch-all template, unless specified in the comments or going through the definition
  2. The definition becomes similar to a big switch whose size can become unwieldy and I prefer the clarity of multiple declarations
  3. Wrong branch selection might be harder to detect or too easy to occur

…using concepts

A new big thing in C++20 is the concepts, which is a clean and efficient way to impose constraints or describe interfaces when diving into generic programming. The constraints can be imposed either by defining concepts or by introducing ad-hoc constraints with the keyword requires. Note that the type trait is_same/is_same_v is symmetric to the concept same_as.

1
2
3
4
5
6
template<typename T> requires std::same_as<typename T::value_type,std::filesystem::path>
void print(T&& paths)
{
    for(auto const& path : paths)
        print(path);
}

The satisfaction of a job done

Putting everything together gives a brief, yet powerful, interface.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
#include <iostream>
#include <filesystem>
#include <vector>
#include <set>
#include <type_traits>

void print(std::filesystem::path const& path)
{
    std::cout << "\t" << path <<"\n";
}

template<typename T> requires std::same_as<typename std::remove_cvref_t<T>::value_type,std::filesystem::path>
void print(T&& paths)
{
    for(auto const& path : paths)
        print(path);
}

int main()
{
    std::cout << "Path\n";
    auto path_a = std::filesystem::path("foo/bar");
    print(path_a);
    std::cout << "String\n";
    auto string_a = std::string("foo/bar");
    print(string_a);
    std::cout << "Vector\n";
    auto paths_a = std::vector<std::filesystem::path>{"foo/bar", "bar/foo"};
    print(paths_a);
    std::cout << "Set\n";
    auto paths_b = std::set<std::filesystem::path>{"foo/bar", "bar/foo"};
    print(paths_b);

    return 0;
}

Output

Path
	"foo/bar"
String
	"foo/bar"
Vector
	"foo/bar"
	"bar/foo"
Set
	"bar/foo"
	"foo/bar"

Yey ! Job done, let’s go home… or?

All is ephemeral

It came up quickly — ahem, during the first reality check — that several major use-cases were missing.

  1. Using the function with container of string-like objects
    1
    2
    
    std::vector<std::string> paths;
    print(paths);
    
  2. Using the function with std::ranges, for example, by prepending a path
    1
    2
    3
    
    std::vector<std::filesystem::path> paths;
    auto prepend_with = [](auto&& x) {return [&x](auto&&p){return x/p;};};
    print(paths | std::ranges::views::transform(prepend_with("blop")));
    
  3. Using the function with brace-enclosed initializer lists of std::filesystem::path or string-like values
    1
    2
    
    print({path1, path2});
    print({"foo/bar","bar/foo"});
    

Expanding the interface

Resolving the first issue is relatively simple. In essence, we need to accept any object that can be converted to a std::filesystem::path. Fortunately, the C++ standard provides several core concepts, which includes one with the exact name corresponding to our interpretation std::convertible_to.

1
2
3
4
5
6
7
template<typename T> 
requires std::convertible_to<typename std::remove_cvref_t<T>::value_type, std::filesystem::path>
void print(T&& paths)
{
    for(auto const& path : paths)
        print(path);
}

📝 Note that the concept std::constructible_from is also applicable, and which might be more favorable in some (corner?) cases. More on this topic: Is constructible? and std::convertible_to failing to recognize explicitly convertible types

Dealing with ranges elegantly resolves the case for containers too, because a range generalize iteration of containers, they just need to provide an iterator pair begin and end.

1
2
3
4
5
6
template<std::ranges::range R>
void print(R&& paths)
{
    for(auto const& path : paths)
        print(path);
}

However, it becomes again under-constrained because the type of an element of the range is unspecified. The container-specific dependent type value_type cannot be reused as such because the std::ranges library defines its own set of “dependent” types.

1
2
3
4
5
6
7
template<std::ranges::range R> 
requires std::convertible_to<std::ranges::range_value_t<R>,std::filesystem::path>
void print(R&& paths)
{
    for(auto const& path : paths)
        print(path);
}

Initializer lists, more commonly faced as anonymous brace-enclosed initializer lists, require a separate codepath. Due to their anonymous nature, they do not behave similarly as a concrete container and cannot be deduced directly through a templated function.

1
2
3
4
5
6
template<typename T>
void foo(T) {}

foo({0,1,2});
// What type should deduced for T with {0,1,2} ? 
// There is simply not enough information for the compiler to decide.

Brace-enclosed initializer lists are abstract entities that try to conform to the target type, which does not exist when using template arguments because the target type can be anything. Hence the need for a specific overload of the function.

1
2
3
4
5
6
7
template<typename T> 
requires std::convertible_to<std::remove_cvref_t<T>,std::filesystem::path>
void print(std::initializer_list<T> paths)
{
    for(auto const& path : paths)
        print(path);
}

I will not go into the topic of initializer_list nor concept subsumption or overload resolution in this post for the sake of keeping this piece as a blog post and not turning it into a book 😄

Let’s roll !

Running the following example on Compiler Explorer

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#include <concepts>
#include <filesystem>
#include <initializer_list>
#include <iostream>
#include <ranges>

void print(std::filesystem::path const& path)
{
    std::cout << path.string() <<"\n";
}

template<std::ranges::range R> requires std::convertible_to<std::ranges::range_value_t<R>,std::filesystem::path>
void print(R&& paths)
{
    for(auto const& path : paths)
        print(path);
}

template<typename T> requires std::convertible_to<std::remove_cvref_t<T>,std::filesystem::path>
void print(std::initializer_list<T> paths)
{
    for(auto const& path : paths)
        print(path);
}

int main()
{
    print(path{"./foo/bar"});
}

Output

.
foo
bar

Huh?! Where is the nice printed one-liner ./foo/bar 😠

Tell me why 🎵

Instead of the desired output

./foo/bar

We got our path sliced like a salami

. foo bar

Sometimes programming is like medicine, let us proceed with the diagnosis.

Symptom: template overload priority

Consider the following function call:

1
print(fs::path{"./foo/bar"});// matches more closely print(fs::path&&) or print(path) rather than print(fs::path const&)

It involves a temporary std::filesystem::path, which is a pure rvalue. When the compiler sees the call print(fs::path{"./foo/bar"}), it chooses the most specialized template that can match the argument, or emit an “ambiguous” compiler error if several candidates match equivalently.

The print(fs::path const& path) specialization is not an exact match for either a mutable lvalue, nor an rvalue argument. Even if it is a function overload rather than a template specialization, there is no priority for such within the selection process.

The template version, even if constrained, can accept both lvalues and rvalues due to its parameter parameter being a forwarding reference (R&&). In our example, since our input is an rvalue, the compiler considers this version to be a better match for the rvalue argument and pick it instead of the concrete overload.

To resolve this issue and ensure that the right version is called, one can provide an overload specifically for rvalues:

1
2
3
4
void print(fs::path&& path)
{
    std::cout << path.string() << "\n";
}

With this overload, the compiler will now prefer the print(fs::path&& path) version when an rvalue is passed.

Note that a similar symptom would have been triggered if we were to use a forwarding reference within the range-for loop of the constrained template function.

1
2
3
4
5
6
7
8
template<range R> requires std::convertible_to<range_value_t<R>,path>
void print(R&& paths)
{
-    for(auto const&  path : paths)
-        print(path);
+    for(auto&& path : paths)
+        print(path);
}

Hereby requiring us to slap the final overload for mutable lvalue print(fs::path& path).

Unfortunately, it introduces quite some boilerplate, obfuscating the root cause of the problem we face.

Cause: path satisfies the concept range itself.

Concepts in C++20 are structural. Any concept created outside the developer’s control can apply to any type as long as the apparent constraints are fulfilled, whether or not they want it or mean it semantically.

📝 Even though C++ Core Guidelines recommend to have concepts modeling semantic notions rather than syntactic one, it is effectively hard to avoid.

The intent of concepts is to model semantic categories (Number, Range, RegularFunction) rather than syntactic restrictions (HasPlus, Array). According to ISO C++ core guideline T.20, “The ability to specify meaningful semantics is a defining characteristic of a true concept, as opposed to a syntactic constraint.” — https://en.cppreference.com/w/cpp/language/constraints

An example of something that looks like a range but is not:

1
2
3
4
5
6
7
struct this_is_not_a_range_and_yet
{
    auto begin() { return "this is only the beginning";};
    auto end()   { return "this is already the end";};
};

static_assert(std::ranges::range<this_is_not_a_range_and_yet> == false);

Compiler

error: static assertion failed
43 | static_assert(std::ranges::range<this_is_not_a_range_and_yet> == false);
      |                       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~
note: the comparison reduces to '(1 == 0)'

😐😐😐

When looking closer at the C++ standard, the range concept is defined in a simple way:

1
2
3
4
5
template< class T >
concept range = requires( T& t ) {
	ranges::begin(t); // equality-preserving for forward iterators
	ranges::end(t);
};

A typical range class only needs to provide two functions:

  1. A member function begin() whose return type models input_or_output_iterator.
  2. A member function end() whose return type models sentinel_for, where It is the return type of begin().

Alternatively, they can be non-member functions, to be found by argument-dependent lookup.

ISO/IEC 14882:2020 (C++20)

Every type that provides the couple begin() and end() member functions with some compatible return type (for example a T*) will be considered as a range. std::filesystem::path does provide those functions, which enables to traverse each node of the path, i.e. pieces located between path separator pairs. Therefore std::filesystem::path represents a range.

The better treatment, compared to the rvalue overload mentioned above, is to change the constraint itself to restrict further the space of acceptable types, by skipping the overload when the type for range R is convertible to a path:

1
2
3
4
5
6
7
8
template<std::ranges::range R> 
requires std::convertible_to<std::ranges::range_value_t<R>,std::filesystem::path>
         && !std::convertible_to<R,std::filesystem::path> //! addition to the previous version
void print(R&& paths)
{
    for(auto const& path : paths)
        print(path);
}

Final snippet

With all of this in mind or resolved, we can get to our final snippet 🎉

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#include <concepts>
#include <filesystem>
#include <initializer_list>
#include <iostream>
#include <ranges>

void print(std::filesystem::path const& path)
{
    std::cout << path.string() <<"\n";
}

template<std::ranges::range R> 
requires std::convertible_to<std::ranges::range_value_t<R>,std::filesystem::path>
         && !std::convertible_to<R,std::filesystem::path>
void print(R&& paths)
{
    for(auto const& path : paths)
        print(path);
}

template<typename T> 
requires std::convertible_to<std::remove_cvref_t<T>,std::filesystem::path>
void print(std::initializer_list<T> paths)
{
    for(auto const& path : paths)
        print(path);
}

Conclusion

All this thought process for producing a few line of code 😔 But see the power in those lines ! Though, I would have wished that std::filesystem::path was not implicitly a range, as well as std::string, for example by asking us to opt-in with a as_range() member or free function.