Design

This section elaborates on what a transducer is. It will help you understand how to use them, but also how to write your own transducers and processes.

What is a transducer?

https://public.sinusoid.es/talks/transducers-cppcon15/pic/rich2.jpg

“Transducers extract the essence of map and filter

Rich Hickey, author of Clojure, on Transducers (video) (blog)

Transducers come after the realization that the most general algorithm over sequences is what in C++ we call accumulate, but in other languages is often called reduce or fold. Accumulate applies a reducing function iteratively over a sequence. The reducing function takes a state and some input as arguments, and returns an updated state.

For example, we can define a reducing function output that takes an output iterator as state, some value as input, and when invoked, outputs the value using the iterator and returns the iterator into the next position:

auto output = [](auto state, auto input) {
    *state = input;
    return ++state;
};

Using this function, we can implement std::copy in terms of accumulate:

template <typename InputIt, typename OutputIt>
OutputIt copy(InputIt first, InputIt last, InputIt out)
{
    return accumulate(first, last, out, output);
}

We can also use this structure to implement transform and filter:

template <typename InputIt, typename OutputIt, typename Mapping>
OutputIt transform(InputIt first, InputIt last, InputIt out,
                   Mapping mapping)
{
    return accumulate(first, last, out, [&](auto state, auto input) {
        return output(state, mapping(input));
    });
}

template <typename InputIt, typename OutputIt, typename Predicate>
OutputIt filter(InputIt first, InputIt last, InputIt out,
                Predicate pred)
{
    return accumulate(first, last, out, [&](auto state, auto input) {
        return pred(input)
            ? output(state, input)
            : state;
    });
}

Notice something? The functions look almost identical. There is a part that is almost identical: calls to accumulate and output that specify the input and output parts of the Processes. The meat of the algorithm is sandwitched in between.

We can extract their by writing them as functions take a reducing function (like output) as an argument, and as result return another reducing function that, for example, maps the input betweeing passing it over the next reducing function.

A transducer is a function that takes a reducing function, and returns another reducing function that wraps it.
template <typename Mapping>
auto map(Mapping mapping)
{
    return [=] (auto step) {
        return [=] (auto s, auto... ins) {
            return step(s, mapping(ins...));
        };
    };
}

template <typename Predicate>
auto filter(Predicate pred)
{
    return [=] (auto step) {
        return [=] (auto s, auto... ins) {
            return pred(input) ? step(s, ins...) : s;
        };
    };
}

Warning

This code could be improved by leveraging move semantics and perfect forwarding wherever we pass stuff to the underlying transducer. This, however, adds too much noise to the code examples. These concepts are already hard enough to understand to those uninitated in functional programming.

As an exercise can think about you would use std::move and std::forward in these examples. You can check the implementation of these transducers in the library itself for some inspiration.

These two functions return a transducer, this a function that takes a reducing function called step as an argument, and return another reducing function. We can revise our implementation of our standard algorithms using them:

template <typename InputIt, typename OutputIt, typename Mapping>
OutputIt transform(InputIt first, InputIt last, InputIt out,
                   Mapping mapping)
{
    return accumulate(first, last, out, map(mapping)(output));
}

template <typename InputIt, typename OutputIt, typename Predicate>
OutputIt filter(InputIt first, InputIt last, InputIt out,
                Predicate pred)
{
    return accumulate(first, last, out, filter(pred)(output));
}

What if you wanted to write a function that filters and transforms all in one go?

template <typename InputIt, typename OutputIt,
          typename Predicate, typename Mapping>
OutputIt filter_and_transform(
              InputIt first, InputIt last, InputIt out,
              Predicate pred, Mapping mapping)
{
    return accumulate(first, last, out,
                      filter(pred)(map(mapping)(output)));
}

Or using piping for function composition and extracting out the transducer:

template <typename InputIt, typename OutputIt,
          typename Predicate, typename Mapping>
OutputIt filter_and_transform(
              InputIt first, InputIt last, InputIt out,
              Predicate pred, Mapping mapping)
{
    auto xf = filter(pred) | mapping(mapping);
    return accumulate(first, last, out, xf(output)));
}

Tip

The library provides a function comp for function composition, but also operator| for more natural reading syntax. You can also turn any function f into a pipeable one by using comp(f).

Order of composition

The filter_and_transform function above performs filtering and mapping precisely in that order: it filters first, and only if the value passes the predicate is it mapped at all. This, composition of transducers reads from left to right.

This may be unintuitive if that this is simply function composition, and function composition reads from right to left. However, remember that a transducer, as a function, is not a transformation over a sequence, but a transformation over an operation. Each transducer is, indeed, called from right to left. Each transducer transforms the operation, wrapping it into a more complex operation. It is like an onion, each transducer adds a new layer to the onion. The result is a more complex operation. When data is fed into the operation, it will enter through the external layer of the onion, this is through the layer that the last transducer (the leftmost!), and penetrate the onion in this reverse order until it reaches the core of the onion—to the rightmost transducer, and then the original reducing function.

_images/composition-flat.svg

Dealing with state

Now you know how to write a simple transducer.

But what if you want to do something more advanced? For example, you may want to write a transducer like Python’s enumerate, that given a sequence of values, pairs each element with an index. You may be tempted to do it like this:

auto enumerate = [=] (auto step)
{
    return [=, idx = 0] (auto s, auto... ins) mutable
    {
        return step(s, idx++, ins...);
    };
};

This works, but it has one problem: it hides a mutable variable in the reducer. It needs to be marked mutable and it is no longer a pure function. We have to be careful about which instance are we using when.

But we already have a functional, explicit, state, the variable s. The problem is that we can not make any assumptions about its type or manipulate it in any way — only somewhere down in the next reducing function step can it be understood.

The solution is to instead wrap the incoming state, adding any additional data you need next to it. There is some nuance to it: the first time the transducer is invoked there state you receive is unwrapped, but the second time you’ll receive a wrapped state. The library provides some methods to deal with this:

  • state_wrap(s, d), returns a new wrapped state containing both the state s and the additional data d.
  • state_unwrap(s), returns the underlying state from the potentially wrapped state s.
  • state_data(s, [] { return ...; }), returns the wrapped data in the state s. If none exists, it, produces an initial value using the supplied callback.

Using this tool we can now improve our implementation of enumerate:

auto enumerate = [=] (auto step)
{
    return [=] (auto s, auto... ins) mutable {
        auto idx  = state_data(s, [] { return 0; });
        auto next = step(s, idx, ins...);
        return wrap_state(next, ++idx);
    };
};

Note

In Clojure, stateful transducers hide state within the transducer the way we showed in the initial example. The reason, I believe, that they do not follow our approach, is that since Clojure is a dynamic language, this would add too much of a runtime overhead, as one would need to constantly check whether we are dealing with wrapped or unwrapped data. In C++ all this happens at compile time, with no runtime overhead. Ironically C++ happens to be more functional than Clojure in their transducers implementation 😉

Early termination

Some transducers stop producing any output after certain conditions. One example is take which it lets at most n elements of the sequence in. Howe can we signal to the reducing process, and from within the reducing function, that there is no more work to be done?

One option would be to throw an exception, that may be catched by the reducing process. But exceptions are an inefficient way of control flow in C++. Instead, we may realize that termination, whether we are done or not, is a property of the state. Think about it: the transducer take is gonna anotate the state with some kind of count. The transducer is done when this count reaches zero. We say that a state that represents a finished computation is reduced. We can express this in the library like this:

struct take_tag {};

bool state_wrapper_data_is_reduced(take_tag, int n)
{
    return n <= 0;
}

auto take(int n)
{
    return comp([=](auto step) {
        return [=](auto s, auto... is) {
            return wrap_state<take_tag>(
                step(state_unwrap(s), is...),
                state_data(s, [=] { return n; }) - 1);
        };
    });
}

Note how we added a take_tag to identify our state, and how we introduced the state_wrapper_data_is_reduced to specify whether the data associated to a particular state indicates termination.

Is accumulate a valid reducing process?

You may have noted that we are increasingly adding stuff to how the reducing process interacts with the reducing function. For once, the type of the state changes after the first iteration. Also, we now have to consider whether the state is reduced (the library provides state_is_reduced for that) and stop the process in that case.

For this reason: std::accumulate may not work, in general, with the reducing functions produced by our transducers. The library provides it’s own function, reduce() and the convenience transduce() that work like accumulate but considering all these state management quirks.

Conditional invocation

In the presence of state, we need to reconsider transducers that conditionally invoke the next reducing function, like filter. The problem is: if the next reducing function wraps the state with additional state, we may need to return something of a different type depending on whether we call the reducing function (we get a wrapped state) or don’t (we may or may not have a wrapped state at hand already).

The solution is to return something like a variant of a wrapped or an unwrapped state. To ease this, the library provides the functions call and skip, that take the same arguments (an invocation of the next reducing function) and return the same type, but the latter does not actually call the reducing function. This may be clearer considering this improved implementation of filter:

template <typename Predicate>
auto filter(Predicate pred)
{
    return [=] (auto step) {
        return [=] (auto s, auto... ins) {
            return pred(input)
                ? call(step, s, ins...)
                : skip(step, s, ins...);
        };
    };
};

Inspecting the state

We have seen how we can use state_data and state_unwrap and state_wrap to add our oun state to the computation without having special control flow for the first invocation of the reducing function.

However, sometimes we may want to indeed do something different when we receive an unwrapped state, signaling that we have not yet done any work. One example is dedupe which removes duplicate consecutive elements from the sequence. On the first invocation, we know that the element is no duplicate, because there was none before, so we can call the next reducing function unconditionally.

The library provides a with_state(s, fn1, fn2) that takes a state s and calls fn1 or fn2 depending on whether s is already wrapped or not. We can use it to implemente dedupe as follows:

auto dedupe = [](auto step)
{
    return [=](auto s, auto... is) {
        return with_state(s,
            [&](auto s) {
                auto next = std::make_tuple(is...);
                return wrap_state(step(state_unwrap(s), is...),
                                  next);
            },
            [&](auto s) {
                auto next = std::make_tuple(is...);
                return next == state_wrapper_data(st)
                     ? s
                     : wrap_state(step(state_unwrap(st), is...),
                                  next);
            });
    };
};

Variadics

Note how in the reducing functions above we are always receiving and passing the inputs using ins.... This is because transducers are variadic, this is, an input maybe consist of none, one, or many elements. This allows for some interesting use-cases.

Generators

A transducer with no arguments can act as generator, that with every “pulse” it receives no input, but produces a value. For example, we can use a transducer to create a vector with 10 pseudo-random numbers like this:

auto v = into(std::vector<int>{}, zug::map(&std::rand) | zug::take(10));
assert(v.size() == 10);

Note that if we did not use take(10) there, that function would never end. However, we may alternatively define an infinite lazy range of pseudo-random numbers using:

auto s = zug::sequence(zug::map(&std::rand));
std::copy(s.begin(), s.end() + 10, ...);

Zipping

Alternatively, we can use transducers to combine multiple sequences into one, for example, by pairing every two elements. When using ranges this can be slightly cumbersome, since every range must have one and only one value type, so the every element must be combined into a tuple. With transducers, the two elements can be passed to the next transducer to combine them however you want. For example, if we have two vectors of integers, we can produce a vector with the sum of every two elements like this:

auto v1 = std::vector<int>{ 1,  2,  3,  4};
auto v2 = std::vector<int>{10, 20, 30, 40};
auto xf = zug::map(std::plus<>{});
auto r  = zug::into(std::vector<int>{}, xf, v1, v2);
assert(r == {11, 22, 33, 44});

Of course, if what you want is, in fact, to generate tuples, the library has you covered:

auto v1 = std::vector<int>{ 1,  2,  3,  4};
auto v2 = std::vector<int>{10, 20, 30, 40};
auto r  = zug::into(std::vector<std::tuple<int, int>>{},
                    zug::zip, v1, v2);
assert(r == {{1, 10}, {2, 20}, {3, 30}, {4, 40}});

Or maybe you have tuples, and want to operate on the elements:

auto v  = std::vector<std::tuple<int, int>>{
   {1, 10}, {2, 20}, {3, 30}, {4, 40}
};
auto xf = zug::unzip | zug::map(std::plus<>{});
auto r  = zug::into(std::vector<int>{}, xf, v);
assert(r == {11, 22, 33, 44});

Performance

After being amazed, or maybe horrified, by these castles of lambda functions, you shall be wondering: but how the heck does this perform?

The short answer is: very well.

Most of the time, a combination of transducers can be as fast as a hand-rolled loop that performs the transformation. At every step, a transducer is just a simple function and it is very easy to inline.

In fact, when transforming a range eagerly it is normally more efficient than equivalent C++20 range adaptor, since there is no intermediate chaining of polling iterators, instead, the data is passed directly through a chain of reducing functions. However, some things might be slightly slower, and using zug::sequence to create a lazy iterator based on a transducer can often be slower than a range adaptor.

The comparison is kind of apple to oranges, since they are different abstractions. In fact, while indeed transducers can be convenient to express transformations over containers, they can also express, with minimal overhead, transformations over push-based temporal sequences. This is something that is simply impossible with the standard library. Check Lager for an application of this ability of transducers.

Contribute!

These conclussions are based on ad-hoc inspection of the generated assembly and benchmarking done during the development of the library. Sadly, there is no sistematic benchmark suite included with the library. Of course, developing one would be a good learning exercise, contact us if you would like to contribute one!