C++ templates and Rcpp

C++ templates are really useful. Templates allow you to code a function able to accept arguments of different types that can’t necessarily be known until compile time. The R language is, however, written in C, and knows nothing of templates. Rcpp opens up to the R language the extensions offered by C++ over C, yet integrating templates within Rcpp code is not straightfoward. This blog entry will hopefully clarify the steps needed to use C++ templates in an Rcpp context.

As often in programming, employing templates in Rcpp is about finding the most efficient level of abstraction. Templates are one of the coolest ways to “abstract” C++ code – generally meaning abstracting away from specific variable types (or classes, structures, whatever …) to generic templated forms that accept multiple, or indeed any possible, types. Templates in rust just work - types are directly inferred, and any potential conflicts will be caught at compile time. rust is the gold standard in which template abstraction is as pain-free as possible. C++ templates are, in contrast, somewhat more painful, as a minimal generic template must be explicitly specified. This is often as simple as replacing some function definition, say:

int my_function (int my_integer_input)
{
    int result = my_integer_input;
    // do something with `result`
    return result;
}

with a templated version:

template <class T>
T my_function_t (T my_generic_input)
{
    T result = my_generic_input;
    // do something with `result`
    return result;
}

As it stands, my_function_t will accept inputs of any arbitrary kinds. (There are also ways to permit templated code to only accept objects of some pre-defined classes.) An Rcpp version of the first function might look like this:

// [[Rcpp::export]]
int my_rcpp_function (int my_integer_input)
{
    int result = my_integer_input;
    // do something with `result`
    return result;
}

The problem arises when you try to do something like this:

// [[Rcpp::export]]
template <class T>
int rcpp_template (T input)
{
    int result = Rcpp::as <int> (input);
    // do something with `result`
    return result;
}

and that takes you here:

RcppExports.cpp:46:36: error: use of undeclared identifier 'T'
    Rcpp::traits::input_parameter< T >::type input(inputSEXP);
                                                              ^
RcppExports.cpp:46:41: error: no type named 'type' in the global namespace
    Rcpp::traits::input_parameter< T >::type input(inputSEXP);
                                                                    ~~^
2 errors generated.

This provides highly informative error messages that clearly indicate that the cause is the ability to infer appropriate inputSEXP types (itself a necessity of R being written in C and so knowing nothing about inferred types or templates, as stated above). What we can nevertheless do here is replace our undefined type, T, with an equivalently undefined and generic SEXP (and let’s define our function while we’re at it to square the input; and also, if you’re wondering what all this SEXP stuff is, you could take a wee digression over to Miles McBain’s brief but illuminating ramblings on the topic )

// [[Rcpp::export]]
int rcpp_template (SEXP input)
{
    int result = Rcpp::as <int> (input);
    return result * result;
}

This can be then called from R, and will return an integer output. (The Rcpp::as <int> () is a wrapper for static_cast <int> (), which simply truncates decimals, so rcpp_template(1.9) will give 1.) What about generic return values? The next obvious step would be to try this:

// [[Rcpp::export]]
SEXP rcpp_template2 (SEXP input)
{
    return input * input;
}

This would obviously be rather dangerous if it actually worked, but we don’t need to worry because it fails with this:

error: invalid operands to binary expression ('SEXP' (aka 'SEXPREC *') and 'SEXP')
SEXP result = input * input;
                        ~~~~~ ^ ~~~~~
1 error generated.

The arrow points to the “operand”, indicating that this has defaulted to an attempt to implement bit-wise multiplication of two generic pointer objects (an SEXP object is nothing but a pointer to the underlying structure it points to, an SEXPREC object). So that all leaves us now knowing that the most we can do is to send generic inputs from R to C++ functions as SEXP parameters, and then coerce them with the magic of Rcpp::as(). This is of course also potentially dangerous:

rcpp_template (2.9) # = 4; okay
rcpp_template ("2.9")
# Error in rcpp_template(input) :
#   Not compatible with requested type: [type=character; target=integer].

A better level of abstraction

Remembering that R’s C interface only knows about SEXP (“S-EXPression”), and that all SEXP objects are mere pointers to C arrays, suggests something like the following code—which does not work:

#include <Rcpp.h>

template <class T>
T mysquare (T &x)
{
    for (size_t i = 0; i < x.size (); i++)
        x (i) = x (i) * x (i);
    return x;
}

// [[Rcpp::export]]
SEXP rcpp_mysquare (SEXP &x)
{
    return mysquare (x);
}

That code fails to compile because of “incomplete definition of type ‘SEXPREC’” (where a SEXPREC is a structure pointed to by an SEXP)—in other words, R has no way of inferring the type of data pointed to by the SEXP. The trick to getting this to compile, and thereby to using C++ templates via Rcpp, is to have an additional “type-selector” function that recognises and typecasts the input type as one of the six possible R types . We’re only interested in a couple of those here, representing the integer and real or floating-point types, which are respectively INTSXP and REALSXP. Recalling that there is no distinction between a single integer or numeric (floating-point) value and equivalent vectors of these, we can distinguish these two cases through casting via Rcpp::as to Rcpp equivalents of either integer or numeric vectors with the following additional code, representing our “type selector” function:

SEXP mysquare (SEXP &x)
{
    switch (TYPEOF (x))
    {
        case INTSXP: {
                         Rcpp::IntegerVector iv = Rcpp::as <Rcpp::IntegerVector> (x);
                         return mysquare (iv);
                     }
        case REALSXP: {
                         Rcpp::NumericVector nv = Rcpp::as <Rcpp::NumericVector> (x);
                         return mysquare (nv);
                     }
        default: { Rcpp::stop ("incompatible type");    }
    }
    return x; // this should never happen
}

This function takes a generic (SEXP) input and returns a generic output, yet deploys actual calls to the templated version of mysquare with specified (Rcpp) types, ensuring that the above templated function will always be able to infer the input type. The default Rcpp::stop ensures that types other than our desired two are not processed further, preventing for example attempts to calculate the square of "a". Inserting this “type-selector” code in the above code permits a generic SEXP-in / SEXP-out function (our rcpp_mysquare in the above code) to be deployed to specific types, and then simply passed to a generic C++ template function. Presuming this C++ code to be in a file src.cpp, the whole thing then works like this:

Rcpp::sourceCpp ("src.cpp") # source the file, placing the Rcpp::export-ed function in workspace
x <- 1:5
x <- rcpp_mysquare (x)
x

## [1]  1  4  9 16 25

class (x)

## [1] "integer"

storage.mode (x) <- "numeric"
x <- rcpp_mysquare (x)
x

## [1]   1  16  81 256 625

class (x)

## [1] "numeric"

An integer vector gives integer return values, and a numeric (floating-point) vector gives numeric return values. There you have it: templating through the magic of SEXP. Gratitude extended to Dirk Eddelbeuttel and David Cooley for advice and helpful pointers.

The final code

Just to make it clear, here’s the above code all placed in a single place:

#include <Rcpp.h>

template <class T>
T mysquare (T &x)
{
    for (size_t i = 0; i < x.size (); i++)
        x (i) = x (i) * x (i);
    return x;
}

SEXP mysquare (SEXP &x)
{
    switch (TYPEOF (x))
    {
        case INTSXP: {
                         Rcpp::IntegerVector iv = Rcpp::as <Rcpp::IntegerVector> (x);
                         return mysquare (iv);
                     }
        case REALSXP: {
                         Rcpp::NumericVector nv = Rcpp::as <Rcpp::NumericVector> (x);
                         return mysquare (nv);
                     }
        default: { Rcpp::stop ("error");    }
    }
    return x; // this never happens
}

// [[Rcpp::export]]
SEXP rcpp_mysquare (SEXP &x)
{
    return mysquare (x);
}

Update (31 July 2019)

Since writing that, I found this very clear and more extensive explanation in an Rcpp Gallery post. .

Originally posted: 07 May 19 Updated: 31 Jul 19