C++ templates are really useful. Templates allow you to code a function able to accept arguments of different types that can’t necessarily be known until compile time. The R language is, however, written in C, and knows nothing of templates. Rcpp opens up to the R language the extensions offered by C++ over C, yet integrating templates within Rcpp code is not straightfoward. This blog entry will hopefully clarify the steps needed to use C++ templates in an Rcpp context.
As often in programming, employing templates in Rcpp is about finding the most efficient level of abstraction. Templates are one of the coolest ways to “abstract” C++ code – generally meaning abstracting away from specific variable types (or classes, structures, whatever …) to generic templated forms that accept multiple, or indeed any possible, types. Templates in rust just work - types are directly inferred, and any potential conflicts will be caught at compile time. rust is the gold standard in which template abstraction is as pain-free as possible. C++ templates are, in contrast, somewhat more painful, as a minimal generic template must be explicitly specified. This is often as simple as replacing some function definition, say:
int my_function (int my_integer_input)
{
int result = my_integer_input;
// do something with `result`
return result;
}
with a templated version:
template <class T>
T my_function_t (T my_generic_input)
{
T result = my_generic_input;
// do something with `result`
return result;
}
As it stands, my_function_t
will accept inputs of any arbitrary kinds.
(There are also ways to permit templated code to only accept objects of
some pre-defined classes.) An Rcpp version of the first function might
look like this:
// [[Rcpp::export]]
int my_rcpp_function (int my_integer_input)
{
int result = my_integer_input;
// do something with `result`
return result;
}
The problem arises when you try to do something like this:
// [[Rcpp::export]]
template <class T>
int rcpp_template (T input)
{
int result = Rcpp::as <int> (input);
// do something with `result`
return result;
}
and that takes you here:
RcppExports.cpp:46:36: error: use of undeclared identifier 'T'
Rcpp::traits::input_parameter< T >::type input(inputSEXP);
^
RcppExports.cpp:46:41: error: no type named 'type' in the global namespace
Rcpp::traits::input_parameter< T >::type input(inputSEXP);
~~^
2 errors generated.
This provides highly informative error messages that clearly indicate
that the cause is the ability to infer appropriate inputSEXP
types
(itself a necessity of R being written in C and so knowing nothing
about inferred types or templates, as stated above). What we can
nevertheless do here is replace our undefined type, T
, with an
equivalently undefined and generic SEXP
(and let’s define our function
while we’re at it to square the input; and also, if you’re wondering
what all this SEXP
stuff is, you could take a wee digression over to
Miles McBain’s brief but illuminating ramblings on the topic )
// [[Rcpp::export]]
int rcpp_template (SEXP input)
{
int result = Rcpp::as <int> (input);
return result * result;
}
This can be then called from R, and will return an integer output.
(The Rcpp::as <int> ()
is a wrapper for static_cast <int> ()
, which
simply truncates decimals, so rcpp_template(1.9)
will give 1.) What
about generic return values? The next obvious step would be to try this:
// [[Rcpp::export]]
SEXP rcpp_template2 (SEXP input)
{
return input * input;
}
This would obviously be rather dangerous if it actually worked, but we don’t need to worry because it fails with this:
error: invalid operands to binary expression ('SEXP' (aka 'SEXPREC *') and 'SEXP')
SEXP result = input * input;
~~~~~ ^ ~~~~~
1 error generated.
The arrow points to the “operand”, indicating that this has defaulted to
an attempt to implement bit-wise multiplication of two generic pointer
objects (an SEXP
object is nothing but a pointer to the underlying
structure it points to, an SEXPREC
object). So that all leaves us now
knowing that the most we can do is to send generic inputs from R to
C++ functions as SEXP
parameters, and then coerce them with the magic
of Rcpp::as()
. This is of course also potentially dangerous:
rcpp_template (2.9) # = 4; okay
rcpp_template ("2.9")
# Error in rcpp_template(input) :
# Not compatible with requested type: [type=character; target=integer].
Remembering that
R’s C interface only knows about SEXP
(“S-EXPression”), and that
all SEXP
objects are mere pointers to C arrays, suggests something
like the following code—which does not work:
#include <Rcpp.h>
template <class T>
T mysquare (T &x)
{
for (size_t i = 0; i < x.size (); i++)
x (i) = x (i) * x (i);
return x;
}
// [[Rcpp::export]]
SEXP rcpp_mysquare (SEXP &x)
{
return mysquare (x);
}
That code fails to compile because of “incomplete definition of type
‘SEXPREC’” (where a SEXPREC
is a structure pointed to by an
SEXP
)—in other words, R has no way of inferring the type of data
pointed to by the SEXP
. The trick to getting this to compile, and
thereby to using C++ templates via Rcpp, is to have an additional
“type-selector” function that recognises and typecasts the input type
as one of the
six possible R types . We’re only interested in a couple of those
here, representing the integer and real or floating-point types, which
are respectively INTSXP
and REALSXP
. Recalling that there is no
distinction between a single integer or numeric (floating-point) value
and equivalent vectors of these, we can distinguish these two cases
through casting via Rcpp::as
to Rcpp
equivalents of either integer
or numeric vectors with the following additional code, representing our
“type selector” function:
SEXP mysquare (SEXP &x)
{
switch (TYPEOF (x))
{
case INTSXP: {
Rcpp::IntegerVector iv = Rcpp::as <Rcpp::IntegerVector> (x);
return mysquare (iv);
}
case REALSXP: {
Rcpp::NumericVector nv = Rcpp::as <Rcpp::NumericVector> (x);
return mysquare (nv);
}
default: { Rcpp::stop ("incompatible type"); }
}
return x; // this should never happen
}
This function takes a generic (SEXP
) input and returns a generic
output, yet deploys actual calls to the templated version of mysquare
with specified (Rcpp
) types, ensuring that the above templated
function will always be able to infer the input type. The default
Rcpp::stop
ensures that types other than our desired two are not
processed further, preventing for example attempts to calculate the
square of "a"
. Inserting this “type-selector” code in the above code
permits a generic SEXP
-in / SEXP
-out function (our rcpp_mysquare
in the above code) to be deployed to specific types, and then simply
passed to a generic C++ template function. Presuming this C++ code to be
in a file src.cpp
, the whole thing then works like this:
Rcpp::sourceCpp ("src.cpp") # source the file, placing the Rcpp::export-ed function in workspace
x <- 1:5
x <- rcpp_mysquare (x)
x
## [1] 1 4 9 16 25
class (x)
## [1] "integer"
storage.mode (x) <- "numeric"
x <- rcpp_mysquare (x)
x
## [1] 1 16 81 256 625
class (x)
## [1] "numeric"
An integer vector gives integer return values, and a numeric
(floating-point) vector gives numeric return values. There you have it:
templating through the magic of SEXP
. Gratitude extended to Dirk
Eddelbeuttel and David Cooley for advice and helpful pointers.
Just to make it clear, here’s the above code all placed in a single place:
#include <Rcpp.h>
template <class T>
T mysquare (T &x)
{
for (size_t i = 0; i < x.size (); i++)
x (i) = x (i) * x (i);
return x;
}
SEXP mysquare (SEXP &x)
{
switch (TYPEOF (x))
{
case INTSXP: {
Rcpp::IntegerVector iv = Rcpp::as <Rcpp::IntegerVector> (x);
return mysquare (iv);
}
case REALSXP: {
Rcpp::NumericVector nv = Rcpp::as <Rcpp::NumericVector> (x);
return mysquare (nv);
}
default: { Rcpp::stop ("error"); }
}
return x; // this never happens
}
// [[Rcpp::export]]
SEXP rcpp_mysquare (SEXP &x)
{
return mysquare (x);
}
Since writing that, I found this very clear and more extensive explanation in an Rcpp Gallery post. .
Copyright © 2019--22 mark padgham