Overview

The ast2ast package translates R functions into C++ functions, returning either an external pointer (XPtr) or an R function. This package is particularly useful for tasks requiring frequent function evaluations, such as solving ODE systems or optimization problems. Using the external pointer generated by C++ can significantly enhance performance, as shown in the benchmark below.

Benchmark
Benchmark

Supported objects:

Supported functions:

Type system in ast2ast

Overview

R is dynamically typed, while C++ is statically typed. When translating an R function, ast2ast must decide static C++ types for:

  1. the arguments of f, and
  2. the variables created inside f.

In ast2ast, every type is a combination of:

  • a base type: logical, integer (or int), double
  • a structure: scalar, vector / vec, matrix / mat

Typical types are therefore:

  • double (scalar double)
  • vec(double) / vector(double) (double vector)
  • mat(double) / matrix(double) (double matrix)

A key difference to R: ast2ast scalars are true scalars, not length-1 vectors. Scalars cannot be subset using [] or [[ ]].


Setting types for function arguments

Default behavior

If args_f = NULL, all arguments default to: matrix(double), as this is most convinient for numeric code.


Using args_f

To control argument types, provide a helper function via args_f. It has the same argument list as f and assigns types using type():

f_args <- function(a, b, c) {
  a |> type(vec(double))
  b |> type(mat(double))
  c |> type(double)
}
f_cpp <- ast2ast::translate(f, args_f = f_args)

Borrowing, constness, and references

For arguments, you can additionally control how values are passed:

  • borrow_vec(...), borrow_mat(...): borrow memory (no copy)
  • const(): disallow modification
  • ref(): pass by reference (only valid when output = "XPtr")

Example:

f_args <- function(a, b, c) {
  a |> type(borrow_vec(double)) |> ref()           # mutable, passed by reference
  b |> type(borrow_mat(double)) |> ref() |> const()# read-only matrix reference
  c |> type(double) |> ref()                       # scalar reference (XPtr only)
}

Notes:

  • Borrowed arguments are useful for avoiding allocations and modifying inputs in place.
  • const() produces a compile-time error in C++ if mutation is attempted.
  • ref() is primarily intended for the external pointer interface.

Setting types for variables inside the function

Types inside f are often inferred automatically from:

  • the first assignment, and/or
  • the constructor used (numeric(), matrix(), integer(), etc.)

You can override inference using explicit annotations.


Immutability of types

Once a variable has a type, it cannot change its base type or structure. This follows C++ rules. For example, you cannot first treat a variable as scalar(double) and later assign a scalar double to it. If you need a different type or structure, create a new variable with a new name.

Derivatives

The ast2ast package provides built-in support for automatic differentiation (AD) in both forward mode and reverse mode. Derivative support is enabled when translating a function via the derivative argument:

fcpp <- ast2ast::translate(f, derivative = "forward")
fcpp <- ast2ast::translate(f, derivative = "reverse")

Unlike many high-level AD frameworks, ast2ast intentionally exposes a low-level and explicit interface. Rather than automatically differentiating a function or providing a single jacobian() wrapper, users explicitly assemble derivative computations using a small set of primitive operations. This design keeps the behavior transparent, predictable, and close to the generated C++ code.

Forward mode

In forward mode, derivatives are propagated alongside values. Internally, each scalar carries both its value and its directional derivative (also called its dot value). The following functions are available:

  • seed(x, i): Activates the i-th component of x as the differentiation direction (sets its derivative to 1).
  • unseed(x, i): Resets the derivative state of the i-th component.
  • get_dot(y): Extracts the directional derivatives of y.

A typical pattern is to compute Jacobians column-by-column by looping over the input variables:

f <- function(y, x) {
  jac <- matrix(0.0, length(y), length(x))
  for (i in 1L:length(x)) {
    seed(x, i)

    y[[1L]] <- x[[1L]] * x[[2L]]
    y[[2L]] <- x[[1L]] + x[[2L]] * x[[2L]]

    d <- get_dot(y)
    jac[TRUE, i] <- d

    unseed(x, i)
  }
  return(jac)
}

fcpp_forward <- ast2ast::translate(f, derivative = "forward")

Forward mode is most efficient when the number of inputs is small relative to the number of outputs.

Reverse mode

In reverse mode, derivatives are accumulated by propagating sensitivities backward from the outputs to the inputs. This is particularly efficient when the number of outputs is small relative to the number of inputs.

Reverse mode provides the function:

  • deriv(y, x): Computes the Jacobian of y with respect to x.

Example:

f <- function(y, x) {
  y[[1L]] <- x[[1L]] * x[[2L]]
  y[[2L]] <- x[[1L]] + x[[2L]] * x[[2L]]
  jac <- deriv(y, x)
  return(jac)
}

fcpp_reverse <- ast2ast::translate(f, derivative = "reverse")

The call to deriv() must appear explicitly in your function body. No automatic differentiation is performed unless requested.

Design philosophy

Derivative computation in ast2ast is explicit by design. The full control flow—loops, seeding, unseeding, derivative extraction, and accumulation—is written directly in R and translated into C++.

This approach: * avoids hidden performance costs, * makes derivative logic easy to inspect and debug, * gives full control over memory and evaluation order, * maps naturally to high-performance C++ code.

Rather than hiding differentiation behind abstractions, ast2ast treats derivatives as first-class values that can be manipulated like any other object.

Interpolation

To interpolate values, the ‘cmr’ function can be used. The function needs three arguments.

f <- function() {
  dep <- c(0, 1, 0.5, 2.5, 3.5, 4.5, 4)
  indep <- 1:7
  evalpoints <- c(
    0.5, 1, 1.5, 2, 2.5,
    3, 3.5, 4, 4.5, 5,
    5.5, 6, 6.5
  )
  for (i in evalpoints) {
    print(cmr(i, indep, dep))
  }
}