Automatic Differentiation

Switching AD Modes

Turing supports two types of automatic differentiation (AD) in the back end during sampling. The current default AD mode is ForwardDiff, but Turing also supports Flux’s Tracker-based differentation.

To switch between ForwardDiff and Flux.Tracker, one can call function Turing.setadbackend(backend_sym), where backend_sym can be :forward_diff or :reverse_diff.

Compositional Sampling with Differing AD Modes

Turing supports intermixed automatic differentiation methods for different variable spaces. The snippet below shows using ForwardDiff to sample the mean (m) parameter, and using the Flux-based FluxTrackerAD autodiff for the variance (s) parameter:

using Turing

# Define a simple Normal model with unknown mean and variance.
@model gdemo(x, y) = begin
    s ~ InverseGamma(2, 3)
    m ~ Normal(0, sqrt(s))
    x ~ Normal(m, sqrt(s))
    y ~ Normal(m, sqrt(s))
end

# Sample using Gibbs and varying autodiff backends.
c = sample(gdemo(1.5, 2),
  Gibbs(1000,
    HMC{Turing.ForwardDiffAD{1}}(2, 0.1, 5, :m),
    HMC{Turing.FluxTrackerAD}(2, 0.1, 5, :s)))

Generally, FluxTrackerAD is faster when sampling from variables of high dimensionality (greater than 20) and ForwardDiffAD is more efficient for lower-dimension variables. This functionality allows those who are performance sensistive to fine tune their automatic differentiation for their specific models.

If the differentation method is not specified in this way, Turing will default to using whatever the global AD backend is. Currently, this defaults to ForwardDiff.