Index
Bijectors.ADBijector
Bijectors.AbstractBijector
Bijectors.Bijector
Bijectors.Composed
Bijectors.CorrBijector
Bijectors.Coupling
Bijectors.Inverse
Bijectors.LeakyReLU
Bijectors.NamedBijector
Bijectors.NamedComposition
Bijectors.NamedCoupling
Bijectors.NamedInverse
Bijectors.PartitionMask
Bijectors.PartitionMask
Bijectors.Permute
Bijectors.Stacked
Bijectors._link_chol_lkj
Bijectors.bijector
Bijectors.combine
Bijectors.composel
Bijectors.composer
Bijectors.compute_r
Bijectors.couple
Bijectors.coupling
Bijectors.find_alpha
Bijectors.forward
Bijectors.forward
Bijectors.get_u_hat
Bijectors.isclosedform
Bijectors.logabsdetjac
Bijectors.logabsdetjac
Bijectors.logabsdetjacinv
Bijectors.logabsdetjacinv
Bijectors.logpdf_with_jac
Bijectors.partition
Bijectors.transformed
Functions
#
Bijectors._link_chol_lkj
— Method.
function _link_chol_lkj(w)
Link function for cholesky factor.
An alternative and maybe more efficient implementation was considered:
for i=2:K, j=(i+1):K
z[i, j] = (w[i, j] / w[i1, j]) * (z[i1, j] / sqrt(1  z[i1, j]^2))
end
But this implementation will not work when w[i1, j] = 0. Though it is a zero measure set, unit matrix initialization will not work.
For equivelence, following explanations is given by @torfjelde:
For (i, j)
in the loop below, we define
z₍ᵢ₋₁, ⱼ₎ = w₍ᵢ₋₁,ⱼ₎ * ∏ₖ₌₁ⁱ⁻² (1 / √(1  z₍ₖ,ⱼ₎²))
and so
z₍ᵢ,ⱼ₎ = w₍ᵢ,ⱼ₎ * ∏ₖ₌₁ⁱ⁻¹ (1 / √(1  z₍ₖ,ⱼ₎²))
= (w₍ᵢ,ⱼ₎ * / √(1  z₍ᵢ₋₁,ⱼ₎²)) * (∏ₖ₌₁ⁱ⁻² 1 / √(1  z₍ₖ,ⱼ₎²))
= (w₍ᵢ,ⱼ₎ * / √(1  z₍ᵢ₋₁,ⱼ₎²)) * (w₍ᵢ₋₁,ⱼ₎ * ∏ₖ₌₁ⁱ⁻² 1 / √(1  z₍ₖ,ⱼ₎²)) / w₍ᵢ₋₁,ⱼ₎
= (w₍ᵢ,ⱼ₎ * / √(1  z₍ᵢ₋₁,ⱼ₎²)) * (z₍ᵢ₋₁,ⱼ₎ / w₍ᵢ₋₁,ⱼ₎)
= (w₍ᵢ,ⱼ₎ / w₍ᵢ₋₁,ⱼ₎) * (z₍ᵢ₋₁,ⱼ₎ / √(1  z₍ᵢ₋₁,ⱼ₎²))
which is the above implementation.
#
Bijectors.bijector
— Method.
bijector(d::Distribution)
Returns the constrainedtounconstrained bijector for distribution d
.
#
Bijectors.combine
— Method.
combine(m::PartitionMask, x_1, x_2, x_3)
Combines x_1
, x_2
, and x_3
into a single vector.
#
Bijectors.composel
— Method.
composel(ts::Bijector...)::Composed{<:Tuple}
Constructs Composed
such that ts
are applied lefttoright.
#
Bijectors.composer
— Method.
composer(ts::Bijector...)::Composed{<:Tuple}
Constructs Composed
such that ts
are applied righttoleft.
#
Bijectors.compute_r
— Method.
compute_r(y_minus_z0::AbstractVector{<:Real}, α, α_plus_β_hat)
Compute the unique solution $r$ to the equation
\[\y_minus_z0\_2 = r \left(1 + \frac{α_plus_β_hat  α}{α + r}\right)\]subject to $r ≥ 0$ and $r ≠ α$.
Since $α > 0$ and $α_plus_β_hat > 0$, the solution is unique and given by
\[r = (\sqrt{(α_plus_β_hat  γ)^2 + 4 α γ}  (α_plus_β_hat  γ)) / 2,\]where $γ = y_minus_z0_2$. For details see appendix A.2 of the reference.
References
D. Rezende, S. Mohamed (2015): Variational Inference with Normalizing Flows. arXiv:1505.05770
#
Bijectors.couple
— Method.
Returns the coupling law constructed from x
.
#
Bijectors.coupling
— Method.
Returns the constructor of the coupling law.
#
Bijectors.find_alpha
— Method.
find_alpha(wt_y, wt_u_hat, b)
Compute an (approximate) realvalued solution $α̂$ to the equation
\[wt_y = α + wt_u_hat tanh(α + b)\]The uniqueness of the solution is guaranteed since $wt_u_hat ≥ 1$. For details see appendix A.1 of the reference.
Initial bracket
For all $α$, we have
\[α  wt_u_hat  wt_y \leq α + wt_u_hat tanh(α + b)  wt_y \leq α + wt_u_hat  wt_y.\]Thus
\[α̂  wt_u_hat  wt_y \leq 0 \leq α̂ + wt_u_hat  wt_y,\]which implies $α̂ ∈ [wt_y   wt_u_hat  , wt_y +  wt_u_hat  ]$. 
References
D. Rezende, S. Mohamed (2015): Variational Inference with Normalizing Flows. arXiv:1505.05770
#
Bijectors.forward
— Method.
forward(b::Bijector, x)
Computes both transform
and logabsdetjac
in one forward pass, and returns a named tuple (rv=b(x), logabsdetjac=logabsdetjac(b, x))
.
This defaults to the call above, but often one can reuse computation in the computation of the forward pass and the computation of the logabsdetjac
. forward
allows the user to take advantange of such efficiencies, if they exist.
#
Bijectors.forward
— Method.
forward(d::Distribution)
forward(d::Distribution, num_samples::Int)
Returns a NamedTuple
with fields x
, y
, logabsdetjac
and logpdf
.
In the case where d isa TransformedDistribution
, this means
x = rand(d.dist)
y = d.transform(x)
logabsdetjac
is the logabsdetjac of the “forward” transform.logpdf
is the logpdf ofy
, notx
In the case where d isa Distribution
, this means
x = rand(d)
y = x
logabsdetjac = 0.0
logpdf
is logpdf ofx
#
Bijectors.get_u_hat
— Method.
get_u_hat(u::AbstractVector{<:Real}, w::AbstractVector{<:Real})
Return a tuple of vector $û$ that guarantees invertibility of the planar layer, and scalar $wᵀ û$.
Mathematical background
According to appendix A.1, vector $û$ defined by
\[û(w, u) = u + (\log(1 + \exp{(wᵀu)})  1  wᵀu) \frac{w}{\w\²}\]guarantees that the planar layer $f(z) = z + û tanh(wᵀz + b)$ is invertible for all $w, u ∈ ℝᵈ$ and $b ∈ ℝ$. We can rewrite $û$ as
\[û = u + (\log(1 + \exp{(wᵀu)})  1) \frac{w}{\w\²}.\]Additionally, we obtain
\[wᵀû = wᵀu + \log(1 + \exp{(wᵀu)})  1 = \log(1 + \exp{(wᵀu)})  1.\]References
D. Rezende, S. Mohamed (2015): Variational Inference with Normalizing Flows. arXiv:1505.05770
#
Bijectors.isclosedform
— Method.
isclosedform(b::Bijector)::bool
isclosedform(b⁻¹::Inverse{<:Bijector})::bool
Returns true
or false
depending on whether or not evaluation of b
has a closedform implementation.
Most bijectors have closedform evaluations, but there are cases where this is not the case. For example the inverse evaluation of PlanarLayer
requires an iterative procedure to evaluate.
#
Bijectors.logabsdetjac
— Method.
Computes the absolute determinant of the Jacobian of the inversetransformation.
#
Bijectors.logabsdetjac
— Method.
logabsdetjac(b::Bijector, x)
logabsdetjac(ib::Inverse{<:Bijector}, y)
Computes the log(abs(det(J(b(x))))) where J is the jacobian of the transform. Similarily for the inversetransform.
Default implementation for Inverse{<:Bijector}
is implemented as  logabsdetjac
of original Bijector
.
#
Bijectors.logabsdetjacinv
— Method.
logabsdetjacinv(b::Bijector, y)
Just an alias for logabsdetjac(inv(b), y)
.
#
Bijectors.logabsdetjacinv
— Method.
logabsdetjacinv(td::UnivariateTransformed, y::Real)
logabsdetjacinv(td::MultivariateTransformed, y::AbstractVector{<:Real})
Computes the logabsdetjac
of the inverse transformation, since rand(td)
returns the transformed random variable.
#
Bijectors.logpdf_with_jac
— Method.
logpdf_with_jac(td::UnivariateTransformed, y::Real)
logpdf_with_jac(td::MvTransformed, y::AbstractVector{<:Real})
logpdf_with_jac(td::MatrixTransformed, y::AbstractMatrix{<:Real})
Makes use of the forward
method to potentially reuse computation and returns a tuple (logpdf, logabsdetjac)
.
#
Bijectors.partition
— Method.
partition(m::PartitionMask, x)
Partitions x
into 3 disjoint subvectors.
#
Bijectors.transformed
— Method.
transformed(d::Distribution)
transformed(d::Distribution, b::Bijector)
Couples distribution d
with the bijector b
by returning a TransformedDistribution
.
If no bijector is provided, i.e. transformed(d)
is called, then transformed(d, bijector(d))
is returned.
Types
#
Bijectors.ADBijector
— Type.
Abstract type for a Bijector{N}
making use of autodifferentation (AD) to implement jacobian
and, by impliciation, logabsdetjac
.
#
Bijectors.AbstractBijector
— Type.
Abstract type for a bijector.
#
Bijectors.Bijector
— Type.
Abstract type of bijectors with fixed dimensionality.
#
Bijectors.Composed
— Type.
Composed(ts::A)
∘(b1::Bijector{N}, b2::Bijector{N})::Composed{<:Tuple}
composel(ts::Bijector{N}...)::Composed{<:Tuple}
composer(ts::Bijector{N}...)::Composed{<:Tuple}
where A
refers to either
Tuple{Vararg{<:Bijector{N}}}
: a tuple of bijectors of dimensionalityN
AbstractArray{<:Bijector{N}}
: an array of bijectors of dimensionalityN
A Bijector
representing composition of bijectors. composel
and composer
results in a Composed
for which application occurs from lefttoright and righttoleft, respectively.
Note that all the alternative ways of constructing a Composed
returns a Tuple
of bijectors. This ensures typestability of implementations of all relating methdos, e.g. inv
.
If you want to use an Array
as the container instead you can do
Composed([b1, b2, ...])
In general this is not advised since you lose typestability, but there might be cases where this is desired, e.g. if you have a insanely large number of bijectors to compose.
Examples
Simple example
Let’s consider a simple example of Exp
:
julia> using Bijectors: Exp
julia> b = Exp()
Exp{0}()
julia> b ∘ b
Composed{Tuple{Exp{0},Exp{0}},0}((Exp{0}(), Exp{0}()))
julia> (b ∘ b)(1.0) == exp(exp(1.0)) # evaluation
true
julia> inv(b ∘ b)(exp(exp(1.0))) == 1.0 # inversion
true
julia> logabsdetjac(b ∘ b, 1.0) # determinant of jacobian
3.718281828459045
Notes
Order
It’s important to note that ∘
does what is expected mathematically, which means that the bijectors are applied to the input righttoleft, e.g. first applying b2
and then b1
:
(b1 ∘ b2)(x) == b1(b2(x)) # => true
But in the Composed
struct itself, we store the bijectors lefttoright, so that
cb1 = b1 ∘ b2 # => Composed.ts == (b2, b1)
cb2 = composel(b2, b1) # => Composed.ts == (b2, b1)
cb1(x) == cb2(x) == b1(b2(x)) # => true
Structure
∘
will result in “flatten” the composition structure while composel
and composer
preserve the compositional structure. This is most easily seen by an example:
julia> b = Exp()
Exp{0}()
julia> cb1 = b ∘ b; cb2 = b ∘ b;
julia> (cb1 ∘ cb2).ts # <= different
(Exp{0}(), Exp{0}(), Exp{0}(), Exp{0}())
julia> (cb1 ∘ cb2).ts isa NTuple{4, Exp{0}}
true
julia> Bijectors.composer(cb1, cb2).ts
(Composed{Tuple{Exp{0},Exp{0}},0}((Exp{0}(), Exp{0}())), Composed{Tuple{Exp{0},Exp{0}},0}((Exp{0}(), Exp{0}())))
julia> Bijectors.composer(cb1, cb2).ts isa Tuple{Composed, Composed}
true
#
Bijectors.CorrBijector
— Type.
CorrBijector <: Bijector{2}
A bijector implementation of Stan’s parametrization method for Correlation matrix: https://mcstan.org/docs/2_23/referencemanual/correlationmatrixtransformsection.html
Basically, a unconstrained strictly upper triangular matrix y
is transformed to a correlation matrix by following readable but not that efficient form:
K = size(y, 1)
z = tanh.(y)
for j=1:K, i=1:K
if i>j
w[i,j] = 0
elseif 1==i==j
w[i,j] = 1
elseif 1<i==j
w[i,j] = prod(sqrt(1 . z[1:i1, j].^2))
elseif 1==i<j
w[i,j] = z[i,j]
elseif 1<i<j
w[i,j] = z[i,j] * prod(sqrt(1 . z[1:i1, j].^2))
end
end
It is easy to see that every column is a unit vector, for example:
w3' w3 ==
w[1,3]^2 + w[2,3]^2 + w[3,3]^2 ==
z[1,3]^2 + (z[2,3] * sqrt(1  z[1,3]^2))^2 + (sqrt(1z[1,3]^2) * sqrt(1z[2,3]^2))^2 ==
z[1,3]^2 + z[2,3]^2 * (1z[1,3]^2) + (1z[1,3]^2) * (1z[2,3]^2) ==
z[1,3]^2 + z[2,3]^2  z[2,3]^2 * z[1,3]^2 + 1 z[1,3]^2  z[2,3]^2 + z[1,3]^2 * z[2,3]^2 ==
1
And diagonal elements are positive, so w
is a cholesky factor for a positive matrix.
x = w' * w
Consider block matrix representation for x
x = [w1'; w2'; ... wn'] * [w1 w2 ... wn] ==
[w1'w1 w1'w2 ... w1'wn;
w2'w1 w2'w2 ... w2'wn;
...
]
The diagonal elements are given by wk'wk = 1
, thus x
is a correlation matrix.
Every step is invertible, so this is a bijection(bijector).
Note: The implementation doesn’t follow their “manageable expression” directly, because their equation seems wrong (7/30/2020). Insteadly it follows definition above the “manageable expression” directly, which is also described in above doc.
#
Bijectors.Coupling
— Type.
Coupling{F, M}(θ::F, mask::M)
Implements a couplinglayer as defined in [1].
Examples
julia> m = PartitionMask(3, [1], [2]) # <= going to use x[2] to parameterize transform of x[1]
PartitionMask{SparseArrays.SparseMatrixCSC{Float64,Int64}}(
[1, 1] = 1.0,
[2, 1] = 1.0,
[3, 1] = 1.0)
julia> cl = Coupling(θ > Shift(θ[1]), m) # <= will do `y[1:1] = x[1:1] + x[2:2]`;
julia> x = [1., 2., 3.];
julia> cl(x)
3element Array{Float64,1}:
3.0
2.0
3.0
julia> inv(cl)(cl(x))
3element Array{Float64,1}:
1.0
2.0
3.0
julia> coupling(cl) # get the `Bijector` map `θ > b(⋅, θ)`
Shift
julia> couple(cl, x) # get the `Bijector` resulting from `x`
Shift{Array{Float64,1},1}([2.0])
References
[1] Kobyzev, I., Prince, S., & Brubaker, M. A., Normalizing flows: introduction and ideas, CoRR, (), (2019).
#
Bijectors.Inverse
— Type.
inv(b::Bijector)
Inverse(b::Bijector)
A Bijector
representing the inverse transform of b
.
#
Bijectors.LeakyReLU
— Type.
LeakyReLU{T, N}(α::T) <: Bijector{N}
Defines the invertible mapping
x ↦ x if x ≥ 0 else αx
where α > 0.
#
Bijectors.NamedBijector
— Type.
NamedBijector <: AbstractNamedBijector
Wraps a NamedTuple
of key > Bijector
pairs, implementing evaluation, inversion, etc.
Examples
julia> using Bijectors: NamedBijector, Scale, Exp
julia> b = NamedBijector((a = Scale(2.0), b = Exp()));
julia> x = (a = 1., b = 0., c = 42.);
julia> b(x)
(a = 2.0, b = 1.0, c = 42.0)
julia> (a = 2 * x.a, b = exp(x.b), c = x.c)
(a = 2.0, b = 1.0, c = 42.0)
#
Bijectors.NamedComposition
— Type.
NamedComposition <: AbstractNamedBijector
Wraps a tuple of array of AbstractNamedBijector
and implements their composition.
This is very similar to Composed
for Bijector
, with the exception that we do not require the inputs to have the same “dimension”, which in this case refers to the symbols for the NamedTuple
that this takes as input.
See also: Composed
#
Bijectors.NamedCoupling
— Type.
NamedCoupling{target, deps, F} <: AbstractNamedBijector
Implements a coupling layer for named bijectors.
Examples
julia> using Bijectors: NamedCoupling, Scale
julia> b = NamedCoupling(:b, (:a, :c), (a, c) > Scale(a + c))
NamedCoupling{:b,(:a, :c),var"#3#4"}(var"#3#4"())
julia> x = (a = 1., b = 2., c = 3.);
julia> b(x)
(a = 1.0, b = 8.0, c = 3.0)
julia> (a = x.a, b = (x.a + x.c) * x.b, c = x.c)
(a = 1.0, b = 8.0, c = 3.0)
#
Bijectors.NamedInverse
— Type.
NamedInverse <: AbstractNamedBijector
Represents the inverse of a AbstractNamedBijector
, similarily to Inverse
for Bijector
.
See also: Inverse
#
Bijectors.PartitionMask
— Type.
PartitionMask{A}(A_1::A, A_2::A, A_3::A) where {A}
This is used to partition and recombine a vector into 3 disjoint “subvectors”.
Implements
partition(m::PartitionMask, x)
: partitionsx
into 3 disjoint “subvectors”combine(m::PartitionMask, x_1, x_2, x_3)
: combines 3 disjoint vectors into a single one
Note that PartitionMask
is not a Bijector
. It is indeed a bijection, but does not follow the Bijector
interface.
Its main use is in Coupling
where we want to partition the input into 3 parts, one part to transform, one part to map into the parameterspace of the transform applied to the first part, and the last part of the vector is not used for anything.
Examples
julia> using Bijectors: PartitionMask, partition, combine
julia> m = PartitionMask(3, [1], [2]) # <= assumes inputlength 3
PartitionMask{Bool,SparseArrays.SparseMatrixCSC{Bool,Int64}}(
[1, 1] = true,
[2, 1] = true,
[3, 1] = true)
julia> # Partition into 3 parts; the last part is inferred to be indices `[3, ]` from
# the fact that `[1]` and `[2]` does not make up all indices in `1:3`.
x1, x2, x3 = partition(m, [1., 2., 3.])
([1.0], [2.0], [3.0])
julia> # Recombines the partitions into a vector
combine(m, x1, x2, x3)
3element Array{Float64,1}:
1.0
2.0
3.0
Note that the underlying SparseMatrix
is using Bool
as the element type. We can also specify this to be some other type using the sp_type
keyword:
julia> m = PartitionMask{Float32}(3, [1], [2])
PartitionMask{Float32,SparseArrays.SparseMatrixCSC{Float32,Int64}}(
[1, 1] = 1.0,
[2, 1] = 1.0,
[3, 1] = 1.0)
#
Bijectors.PartitionMask
— Method.
PartitionMask(n::Int, indices)
Assumes you want to split the vector, where indices
refer to the parts of the vector you want to apply the bijector to.
#
Bijectors.Permute
— Type.
Permute{A} <: Bijector{1}
A bijector implementation of a permutation. The permutation is performed using a matrix of type A
. There are a couple of different ways to construct Permute
:
Permute([0 1; 1 0]) # will map [1, 2] => [2, 1]
Permute([2, 1]) # will map [1, 2] => [2, 1]
Permute(2, 2 => 1, 1 => 2) # will map [1, 2] => [2, 1]
Permute(2, [1, 2] => [2, 1]) # will map [1, 2] => [2, 1]
If this is not clear, the examples might be of help.
Examples
A simple example is permuting a vector of size 3.
julia> b1 = Permute([
0 1 0;
1 0 0;
0 0 1
])
Permute{Array{Int64,2}}([0 1 0; 1 0 0; 0 0 1])
julia> b2 = Permute([2, 1, 3]) # specify all elements at once
Permute{SparseArrays.SparseMatrixCSC{Float64,Int64}}(
[2, 1] = 1.0
[1, 2] = 1.0
[3, 3] = 1.0)
julia> b3 = Permute(3, 2 => 1, 1 => 2) # elementwise
Permute{SparseArrays.SparseMatrixCSC{Float64,Int64}}(
[2, 1] = 1.0
[1, 2] = 1.0
[3, 3] = 1.0)
julia> b4 = Permute(3, [1, 2] => [2, 1]) # blockwise
Permute{SparseArrays.SparseMatrixCSC{Float64,Int64}}(
[2, 1] = 1.0
[1, 2] = 1.0
[3, 3] = 1.0)
julia> b1.A == b2.A == b3.A == b4.A
true
julia> b1([1., 2., 3.])
3element Array{Float64,1}:
2.0
1.0
3.0
julia> b2([1., 2., 3.])
3element Array{Float64,1}:
2.0
1.0
3.0
julia> b3([1., 2., 3.])
3element Array{Float64,1}:
2.0
1.0
3.0
julia> b4([1., 2., 3.])
3element Array{Float64,1}:
2.0
1.0
3.0
julia> inv(b1)
Permute{LinearAlgebra.Transpose{Int64,Array{Int64,2}}}([0 1 0; 1 0 0; 0 0 1])
julia> inv(b1)(b1([1., 2., 3.]))
3element Array{Float64,1}:
1.0
2.0
3.0
#
Bijectors.Stacked
— Type.
Stacked(bs)
Stacked(bs, ranges)
stack(bs::Bijector{0}...) # where `0` means 0dim `Bijector`
A Bijector
which stacks bijectors together which can then be applied to a vector where bs[i]::Bijector
is applied to x[ranges[i]]::UnitRange{Int}
.
Arguments

bs
can be either aTuple
or anAbstractArray
of 0 and/or 1dimensional bijectors If
bs
is aTuple
, implementations are typestable using generated functions  If
bs
is anAbstractArray
, implementations are not typestable and use iterative methods
 If

ranges
needs to be an iterable consisting ofUnitRange{Int}
length(bs) == length(ranges)
needs to be true.
Examples
b1 = Logit(0.0, 1.0)
b2 = Identity{0}()
b = stack(b1, b2)
b([0.0, 1.0]) == [b1(0.0), 1.0] # => true