Encoders & Decoders
AutoEncoderToolkit.jl
provides a set of predefined encoders and decoders that can be used to define custom (variational) autoencoder architectures.
Encoders
The tree structure of the encoder types looks like this (🧱 represents concrete types):
AbstractEncoder
AbstractDeterministicEncoder
AbstractVariationalEncoder
AbstractGaussianEncoder
AbstractGaussianLinearEncoder
AbstractGaussianLogEncoder
Encoder
AutoEncoderToolkit.Encoder
— Typestruct Encoder{E<:Union{Flux.Chain,Flux.Dense}} <: AbstractDeterministicEncoder
Default encoder function for deterministic autoencoders. The encoder
network is used to map the input data directly into the latent space representation.
Fields
encoder::Union{Flux.Chain,Flux.Dense}
: The primary neural network used to process input data and map it into a latent space representation.
Example
enc = Encoder(Flux.Chain(Dense(784, 400, relu), Dense(400, 20)))
AutoEncoderToolkit.Encoder
— Method(encoder::Encoder)(x)
Forward propagate the input x
through the Encoder
to obtain the encoded representation in the latent space.
Arguments
x::Array
: Input data to be encoded.
Returns
z
: Encoded representation of the input data in the latent space.
Description
This method allows for a direct call on an instance of Encoder
with the input data x
. It runs the input through the encoder network and outputs the encoded representation in the latent space.
Example
enc = Encoder(...)
z = enc(some_input)
Note
Ensure that the input x matches the expected dimensionality of the encoder's input layer.
JointGaussianEncoder
AutoEncoderToolkit.JointGaussianEncoder
— Typestruct JointGaussianEncoder <: AbstractGaussianLinearEncoder
Encoder function for variational autoencoders where the same encoder
network is used to map to the latent space mean µ
and standard deviation σ
.
Fields
encoder::Flux.Chain
: The primary neural network used to process input data and map it into a latent space representation.µ::Flux.Dense
: A dense layer mapping from the output of theencoder
to the mean of the latent space.σ::Flux.Dense
: A dense layer mapping from the output of theencoder
to the standard deviation of the latent space.
Example
enc = JointGaussianEncoder(
Flux.Chain(Dense(784, 400, relu)), Flux.Dense(400, 20), Flux.Dense(400, 20)
)
AutoEncoderToolkit.JointGaussianEncoder
— Method (encoder::JointGaussianEncoder)(x::AbstractArray)
Forward propagate the input x
through the JointGaussianEncoder
to obtain the mean (µ
) and standard deviation (σ
) of the latent space.
Arguments
x::AbstractArray
: Input data to be encoded.
Returns
- A NamedTuple
(µ=µ, σ=σ,)
where:µ
: Mean of the latent space after passing the input through the encoder and subsequently through theµ
layer.σ
: Standard deviation of the latent space after passing the input through the encoder and subsequently through theσ
layer.
Description
This method allows for a direct call on an instance of JointGaussianEncoder
with the input data x
. It first runs the input through the encoder network, then maps the output of the last encoder layer to both the mean and standard deviation of the latent space.
Example
je = JointGaussianEncoder(...)
µ, σ = je(some_input)
Note
Ensure that the input x matches the expected dimensionality of the encoder's input layer.
JointGaussianLogEncoder
AutoEncoderToolkit.JointGaussianLogEncoder
— Typestruct JointGaussianLogEncoder <: AbstractGaussianLogEncoder
Default encoder function for variational autoencoders where the same encoder
network is used to map to the latent space mean µ
and log standard deviation logσ
.
Fields
encoder::Flux.Chain
: The primary neural network used to process input data and map it into a latent space representation.µ::Union{Flux.Dense,Flux.Chain}
: A dense layer or a chain of layers mapping from the output of theencoder
to the mean of the latent space.logσ::Union{Flux.Dense,Flux.Chain}
: A dense layer or a chain of layers mapping from the output of theencoder
to the log standard deviation of the latent space.
Example
enc = JointGaussianLogEncoder(
Flux.Chain(Dense(784, 400, relu)), Flux.Dense(400, 20), Flux.Dense(400, 20)
)
AutoEncoderToolkit.JointGaussianLogEncoder
— Method (encoder::JointGaussianLogEncoder)(x)
This method forward propagates the input x
through the JointGaussianLogEncoder
to compute the mean (mu
) and log standard deviation (logσ
) of the latent space.
Arguments
x::Array{Float32}
: The input data to be encoded.
Returns
- A NamedTuple
(µ=µ, logσ=logσ,)
where:µ
: The mean of the latent space. This is computed by passing the input through the encoder and subsequently through theµ
layer.logσ
: The log standard deviation of the latent space. This is computed by passing the input through the encoder and subsequently through thelogσ
layer.
Description
This method allows for a direct call on an instance of JointGaussianLogEncoder
with the input data x
. It first processes the input through the encoder network, then maps the output of the last encoder layer to both the mean and log standard deviation of the latent space.
Example
je = JointGaussianLogEncoder(...)
mu, logσ = je(some_input)
Note
Ensure that the input x matches the expected dimensionality of the encoder's input layer.
Decoders
The tree structure of the decoder types looks like this (🧱 represents concrete types):
AbstractDecoder
AbstractDeterministicDecoder
AbstractVariationalDecoder
BernoulliDecoder
🧱CategoricalDecoder
🧱AbstractGaussianDecoder
SimpleGaussianDecoder
🧱AbstractGaussianLinearDecoder
AbstractGaussianLogDecoder
Decoder
AutoEncoderToolkit.Decoder
— Typestruct Decoder{D<:Flux.Chain} <: AbstractDeterministicDecoder
Default decoder function for deterministic autoencoders. The decoder
network is used to map the latent space representation directly back to the original data space.
Fields
decoder::Flux.Chain
: The primary neural network used to process the latent space representation and map it back to the data space.
Example
dec = Decoder(Flux.Chain(Dense(20, 400, relu), Dense(400, 784)))
AutoEncoderToolkit.Decoder
— Method(decoder::Decoder)(z::AbstractArray)
Forward propagate the encoded representation z
through the Decoder
to obtain the reconstructed input data.
Arguments
z::AbstractArray
: Encoded representation in the latent space.
Returns
x_reconstructed
: Reconstructed version of the original input data after decoding from the latent space.
Description
This method allows for a direct call on an instance of Decoder
with the encoded data z
. It runs the encoded representation through the decoder network and outputs the reconstructed version of the original input data.
Example
julia dec = Decoder(...) x_reconstructed = dec(encoded_representation)
`
Note
Ensure that the input z matches the expected dimensionality of the decoder's input layer.
BernoulliDecoder
AutoEncoderToolkit.BernoulliDecoder
— Type BernoulliDecoder{D<:Flux.Chain} <: AbstractVariationalDecoder
A decoder structure for variational autoencoders (VAEs) that models the output data as a Bernoulli distribution. This is typically used when the outputs of the decoder are probabilities.
Fields
decoder::Flux.Chain
: The primary neural network used to process the latent space and map it to the output (or reconstructed) space.
Description
BernoulliDecoder
represents a VAE decoder that models the output data as a Bernoulli distribution. It's commonly used when the outputs of the decoder are probabilities, such as in a binary classification task or when modeling binary data. Unlike a Gaussian decoder, there's no need for separate paths or operations on the mean or log standard deviation.
Note
Ensure the last layer of the decoder outputs a value between 0 and 1, as this is required for a Bernoulli distribution.
AutoEncoderToolkit.BernoulliDecoder
— Method (decoder::BernoulliDecoder)(z::AbstractArray)
Maps the given latent representation z
through the BernoulliDecoder
network to reconstruct the original input.
Arguments
z::AbstractArray
: The latent space representation to be decoded. This can be a vector or a matrix, where each column represents a separate sample from the latent space of a VAE.
Returns
- A NamedTuple
(p=p,)
wherep
is an array representing the output of the decoder, which should resemble the original input to the VAE (post encoding and sampling from the latent space).
Description
This function processes the latent space representation z
using the neural network defined in the BernoulliDecoder
struct. The aim is to decode or reconstruct the original input from this representation.
Note
Ensure that the latent space representation z matches the expected input dimensionality for the BernoulliDecoder.
CategoricalDecoder
AutoEncoderToolkit.CategoricalDecoder
— TypeCategoricalDecoder{D<:Flux.Chain} <: AbstractVariationalDecoder
A decoder structure for variational autoencoders (VAEs) that models the output data as a categorical distribution. This is typically used when the outputs of the decoder are categorical variables encoded as one-hot vectors.
Fields
decoder::Flux.Chain
: The primary neural network used to process the latent space and map it to the output (or reconstructed) space.
Description
CategoricalDecoder
represents a VAE decoder that models the output data as a categorical distribution. It's commonly used when the outputs of the decoder are categorical variables, such as in a multi-class one-hot encoded vectors. Unlike a Gaussian decoder, there's no need for separate paths or operations on the mean or log standard deviation.
Note
Ensure the last layer of the decoder outputs a probability distribution over the categories, as this is required for a categorical distribution. This can be done using a softmax activation function, for example.
AutoEncoderToolkit.CategoricalDecoder
— Method(decoder::CategoricalDecoder)(z::AbstractArray)
Maps the given latent representation z
through the CategoricalDecoder
network to reconstruct the original input.
Arguments
z::AbstractArray
: The latent space representation to be decoded. This can be a vector or a matrix, where each column represents a separate sample from the latent space of a VAE.
Returns
- A NamedTuple
(p=p,)
wherep
is an array representing the output of the decoder, which should resemble the original input to the VAE (post encoding and sampling from the latent space).
Description
This function processes the latent space representation z
using the neural network defined in the CategoricalDecoder
struct. The aim is to decode or reconstruct the original input from this representation.
Note
Ensure that the latent space representation z matches the expected input dimensionality for the CategoricalDecoder.
SimpleGaussianDecoder
AutoEncoderToolkit.SimpleGaussianDecoder
— TypeSimpleGaussianDecoder{D} <: AbstractGaussianDecoder
A straightforward decoder structure for variational autoencoders (VAEs) that contains only a single decoder network.
Fields
decoder::Flux.Chain
: The primary neural network used to process the latent space and map it to the output (or reconstructed) space.
Description
SimpleGaussianDecoder
represents a basic VAE decoder without explicit components for the latent space's mean (µ
) or log standard deviation (logσ
). It's commonly used when the VAE's latent space distribution is implicitly defined, and there's no need for separate paths or operations on the mean or log standard deviation.
AutoEncoderToolkit.SimpleGaussianDecoder
— Method(decoder::SimpleGaussianDecoder)(z::AbstractVecOrMat)
Maps the given latent representation z
through the SimpleGaussianDecoder
network to reconstruct the original input.
Arguments
z::AbstractArray
: The latent space representation to be decoded. This can be a vector or a matrix, where each column represents a separate sample from the latent space of a VAE.
Returns
- A NamedTuple
(µ=µ,)
whereµ
is an array representing the output of the decoder, which should resemble the original input to the VAE (post encoding and sampling from the latent space).
Description
This function processes the latent space representation z
using the neural network defined in the SimpleGaussianDecoder
struct. The aim is to decode or reconstruct the original input from this representation.
Example
decoder = SimpleGaussianDecoder(...)
z = ... # some latent space representation
output = decoder(z)
Note
Ensure that the latent space representation z matches the expected input dimensionality for the SimpleGaussianDecoder.
JointGaussianDecoder
AutoEncoderToolkit.JointGaussianDecoder
— TypeJointGaussianDecoder{D<:Flux.Chain,L<:Flux.Dense} <: AbstractGaussianLinearDecoder
An extended decoder structure for VAEs that incorporates separate layers for mapping from the latent space to both its mean (µ
) and standard deviation (σ
).
Fields
decoder::Flux.Chain
: The primary neural network used to process the latent space before determining its mean and log standard deviation.µ::Flux.Dense
: A dense layer that maps from the output of thedecoder
to the mean of the latent space.σ::Flux.Dense
: A dense layer that maps from the output of thedecoder
to the standard deviation of the latent space.
Description
JointGaussianDecoder
is tailored for VAE architectures where the same decoder network is used initially, and then splits into two separate paths for determining both the mean and standard deviation of the latent space.
AutoEncoderToolkit.JointGaussianDecoder
— Method (decoder::JointGaussianDecoder)(z::AbstractArray)
Maps the given latent representation z
through the JointGaussianDecoder
network to produce both the mean (µ
) and standard deviation (σ
).
Arguments
z::AbstractArray
: The latent space representation to be decoded. If array, the last dimension contains each of the latent space representations to be decoded.
Returns
- A NamedTuple
(µ=µ, σ=σ,)
where:µ::AbstractArray
: The mean representation obtained from the decoder.σ::AbstractArray
: The standard deviation representation obtained from the decoder.
Description
This function processes the latent space representation z
using the primary neural network of the JointGaussianDecoder
struct. It then separately maps the output of this network to the mean and standard deviation using the µ
and σ
dense layers, respectively.
Example
decoder = JointGaussianDecoder(...)
z = ... # some latent space representation
output = decoder(z)
Note
Ensure that the latent space representation z matches the expected input dimensionality for the JointGaussianDecoder.
JointGaussianLogDecoder
AutoEncoderToolkit.JointGaussianLogDecoder
— TypeJointGaussianLogDecoder{D<:Flux.Chain,L<:Flux.Dense} <: AbstractGaussianLogDecoder
An extended decoder structure for VAEs that incorporates separate layers for mapping from the latent space to both its mean (µ
) and log standard deviation (logσ
).
Fields
decoder::Flux.Chain
: The primary neural network used to process the latent space before determining its mean and log standard deviation.µ::Flux.Dense
: A dense layer that maps from the output of thedecoder
to the mean of the latent space.logσ::Flux.Dense
: A dense layer that maps from the output of thedecoder
to the log standard deviation of the latent space.
Description
JointGaussianLogDecoder
is tailored for VAE architectures where the same decoder network is used initially, and then splits into two separate paths for determining both the mean and log standard deviation of the latent space.
AutoEncoderToolkit.JointGaussianLogDecoder
— Method (decoder::JointGaussianLogDecoder)(z::AbstractArray)
Maps the given latent representation z
through the JointGaussianLogDecoder
network to produce both the mean (µ
) and log standard deviation (logσ
).
Arguments
z::AbstractArray
: The latent space representation to be decoded. If array, the last dimension contains each of the latent space representations.
Returns
- A NamedTuple
(µ=µ, logσ=logσ,)
where:µ::Array
: The mean representation obtained from the decoder.logσ::Array
: The log standard deviation representation obtained from the decoder.
Description
This function processes the latent space representation z
using the primary neural network of the JointGaussianLogDecoder
struct. It then separately maps the output of this network to the mean and log standard deviation using the µ
and logσ
dense layers, respectively.
Example
decoder = JointGaussianLogDecoder(...)
z = ... # some latent space representation
output = decoder(z)
Note
Ensure that the latent space representation z matches the expected input dimensionality for the JointGaussianLogDecoder.
SplitGaussianDecoder
AutoEncoderToolkit.SplitGaussianDecoder
— TypeSplitGaussianDecoder{D<:Flux.Chain} <: AbstractGaussianLinearDecoder
A specialized decoder structure for VAEs that uses distinct neural networks for determining the mean (µ
) and standard deviation (logσ
) of the latent space.
Fields
decoder_µ::Flux.Chain
: A neural network dedicated to processing the latent space and mapping it to its mean.decoder_σ::Flux.Chain
: A neural network dedicated to processing the latent space and mapping it to its standard deviation.
Description
SplitGaussianDecoder
is designed for VAE architectures where separate decoder networks are preferred for computing the mean and log standard deviation, ensuring that each has its own distinct set of parameters and transformation logic.
AutoEncoderToolkit.SplitGaussianDecoder
— Method (decoder::SplitGaussianDecoder)(z::AbstractArray)
Maps the given latent representation z
through the separate networks of the SplitGaussianDecoder
to produce both the mean (µ
) and standard deviation (σ
).
Arguments
z::AbstractArray
: The latent space representation to be decoded. If array, the last dimension contains each of the latent space representations to be decoded.
Returns
- A NamedTuple
(µ=µ, σ=σ,)
where:µ::AbstractArray
: The mean representation obtained using the dedicateddecoder_µ
network.σ::AbstractArray
: The standard deviation representation obtained using the dedicateddecoder_σ
network.
Description
This function processes the latent space representation z
through two distinct neural networks within the SplitGaussianDecoder
struct. The decoder_µ
network is used to produce the mean representation, while the decoder_σ
network is utilized for the standard deviation.
Example
decoder = SplitGaussianDecoder(...)
z = ... # some latent space representation
output = decoder(z)
Note
Ensure that the latent space representation z matches the expected input dimensionality for both networks in the SplitGaussianDecoder.
SplitGaussianLogDecoder
AutoEncoderToolkit.SplitGaussianLogDecoder
— TypeSplitGaussianLogDecoder{D<:Flux.Chain} <: AbstractGaussianLogDecoder
A specialized decoder structure for VAEs that uses distinct neural networks for determining the mean (µ
) and log standard deviation (logσ
) of the latent space.
Fields
decoder_µ::Flux.Chain
: A neural network dedicated to processing the latent space and mapping it to its mean.decoder_logσ::Flux.Chain
: A neural network dedicated to processing the latent space and mapping it to its log standard deviation.
Description
SplitGaussianLogDecoder
is designed for VAE architectures where separate decoder networks are preferred for computing the mean and log standard deviation, ensuring that each has its own distinct set of parameters and transformation logic.
AutoEncoderToolkit.SplitGaussianLogDecoder
— Method (decoder::SplitGaussianLogDecoder)(z::AbstractArray)
Maps the given latent representation z
through the separate networks of the SplitGaussianLogDecoder
to produce both the mean (µ
) and log standard deviation (logσ
).
Arguments
z::AbstractArray
: The latent space representation to be decoded. If array, the last dimension contains each of the latent space representations to be decoded.
Returns
- A NamedTuple
(µ=µ, logσ=logσ,)
where:µ::AbstractArray
: The mean representation obtained using the dedicateddecoder_µ
network.logσ::AbstractArray
: The log standard deviation representation obtained using the dedicateddecoder_logσ
network.
Description
This function processes the latent space representation z
through two distinct neural networks within the SplitGaussianLogDecoder
struct. The decoder_µ
network is used to produce the mean representation, while the decoder_logσ
network is utilized for the log standard deviation.
Example
decoder = SplitGaussianLogDecoder(...)
z = ... # some latent space representation
output = decoder(z))
Note
Ensure that the latent space representation z matches the expected input dimensionality for both networks in the SplitGaussianLogDecoder.
Default initializations
The package provides a set of functions to initialize encoder and decoder architectures. Although it gives the user less flexibility, it can be useful for quick prototyping.
Encoder initializations
AutoEncoderToolkit.Encoder
— MethodEncoder(n_input, n_latent, latent_activation, encoder_neurons,
encoder_activation; init=Flux.glorot_uniform)
Construct and initialize an Encoder
struct that defines an encoder network for a deterministic autoencoder.
Arguments
n_input::Int
: The dimensionality of the input data.n_latent::Int
: The dimensionality of the latent space.encoder_neurons::Vector{<:Int}
: A vector specifying the number of neurons in each layer of the encoder network.encoder_activation::Vector{<:Function}
: Activation functions corresponding to each layer in theencoder_neurons
.latent_activation::Function
: Activation function for the latent space layer.
Optional Keyword Arguments
init::Function=Flux.glorot_uniform
: The initialization function used for the neural network weights.
Returns
- An
Encoder
struct initialized based on the provided arguments.
Examples
julia encoder = Encoder(784, 20, tanh, [400], [relu])
`
Notes
The length of encoderneurons should match the length of encoderactivation, ensuring that each layer in the encoder has a corresponding activation function.
AutoEncoderToolkit.JointGaussianLogEncoder
— MethodJointGaussianLogEncoder(n_input, n_latent, encoder_neurons, encoder_activation,
latent_activation; init=Flux.glorot_uniform)
Construct and initialize a JointGaussianLogEncoder
struct that defines an encoder network for a variational autoencoder.
Arguments
n_input::Int
: The dimensionality of the input data.n_latent::Int
: The dimensionality of the latent space.encoder_neurons::Vector{<:Int}
: A vector specifying the number of neurons in each layer of the encoder network.encoder_activation::Vector{<:Function}
: Activation functions corresponding to each layer in theencoder_neurons
.latent_activation::Function
: Activation function for the latent space layers (both µ and logσ).
Optional Keyword Arguments
init::Function=Flux.glorot_uniform
: The initialization function used for the neural network weights.
Returns
- A
JointGaussianLogEncoder
struct initialized based on the provided arguments.
Examples
encoder = JointGaussianLogEncoder(784, 20, [400], [relu], tanh)
Notes
The length of encoderneurons should match the length of encoderactivation, ensuring that each layer in the encoder has a corresponding activation function.
AutoEncoderToolkit.JointGaussianLogEncoder
— MethodJointGaussianLogEncoder(n_input, n_latent, encoder_neurons, encoder_activation,
latent_activation; init=Flux.glorot_uniform)
Construct and initialize a JointGaussianLogEncoder
struct that defines an encoder network for a variational autoencoder.
Arguments
n_input::Int
: The dimensionality of the input data.n_latent::Int
: The dimensionality of the latent space.encoder_neurons::Vector{<:Int}
: A vector specifying the number of neurons in each layer of the encoder network.encoder_activation::Vector{<:Function}
: Activation functions corresponding to each layer in theencoder_neurons
.latent_activation::Vector{<:Function}
: Activation functions for the latent space layers (both µ and logσ).
Optional Keyword Arguments
init::Function=Flux.glorot_uniform
: The initialization function used for the neural network weights.
Returns
- A
JointGaussianLogEncoder
struct initialized based on the provided arguments.
Examples
encoder = JointGaussianLogEncoder(784, 20, [400], [relu], tanh)
Notes
The length of encoderneurons should match the length of encoderactivation, ensuring that each layer in the encoder has a corresponding activation function.
AutoEncoderToolkit.JointGaussianEncoder
— MethodJointGaussianEncoder(n_input, n_latent, encoder_neurons, encoder_activation,
latent_activation; init=Flux.glorot_uniform)
Construct and initialize a JointGaussianLogEncoder
struct that defines an encoder network for a variational autoencoder.
Arguments
n_input::Int
: The dimensionality of the input data.n_latent::Int
: The dimensionality of the latent space.encoder_neurons::Vector{<:Int}
: A vector specifying the number of neurons in each layer of the encoder network.encoder_activation::Vector{<:Function}
: Activation functions corresponding to each layer in theencoder_neurons
.latent_activation::Vector{<:Function}
: Activation function for the latent space layers. This vector must contain the activation for both µ and logσ.
Optional Keyword Arguments
init::Function=Flux.glorot_uniform
: The initialization function used for the neural network weights.
Returns
- A
JointGaussianEncoder
struct initialized based on the provided arguments.
Examples
encoder = JointGaussianEncoder(784, 20, [400], [relu], [tanh, softplus])
Notes
The length of encoderneurons should match the length of encoderactivation, ensuring that each layer in the encoder has a corresponding activation function.
Decoder initializations
AutoEncoderToolkit.Decoder
— MethodDecoder(n_input, n_latent, decoder_neurons, decoder_activation,
output_activation; init=Flux.glorot_uniform)
Construct and initialize a Decoder
struct that defines a decoder network for a deterministic autoencoder.
Arguments
n_input::Int
: The dimensionality of the output data (which typically matches the input data dimensionality of the autoencoder).n_latent::Int
: The dimensionality of the latent space.decoder_neurons::Vector{<:Int}
: A vector specifying the number of neurons in each layer of the decoder network.decoder_activation::Vector{<:Function}
: Activation functions corresponding to each layer in thedecoder_neurons
.output_activation::Function
: Activation function for the final output layer.
Optional Keyword Arguments
init::Function=Flux.glorot_uniform
: The initialization function used for the neural network weights.
Returns
- A
Decoder
struct initialized based on the provided arguments.
Examples
decoder = Decoder(784, 20, sigmoid, [400], [relu])
Notes
The length of decoderneurons should match the length of decoderactivation, ensuring that each layer in the decoder has a corresponding activation function.
AutoEncoderToolkit.SimpleGaussianDecoder
— MethodSimpleGaussianDecoder(
n_input, n_latent, decoder_neurons,
decoder_activation, output_activation;
init=Flux.glorot_uniform
)
Constructs and initializes a SimpleGaussianDecoder
object designed for variational autoencoders (VAEs). This function sets up a straightforward decoder network that maps from a latent space to an output space.
Arguments
n_input::Int
: Dimensionality of the output data (or the data to be reconstructed).n_latent::Int
: Dimensionality of the latent space.decoder_neurons::Vector{<:Int}
: Vector of layer sizes for the decoder network, not including the input latent layer and the final output layer.decoder_activation::Vector{<:Function}
: Activation functions for each decoder layer, not including the final output layer.output_activation::Function
: Activation function for the final output layer.
Optional Keyword Arguments
init::Function=Flux.glorot_uniform
: Initialization function for the network parameters.
Returns
A SimpleGaussianDecoder
object with the specified architecture and initialized weights.
Description
This function constructs a SimpleGaussianDecoder
object, setting up its decoder network based on the provided specifications. The architecture begins with a dense layer mapping from the latent space, goes through a sequence of middle layers if specified, and finally maps to the output space.
The function ensures that there are appropriate activation functions provided for each layer in the decoder_neurons
and checks for potential mismatches in length.
Example
n_input = 28*28
n_latent = 64
decoder_neurons = [128, 256]
decoder_activation = [relu, relu]
output_activation = sigmoid
decoder = SimpleGaussianDecoder(
n_input, n_latent, decoder_neurons, decoder_activation, output_activation
)
Note
Ensure that the lengths of decoderneurons and decoderactivation match, excluding the output layer.
AutoEncoderToolkit.JointGaussianLogDecoder
— MethodJointGaussianLogDecoder(n_input, n_latent, decoder_neurons, decoder_activation,
latent_activation; init=Flux.glorot_uniform)
Constructs and initializes a JointGaussianLogDecoder
object for variational autoencoders (VAEs). This function sets up a decoder network that first processes the latent space and then maps it separately to both its mean (µ
) and log standard deviation (logσ
).
Arguments
n_input::Int
: Dimensionality of the output data (or the data to be reconstructed).n_latent::Int
: Dimensionality of the latent space.decoder_neurons::Vector{<:Int}
: Vector of layer sizes for the primary decoder network, not including the input latent layer.decoder_activation::Vector{<:Function}
: Activation functions for each primary decoder layer.output_activation::Function
: Activation function for the mean (µ
) and log standard deviation (logσ
) layers.
Optional Keyword Arguments
init::Function=Flux.glorot_uniform
: Initialization function for the network parameters.
Returns
A JointGaussianLogDecoder
object with the specified architecture and initialized weights.
Description
This function constructs a JointGaussianLogDecoder
object, setting up its primary decoder network based on the provided specifications. The architecture begins with a dense layer mapping from the latent space and goes through a sequence of middle layers if specified. After processing the latent space through the primary decoder, it then maps separately to both its mean (µ
) and log standard deviation (logσ
).
Example
n_input = 28*28
n_latent = 64
decoder_neurons = [128, 256]
decoder_activation = [relu, relu]
output_activation = tanh
decoder = JointGaussianLogDecoder(
n_input, n_latent, decoder_neurons, decoder_activation, output_activation
)
Note
Ensure that the lengths of decoderneurons and decoderactivation match.
AutoEncoderToolkit.JointGaussianLogDecoder
— MethodJointGaussianLogDecoder(n_input, n_latent, decoder_neurons, decoder_activation,
latent_activation; init=Flux.glorot_uniform)
Constructs and initializes a JointGaussianLogDecoder
object for variational autoencoders (VAEs). This function sets up a decoder network that first processes the latent space and then maps it separately to both its mean (µ
) and log standard deviation (logσ
).
Arguments
n_input::Int
: Dimensionality of the output data (or the data to be reconstructed).n_latent::Int
: Dimensionality of the latent space.decoder_neurons::Vector{<:Int}
: Vector of layer sizes for the primary decoder network, not including the input latent layer.decoder_activation::Vector{<:Function}
: Activation functions for each primary decoder layer.output_activation::Vector{<:Function}
: Activation functions for the mean (µ
) and log standard deviation (logσ
) layers.
Optional Keyword Arguments
init::Function=Flux.glorot_uniform
: Initialization function for the network parameters.
Returns
A JointGaussianLogDecoder
object with the specified architecture and initialized weights.
Description
This function constructs a JointGaussianLogDecoder
object, setting up its primary decoder network based on the provided specifications. The architecture begins with a dense layer mapping from the latent space and goes through a sequence of middle layers if specified. After processing the latent space through the primary decoder, it then maps separately to both its mean (µ
) and log standard deviation (logσ
).
Example
n_input = 28*28
n_latent = 64
decoder_neurons = [128, 256]
decoder_activation = [relu, relu]
output_activation = [tanh, identity]
decoder = JointGaussianLogDecoder(
n_input, n_latent, decoder_neurons, decoder_activation, latent_activation
)
Note
Ensure that the lengths of decoderneurons and decoderactivation match.
AutoEncoderToolkit.JointGaussianDecoder
— MethodJointGaussianDecoder(n_input, n_latent, decoder_neurons, decoder_activation,
latent_activation; init=Flux.glorot_uniform)
Constructs and initializes a JointGaussianLogDecoder
object for variational autoencoders (VAEs). This function sets up a decoder network that first processes the latent space and then maps it separately to both its mean (µ
) and log standard deviation (logσ
).
Arguments
n_input::Int
: Dimensionality of the output data (or the data to be reconstructed).n_latent::Int
: Dimensionality of the latent space.decoder_neurons::Vector{<:Int}
: Vector of layer sizes for the primary decoder network, not including the input latent layer.decoder_activation::Vector{<:Function}
: Activation functions for each primary decoder layer.output_activation::Function
: Activation function for the mean (µ
) and log standard deviation (logσ
) layers.
Optional Keyword Arguments
init::Function=Flux.glorot_uniform
: Initialization function for the network parameters.
Returns
A JointGaussianDecoder
object with the specified architecture and initialized weights.
Description
This function constructs a JointGaussianDecoder
object, setting up its primary decoder network based on the provided specifications. The architecture begins with a dense layer mapping from the latent space and goes through a sequence of middle layers if specified. After processing the latent space through the primary decoder, it then maps separately to both its mean (µ
) and standard deviation (σ
).
Example
n_input = 28*28
n_latent = 64
decoder_neurons = [128, 256]
decoder_activation = [relu, relu]
output_activation = tanh
decoder = JointGaussianDecoder(
n_input, n_latent, decoder_neurons, decoder_activation, output_activation
)
Note
Ensure that the lengths of decoderneurons and decoderactivation match.
AutoEncoderToolkit.JointGaussianDecoder
— MethodJointGaussianDecoder(n_input, n_latent, decoder_neurons, decoder_activation,
latent_activation; init=Flux.glorot_uniform)
Constructs and initializes a JointGaussianDecoder
object for variational autoencoders (VAEs). This function sets up a decoder network that first processes the latent space and then maps it separately to both its mean (µ
) and standard deviation (σ
).
Arguments
n_input::Int
: Dimensionality of the output data (or the data to be reconstructed).n_latent::Int
: Dimensionality of the latent space.decoder_neurons::Vector{<:Int}
: Vector of layer sizes for the primary decoder network, not including the input latent layer.decoder_activation::Vector{<:Function}
: Activation functions for each primary decoder layer.output_activation::Function
: Activation function for the mean (µ
) and standard deviation (σ
) layers.
Optional Keyword Arguments
init::Function=Flux.glorot_uniform
: Initialization function for the network parameters.
Returns
A JointGaussianDecoder
object with the specified architecture and initialized weights.
Description
This function constructs a JointGaussianDecoder
object, setting up its primary decoder network based on the provided specifications. The architecture begins with a dense layer mapping from the latent space and goes through a sequence of middle layers if specified. After processing the latent space through the primary decoder, it then maps separately to both its mean (µ
) and standard deviation (σ
).
Example
n_input = 28*28
n_latent = 64
decoder_neurons = [128, 256]
decoder_activation = [relu, relu]
latent_activation = [tanh, softplus]
decoder = JointGaussianDecoder(
n_input, n_latent, decoder_neurons, decoder_activation, latent_activation
)
Note
Ensure that the lengths of decoderneurons and decoderactivation match.
AutoEncoderToolkit.SplitGaussianLogDecoder
— MethodSplitGaussianLogDecoder(n_input, n_latent, µ_neurons, µ_activation, logσ_neurons,
logσ_activation; init=Flux.glorot_uniform)
Constructs and initializes a SplitGaussianLogDecoder
object for variational autoencoders (VAEs). This function sets up two distinct decoder networks, one dedicated for determining the mean (µ
) and the other for the log standard deviation (logσ
) of the latent space.
Arguments
n_input::Int
: Dimensionality of the output data (or the data to be reconstructed).n_latent::Int
: Dimensionality of the latent space.µ_neurons::Vector{<:Int}
: Vector of layer sizes for theµ
decoder network, not including the input latent layer.µ_activation::Vector{<:Function}
: Activation functions for eachµ
decoder layer.logσ_neurons::Vector{<:Int}
: Vector of layer sizes for thelogσ
decoder network, not including the input latent layer.logσ_activation::Vector{<:Function}
: Activation functions for eachlogσ
decoder layer.
Optional Keyword Arguments
init::Function=Flux.glorot_uniform
: Initialization function for the network parameters.
Returns
A SplitGaussianLogDecoder
object with two distinct networks initialized with the specified architectures and weights.
Description
This function constructs a SplitGaussianLogDecoder
object, setting up two separate decoder networks based on the provided specifications. The first network, dedicated to determining the mean (µ
), and the second for the log standard deviation (logσ
), both begin with a dense layer mapping from the latent space and go through a sequence of middle layers if specified.
Example
n_latent = 64
µ_neurons = [128, 256]
µ_activation = [relu, relu]
logσ_neurons = [128, 256]
logσ_activation = [relu, relu]
decoder = SplitGaussianLogDecoder(
n_latent, µ_neurons, µ_activation, logσ_neurons, logσ_activation
)
Notes
- Ensure that the lengths of µneurons with µactivation and logσneurons with logσactivation match respectively.
- If µneurons[end] or logσneurons[end] do not match n_input, the function automatically changes this number to match the right dimensionality
AutoEncoderToolkit.SplitGaussianDecoder
— MethodSplitGaussianDecoder(n_input, n_latent, µ_neurons, µ_activation, logσ_neurons,
logσ_activation; init=Flux.glorot_uniform)
Constructs and initializes a SplitGaussianDecoder
object for variational autoencoders (VAEs). This function sets up two distinct decoder networks, one dedicated for determining the mean (µ
) and the other for the standard deviation (σ
) of the latent space.
Arguments
n_input::Int
: Dimensionality of the output data (or the data to be reconstructed).n_latent::Int
: Dimensionality of the latent space.µ_neurons::Vector{<:Int}
: Vector of layer sizes for theµ
decoder network, not including the input latent layer.µ_activation::Vector{<:Function}
: Activation functions for eachµ
decoder layer.σ_neurons::Vector{<:Int}
: Vector of layer sizes for theσ
decoder network, not including the input latent layer.σ_activation::Vector{<:Function}
: Activation functions for eachσ
decoder layer.
Optional Keyword Arguments
init::Function=Flux.glorot_uniform
: Initialization function for the network parameters.
Returns
A SplitGaussianDecoder
object with two distinct networks initialized with the specified architectures and weights.
Description
This function constructs a SplitGaussianDecoder
object, setting up two separate decoder networks based on the provided specifications. The first network, dedicated to determining the mean (µ
), and the second for the standard deviation (σ
), both begin with a dense layer mapping from the latent space and go through a sequence of middle layers if specified.
Example
n_latent = 64
µ_neurons = [128, 256]
µ_activation = [relu, relu]
σ_neurons = [128, 256]
σ_activation = [relu, relu]
decoder = SplitGaussianDecoder(
n_latent, µ_neurons, µ_activation, σ_neurons, σ_activation
)
Notes
- Ensure that the lengths of µneurons with µactivation and σneurons with σactivation match respectively.
- If µneurons[end] or σneurons[end] do not match n_input, the function automatically changes this number to match the right dimensionality
- Ensure that σ_neurons[end] maps to a positive value. Activation functions such as
softplus
are needed to guarantee the positivity of the standard deviation.
AutoEncoderToolkit.BernoulliDecoder
— Method BernoulliDecoder(n_input, n_latent, decoder_neurons, decoder_activation,
output_activation; init=Flux.glorot_uniform)
Constructs and initializes a BernoulliDecoder
object designed for variational autoencoders (VAEs). This function sets up a decoder network that maps from a latent space to an output space.
Arguments
n_input::Int
: Dimensionality of the output data (or the data to be reconstructed).n_latent::Int
: Dimensionality of the latent space.decoder_neurons::Vector{<:Int}
: Vector of layer sizes for the decoder network, not including the input latent layer and the final output layer.decoder_activation::Vector{<:Function}
: Activation functions for each decoder layer, not including the final output layer.output_activation::Function
: Activation function for the final output layer.
Optional Keyword Arguments
init::Function=Flux.glorot_uniform
: Initialization function for the network parameters.
Returns
A BernoulliDecoder
object with the specified architecture and initialized weights.
Description
This function constructs a BernoulliDecoder
object, setting up its decoder network based on the provided specifications. The architecture begins with a dense layer mapping from the latent space, goes through a sequence of middle layers if specified, and finally maps to the output space.
The function ensures that there are appropriate activation functions provided for each layer in the decoder_neurons
and checks for potential mismatches in length.
Example
n_input = 28*28
n_latent = 64
decoder_neurons = [128, 256]
decoder_activation = [relu, relu]
output_activation = sigmoid
decoder = BernoulliDecoder(
n_input,
n_latent,
decoder_neurons,
decoder_activation,
output_activation
)
Note
Ensure that the lengths of decoderneurons and decoderactivation match, excluding the output layer. Also, the output activation function should return values between 0 and 1, as the decoder models the output data as a Bernoulli distribution.
AutoEncoderToolkit.CategoricalDecoder
— Method CategoricalDecoder(
size_input, n_latent, decoder_neurons, decoder_activation,
output_activation; init=Flux.glorot_uniform
)
Constructs and initializes a CategoricalDecoder
object designed for variational autoencoders (VAEs). This function sets up a decoder network that maps from a latent space to an output space.
Arguments
size_input::AbstractVector{<:Int}
: Dimensionality of the output data (or the data to be reconstructed) in the form of a vector where each element represents the size of a dimension.n_latent::Int
: Dimensionality of the latent space.decoder_neurons::Vector{<:Int}
: Vector of layer sizes for the decoder network, not including the input latent layer and the final output layer.decoder_activation::Vector{<:Function}
: Activation functions for each decoder layer, not including the final output layer.output_activation::Function
: Activation function for the final output layer.
Optional Keyword Arguments
init::Function=Flux.glorot_uniform
: Initialization function for the network parameters.
Returns
A CategoricalDecoder
object with the specified architecture and initialized weights.
Description
This function constructs a CategoricalDecoder
object, setting up its decoder network based on the provided specifications. The architecture begins with a dense layer mapping from the latent space, goes through a sequence of middle layers if specified, and finally maps to the output space.
The function ensures that there are appropriate activation functions provided for each layer in the decoder_neurons
and checks for potential mismatches in length.
The output layer uses the identity function as its activation function, and the output is reshaped to match the dimensions specified in size_input
. The output_activation
function is then applied over the first dimension of the reshaped output.
Note
Ensure that the lengths of decoderneurons and decoderactivation match, excluding the output layer. Also, the output activation function should return values that can be interpreted as probabilities, as the decoder models the output data as a categorical distribution.
AutoEncoderToolkit.CategoricalDecoder
— MethodCategoricalDecoder(
n_input, n_latent, decoder_neurons, decoder_activation,
output_activation; init=Flux.glorot_uniform
)
Constructs and initializes a CategoricalDecoder
object designed for variational autoencoders (VAEs). This function sets up a decoder network that maps from a latent space to an output space.
Arguments
size_input::AbstractVector{<:Int}
: Dimensionality of the output data (or the data to be reconstructed).n_latent::Int
: Dimensionality of the latent space.decoder_neurons::Vector{<:Int}
: Vector of layer sizes for the decoder network, not including the input latent layer and the final output layer.decoder_activation::Vector{<:Function}
: Activation functions for each decoder layer, not including the final output layer.output_activation::Function
: Activation function for the final output layer.
Optional Keyword Arguments
init::Function=Flux.glorot_uniform
: Initialization function for the network parameters.
Returns
A CategoricalDecoder
object with the specified architecture and initialized weights.
Description
This function constructs a CategoricalDecoder
object, setting up its decoder network based on the provided specifications. The architecture begins with a dense layer mapping from the latent space, goes through a sequence of middle layers if specified, and finally maps to the output space.
The function ensures that there are appropriate activation functions provided for each layer in the decoder_neurons
and checks for potential mismatches in length.
Note
Ensure that the lengths of decoderneurons and decoderactivation match, excluding the output layer. Also, the output activation function should return values that can be interpreted as probabilities, as the decoder models the output data as a categorical distribution.
Probabilistic functions
Given the probability-centered design of AutoEncoderToolkit.jl
, each variational encoder and decoder has an associated probabilistic function used when computing the evidence lower bound (ELBO). The following functions are available:
AutoEncoderToolkit.encoder_logposterior
— Functionencoder_logposterior(
z::AbstractVector,
encoder::AbstractGaussianLogEncoder,
encoder_output::NamedTuple
)
Computes the log-posterior of the latent variable z
given the encoder output under a Gaussian distribution with mean and standard deviation given by the encoder.
Arguments
z::AbstractVector
: The latent variable for which the log-posterior is to be computed.encoder::AbstractGaussianLogEncoder
: The encoder of the VAE, which is not used in the computation of the log-posterior. This argument is only used to know which method to call.encoder_output::NamedTuple
: The output of the encoder, which includes the mean and log standard deviation of the Gaussian distribution.
Returns
logposterior::T
: The computed log-posterior of the latent variablez
given the encoder output.
Description
The function computes the log-posterior of the latent variable z
given the encoder output under a Gaussian distribution. The mean and log standard deviation of the Gaussian distribution are extracted from the encoder_output
. The standard deviation is then computed by exponentiating the log standard deviation. The log-posterior is computed using the formula for the log-posterior of a Gaussian distribution.
Note
Ensure the dimensions of z
match the expected input dimensionality of the encoder
.
encoder_logposterior(
z::AbstractMatrix,
encoder::AbstractGaussianLogEncoder,
encoder_output::NamedTuple
)
Computes the log-posterior of the latent variable z
given the encoder output under a Gaussian distribution with mean and standard deviation given by the encoder.
Arguments
z::AbstractMatrix
: The latent variable for which the log-posterior is to be computed. Each column ofz
represents a different data point.encoder::AbstractGaussianLogEncoder
: The encoder of the VAE, which is not used in the computation of the log-posterior. This argument is only used to know which method to call.encoder_output::NamedTuple
: The output of the encoder, which includes the mean and log standard deviation of the Gaussian distribution.
Returns
logposterior::Vector
: The computed log-posterior of the latent variablez
given the encoder output. Each element of the vector corresponds to a different data point.
Description
The function computes the log-posterior of the latent variable z
given the encoder output under a Gaussian distribution. The mean and log standard deviation of the Gaussian distribution are extracted from the encoder_output
. The standard deviation is then computed by exponentiating the log standard deviation. The log-posterior is computed using the formula for the log-posterior of a Gaussian distribution.
Note
Ensure the dimensions of z
match the expected input dimensionality of the encoder
.
encoder_logposterior(
z::AbstractVector,
encoder::AbstractGaussianLogEncoder,
encoder_output::NamedTuple,
index::Int
)
Computes the log-posterior of the latent variable z
for a single data point specified by index
given the encoder output under a Gaussian distribution with mean and standard deviation given by the encoder.
Arguments
z::AbstractVector
: The latent variable for which the log-posterior is to be computed.encoder::AbstractGaussianLogEncoder
: The encoder of the VAE, which is not used in the computation of the log-posterior. This argument is only used to know which method to call.encoder_output::NamedTuple
: The output of the encoder, which includes the mean and log standard deviation of the Gaussian distribution for multiple data points.index::Int
: The index of the data point for which the log-posterior is to be computed.
Returns
logposterior::Float32
: The computed log-posterior of the latent variablez
for the specified data point given the encoder output.
Description
The function computes the log-posterior of the latent variable z
for a single data point specified by index
given the encoder output under a Gaussian distribution. The mean and log standard deviation of the Gaussian distribution are extracted from the encoder_output
for the specified data point. The standard deviation is then computed by exponentiating the log standard deviation. The log-posterior is computed using the formula for the log-posterior of a Gaussian distribution.
Note
Ensure the dimensions of z
match the expected input dimensionality of the encoder
. Also, ensure that index
is a valid index for the data points in encoder_output
.
AutoEncoderToolkit.encoder_kl
— Functionencoder_kl(
encoder::AbstractGaussianLogEncoder,
encoder_output::NamedTuple
)
Calculate the Kullback-Leibler (KL) divergence between the approximate posterior distribution and the prior distribution in a variational autoencoder with a Gaussian encoder.
The KL divergence for a Gaussian encoder with mean encoder_µ
and log standard deviation encoder_logσ
is computed against a standard Gaussian prior.
Arguments
encoder::AbstractGaussianLogEncoder
: Encoder network. This argument is not used in the computation of the KL divergence, but is included to allow for multiple encoder types to be used with the same function.encoder_output::NamedTuple
:NamedTuple
containing all the encoder outputs. It should have fieldsμ
andlogσ
representing the mean and log standard deviation of the encoder's output.
Returns
kl_div::Union{Number, Vector}
: The KL divergence for the entire batch of data points. Ifencoder_µ
is a vector,kl_div
is a scalar. Ifencoder_µ
is a matrix,kl_div
is a vector where each element corresponds to the KL divergence for a batch of data points.
Note
- It is assumed that the mapping from data space to latent parameters (
encoder_µ
andencoder_logσ
) has been performed prior to calling this function. Theencoder
argument is provided to indicate the type of decoder network used, but it is not used within the function itself.
AutoEncoderToolkit.spherical_logprior
— Functionspherical_logprior(z::AbstractVector, σ::Real=1.0f0)
Computes the log-prior of the latent variable z
under a spherical Gaussian distribution with zero mean and standard deviation σ
.
Arguments
z::AbstractVector
: The latent variable for which the log-prior is to be computed.σ::T=1.0f0
: The standard deviation of the spherical Gaussian distribution. Defaults to1.0f0
.
Returns
logprior::T
: The computed log-prior of the latent variablez
.
Description
The function computes the log-prior of the latent variable z
under a spherical Gaussian distribution with zero mean and standard deviation σ
. The log-prior is computed using the formula for the log-prior of a Gaussian distribution.
Note
Ensure the dimension of z
matches the expected dimensionality of the latent space.
spherical_logprior(z::AbstractMatrix, σ::Real=1.0f0)
Computes the log-prior of the latent variable z
under a spherical Gaussian distribution with zero mean and standard deviation σ
.
Arguments
z::AbstractMatrix
: The latent variable for which the log-prior is to be computed. Each column ofz
represents a different latent variable.σ::Real=1.0f0
: The standard deviation of the spherical Gaussian distribution. Defaults to1.0f0
.
Returns
logprior::T
: The computed log-prior(s) of the latent variablez
.
Description
The function computes the log-prior of the latent variable z
under a spherical Gaussian distribution with zero mean and standard deviation σ
. The log-prior is computed using the formula for the log-prior of a Gaussian distribution.
Note
Ensure the dimension of z
matches the expected dimensionality of the latent space.
Defining custom encoder and decoder types
We will omit all docstrings in the following examples for brevity. However, every struct and function in AutoEncoderToolkit.jl
is well-documented.
Let us imagine your particular task requires a custom encoder or decoder type. For example, let's imagine that for a particular application, you need a decoder whose output distribution is Poisson. In other words, the assumption is that each dimension in the input $x_i$ is a sample from a Poisson distribution with mean $\lambda_i$. Thus, on the decoder side, what the decoder return is a vector of these $\lambda$ paraeters. We thus need to define a custom decoder type.
struct PoissonDecoder <: AbstractVariationalDecoder
decoder::Flux.Chain
end # struct
With this struct defined, we need to define the forward-pass function for our custom PoissonDecoder
. All decoders in AutoEncoderToolkit.jl
return a NamedTuple
with the corresponding parameters of the distribution that defines them. In this case, the Poisson distribution is defined by a single parameter $\lambda$. Thus, we have a forward-pass of the form
function (decoder::PoissonDecoder)(z::AbstractArray)
# Run input to decoder network
return (λ=decoder.decoder(z),)
end # function
Next, we need to define the probabilistic function associated with this decoder. We know that the probability of observing $x_i$ given $\lambda_i$ is given by
\[P(x_i | \lambda_i) = \frac{\lambda_i^{x_i} e^{-\lambda_i}}{x_i!}. \tag{1}\]
If each $x_i$ is independent, then the probability of observing the entire input $x$ given the entire output $\lambda$ is given by the product of the individual probabilities, i.e.
\[P(x | \lambda) = \prod_i P(x_i | \lambda_i). \tag{2}\]
The log-likehood of the data given the output of the decoder is then given by
\[\mathcal{L}(x, \lambda) = \log P(x | \lambda) = \sum_i \log P(x_i | \lambda_i), \tag{3}\]
which, by using the properties of the logarithm, can be written as
\[\mathcal{L}(x, \lambda) = \sum_i x_i \log \lambda_i - \lambda_i - \log(x_i!). \tag{4}\]
We can then define the probabilistic function associated with the PoissonDecoder
as
function decoder_loglikelihood(
x::AbstractArray,
z::AbstractVector,
decoder::PoissonDecoder,
decoder_output::NamedTuple;
)
# Extract the lambda parameter of the Poisson distribution
λ = decoder_output.λ
# Compute log-likelihood
loglikelihood = sum(x .* log.(λ) - λ - loggamma.(x .+ 1))
return loglikelihood
end # function
where we use the loggamma
function from SpecialFunctions.jl
to compute the log of the factorial of x_i
.
We only defined the decoder_loglikelihood
method for z::AbstractVector
. One should also include a method for z::AbstractMatrix
used when performing batch training.
With these two functions defined, our PoissonDecoder
is ready to be used with any of the different VAE flavors included in AutoEncoderToolkit.jl
!