model
In this section, we list the available models to be fit with either mcmc or vi. To see examples on how to implement these models, please check the examples tab.
Single-dataset single-environment
BarBay.model.fitness_normal — Functionfitness_normal(R̲̲::Matrix{Int64}, n̲ₜ::Vector{Int64}, n_neutral::Int, n_bc::Int; kwargs...)
Defines a model to estimate fitness effects in a competitive fitness experiment across growth-dilution cycles.
Model summary
- Prior on population mean fitness at time
t,π(sₜ)
sₜ ~ Normal(params=s_pop_prior)
- Prior on population log mean fitness associated error at time
t,π(logσₜ)
logσₜ ~ Normal(params=logσ_pop_prior)
- Prior on non-neutral relative fitness for barcoe
m,π(s⁽ᵐ⁾)
s⁽ᵐ⁾ ~ Normal(params=s_bc_prior)
- Prior on non-neutral log relative fitness associated error for barcode
mπ(logσ⁽ᵐ⁾)
logσ⁽ᵐ⁾ ~ Normal(params=logσ_bc_prior)
- Prior on log Poisson distribtion parameters for barcode
mat timet,π(logλₜᵣ⁽ᵐ⁾)
logλₜ⁽ᵐ⁾ ~ Normal(params=logλ_prior)
- Probability of total number of barcodes read given the Poisson distribution parameters at time
tπ(nₜ | logλ̲ₜ)
nₜ ~ Poisson(∑ₘ exp(λₜ⁽ᵐ⁾))
- Barcode
jfrequency at timet(deterministic relationship from the Poisson parameters)
fₜ⁽ʲ⁾ = λₜ⁽ʲ⁾ / ∑ₖ λₜ⁽ᵏ⁾
- log frequency ratio for barcode
jat timet(deterministic relationship from barcode frequencies)
logγₜ⁽ʲ⁾ = log(f₍ₜ₊₁₎⁽ʲ⁾ / fₜ⁽ʲ⁾)
- Probability of number of reads at time
tfor all barcodes given the total number of reads and the barcode frequenciesπ(r̲ₜ | nₜ, f̲ₜ)
r̲ₜ ~ Multinomial(nₜ, f̲ₜ)
- Probability of neutral barcodes frequency ratio for barcode
nat timetπ(logγₜ⁽ⁿ⁾| sₜ, σₜ)
logγₜ⁽ⁿ⁾ ~ Normal(μ = -sₜ, σ = exp(logσₜ))
- Probability of non-neutral barcodes frequency ratio for barcode
mat timetπ(logγₜ⁽ᵐ⁾| s⁽ᵐ⁾, σ⁽ᵐ⁾, sₜ)
logγₜ⁽ᵐ⁾ ~ Normal(μ = s⁽ᵐ⁾ - sₜ, σ = exp(logσ⁽ᵐ⁾))
Arguments
R̲̲::Matrix{Int64}:T × Bmatrix–split into a vector of vectors for computational efficiency–whereTis the number of time points in the data set andBis the number of barcodes. Each column represents the barcode count trajectory for a single lineage.n̲ₜ::Vector{Int64}: Vector with the total number of barcode counts for each time point. NOTE: This vector must be equivalent to computingvec(sum(R̲̲, dims=2)).n_neutral::Int: Number of neutral lineages in dataset.n_bc::Int: Number of mutant lineages in dataset.
Optional Keyword Arguments
s_pop_prior::VecOrMat{Float64}=[0.0, 2.0]: Vector or Matrix with the corresponding parameters (Vector:s_pop_prior[1]= mean,s_pop_prior[2]= standard deviation, Matrix:s_pop_prior[:, 1]= mean,s_pop_prior[:, 2]= standard deviation) for a Normal prior on the population mean fitness values. Iftypeof(s_pop_prior) <: Matrix, there should be as many rows in the matrix as pairs of adjacent time points in dataset.logσ_pop_prior::VecOrMat{Float64}=[0.0, 1.0]: Vector or Matrix with the corresponding parameters (Vector:logσ_pop_prior[1]= mean,logσ_pop_prior[2]= standard deviation, Matrix:σ_pop_prior[:, 1]= mean,logσ_pop_prior[:, 2]= standard deviation) for a Normal prior on the population mean fitness log-error utilized in the log-likelihood function. Iftypeof(logσ_pop_prior) <: Matrix, there should be as many rows in the matrix as pairs of adjacent time points in dataset.s_bc_prior::VecOrMat{Float64}=[0.0, 2.0]: Vector or Matrix with the corresponding parameters (Vector:s_bc_prior[1]= mean,s_bc_prior[2]= standard deviation, Matrix:s_bc_prior[:, 1]= mean,s_bc_prior[:, 2]= standard deviation) for a Normal prior on the mutant fitness values. Iftypeof(s_bc_prior) <: Matrix, there should be as many rows in the matrix as mutant lineages × number of unique environments in the dataset.logσ_bc_prior::VecOrMat{Float64}=[0.0, 1.0]: Vector or Matrix with the corresponding parameters (Vector:logσ_bc_prior[1]= mean,logσ_bc_prior[2]= standard deviation, Matrix:logσ_bc_prior[:, 1]= mean,logσ_bc_prior[:, 2]= standard deviation) for a Normal prior on the mutant fitness log-error utilized in the log-likelihood function. Iftypeof(logσ_bc_prior) <: Matrix, there should be as many rows in the matrix as mutant lineages × number of unique environments in the dataset.logλ_prior::VecOrMat{Float64}=[3.0, 3.0]: Vector or Matrix with the corresponding parameters (Vector:logλ_prior[1]= mean,logλ_prior[2]= standard deviation, Matrix:logλ_prior[:, 1]= mean,logλ_prior[:, 2]= standard deviation) for a Normal prior on the log of the λ parameter in the Poisson distribution. The λ parameter can be interpreted as the mean number of barcode counts since we assume any barcode countn⁽ᵇ⁾ ~ Poisson(λ⁽ᵇ⁾). Iftypeof(λ_prior) <: Matrix, there should be as many rows in the matrix as number of barcodes × number of time points in the dataset.
Latent Variables
- Population mean fitness per timepoint.
- Mutant fitness effects.
- λ dispersion parameters per barcode and timepoint.
Notes
- Models fitness effects as normally distributed.
- Utilizes a Poisson observation model for barcode counts.
- Setting informative priors is recommended for stable convergence.
Single-dataset multi-environment
BarBay.model.multienv_fitness_normal — Functionmultienv_fitness_normal(R̲̲::Matrix{Int64}, n̲ₜ::Vector{Int64}, n_neutral::Int, n_bc::Int; kwargs...)
Defines a model to estimate fitness effects in a competitive fitness experiment with different environments across growth-dilution cycles.
Model summary
- Prior on population mean fitness at time
t,π(sₜ)
sₜ ~ Normal(params=s_pop_prior)
- Prior on population log mean fitness associated error at time
t,π(logσₜ)
logσₜ ~ Normal(params=logσ_pop_prior)
- Prior on non-neutral relative fitness for barcode
min environmentiπ(sᵢ⁽ᵐ⁾)
sᵢ⁽ᵐ⁾ ~ Normal(params=s_bc_prior)
- Prior on non-neutral log relative fitness associated error for barcode
min environmentiπ(logσᵢ⁽ᵐ⁾)
logσᵢ⁽ᵐ⁾ ~ Normal(params=logσ_bc_prior)
- Prior on log Poisson distribtion parameters for barcode
mat timet,π(logλₜᵣ⁽ᵐ⁾)
logλₜ⁽ᵐ⁾ ~ Normal(params=logλ_prior)
- Probability of total number of barcodes read given the Poisson distribution parameters at time
tπ(nₜ | logλ̲ₜ)
nₜ ~ Poisson(∑ₘ exp(λₜ⁽ᵐ⁾))
- Barcode
jfrequency at timet(deterministic relationship from the Poisson parameters)
fₜ⁽ʲ⁾ = λₜ⁽ʲ⁾ / ∑ₖ λₜ⁽ᵏ⁾
- log frequency ratio for barcode
jat timet(deterministic relationship from barcode frequencies)
logγₜ⁽ʲ⁾ = log(f₍ₜ₊₁₎⁽ʲ⁾ / fₜ⁽ʲ⁾)
- Probability of number of reads at time
tfor all barcodes given the total number of reads and the barcode frequenciesπ(r̲ₜ | nₜ, f̲ₜ)
r̲ₜ ~ Multinomial(nₜ, f̲ₜ)
- Probability of neutral barcodes frequency ratio for barcode
nat timetπ(logγₜ⁽ⁿ⁾| sₜ, logσₜ)
logγₜ⁽ⁿ⁾ ~ Normal(μ = -sₜ, σ = exp(logσₜ))
- Probability of non-neutral barcodes frequency ratios
π(logγₜ⁽ᵐ⁾| s⁽ᵐ⁾, logσ⁽ᵐ⁾, sₜ). Note: This is done grouping by corresponding environment such that if timetis associated with environmenti, sᵢ⁽ᵐ⁾ is used as the fitness value.
logγₜ⁽ᵐ⁾ ~ Normal(μ = sᵢ⁽ᵐ⁾ - sₜ, σ = exp(σ⁽ᵐ⁾))
Arguments
R̲̲::Matrix{Int64}::T × Bmatrix–split into a vector of vectors for computational efficiency–whereTis the number of time points in the data set andBis the number of barcodes. Each column represents the barcode count trajectory for a single lineage.n̲ₜ::Vector{Int64}: Vector with the total number of barcode counts for each time point. NOTE: This vector must be equivalent to computingvec(sum(R̲̲, dims=2)).n_neutral::Int: Number of neutral lineages in dataset.n_bc::Int: Number of mutant lineages in dataset.
Keyword Arguments
envs::Vector{<:Any}: List of environments for each time point in dataset. NOTE: The length must be equal to that ofn̲tto have one environment per time point.
Optional Keyword Arguments
s_pop_prior::VecOrMat{Float64}=[0.0, 2.0]: Vector or Matrix with the corresponding parameters (Vector:s_pop_prior[1]= mean,s_pop_prior[2]= standard deviation, Matrix:s_pop_prior[:, 1]= mean,s_pop_prior[:, 2]= standard deviation) for a Normal prior on the population mean fitness values. Iftypeof(s_pop_prior) <: Matrix, there should be as many rows in the matrix as pairs of adjacent time points in dataset.logσ_pop_prior::VecOrMat{Float64}=[0.0, 1.0]: Vector or Matrix with the corresponding parameters (Vector:logσ_pop_prior[1]= mean,logσ_pop_prior[2]= standard deviation, Matrix:σ_pop_prior[:, 1]= mean,logσ_pop_prior[:, 2]= standard deviation) for a Normal prior on the population mean fitness log-error utilized in the log-likelihood function. Iftypeof(logσ_pop_prior) <: Matrix, there should be as many rows in the matrix as pairs of adjacent time points in dataset.s_bc_prior::VecOrMat{Float64}=[0.0, 2.0]: Vector or Matrix with the corresponding parameters (Vector:s_bc_prior[1]= mean,s_bc_prior[2]= standard deviation, Matrix:s_bc_prior[:, 1]= mean,s_bc_prior[:, 2]= standard deviation) for a Normal prior on the mutant fitness values. Iftypeof(s_bc_prior) <: Matrix, there should be as many rows in the matrix as mutant lineages × number of unique environments in the dataset.logσ_bc_prior::VecOrMat{Float64}=[0.0, 1.0]: Vector or Matrix with the corresponding parameters (Vector:logσ_bc_prior[1]= mean,logσ_bc_prior[2]= standard deviation, Matrix:logσ_bc_prior[:, 1]= mean,logσ_bc_prior[:, 2]= standard deviation) for a Normal prior on the mutant fitness log-error utilized in the log-likelihood function. Iftypeof(logσ_bc_prior) <: Matrix, there should be as many rows in the matrix as mutant lineages × number of unique environments in the dataset.logλ_prior::VecOrMat{Float64}=[3.0, 3.0]: Vector or Matrix with the corresponding parameters (Vector:logλ_prior[1]= mean,logλ_prior[2]= standard deviation, Matrix:logλ_prior[:, 1]= mean,logλ_prior[:, 2]= standard deviation) for a Normal prior on the log of the λ parameter in the Poisson distribution. The λ parameter can be interpreted as the mean number of barcode counts since we assume any barcode countn⁽ᵇ⁾ ~ Poisson(λ⁽ᵇ⁾). Iftypeof(λ_prior) <: Matrix, there should be as many rows in the matrix as number of barcodes × number of time points in the dataset.
Latent Variables
- Population mean fitness per timepoint
- Mutant fitness effects per environment
- λ dispersion parameters per barcode and timepoint
Notes
- Models fitness effects as normally distributed.
- Utilizes a Poisson observation model for barcode counts.
- Can estimate time-varying and environment-specific fitness effects.
- Setting informative priors is recommended for stable convergence.
Single-dataset hierarchical model on genotypes
BarBay.model.genotype_fitness_normal — Functiongenotype_fitness_normal(R̲̲::Vector{Matrix{Int64}}, n̲ₜ::Vector{Vector{Int64}}, n_neutral::Int, n_bc::Int; kwargs...)
Defines a hierarchical model to estimate fitness effects in a competitive fitness experiment where multiple barcodes belong to a specific "genotype." This means that different barcodes are grouped together through a fitness hyperparameter where each barcode samples from the distribution of this hyperparameter.
Model summary
- Prior on population mean fitness at time
t,π(sₜ)
sₜ ~ Normal(params=s_pop_prior)
- Prior on population log mean fitness associated error
π(logσₜ)
logσₜ ~ Normal(params=logσ_pop_prior)
- Prior on non-neutral relative hyper-fitness for genotype
iπ(θᵢ)
θᵢ ~ Normal(params=s_bc_prior)
- prior on non-centered samples that allow local fitness to vary in the positive and negative direction for genotype
iπ(θ̃ᵢ⁽ᵐ⁾). Note, this is a standard normal with mean zero and standard deviation one.
θ̃ᵢ⁽ᵐ⁾ ~ Normal(μ = 0, σ = 1)
- prior on log deviations of local fitness of barcode
mfrom hyper-fitness for genotypeiπ(logτᵢ⁽ᵐ⁾)
logτᵢ⁽ᵐ⁾ ~ Normal(params=logτ_prior)
- local relative fitness for non-neutral barcode
mwith genotypei(deterministic relationship from hyper-priors)
sᵢ⁽ᵐ⁾ = θᵢ + θ̃ᵢ⁽ᵐ⁾ * exp(logτᵢ⁽ᵐ⁾)
- Prior on non-neutral log relative fitness associated error for non-neutral barcode
mwith genotypei,π(logσᵢ⁽ᵐ⁾)
logσᵢ⁽ᵐ⁾ ~ Normal(params=logσ_bc_prior)
- Prior on log Poisson distribtion parameters for barcode
mat timetin replicater,π(logλₜᵣ⁽ᵐ⁾)
logλₜᵣ⁽ᵐ⁾ ~ Normal(params=logλ_prior)
- Probability of total number of barcodes read given the Poisson distribution parameters
π(nₜ | logλ̲ₜ)
nₜ ~ Poisson(∑ₖ exp(λₜ⁽ᵏ⁾))
- Barcode
jfrequency at timet(deterministic relationship from the Poisson parameters)
fₜ⁽ʲ⁾ = λₜ⁽ʲ⁾ / ∑ₖ λₜ⁽ᵏ⁾
- log frequency ratio for barcode
jat timet(deterministic relationship from barcode frequencies)
- log frequency ratio for barcode
logγₜ⁽ʲ⁾ = log(fₜ₊₁⁽ʲ⁾ / fₜ⁽ʲ⁾)
- Probability of number of reads at time
tfor all barcodes given the total number of reads and the barcode frequenciesπ(r̲ₜ | nₜ, f̲ₜ)
r̲ₜ ~ Multinomial(nₜ, f̲ₜ)
- Probability of neutral barcodes
nfrequency ratio at timetπ(logγₜ⁽ⁿ⁾| sₜ, logσₜ)
logγₜ⁽ⁿ⁾ ~ Normal(μ = -sₜ, σ = exp(logσₜ))
- Probability of non-neutral barcode
mwith genotypeifrequency ratio at timetπ(logγₜ⁽ᵐ⁾| sᵢ⁽ᵐ⁾, logσᵢ⁽ᵐ⁾, sₜ)
logγₜ⁽ᵐ⁾ ~ Normal(μ = sᵢ⁽ᵐ⁾ - sₜ, σ = exp(logσᵢ⁽ᵐ⁾))
Arguments
R̲̲::Vector{Matrix{Int64}}:: LengthRvector wthT × Bmatrices whereTis the number of time points in the data set,Bis the number of barcodes, andRis the number of experimental replicates. For each matrix in the vector, each column represents the barcode count trajectory for a single lineage.n̲ₜ::Vector{Vector{Int64}}: Vector of vectors with the total number of barcode counts for each time point on each replicate. NOTE: This vector must be equivalent to computingvec.(sum.(R̲̲, dims=2)).n_neutral::Int: Number of neutral lineages in dataset.n_bc::Int: Number of mutant lineages in dataset.
Keyword Arguments
genotypes::Vector{Vector{<:Any}}: Vector with the list of genotypes for each non-neutral barcode.
Optional Keyword Arguments
s_pop_prior::VecOrMat{Float64}=[0.0, 2.0]: Vector or Matrix with the corresponding parameters (Vector:s_pop_prior[1]= mean,s_pop_prior[2]= standard deviation, Matrix:s_pop_prior[:, 1]= mean,s_pop_prior[:, 2]= standard deviation) for a Normal prior on the population mean fitness values. Iftypeof(s_pop_prior) <: Matrix, there should be as many rows in the matrix as pairs of time adjacent time points in dataset.logσ_pop_prior::VecOrMat{Float64}=[0.0, 1.0]: Vector or Matrix with the corresponding parameters (Vector:logσ_pop_prior[1]= mean,logσ_pop_prior[2]= standard deviation, Matrix:logσ_pop_prior[:, 1]= mean,logσ_pop_prior[:, 2]= standard deviation) for a Normal prior on the population mean fitness error utilized in the log-likelihood function. Iftypeof(logσ_pop_prior) <: Matrix, there should be as many rows in the matrix as pairs of time adjacent time points × number of replicates in dataset.s_bc_prior::VecOrMat{Float64}=[0.0, 2.0]: Vector or Matrix with the corresponding parameters (Vector:s_bc_prior[1]= mean,s_bc_prior[2]= standard deviation, Matrix:s_bc_prior[:, 1]= mean,s_bc_prior[:, 2]= standard deviation) for a Normal prior on the mutant fitness values. Iftypeof(s_bc_prior) <: Matrix, there should be as many rows in the matrix as number of mutant lineages × number of replicates in the dataset.logσ_bc_prior::VecOrMat{Float64}=[0.0, 1.0]: Vector or Matrix with the corresponding parameters (Vector:logσ_bc_prior[1]= mean,logσ_bc_prior[2]= standard deviation, Matrix:logσ_bc_prior[:, 1]= mean,logσ_bc_prior[:, 2]= standard deviation) for a Normal prior on the mutant fitness error utilized in the log-likelihood function. Iftypeof(logσ_bc_prior) <: Matrix, there should be as many rows in the matrix as mutant lineages × number of replicates in the dataset.logλ_prior::VecOrMat{Float64}=[3.0, 3.0]: Vector or Matrix with the corresponding parameters (Vector:logλ_prior[1]= mean,logλ_prior[2]= standard deviation, Matrix:logλ_prior[:, 1]= mean,logλ_prior[:, 2]= standard deviation) for a Normal prior on the λ parameter in the Poisson distribution. The λ parameter can be interpreted as the mean number of barcode counts since we assume any barcode countn⁽ᵇ⁾ ~ Poisson(λ⁽ᵇ⁾). Iftypeof(logλ_prior) <: Matrix, there should be as many rows in the matrix as number of barcodes × number of time points × number of replicates in the dataset.
Latent Variables
- Population mean fitness per timepoint.
- Genotype hyper-fitness effects.
- Non-centered samples for each of the non-neutral barcodes.
- Deviations from the hyper parameter value for each non-neutral barcode.
- λ dispersion parameters per barcode and timepoint.
Notes
- Models hyper-fitness effects as normally distributed.
- Models fitness effects as normally distributed.
- Utilizes a Poisson observation model for barcode counts.
- Setting informative priors is recommended for stable convergence.
Multi-replicate single-environment hierarchical model for experimental replicates
BarBay.model.replicate_fitness_normal — Functionreplicate_fitness_normal(R̲̲::Array{Int64,3}, n̲ₜ::Matrix{Int64}, n_neutral::Int, n_bc::Int; kwargs...)
Defines a hierarchical model to estimate fitness effects in a competitive fitness experiment across growth-dilution cycles over multiple experimental replicates.
Model summary
- Prior on population mean fitness at time
tin replicater,π(sₜᵣ)
sₜᵣ ~ Normal(params=s_pop_prior)
- Prior on population log mean fitness associated error at time
tin replicater,π(logσₜ)
logσₜᵣ ~ Normal(params=logσ_pop_prior)
- Prior on non-neutral relative hyper-fitness for barcode
mπ(θ⁽ᵐ⁾)
θ⁽ᵐ⁾ ~ Normal(params=s_bc_prior)
- Prior on non-centered samples that allow local fitness to vary in the positive and negative direction for barcode
min experimental replicaterπ(θ̃ᵣ⁽ᵐ⁾). Note, this is a standard normal with mean zero and standard deviation one.
θ̃ᵣ⁽ᵐ⁾ ~ Normal(μ = 0, σ = 1)
- Prior on log deviations of local fitness from hyper-fitness for barcode
min replicaterπ(logτᵣ⁽ᵐ⁾)
logτᵣ⁽ᵐ⁾ ~ Normal(params=logτ_prior)
- Local relative fitness for non-neutral barcode
min replicater(deterministic relationship from hyper-priors)
sᵣ⁽ᵐ⁾ = θ⁽ᵐ⁾ + θ̃ᵣ⁽ᵐ⁾ * exp(logτᵣ⁽ᵐ⁾)
- Prior on non-neutral log relative fitness associated error for non-neutral barcode
min replcater,π(logσᵣ⁽ᵐ⁾)
logσᵣ⁽ᵐ⁾ ~ Normal(params=logσ_bc_prior)
- Prior on log Poisson distribtion parameters for barcode
mat timetin replicater,π(logλₜᵣ⁽ᵐ⁾)
logλₜᵣ⁽ᵐ⁾ ~ Normal(params=logλ_prior)
- Probability of total number of barcodes read given the Poisson distribution parameters at time
tin replicaterπ(nₜᵣ | logλ̲ₜᵣ)
nₜᵣ ~ Poisson(∑ₘ exp(λₜᵣ⁽ᵐ⁾))
- Barcode
jfrequency at timetin replciater(deterministic relationship from the Poisson parameters)
fₜᵣ⁽ʲ⁾ = λₜᵣ⁽ʲ⁾ / ∑ₖ λₜᵣ⁽ᵏ⁾
- Log frequency ratio for barcode
jat timetin replicater(deterministic relationship from barcode frequencies)
logγₜᵣ⁽ʲ⁾ = log(f₍ₜ₊₁₎ᵣ⁽ʲ⁾ / fₜᵣ⁽ʲ⁾)
- Probability of number of reads at time
tfor all barcodes in replicatergiven the total number of reads and the barcode frequenciesπ(r̲ₜᵣ | nₜᵣ, f̲ₜᵣ)
r̲ₜᵣ ~ Multinomial(nₜᵣ, f̲ₜᵣ)
- Probability of neutral barcodes
nfrequency ratio at timetin replicater,π(logγₜᵣ⁽ⁿ⁾| sₜᵣ, logσₜᵣ)
logγₜᵣ⁽ⁿ⁾ ~ Normal(μ = -sₜᵣ, σ = exp(logσₜᵣ))
- Probability of non-neutral barcode
mfrequency ratio at timetin replicaterπ(logγₜᵣ⁽ᵐ⁾| sᵣ⁽ᵐ⁾, logσᵣ⁽ᵐ⁾, sₜᵣ)
logγₜ⁽ᵐ⁾ ~ Normal(μ = sᵣ⁽ᵐ⁾ - sₜᵣ, σ = exp(logσᵣ⁽ᵐ⁾))
Arguments
R̲̲::Array{Int64, 3}::T × B × RwhereTis the number of time points in the data set,Bis the number of barcodes, andRis the number of experimental replicates. For each slice in theRaxis, each column represents the barcode count trajectory for a single lineage.n̲ₜ::Matrix{Int64}: Matrix with the total number of barcode counts for each time point on each replicate. NOTE: This matrix must be equivalent to computingvec(sum(R̲̲, dims=2)).n_neutral::Int: Number of neutral lineages in dataset.n_bc::Int: Number of mutant lineages in dataset.
Optional Keyword Arguments
s_pop_prior::VecOrMat{Float64}=[0.0, 2.0]: Vector or Matrix with the corresponding parameters (Vector:s_pop_prior[1]= mean,s_pop_prior[2]= standard deviation, Matrix:s_pop_prior[:, 1]= mean,s_pop_prior[:, 2]= standard deviation) for a Normal prior on the population mean fitness values. Iftypeof(s_pop_prior) <: Matrix, there should be as many rows in the matrix as pairs of time adjacent time points in dataset.logσ_pop_prior::VecOrMat{Float64}=[0.0, 1.0]: Vector or Matrix with the corresponding parameters (Vector:logσ_pop_prior[1]= mean,logσ_pop_prior[2]= standard deviation, Matrix:logσ_pop_prior[:, 1]= mean,logσ_pop_prior[:, 2]= standard deviation) for a Normal prior on the population mean fitness error utilized in the log-likelihood function. Iftypeof(logσ_pop_prior) <: Matrix, there should be as many rows in the matrix as pairs of time adjacent time points × number of replicates in dataset.s_bc_prior::VecOrMat{Float64}=[0.0, 2.0]: Vector or Matrix with the corresponding parameters (Vector:s_bc_prior[1]= mean,s_bc_prior[2]= standard deviation, Matrix:s_bc_prior[:, 1]= mean,s_bc_prior[:, 2]= standard deviation) for a Normal prior on the mutant fitness values. Iftypeof(s_bc_prior) <: Matrix, there should be as many rows in the matrix as number of mutant lineages × number of replicates in the dataset.logσ_bc_prior::VecOrMat{Float64}=[0.0, 1.0]: Vector or Matrix with the corresponding parameters (Vector:s_bc_prior[1]= mean,s_bc_prior[2]= standard deviation, Matrix:s_bc_prior[:, 1]= mean,s_bc_prior[:, 2]= standard deviation) for a Normal prior on the mutant fitness error utilized in the log-likelihood function. Iftypeof(logσ_bc_prior) <: Matrix, there should be as many rows in the matrix as mutant lineages × number of replicates in the dataset.logλ_prior::VecOrMat{Float64}=[3.0, 3.0]: Vector or Matrix with the corresponding parameters (Vector:logλ_prior[1]= mean,logλ_prior[2]= standard deviation, Matrix:logλ_prior[:, 1]= mean,logλ_prior[:, 2]= standard deviation) for a Normal prior on the λ parameter in the Poisson distribution. The λ parameter can be interpreted as the mean number of barcode counts since we assume any barcode countn⁽ᵇ⁾ ~ Poisson(λ⁽ᵇ⁾). Iftypeof(logλ_prior) <: Matrix, there should be as many rows in the matrix as number of barcodes × number of time points × number of replicates in the dataset.
Latent Variables
- Population mean fitness per timepoint.
- Mutant hyper-fitness effects.
- Mutant fitness effects per experimental replicate.
- λ dispersion parameters per barcode and timepoint.
Notes
- Models hyper-fitness effects as normally distributed.
- Models fitness effects as normally distributed.
- Utilizes a Poisson observation model for barcode counts.
- Setting informative priors is recommended for stable convergence.
replicatefitnessnormal(R̲̲::Vector{Matrix{Int64}}, n̲ₜ::Vector{Vector{Int64}}, nneutral::Int, nbc::Int; kwargs...)
Defines a hierarchical model to estimate fitness effects in a competitive fitness experiment across growth-dilution cycles over multiple experimental replicates.
Arguments
R̲̲::Vector{Matrix{Int64}}:: LengthRvector wthT × Bmatrices whereTis the number of time points in the data set,Bis the number of barcodes, andRis the number of experimental replicates. For each matrix in the vector, each column represents the barcode count trajectory for a single lineage.n̲ₜ::Vector{Vector{Int64}}: Vector of vectors with the total number of barcode counts for each time point on each replicate. NOTE: This vector must be equivalent to computingvec.(sum.(R̲̲, dims=2)).n_neutral::Int: Number of neutral lineages in dataset.n_bc::Int: Number of mutant lineages in dataset.
Optional Keyword Arguments
s_pop_prior::VecOrMat{Float64}=[0.0, 2.0]: Vector or Matrix with the corresponding parameters (Vector:s_pop_prior[1]= mean,s_pop_prior[2]= standard deviation, Matrix:s_pop_prior[:, 1]= mean,s_pop_prior[:, 2]= standard deviation) for a Normal prior on the population mean fitness values. Iftypeof(s_pop_prior) <: Matrix, there should be as many rows in the matrix as pairs of time adjacent time points in dataset.logσ_pop_prior::VecOrMat{Float64}=[0.0, 1.0]: Vector or Matrix with the corresponding parameters (Vector:logσ_pop_prior[1]= mean,logσ_pop_prior[2]= standard deviation, Matrix:logσ_pop_prior[:, 1]= mean,logσ_pop_prior[:, 2]= standard deviation) for a Normal prior on the population mean fitness error utilized in the log-likelihood function. Iftypeof(logσ_pop_prior) <: Matrix, there should be as many rows in the matrix as pairs of time adjacent time points × number of replicates in dataset.s_bc_prior::VecOrMat{Float64}=[0.0, 2.0]: Vector or Matrix with the corresponding parameters (Vector:s_bc_prior[1]= mean,s_bc_prior[2]= standard deviation, Matrix:s_bc_prior[:, 1]= mean,s_bc_prior[:, 2]= standard deviation) for a Normal prior on the mutant fitness values. Iftypeof(s_bc_prior) <: Matrix, there should be as many rows in the matrix as number of mutant lineages × number of replicates in the dataset.logσ_bc_prior::VecOrMat{Float64}=[0.0, 1.0]: Vector or Matrix with the corresponding parameters (Vector:s_bc_prior[1]= mean,s_bc_prior[2]= standard deviation, Matrix:s_bc_prior[:, 1]= mean,s_bc_prior[:, 2]= standard deviation) for a Normal prior on the mutant fitness error utilized in the log-likelihood function. Iftypeof(logσ_bc_prior) <: Matrix, there should be as many rows in the matrix as mutant lineages × number of replicates in the dataset.logλ_prior::VecOrMat{Float64}=[3.0, 3.0]: Vector or Matrix with the corresponding parameters (Vector:logλ_prior[1]= mean,logλ_prior[2]= standard deviation, Matrix:logλ_prior[:, 1]= mean,logλ_prior[:, 2]= standard deviation) for a Normal prior on the λ parameter in the Poisson distribution. The λ parameter can be interpreted as the mean number of barcode counts since we assume any barcode countn⁽ᵇ⁾ ~ Poisson(λ⁽ᵇ⁾). Iftypeof(logλ_prior) <: Matrix, there should be as many rows in the matrix as number of barcodes × number of time points × number of replicates in the dataset.
Latent Variables
- Population mean fitness per timepoint.
- Mutant hyper-fitness effects.
- Mutant fitness effects per experimental replicate.
- λ dispersion parameters per barcode and timepoint.
Notes
- Models hyper-fitness effects as normally distributed.
- Models fitness effects as normally distributed.
- Utilizes a Poisson observation model for barcode counts.
- Setting informative priors is recommended for stable convergence.
Multi-replicate multi-environment hierarchical model for experimental replicates
BarBay.model.multienv_replicate_fitness_normal — Functionmultienv_replicate_fitness_normal(R̲̲::Matrix{Int64}, n̲ₜ::Vector{Int64}, n_neutral::Int, n_bc::Int; kwargs...)
Defines a hierarchical model to estimate fitness effects in a competitive fitness experiment with different environments across growth-dilution cycles over multiple experimental replicates.
Model summary
- Prior on population mean fitness at time
tin replicater,π(sₜᵣ)
sₜᵣ ~ Normal(params=s_pop_prior)
- Prior on population log mean fitness associated error at time
tin replicater,π(logσₜ)
logσₜᵣ ~ Normal(params=logσ_pop_prior)
- Prior on non-neutral relative hyper-fitness for barcode
min environmentiπ(θᵢ⁽ᵐ⁾)
θᵢ⁽ᵐ⁾ ~ Normal(params=s_bc_prior)
- Prior on non-centered samples that allow local fitness to vary in the positive and negative direction for barcode
min experimental replicaterin environmentiπ(θ̃ᵣᵢ⁽ᵐ⁾). Note, this is a standard normal with mean zero and standard deviation one.
θ̃ᵣᵢ⁽ᵐ⁾ ~ Normal(μ = 0, σ = 1)
- Prior on log deviations of local fitness from hyper-fitness for barcode
min replicaterin environmentiπ(logτᵣᵢ⁽ᵐ⁾)
logτᵣᵢ⁽ᵐ⁾ ~ Normal(params=logτ_prior)
- Local relative fitness for non-neutral barcode
min replicaterin environmenti(deterministic relationship from hyper-priors)
sᵣᵢ⁽ᵐ⁾ = θᵢ⁽ᵐ⁾ + θ̃ᵣᵢ⁽ᵐ⁾ * exp(logτᵣᵢ⁽ᵐ⁾)
- Prior on non-neutral log relative fitness associated error for non-neutral barcode
min replcaterin environmenti,π(logσᵣᵢ⁽ᵐ⁾)
logσᵣᵢ⁽ᵐ⁾ ~ Normal(params=logσ_bc_prior)
- Prior on log Poisson distribtion parameters for barcode
mat timetin replicater,π(logλₜᵣ⁽ᵐ⁾)
logλₜᵣ⁽ᵐ⁾ ~ Normal(params=logλ_prior)
- Probability of total number of barcodes read given the Poisson distribution parameters at time
tin replicaterπ(nₜᵣ | logλ̲ₜᵣ)
nₜᵣ ~ Poisson(∑ₘ exp(λₜᵣ⁽ᵐ⁾))
- Barcode
jfrequency at timetin replicater(deterministic relationship from the Poisson parameters)
fₜᵣ⁽ʲ⁾ = λₜᵣ⁽ʲ⁾ / ∑ₖ λₜᵣ⁽ᵏ⁾
- Log frequency ratio for barcode
jat timetin replicater(deterministic relationship from barcode frequencies)
logγₜᵣ⁽ʲ⁾ = log(f₍ₜ₊₁₎ᵣ⁽ʲ⁾ / fₜᵣ⁽ʲ⁾)
- Probability of number of reads at time
tfor all barcodes in replicatergiven the total number of reads and the barcode frequenciesπ(r̲ₜᵣ | nₜᵣ, f̲ₜᵣ)
r̲ₜᵣ ~ Multinomial(nₜᵣ, f̲ₜᵣ)
- Probability of neutral barcodes
nfrequency ratio at timetin replicater,π(logγₜᵣ⁽ⁿ⁾| sₜᵣ, logσₜᵣ)
logγₜᵣ⁽ⁿ⁾ ~ Normal(μ = -sₜᵣ, σ = exp(logσₜᵣ))
- Probability of non-neutral barcodes frequency ratio for barcode
min replicaterπ(logγₜᵣ⁽ᵐ⁾| sᵣ⁽ᵐ⁾, logσᵣ⁽ᵐ⁾, sₜᵣ). Note: This is done grouping by corresponding environment such that if timetis associated with environmenti, sᵣᵢ⁽ᵐ⁾ is used as the fitness value.
logγₜᵣ⁽ᵐ⁾ ~ Normal(μ = sᵣᵢ⁽ᵐ⁾ - sₜᵣ, σ = exp(logσᵣᵢ⁽ᵐ⁾))
Arguments
R̲̲::Array{Int64, 3}::T × B × RwhereTis the number of time points in the data set,Bis the number of barcodes, andRis the number of experimental replicates. For each slice in theRaxis, each column represents the barcode count trajectory for a single lineage.n̲ₜ::Matrix{Int64}: Matrix with the total number of barcode counts for each time point on each replicate. NOTE: This matrix must be equivalent to computingvec(sum(R̲̲, dims=2)).n_neutral::Int: Number of neutral lineages in dataset.n_bc::Int: Number of mutant lineages in dataset.
Keyword Arguments
envs::Vector{<:Any}: List of environments for each time point in dataset. NOTE: The length must be equal to that of the number of rows inn̲tto have one environment per time point.
Optional Keyword Arguments
s_pop_prior::VecOrMat{Float64}=[0.0, 2.0]: Vector or Matrix with the corresponding parameters (Vector:s_pop_prior[1]= mean,s_pop_prior[2]= standard deviation, Matrix:s_pop_prior[:, 1]= mean,s_pop_prior[:, 2]= standard deviation) for a Normal prior on the population mean fitness values. Iftypeof(s_pop_prior) <: Matrix, there should be as many rows in the matrix as pairs of time adjacent time points in dataset.logσ_pop_prior::VecOrMat{Float64}=[0.0, 1.0]: Vector or Matrix with the corresponding parameters (Vector:logσ_pop_prior[1]= mean,logσ_pop_prior[2]= standard deviation, Matrix:logσ_pop_prior[:, 1]= mean,logσ_pop_prior[:, 2]= standard deviation) for a Normal prior on the population mean fitness error utilized in the log-likelihood function. Iftypeof(logσ_pop_prior) <: Matrix, there should be as many rows in the matrix as pairs of time adjacent time points × number of replicates in dataset.s_bc_prior::VecOrMat{Float64}=[0.0, 2.0]: Vector or Matrix with the corresponding parameters (Vector:s_bc_prior[1]= mean,s_bc_prior[2]= standard deviation, Matrix:s_bc_prior[:, 1]= mean,s_bc_prior[:, 2]= standard deviation) for a Normal prior on the mutant fitness values. Iftypeof(s_bc_prior) <: Matrix, there should be as many rows in the matrix as number of mutant lineages × number of replicates in the dataset.logσ_bc_prior::VecOrMat{Float64}=[0.0, 1.0]: Vector or Matrix with the corresponding parameters (Vector:logσ_bc_prior[1]= mean,logσ_bc_prior[2]= standard deviation, Matrix:logσ_bc_prior[:, 1]= mean,logσ_bc_prior[:, 2]= standard deviation) for a Normal prior on the mutant fitness error utilized in the log-likelihood function. Iftypeof(logσ_bc_prior) <: Matrix, there should be as many rows in the matrix as mutant lineages × number of replicates in the dataset.logλ_prior::VecOrMat{Float64}=[3.0, 3.0]: Vector or Matrix with the corresponding parameters (Vector:logλ_prior[1]= mean,logλ_prior[2]= standard deviation, Matrix:logλ_prior[:, 1]= mean,logλ_prior[:, 2]= standard deviation) for a Normal prior on the λ parameter in the Poisson distribution. The λ parameter can be interpreted as the mean number of barcode counts since we assume any barcode countn⁽ᵇ⁾ ~ Poisson(λ⁽ᵇ⁾). Iftypeof(logλ_prior) <: Matrix, there should be as many rows in the matrix as number of barcodes × number of time points × number of replicates in the dataset.
Latent Variables
- Population mean fitness per timepoint.
- Mutant hyper-fitness effects per environment.
- Non-centered samples for each of the experimental replicates.
- Deviations from the hyper parameter value for each experimental replicate.
- λ dispersion parameters per barcode and timepoint.
Notes
- All barcodes must be included in all replicates.
- Models hyper-fitness effects as normally distributed.
- Models fitness effects as normally distributed.
- Utilizes a Poisson observation model for barcode counts.
- Can estimate time-varying and environment-specific fitness effects.
- Setting informative priors is recommended for stable convergence.
multienvreplicatefitnessnormal(R̲̲::Vector{Matrix{Int64}}, n̲ₜ::Vector{Vector{Int64}}, nneutral::Int, n_bc::Int; kwargs...)
Defines a hierarchical model to estimate fitness effects in a competitive fitness experiment with different environments across growth-dilution cycles over multiple experimental replicates.
Arguments
R̲̲::Vector{Matrix{Int64}}:: LengthRvector wthT × Bmatrices whereTis the number of time points in the data set,Bis the number of barcodes, andRis the number of experimental replicates. For each matrix in the vector, each column represents the barcode count trajectory for a single lineage.n̲ₜ::Vector{Vector{Int64}}: Vector of vectors with the total number of barcode counts for each time point on each replicate. NOTE: This vector must be equivalent to computingvec.(sum.(R̲̲, dims=2)).n_neutral::Int: Number of neutral lineages in dataset.n_bc::Int: Number of mutant lineages in dataset.
Keyword Arguments
envs::Vector{Vector{<:Any}}: LengthRvector with the list of environments for each time point in each replicate. NOTE: The length must be equal to that of the number of rows inn̲ₜto have one environment per time point.
Optional Keyword Arguments
s_pop_prior::VecOrMat{Float64}=[0.0, 2.0]: Vector or Matrix with the corresponding parameters (Vector:s_pop_prior[1]= mean,s_pop_prior[2]= standard deviation, Matrix:s_pop_prior[:, 1]= mean,s_pop_prior[:, 2]= standard deviation) for a Normal prior on the population mean fitness values. Iftypeof(s_pop_prior) <: Matrix, there should be as many rows in the matrix as pairs of time adjacent time points in dataset.logσ_pop_prior::VecOrMat{Float64}=[0.0, 1.0]: Vector or Matrix with the corresponding parameters (Vector:logσ_pop_prior[1]= mean,logσ_pop_prior[2]= standard deviation, Matrix:logσ_pop_prior[:, 1]= mean,logσ_pop_prior[:, 2]= standard deviation) for a Normal prior on the population mean fitness error utilized in the log-likelihood function. Iftypeof(logσ_pop_prior) <: Matrix, there should be as many rows in the matrix as pairs of time adjacent time points × number of replicates in dataset.s_bc_prior::VecOrMat{Float64}=[0.0, 2.0]: Vector or Matrix with the corresponding parameters (Vector:s_bc_prior[1]= mean,s_bc_prior[2]= standard deviation, Matrix:s_bc_prior[:, 1]= mean,s_bc_prior[:, 2]= standard deviation) for a Normal prior on the mutant fitness values. Iftypeof(s_bc_prior) <: Matrix, there should be as many rows in the matrix as number of mutant lineages × number of replicates in the dataset.logσ_bc_prior::VecOrMat{Float64}=[0.0, 1.0]: Vector or Matrix with the corresponding parameters (Vector:logσ_bc_prior[1]= mean,logσ_bc_prior[2]= standard deviation, Matrix:logσ_bc_prior[:, 1]= mean,logσ_bc_prior[:, 2]= standard deviation) for a Normal prior on the mutant fitness error utilized in the log-likelihood function. Iftypeof(logσ_bc_prior) <: Matrix, there should be as many rows in the matrix as mutant lineages × number of replicates in the dataset.logλ_prior::VecOrMat{Float64}=[3.0, 3.0]: Vector or Matrix with the corresponding parameters (Vector:logλ_prior[1]= mean,logλ_prior[2]= standard deviation, Matrix:logλ_prior[:, 1]= mean,logλ_prior[:, 2]= standard deviation) for a Normal prior on the λ parameter in the Poisson distribution. The λ parameter can be interpreted as the mean number of barcode counts since we assume any barcode countn⁽ᵇ⁾ ~ Poisson(λ⁽ᵇ⁾). Iftypeof(logλ_prior) <: Matrix, there should be as many rows in the matrix as number of barcodes × number of time points × number of replicates in the dataset.
Latent Variables
- Population mean fitness per timepoint.
- Mutant hyper-fitness effects per environment.
- Non-centered samples for each of the experimental replicates.
- Deviations from the hyper parameter value for each experimental replicate.
- λ dispersion parameters per barcode and timepoint.
Notes
- All barcodes must be included in all replicates.
- All environments must appear at least once on each replicate.
- Models hyper-fitness effects as normally distributed.
- Models fitness effects as normally distributed.
- Utilizes a Poisson observation model for barcode counts.
- Can estimate time-varying and environment-specific fitness effects.
- Setting informative priors is recommended for stable convergence.