stats
Statistics functions
- scribe.stats.compute_histogram_percentiles(samples, percentiles=[5, 25, 50, 75, 95], normalize=True, sample_axis=0)[source]
Compute percentiles of histogram frequencies across multiple samples.
- Parameters:
samples (array-like) – Array of shape (n_samples, n_points) by default, or (n_points, n_samples) if sample_axis=1
percentiles (list-like, optional) – List of percentiles to compute (default: [5, 25, 50, 75, 95])
normalize (bool, optional) – Whether to normalize histograms (default: True)
sample_axis (int, optional) – Axis containing samples (default: 0)
- Returns:
bin_edges (array) – Array of bin edges (integers from min to max value + 1)
hist_percentiles (array) – Array of shape (len(percentiles), len(bin_edges)-1) containing the percentiles of histogram frequencies for each bin
- scribe.stats.compute_histogram_credible_regions(samples, credible_regions=[95, 68, 50], normalize=True, sample_axis=0, batch_size=1000, max_bin=None)[source]
Compute credible regions of histogram frequencies across multiple samples.
- Parameters:
samples (array-like) – Array of shape (n_samples, n_points) by default, or (n_points, n_samples) if sample_axis=1
credible_regions (list-like, optional) – List of credible region percentages to compute (default: [95, 68, 50]) For example, 95 will compute the 2.5 and 97.5 percentiles
normalize (bool, optional) – Whether to normalize histograms (default: True)
sample_axis (int, optional) – Axis containing samples (default: 0)
batch_size (int, optional) – Number of samples to process in each batch (default: 100)
max_bin (int, optional) – Maximum number of bins to process (default: None)
- Returns:
Dictionary containing: - ‘bin_edges’: array of bin edges - ‘regions’: nested dictionary where each key is the credible region percentage
- and values are dictionaries containing:
’lower’: lower bound of the credible region
’upper’: upper bound of the credible region
’median’: median (50th percentile)
- Return type:
- scribe.stats.compute_ecdf_percentiles(samples, percentiles=[5, 25, 50, 75, 95], sample_axis=0)[source]
Compute percentiles of ECDF values across multiple samples of integers.
- Parameters:
samples (array-like) – Array of shape (n_samples, n_points) by default, or (n_points, n_samples) if sample_axis=1, containing raw data samples of positive integers
percentiles (list-like, optional) – List of percentiles to compute (default: [5, 25, 50, 75, 95])
sample_axis (int, optional) – Axis containing samples (default: 0)
- Returns:
bin_edges (array) – Array of integer points at which ECDFs were evaluated (from min to max)
ecdf_percentiles (array) – Array of shape (len(percentiles), len(bin_edges)) containing the percentiles of ECDF values at each integer point
- scribe.stats.compute_ecdf_credible_regions(samples, credible_regions=[95, 68, 50], sample_axis=0, batch_size=1000, max_bin=None)[source]
Compute credible regions of ECDF values across multiple samples.
- Parameters:
samples (array-like) – Array of shape (n_samples, n_points) by default, or (n_points, n_samples) if sample_axis=1, containing raw data samples
credible_regions (list-like, optional) – List of credible region percentages to compute (default: [95, 68, 50]) For example, 95 will compute the 2.5 and 97.5 percentiles
sample_axis (int, optional) – Axis containing samples (default: 0)
batch_size (int, optional) – Number of samples to process in each batch (default: 1000)
max_bin (int, optional) – Maximum value to include in ECDF evaluation (default: None)
- Returns:
- Dictionary containing:
’bin_edges’: array of points at which ECDFs were evaluated
’regions’: nested dictionary where each key is the credible region percentage
- and values are dictionaries containing:
’lower’: lower bound of the credible region
’upper’: upper bound of the credible region
’median’: median (50th percentile)
- Return type:
- scribe.stats.sample_dirichlet_from_parameters(parameter_samples, n_samples_dirichlet=1, rng_key=Array([0, 42], dtype=uint32))[source]
Samples from a Dirichlet distribution given an array of parameter samples.
- Parameters:
parameter_samples (array-like) – Array of shape (n_samples, n_variables) containing parameter samples to use as concentration parameters for Dirichlet distributions
n_samples_dirichlet (int, optional) – Number of samples to draw from each Dirichlet distribution (default: 1)
rng_key (random.PRNGKey) – JAX random number generator key. Defaults to random.PRNGKey(42)
- Returns:
- If n_samples_dirichlet=1:
Array of shape (n_samples, n_variables)
- If n_samples_dirichlet>1:
Array of shape (n_samples, n_variables, n_samples_dirichlet)
- Return type:
jnp.ndarray
- scribe.stats.fit_dirichlet_mle(samples, max_iter=1000, tol=1e-07, sample_axis=0)[source]
Fit a Dirichlet distribution to samples using Maximum Likelihood Estimation.
This implementation uses Newton’s method to find the concentration parameters that maximize the likelihood of the observed samples. The algorithm iteratively updates the concentration parameters using gradient and Hessian information until convergence.
- Parameters:
samples (array-like) – Array of shape (n_samples, n_variables) by default, or (n_variables, n_samples) if sample_axis=1, containing Dirichlet samples. Each row/column should sum to 1.
max_iter (int, optional) – Maximum number of iterations for optimization (default: 1000)
tol (float, optional) – Tolerance for convergence in parameter updates (default: 1e-7)
sample_axis (int, optional) – Axis containing samples (default: 0)
- Returns:
Array of concentration parameters for the fitted Dirichlet distribution. Shape is (n_variables,).
- Return type:
jnp.ndarray
- scribe.stats.digamma_inv(y, num_iters=5)[source]
Approximate the inverse of the digamma function using Newton iterations.
The digamma function ψ(x) is the derivative of the log of the gamma function. This function computes its inverse ψ⁻¹(y) using Newton’s method:
x_{n+1} = x_n - (ψ(x_n) - y) / ψ’(x_n)
where ψ’ is the trigamma function (polygamma of order 1).
- Parameters:
y (array-like) – The input value(s) (can be scalar or vector) representing ψ(x) values for which we want to find x.
num_iters (int, optional) – Number of Newton iterations (default: 5). More iterations increase accuracy but take longer to compute.
- Returns:
x – The approximate inverse digamma of y, such that ψ(x) ≈ y.
- Return type:
array-like
- scribe.stats.fit_dirichlet_minka(samples, max_iter=1000, tol=1e-07, sample_axis=0)[source]
Fit a Dirichlet distribution to data using Minka’s fixed-point iteration.
- This function uses the relation:
ψ(α_j) - ψ(α₀) = ⟨ln x_j⟩ (with α₀ = ∑ₖ αₖ)
- so that the fixed point update is:
α_j ← ψ⁻¹( ψ(α₀) + ⟨ln x_j⟩ )
This method is generally more stable and faster than moment matching or maximum likelihood estimation via gradient descent.
- Parameters:
samples (array-like) – Data array with shape (n_samples, n_variables) by default (or transposed if sample_axis=1). Each row should sum to 1 (i.e., be a probability vector).
max_iter (int, optional) – Maximum number of iterations for the fixed-point algorithm.
tol (float, optional) – Tolerance for convergence - algorithm stops when max change in α is below this.
sample_axis (int, optional) – Axis containing samples (default: 0). Use 1 if data is (n_variables, n_samples).
- Returns:
Estimated concentration parameters (α) of shape (n_variables,).
- Return type:
jnp.ndarray
- scribe.stats.kl_gamma(alpha1, beta1, alpha2, beta2)[source]
Compute Kullback-Leibler (KL) divergence between two Gamma distributions.
Calculates KL(P||Q) where: P ~ Gamma(α₁, β₁) Q ~ Gamma(α₂, β₂)
The KL divergence is given by:
KL(P||Q) = (α₁ - α₂)ψ(α₁) - ln[Γ(α₁)] + ln[Γ(α₂)] + α₂ln(β₁/β₂) + α₁(β₂/β₁ - 1)
where: - ψ(x) is the digamma function - Γ(x) is the gamma function
- Parameters:
alpha1 (float or array-like) – Shape parameter α₁ of the first Gamma distribution P
beta1 (float or array-like) – Rate parameter β₁ of the first Gamma distribution P
alpha2 (float or array-like) – Shape parameter α₂ of the second Gamma distribution Q
beta2 (float or array-like) – Rate parameter β₂ of the second Gamma distribution Q
- Returns:
KL divergence between the two Gamma distributions
- Return type:
- scribe.stats.kl_beta(alpha1, beta1, alpha2, beta2)[source]
Compute Kullback-Leibler (KL) divergence between two Beta distributions.
Calculates KL(P||Q) where: P ~ Beta(α₁, β₁) Q ~ Beta(α₂, β₂)
The KL divergence is given by:
- KL(P||Q) = ln[B(α₂,β₂)] - ln[B(α₁,β₁)] + (α₁-α₂)ψ(α₁) + (β₁-β₂)ψ(β₁)
(α₂-α₁+β₂-β₁)ψ(α₁+β₁)
where: - ψ(x) is the digamma function - B(x,y) is the beta function
- Parameters:
alpha1 (float or array-like) – Shape parameter α₁ of the first Beta distribution P
beta1 (float or array-like) – Shape parameter β₁ of the first Beta distribution P
alpha2 (float or array-like) – Shape parameter α₂ of the second Beta distribution Q
beta2 (float or array-like) – Shape parameter β₂ of the second Beta distribution Q
- Returns:
KL divergence between the two Beta distributions
- Return type:
- scribe.stats.kl_lognormal(mu1, sigma1, mu2, sigma2)[source]
Compute Kullback-Leibler (KL) divergence between two log-normal distributions.
Calculates KL(P||Q) where: P ~ LogNormal(μ₁, σ₁) Q ~ LogNormal(μ₂, σ₂)
The KL divergence is given by:
KL(P||Q) = ln(σ₂/σ₁) + (σ₁² + (μ₁-μ₂)²)/(2σ₂²) - 1/2
- Parameters:
mu1 (float or array-like) – Location parameter μ₁ of the first log-normal distribution P
sigma1 (float or array-like) – Scale parameter σ₁ of the first log-normal distribution P
mu2 (float or array-like) – Location parameter μ₂ of the second log-normal distribution Q
sigma2 (float or array-like) – Scale parameter σ₂ of the second log-normal distribution Q
- Returns:
KL divergence between the two log-normal distributions
- Return type:
float or array-like
- scribe.stats.jensen_shannon_beta(alpha1, beta1, alpha2, beta2)[source]
Compute the Jensen-Shannon divergence between two Beta distributions.
The Jensen-Shannon divergence is a symmetrized and smoothed version of the Kullback-Leibler divergence, defined as:
JSD(P||Q) = 1/2 × KL(P||M) + 1/2 × KL(Q||M)
where M = 1/2 × (P + Q) is the average of the two distributions.
For Beta distributions, we compute this by: 1. Finding the parameters of the mixture distribution M 2. Computing KL(P||M) and KL(Q||M) 3. Taking the average of these KL divergences
- Parameters:
alpha1 (float or array) – Alpha parameter (shape) of the first Beta distribution
beta1 (float or array) – Beta parameter (shape) of the first Beta distribution
alpha2 (float or array) – Alpha parameter (shape) of the second Beta distribution
beta2 (float or array) – Beta parameter (shape) of the second Beta distribution
- Returns:
Jensen-Shannon divergence between the two Beta distributions
- Return type:
float or array
- scribe.stats.jensen_shannon_gamma(alpha1, beta1, alpha2, beta2)[source]
Compute the Jensen-Shannon divergence between two Gamma distributions.
The Jensen-Shannon divergence is a symmetrized and smoothed version of the Kullback-Leibler divergence, defined as:
JSD(P||Q) = 1/2 × KL(P||M) + 1/2 × KL(Q||M)
where M = 1/2 × (P + Q) is the average of the two distributions.
- For Gamma distributions, we compute this by:
Finding the parameters of the mixture distribution M
Computing KL(P||M) and KL(Q||M)
Taking the average of these KL divergences
- Parameters:
- Returns:
Jensen-Shannon divergence between the two Gamma distributions
- Return type:
float or array
- scribe.stats.jensen_shannon_lognormal(mu1, sigma1, mu2, sigma2)[source]
Compute the Jensen-Shannon divergence between two LogNormal distributions.
The Jensen-Shannon divergence is a symmetrized and smoothed version of the Kullback-Leibler divergence, defined as:
JSD(P||Q) = 1/2 × KL(P||M) + 1/2 × KL(Q||M)
where M = 1/2 × (P + Q) is the average of the two distributions.
- For LogNormal distributions, we compute this by:
Finding the parameters of the mixture distribution M
Computing KL(P||M) and KL(Q||M)
Taking the average of these KL divergences
- Parameters:
mu1 (float or array) – Location parameter of the first LogNormal distribution
sigma1 (float or array) – Scale parameter of the first LogNormal distribution
mu2 (float or array) – Location parameter of the second LogNormal distribution
sigma2 (float or array) – Scale parameter of the second LogNormal distribution
- Returns:
Jensen-Shannon divergence between the two LogNormal distributions
- Return type:
float or array
- scribe.stats.sq_hellinger_beta(alpha1, beta1, alpha2, beta2)[source]
Compute the squared Hellinger distance between two Beta distributions.
The squared Hellinger distance between two Beta distributions P and Q is given by:
H²(P,Q) = 1 - ∫ sqrt(P(x)Q(x)) dx
where P(x) and Q(x) are the probability density functions of P and Q, respectively.
For Beta distributions P ~ Beta(α₁,β₁) and Q ~ Beta(α₂,β₂), this has the closed form:
H²(P,Q) = 1 - B((α₁+α₂)/2, (β₁+β₂)/2) / sqrt(B(α₁,β₁) * B(α₂,β₂))
where B(x,y) is the beta function.
- Parameters:
alpha1 (float or array-like) – Shape parameter α₁ of the first Beta distribution P
beta1 (float or array-like) – Shape parameter β₁ of the first Beta distribution P
alpha2 (float or array-like) – Shape parameter α₂ of the second Beta distribution Q
beta2 (float or array-like) – Shape parameter β₂ of the second Beta distribution Q
- Returns:
Squared Hellinger distance between the two Beta distributions
- Return type:
float or array-like
- scribe.stats.hellinger_beta(alpha1, beta1, alpha2, beta2)[source]
Compute the Hellinger distance between two Beta distributions.
The Hellinger distance between two Beta distributions P and Q is given by:
H(P,Q) = sqrt(H²(P,Q))
where H²(P,Q) is the squared Hellinger distance.
For Beta distributions P ~ Beta(α₁,β₁) and Q ~ Beta(α₂,β₂), this has the closed form:
H(P,Q) = sqrt(1 - B((α₁+α₂)/2, (β₁+β₂)/2) / sqrt(B(α₁,β₁) * B(α₂,β₂)))
where B(x,y) is the beta function.
- Parameters:
alpha1 (float or array-like) – Shape parameter α₁ of the first Beta distribution P
beta1 (float or array-like) – Shape parameter β₁ of the first Beta distribution P
alpha2 (float or array-like) – Shape parameter α₂ of the second Beta distribution Q
beta2 (float or array-like) – Shape parameter β₂ of the second Beta distribution Q
- Returns:
Hellinger distance between the two Beta distributions
- Return type:
float or array-like
- scribe.stats.sq_hellinger_gamma(alpha1, beta1, alpha2, beta2)[source]
Compute the squared Hellinger distance between two Gamma distributions.
The squared Hellinger distance between two Gamma distributions P and Q is given by:
H²(P,Q) = 1 - ∫ sqrt(P(x)Q(x)) dx
where P(x) and Q(x) are the probability density functions of P and Q, respectively.
For Gamma distributions P ~ Gamma(α₁,β₁) and Q ~ Gamma(α₂,β₂), this has the closed form:
- H²(P,Q) = 1 - Γ((α₁+α₂)/2) * ((β₁+β₂)/2)^(-(α₁+α₂)/2) *
sqrt((β₁^α₁ * β₂^α₂)/(Γ(α₁) * Γ(α₂)))
where Γ(x) is the gamma function.
- Parameters:
alpha1 (float or array-like) – Shape parameter α₁ of the first Gamma distribution P
beta1 (float or array-like) – Rate parameter β₁ of the first Gamma distribution P
alpha2 (float or array-like) – Shape parameter α₂ of the second Gamma distribution Q
beta2 (float or array-like) – Rate parameter β₂ of the second Gamma distribution Q
- Returns:
Squared Hellinger distance between the two Gamma distributions
- Return type:
float or array-like
- scribe.stats.hellinger_gamma(alpha1, beta1, alpha2, beta2)[source]
Compute the Hellinger distance between two Gamma distributions.
The Hellinger distance between two Gamma distributions P and Q is given by:
H(P,Q) = sqrt(H²(P,Q))
where H²(P,Q) is the squared Hellinger distance.
For Gamma distributions P ~ Gamma(α₁,β₁) and Q ~ Gamma(α₂,β₂), this has the closed form:
- H(P,Q) = sqrt(1 - Γ((α₁+α₂)/2) * ((β₁+β₂)/2)^(-(α₁+α₂)/2) *
sqrt((β₁^α₁ * β₂^α₂)/(Γ(α₁) * Γ(α₂))))
where Γ(x) is the gamma function.
- Parameters:
alpha1 (float or array-like) – Shape parameter α₁ of the first Gamma distribution P
beta1 (float or array-like) – Rate parameter β₁ of the first Gamma distribution P
alpha2 (float or array-like) – Shape parameter α₂ of the second Gamma distribution Q
beta2 (float or array-like) – Rate parameter β₂ of the second Gamma distribution Q
- Returns:
Hellinger distance between the two Gamma distributions
- Return type:
float or array-like
- scribe.stats.sq_hellinger_lognormal(mu1, sigma1, mu2, sigma2)[source]
Compute the squared Hellinger distance between two log-normal distributions.
The squared Hellinger distance between two log-normal distributions P and Q is given by:
- H²(P,Q) = 1 - sqrt( σ1 * σ2 / (σ1² + σ2²) )
exp( - (μ1-μ2)² / [4*(σ1² + σ2²)] )
- Parameters:
mu1 (float or array-like) – Location parameter μ₁ of the first log-normal distribution P
sigma1 (float or array-like) – Scale parameter σ₁ of the first log-normal distribution P
mu2 (float or array-like) – Location parameter μ₂ of the second log-normal distribution Q
sigma2 (float or array-like) – Scale parameter σ₂ of the second log-normal distribution Q
- Returns:
Squared Hellinger distance between the two log-normal distributions
- Return type:
float or array-like
- scribe.stats.hellinger_lognormal(mu1, sigma1, mu2, sigma2)[source]
Compute the Hellinger distance between two log-normal distributions.
The Hellinger distance is the square root of the squared Hellinger distance.
- Parameters:
mu1 (float or array-like) – Location parameter μ₁ of the first log-normal distribution P
sigma1 (float or array-like) – Scale parameter σ₁ of the first log-normal distribution P
mu2 (float or array-like) – Location parameter μ₂ of the second log-normal distribution Q
sigma2 (float or array-like) – Scale parameter σ₂ of the second log-normal distribution Q
- Returns:
Hellinger distance between the two log-normal distributions
- Return type:
float or array-like
- scribe.stats.beta_mode(alpha, beta)[source]
Calculate the mode for a Beta distribution.
- For Beta(α,β) distribution, the mode is:
(α-1)/(α+β-2) when α,β > 1 undefined when α,β ≤ 1
- scribe.stats.gamma_mode(alpha, beta)[source]
Calculate the mode for a Gamma distribution.
- For Gamma(α,β) distribution, the mode is:
(α-1)/β when α > 1 0 when α ≤ 1
- scribe.stats.dirichlet_mode(alpha)[source]
Calculate the mode for a Dirichlet distribution.
For Dirichlet(α) distribution with concentration parameters α, the mode for each component is:
(αᵢ-1)/(∑ⱼαⱼ-K) when αᵢ > 1 for all i 0 when αᵢ ≤ 1 for any i
where K is the number of components.
- Parameters:
alpha (array-like) – Concentration parameters α of the Dirichlet distribution
- Returns:
Mode of the Dirichlet distribution
- Return type:
array-like