stats

Statistics functions

scribe.stats.compute_histogram_percentiles(samples, percentiles=[5, 25, 50, 75, 95], normalize=True, sample_axis=0)[source]

Compute percentiles of histogram frequencies across multiple samples.

Parameters:

samples (array-like) – Array of shape (n_samples, n_points) by default, or (n_points, n_samples) if sample_axis=1
percentiles (list-like, optional) – List of percentiles to compute (default: [5, 25, 50, 75, 95])
normalize (bool, optional) – Whether to normalize histograms (default: True)
sample_axis (int, optional) – Axis containing samples (default: 0)

Returns:

bin_edges (array) – Array of bin edges (integers from min to max value + 1)
hist_percentiles (array) – Array of shape (len(percentiles), len(bin_edges)-1) containing the percentiles of histogram frequencies for each bin

scribe.stats.compute_histogram_credible_regions(samples, credible_regions=[95, 68, 50], normalize=True, sample_axis=0, batch_size=1000, max_bin=None)[source]

Compute credible regions of histogram frequencies across multiple samples.

Parameters:

samples (array-like) – Array of shape (n_samples, n_points) by default, or (n_points, n_samples) if sample_axis=1
credible_regions (list-like, optional) – List of credible region percentages to compute (default: [95, 68, 50]) For example, 95 will compute the 2.5 and 97.5 percentiles
normalize (bool, optional) – Whether to normalize histograms (default: True)
sample_axis (int, optional) – Axis containing samples (default: 0)
batch_size (int, optional) – Number of samples to process in each batch (default: 100)
max_bin (int, optional) – Maximum number of bins to process (default: None)

Returns:

Dictionary containing: - ‘bin_edges’: array of bin edges - ‘regions’: nested dictionary where each key is the credible region percentage

and values are dictionaries containing:

’lower’: lower bound of the credible region

’upper’: upper bound of the credible region

’median’: median (50th percentile)

Return type:

dict

scribe.stats.compute_ecdf_percentiles(samples, percentiles=[5, 25, 50, 75, 95], sample_axis=0)[source]

Compute percentiles of ECDF values across multiple samples of integers.

Parameters:

samples (array-like) – Array of shape (n_samples, n_points) by default, or (n_points, n_samples) if sample_axis=1, containing raw data samples of positive integers
percentiles (list-like, optional) – List of percentiles to compute (default: [5, 25, 50, 75, 95])
sample_axis (int, optional) – Axis containing samples (default: 0)

Returns:

bin_edges (array) – Array of integer points at which ECDFs were evaluated (from min to max)
ecdf_percentiles (array) – Array of shape (len(percentiles), len(bin_edges)) containing the percentiles of ECDF values at each integer point

scribe.stats.compute_ecdf_credible_regions(samples, credible_regions=[95, 68, 50], sample_axis=0, batch_size=1000, max_bin=None)[source]

Compute credible regions of ECDF values across multiple samples.

Parameters:

samples (array-like) – Array of shape (n_samples, n_points) by default, or (n_points, n_samples) if sample_axis=1, containing raw data samples
credible_regions (list-like, optional) – List of credible region percentages to compute (default: [95, 68, 50]) For example, 95 will compute the 2.5 and 97.5 percentiles
sample_axis (int, optional) – Axis containing samples (default: 0)
batch_size (int, optional) – Number of samples to process in each batch (default: 1000)
max_bin (int, optional) – Maximum value to include in ECDF evaluation (default: None)

Returns:

Dictionary containing:

’bin_edges’: array of points at which ECDFs were evaluated
’regions’: nested dictionary where each key is the credible region percentage

and values are dictionaries containing:

’lower’: lower bound of the credible region
’upper’: upper bound of the credible region
’median’: median (50th percentile)

Return type:

dict

scribe.stats.sample_dirichlet_from_parameters(parameter_samples, n_samples_dirichlet=1, rng_key=Array([0, 42], dtype=uint32))[source]

Samples from a Dirichlet distribution given an array of parameter samples.

Parameters:

parameter_samples (array-like) – Array of shape (n_samples, n_variables) containing parameter samples to use as concentration parameters for Dirichlet distributions
n_samples_dirichlet (int, optional) – Number of samples to draw from each Dirichlet distribution (default: 1)
rng_key (random.PRNGKey) – JAX random number generator key. Defaults to random.PRNGKey(42)

Returns:

If n_samples_dirichlet=1:: Array of shape (n_samples, n_variables)
If n_samples_dirichlet>1:: Array of shape (n_samples, n_variables, n_samples_dirichlet)

Return type:

jnp.ndarray

scribe.stats.fit_dirichlet_mle(samples, max_iter=1000, tol=1e-07, sample_axis=0)[source]

Fit a Dirichlet distribution to samples using Maximum Likelihood Estimation.

This implementation uses Newton’s method to find the concentration parameters that maximize the likelihood of the observed samples. The algorithm iteratively updates the concentration parameters using gradient and Hessian information until convergence.

Parameters:

samples (array-like) – Array of shape (n_samples, n_variables) by default, or (n_variables, n_samples) if sample_axis=1, containing Dirichlet samples. Each row/column should sum to 1.
max_iter (int, optional) – Maximum number of iterations for optimization (default: 1000)
tol (float, optional) – Tolerance for convergence in parameter updates (default: 1e-7)
sample_axis (int, optional) – Axis containing samples (default: 0)

Returns:

Array of concentration parameters for the fitted Dirichlet distribution. Shape is (n_variables,).

Return type:

jnp.ndarray

scribe.stats.digamma_inv(y, num_iters=5)[source]

Approximate the inverse of the digamma function using Newton iterations.

The digamma function ψ(x) is the derivative of the log of the gamma function. This function computes its inverse ψ⁻¹(y) using Newton’s method:

x_{n+1} = x_n - (ψ(x_n) - y) / ψ’(x_n)

where ψ’ is the trigamma function (polygamma of order 1).

Parameters:

y (array-like) – The input value(s) (can be scalar or vector) representing ψ(x) values for which we want to find x.
num_iters (int, optional) – Number of Newton iterations (default: 5). More iterations increase accuracy but take longer to compute.

Returns:

x – The approximate inverse digamma of y, such that ψ(x) ≈ y.

Return type:

array-like

scribe.stats.fit_dirichlet_minka(samples, max_iter=1000, tol=1e-07, sample_axis=0)[source]

Fit a Dirichlet distribution to data using Minka’s fixed-point iteration.

This function uses the relation:: ψ(α_j) - ψ(α₀) = ⟨ln x_j⟩ (with α₀ = ∑ₖ αₖ)
so that the fixed point update is:: α_j ← ψ⁻¹( ψ(α₀) + ⟨ln x_j⟩ )

This method is generally more stable and faster than moment matching or maximum likelihood estimation via gradient descent.

Parameters:

samples (array-like) – Data array with shape (n_samples, n_variables) by default (or transposed if sample_axis=1). Each row should sum to 1 (i.e., be a probability vector).
max_iter (int, optional) – Maximum number of iterations for the fixed-point algorithm.
tol (float, optional) – Tolerance for convergence - algorithm stops when max change in α is below this.
sample_axis (int, optional) – Axis containing samples (default: 0). Use 1 if data is (n_variables, n_samples).

Returns:

Estimated concentration parameters (α) of shape (n_variables,).

Return type:

jnp.ndarray

scribe.stats.kl_gamma(alpha1, beta1, alpha2, beta2)[source]

Compute Kullback-Leibler (KL) divergence between two Gamma distributions.

Calculates KL(P||Q) where: P ~ Gamma(α₁, β₁) Q ~ Gamma(α₂, β₂)

The KL divergence is given by:

KL(P||Q) = (α₁ - α₂)ψ(α₁) - ln[Γ(α₁)] + ln[Γ(α₂)] + α₂ln(β₁/β₂) + α₁(β₂/β₁ - 1)

where: - ψ(x) is the digamma function - Γ(x) is the gamma function

Parameters:

alpha1 (float or array-like) – Shape parameter α₁ of the first Gamma distribution P
beta1 (float or array-like) – Rate parameter β₁ of the first Gamma distribution P
alpha2 (float or array-like) – Shape parameter α₂ of the second Gamma distribution Q
beta2 (float or array-like) – Rate parameter β₂ of the second Gamma distribution Q

Returns:

KL divergence between the two Gamma distributions

Return type:

float

scribe.stats.kl_beta(alpha1, beta1, alpha2, beta2)[source]

Compute Kullback-Leibler (KL) divergence between two Beta distributions.

Calculates KL(P||Q) where: P ~ Beta(α₁, β₁) Q ~ Beta(α₂, β₂)

The KL divergence is given by:

KL(P||Q) = ln[B(α₂,β₂)] - ln[B(α₁,β₁)] + (α₁-α₂)ψ(α₁) + (β₁-β₂)ψ(β₁)

(α₂-α₁+β₂-β₁)ψ(α₁+β₁)

where: - ψ(x) is the digamma function - B(x,y) is the beta function

Parameters:

alpha1 (float or array-like) – Shape parameter α₁ of the first Beta distribution P
beta1 (float or array-like) – Shape parameter β₁ of the first Beta distribution P
alpha2 (float or array-like) – Shape parameter α₂ of the second Beta distribution Q
beta2 (float or array-like) – Shape parameter β₂ of the second Beta distribution Q

Returns:

KL divergence between the two Beta distributions

Return type:

float

scribe.stats.kl_lognormal(mu1, sigma1, mu2, sigma2)[source]

Compute Kullback-Leibler (KL) divergence between two log-normal distributions.

Calculates KL(P||Q) where: P ~ LogNormal(μ₁, σ₁) Q ~ LogNormal(μ₂, σ₂)

The KL divergence is given by:

KL(P||Q) = ln(σ₂/σ₁) + (σ₁² + (μ₁-μ₂)²)/(2σ₂²) - 1/2

Parameters:

mu1 (float or array-like) – Location parameter μ₁ of the first log-normal distribution P
sigma1 (float or array-like) – Scale parameter σ₁ of the first log-normal distribution P
mu2 (float or array-like) – Location parameter μ₂ of the second log-normal distribution Q
sigma2 (float or array-like) – Scale parameter σ₂ of the second log-normal distribution Q

Returns:

KL divergence between the two log-normal distributions

Return type:

float or array-like

scribe.stats.jensen_shannon_beta(alpha1, beta1, alpha2, beta2)[source]

Compute the Jensen-Shannon divergence between two Beta distributions.

The Jensen-Shannon divergence is a symmetrized and smoothed version of the Kullback-Leibler divergence, defined as:

JSD(P||Q) = 1/2 × KL(P||M) + 1/2 × KL(Q||M)

where M = 1/2 × (P + Q) is the average of the two distributions.

For Beta distributions, we compute this by: 1. Finding the parameters of the mixture distribution M 2. Computing KL(P||M) and KL(Q||M) 3. Taking the average of these KL divergences

Parameters:

alpha1 (float or array) – Alpha parameter (shape) of the first Beta distribution
beta1 (float or array) – Beta parameter (shape) of the first Beta distribution
alpha2 (float or array) – Alpha parameter (shape) of the second Beta distribution
beta2 (float or array) – Beta parameter (shape) of the second Beta distribution

Returns:

Jensen-Shannon divergence between the two Beta distributions

Return type:

float or array

scribe.stats.jensen_shannon_gamma(alpha1, beta1, alpha2, beta2)[source]

Compute the Jensen-Shannon divergence between two Gamma distributions.

The Jensen-Shannon divergence is a symmetrized and smoothed version of the Kullback-Leibler divergence, defined as:

JSD(P||Q) = 1/2 × KL(P||M) + 1/2 × KL(Q||M)

where M = 1/2 × (P + Q) is the average of the two distributions.

For Gamma distributions, we compute this by:

Finding the parameters of the mixture distribution M
Computing KL(P||M) and KL(Q||M)
Taking the average of these KL divergences

Parameters:

alpha1 (float or array) – Shape parameter of the first Gamma distribution
beta1 (float or array) – Rate parameter of the first Gamma distribution
alpha2 (float or array) – Shape parameter of the second Gamma distribution
beta2 (float or array) – Rate parameter of the second Gamma distribution

Returns:

Jensen-Shannon divergence between the two Gamma distributions

Return type:

float or array

scribe.stats.jensen_shannon_lognormal(mu1, sigma1, mu2, sigma2)[source]

Compute the Jensen-Shannon divergence between two LogNormal distributions.

The Jensen-Shannon divergence is a symmetrized and smoothed version of the Kullback-Leibler divergence, defined as:

JSD(P||Q) = 1/2 × KL(P||M) + 1/2 × KL(Q||M)

where M = 1/2 × (P + Q) is the average of the two distributions.

For LogNormal distributions, we compute this by:

Finding the parameters of the mixture distribution M
Computing KL(P||M) and KL(Q||M)
Taking the average of these KL divergences

Parameters:

mu1 (float or array) – Location parameter of the first LogNormal distribution
sigma1 (float or array) – Scale parameter of the first LogNormal distribution
mu2 (float or array) – Location parameter of the second LogNormal distribution
sigma2 (float or array) – Scale parameter of the second LogNormal distribution

Returns:

Jensen-Shannon divergence between the two LogNormal distributions

Return type:

float or array

scribe.stats.sq_hellinger_beta(alpha1, beta1, alpha2, beta2)[source]

Compute the squared Hellinger distance between two Beta distributions.

The squared Hellinger distance between two Beta distributions P and Q is given by:

H²(P,Q) = 1 - ∫ sqrt(P(x)Q(x)) dx

where P(x) and Q(x) are the probability density functions of P and Q, respectively.

For Beta distributions P ~ Beta(α₁,β₁) and Q ~ Beta(α₂,β₂), this has the closed form:

H²(P,Q) = 1 - B((α₁+α₂)/2, (β₁+β₂)/2) / sqrt(B(α₁,β₁) * B(α₂,β₂))

where B(x,y) is the beta function.

Parameters:

alpha1 (float or array-like) – Shape parameter α₁ of the first Beta distribution P
beta1 (float or array-like) – Shape parameter β₁ of the first Beta distribution P
alpha2 (float or array-like) – Shape parameter α₂ of the second Beta distribution Q
beta2 (float or array-like) – Shape parameter β₂ of the second Beta distribution Q

Returns:

Squared Hellinger distance between the two Beta distributions

Return type:

float or array-like

scribe.stats.hellinger_beta(alpha1, beta1, alpha2, beta2)[source]

Compute the Hellinger distance between two Beta distributions.

The Hellinger distance between two Beta distributions P and Q is given by:

H(P,Q) = sqrt(H²(P,Q))

where H²(P,Q) is the squared Hellinger distance.

For Beta distributions P ~ Beta(α₁,β₁) and Q ~ Beta(α₂,β₂), this has the closed form:

H(P,Q) = sqrt(1 - B((α₁+α₂)/2, (β₁+β₂)/2) / sqrt(B(α₁,β₁) * B(α₂,β₂)))

where B(x,y) is the beta function.

Parameters:

alpha1 (float or array-like) – Shape parameter α₁ of the first Beta distribution P
beta1 (float or array-like) – Shape parameter β₁ of the first Beta distribution P
alpha2 (float or array-like) – Shape parameter α₂ of the second Beta distribution Q
beta2 (float or array-like) – Shape parameter β₂ of the second Beta distribution Q

Returns:

Hellinger distance between the two Beta distributions

Return type:

float or array-like

scribe.stats.sq_hellinger_gamma(alpha1, beta1, alpha2, beta2)[source]

Compute the squared Hellinger distance between two Gamma distributions.

The squared Hellinger distance between two Gamma distributions P and Q is given by:

H²(P,Q) = 1 - ∫ sqrt(P(x)Q(x)) dx

where P(x) and Q(x) are the probability density functions of P and Q, respectively.

For Gamma distributions P ~ Gamma(α₁,β₁) and Q ~ Gamma(α₂,β₂), this has the closed form:

H²(P,Q) = 1 - Γ((α₁+α₂)/2) * ((β₁+β₂)/2)^(-(α₁+α₂)/2) *: sqrt((β₁^α₁ * β₂^α₂)/(Γ(α₁) * Γ(α₂)))

where Γ(x) is the gamma function.

Parameters:

alpha1 (float or array-like) – Shape parameter α₁ of the first Gamma distribution P
beta1 (float or array-like) – Rate parameter β₁ of the first Gamma distribution P
alpha2 (float or array-like) – Shape parameter α₂ of the second Gamma distribution Q
beta2 (float or array-like) – Rate parameter β₂ of the second Gamma distribution Q

Returns:

Squared Hellinger distance between the two Gamma distributions

Return type:

float or array-like

scribe.stats.hellinger_gamma(alpha1, beta1, alpha2, beta2)[source]

Compute the Hellinger distance between two Gamma distributions.

The Hellinger distance between two Gamma distributions P and Q is given by:

H(P,Q) = sqrt(H²(P,Q))

where H²(P,Q) is the squared Hellinger distance.

For Gamma distributions P ~ Gamma(α₁,β₁) and Q ~ Gamma(α₂,β₂), this has the closed form:

H(P,Q) = sqrt(1 - Γ((α₁+α₂)/2) * ((β₁+β₂)/2)^(-(α₁+α₂)/2) *: sqrt((β₁^α₁ * β₂^α₂)/(Γ(α₁) * Γ(α₂))))

where Γ(x) is the gamma function.

Parameters:

alpha1 (float or array-like) – Shape parameter α₁ of the first Gamma distribution P
beta1 (float or array-like) – Rate parameter β₁ of the first Gamma distribution P
alpha2 (float or array-like) – Shape parameter α₂ of the second Gamma distribution Q
beta2 (float or array-like) – Rate parameter β₂ of the second Gamma distribution Q

Returns:

Hellinger distance between the two Gamma distributions

Return type:

float or array-like

scribe.stats.sq_hellinger_lognormal(mu1, sigma1, mu2, sigma2)[source]

Compute the squared Hellinger distance between two log-normal distributions.

The squared Hellinger distance between two log-normal distributions P and Q is given by:

H²(P,Q) = 1 - sqrt( σ1 * σ2 / (σ1² + σ2²) )

exp( - (μ1-μ2)² / [4*(σ1² + σ2²)] )

Parameters:

mu1 (float or array-like) – Location parameter μ₁ of the first log-normal distribution P
sigma1 (float or array-like) – Scale parameter σ₁ of the first log-normal distribution P
mu2 (float or array-like) – Location parameter μ₂ of the second log-normal distribution Q
sigma2 (float or array-like) – Scale parameter σ₂ of the second log-normal distribution Q

Returns:

Squared Hellinger distance between the two log-normal distributions

Return type:

float or array-like

scribe.stats.hellinger_lognormal(mu1, sigma1, mu2, sigma2)[source]

Compute the Hellinger distance between two log-normal distributions.

The Hellinger distance is the square root of the squared Hellinger distance.

Parameters:

mu1 (float or array-like) – Location parameter μ₁ of the first log-normal distribution P
sigma1 (float or array-like) – Scale parameter σ₁ of the first log-normal distribution P
mu2 (float or array-like) – Location parameter μ₂ of the second log-normal distribution Q
sigma2 (float or array-like) – Scale parameter σ₂ of the second log-normal distribution Q

Returns:

Hellinger distance between the two log-normal distributions

Return type:

float or array-like

scribe.stats.beta_mode(alpha, beta)[source]

Calculate the mode for a Beta distribution.

For Beta(α,β) distribution, the mode is:: (α-1)/(α+β-2) when α,β > 1 undefined when α,β ≤ 1

Parameters:

alpha (float or array-like) – Shape parameter α of the Beta distribution
beta (float or array-like) – Shape parameter β of the Beta distribution

Returns:

Mode of the Beta distribution. Returns NaN for cases where mode is undefined (α,β ≤ 1)

Return type:

float or array-like

scribe.stats.gamma_mode(alpha, beta)[source]

Calculate the mode for a Gamma distribution.

For Gamma(α,β) distribution, the mode is:: (α-1)/β when α > 1 0 when α ≤ 1

Parameters:

alpha (float or array-like) – Shape parameter α of the Gamma distribution
beta (float or array-like) – Rate parameter β of the Gamma distribution

Returns:

Mode of the Gamma distribution

Return type:

float or array-like

scribe.stats.dirichlet_mode(alpha)[source]

Calculate the mode for a Dirichlet distribution.

For Dirichlet(α) distribution with concentration parameters α, the mode for each component is:

(αᵢ-1)/(∑ⱼαⱼ-K) when αᵢ > 1 for all i 0 when αᵢ ≤ 1 for any i

where K is the number of components.

Parameters:: alpha (array-like) – Concentration parameters α of the Dirichlet distribution
Returns:: Mode of the Dirichlet distribution
Return type:: array-like

scribe.stats.lognorm_mode(mu, sigma)[source]

Calculate the mode for a log-normal distribution.

For LogNormal(μ,σ) distribution, the mode is:: exp(μ - σ²)

Parameters:

mu (float or array-like) – Mean of the log-normal distribution
sigma (float or array-like) – Standard deviation of the log-normal distribution

Returns:

Mode of the log-normal distribution

Return type:

float or array-like

scribe.stats.get_distribution_mode(dist_obj)[source]

Get the mode of a distribution object.

Parameters:: dist_obj (numpyro.distributions.Distribution) – Distribution object
Returns:: Mode of the distribution
Return type:: jnp.ndarray

class scribe.stats.BetaPrime(concentration1, concentration0, validate_args=None)[source]

Bases: Distribution

Beta Prime distribution.

The Beta Prime distribution is the distribution of the ratio of two independent random variables with Gamma distributions. If X ~ Gamma(α, 1) and Y ~ Gamma(β, 1), then X/Y ~ BetaPrime(α, β).

The probability density function is given by:: f(x; α, β) = x^(α-1) * (1+x)^(-α-β) / B(α, β)

where B(α, β) is the Beta function.

Parameters:

concentration1 (jnp.ndarray) – First shape parameter (α).
concentration0 (jnp.ndarray) – Second shape parameter (β).

arg_constraints = {'concentration0': Positive(lower_bound=0.0), 'concentration1': Positive(lower_bound=0.0)}

support = Positive(lower_bound=0.0)

has_rsample = False

sample(key, sample_shape=())[source]

Returns a sample from the distribution having shape given by sample_shape + batch_shape + event_shape. Note that when sample_shape is non-empty, leading dimensions (of size sample_shape) of the returned sample will be filled with iid draws from the distribution instance.

Parameters:

key (jax.random.PRNGKey) – the rng_key key to be used for the distribution.
sample_shape (tuple) – the sample shape for the distribution.

Returns:

an array of shape sample_shape + batch_shape + event_shape

Return type:

numpy.ndarray

log_prob(*args, **kwargs)

Evaluates the log probability density for a batch of samples given by value.

Parameters:: value – A batch of samples from the distribution.
Returns:: an array with shape value.shape[:-self.event_shape]
Return type:: numpy.ndarray

property mean: Mean of the distribution.

property variance: Variance of the distribution.