`scribe-visualize` CLI Guide¶

scribe-visualize is the post-inference visualization CLI for SCRIBE. It reads completed inference outputs (a result *.pkl file plus .hydra/config.yaml in the same run directory) and generates a suite of diagnostic plots --- from quick training summaries to detailed posterior predictive checks.

Like scribe-infer, it supports recursive directory traversal, wildcard patterns, and SLURM submission for batch visualization of many runs at once.

Installation¶

The CLI shares the same Hydra dependency group as scribe-infer:

pip install 'scribe[hydra]'

Quick start¶

# Generate default plots (loss curve) for a single run
scribe-visualize outputs/my_run

# Use an explicit result file path (custom filename supported)
scribe-visualize outputs/my_run/custom_results.pkl

# Generate all available plots
scribe-visualize outputs/my_run --all

# Recursively visualize every run under a directory
scribe-visualize outputs/ --recursive --all

# Recursively match custom result filenames
scribe-visualize outputs/ --recursive "*_results.pkl" --all

Available plots¶

scribe-visualize provides 13 plot types. Two are enabled by default (loss, ECDF); the rest are opt-in via flags or --all.

Default plots (on unless `--no-*` is passed)¶

Plot	Flag	Description
Loss history	`--no-loss` to disable	ELBO / loss curve over SVI steps; MCMC diagnostics for MCMC runs
ECDF	`--no-ecdf` to disable	Empirical CDF of observed counts for a sample of genes

Opt-in plots (off unless explicitly enabled)¶

Plot	Flag	Description	Requirements
Posterior Predictive Check (PPC)	`--ppc`	Grid of per-gene histograms comparing observed counts to posterior predictive samples	---
Biological PPC	`--bio-ppc`	NB(r, p) credible bands overlaid with denoised data histograms	---
UMAP	`--umap`	Joint UMAP embedding of observed and synthetic (PPC) data	---
Correlation heatmap	`--heatmap`	Posterior gene-gene correlation matrix	---
Mixture PPC	`--mixture-ppc`	Per-component posterior predictive check	Mixture model (`n_components >= 2`)
Mixture composition	`--mixture-composition`	MAP component assignment barplot	Mixture model
Annotation PPC	`--annotation-ppc`	Per-annotation-label posterior predictive	Mixture model + `annotation_key`
Capture anchor	`--capture-anchor`	Eta capture-anchor diagnostic (prior vs. posterior)	Biology-informed capture prior
p_capture scaling	`--p-capture-scaling`	Capture probability vs. library size	VCP model
Mean calibration	`--mean-calibration`	Log-log scatter of observed vs. predicted per-gene means	---
Mean pairwise	`--mean-pairwise`	Pairwise dataset-level mean comparison	Multi-dataset model

Use --all to enable every plot at once.

Plot gallery¶

Example plots coming soon

Each plot type will have an example figure here. Placeholders are included below for reference.

Loss history¶

Placeholder --- example loss curve will be added here.

ECDF¶

Placeholder --- example ECDF plot will be added here.

Posterior Predictive Check (PPC)¶

Placeholder --- example PPC grid will be added here.

Biological PPC¶

Placeholder --- example bio-PPC plot will be added here.

UMAP¶

Placeholder --- example UMAP overlay will be added here.

Correlation heatmap¶

Placeholder --- example correlation heatmap will be added here.

Mixture PPC¶

Placeholder --- example mixture PPC will be added here.

Mixture composition¶

Placeholder --- example component assignment barplot will be added here.

Annotation PPC¶

Placeholder --- example per-annotation PPC will be added here.

Capture anchor¶

Placeholder --- example capture-anchor diagnostic will be added here.

p_capture scaling¶

Placeholder --- example capture-probability-vs-library-size will be added here.

Mean calibration¶

Placeholder --- example observed-vs-predicted scatter will be added here.

Mu pairwise¶

Placeholder --- example pairwise dataset mu comparison will be added here.

Customization options¶

Many plots accept fine-tuning parameters via CLI flags:

Flag	Default	Description
`--format`	`png`	Output format: `png`, `pdf`, `svg`, `eps`
`--ecdf-genes`	`25`	Number of genes shown in the ECDF panel
`--ppc-rows`	`5`	Rows in the PPC grid
`--ppc-cols`	`5`	Columns in the PPC grid
`--ppc-samples`	`512`	Number of posterior predictive samples for PPC
`--umap-ppc-samples`	`50`	PPC samples for the UMAP overlay
`--overwrite`	off	Re-generate plots even if output files already exist

Additional fine-grained options (heatmap gene count, UMAP hyperparameters, capture-anchor scatter settings, etc.) can be configured in conf/viz/default.yaml, which is scaffolded by scribe-infer --initialize.

Directory processing¶

Single run¶

scribe-visualize outputs/my_run --ppc --umap

Recursive search¶

Finds every matching result file recursively. With no pattern value, --recursive defaults to scribe_results.pkl:

scribe-visualize outputs/ --recursive --all

Recursive search with custom filename pattern¶

scribe-visualize outputs/ --recursive "*_results.pkl" --all

Wildcard patterns¶

Shell-style globs for selective processing (directories or explicit files):

# All ZINB runs
scribe-visualize "outputs/*/zinb*/*" --heatmap

# Unquoted expansion (shell expands first)
scribe-visualize outputs/bleo_study0*/zinbvcp/* --recursive --umap

# Explicit file glob
scribe-visualize "outputs/**/*_results.pkl" --ppc

SLURM integration¶

Submit visualization as a batch job with the same profile system as scribe-infer:

# Interactive prompts for cluster resources
scribe-visualize --slurm outputs/ --recursive --all

# Reusable profile
scribe-visualize --slurm-profile default outputs/ --recursive --all

# Per-run overrides
scribe-visualize --slurm-profile default \
    --slurm-set partition=gpu \
    --slurm-set mem_gb=32 \
    outputs/ --recursive --all

The SLURM flags (--slurm, --slurm-profile, --slurm-set) follow the same conventions as scribe-infer.

CLI flags reference¶

Flag	Default	Description
`run_target`	(required)	One or more run targets (directories, result `.pkl` files, or glob patterns)
`--all`	off	Enable all plot types
`--no-loss`	off	Disable the loss curve
`--no-ecdf`	off	Disable the ECDF plot
`--ppc`	off	Enable PPC grid
`--bio-ppc`	off	Enable biological PPC
`--umap`	off	Enable UMAP overlay
`--heatmap`	off	Enable correlation heatmap
`--mixture-ppc`	off	Enable mixture PPC
`--mixture-composition`	off	Enable mixture composition barplot
`--annotation-ppc`	off	Enable per-annotation PPC
`--capture-anchor`	off	Enable capture-anchor diagnostic
`--p-capture-scaling`	off	Enable capture probability vs. library size
`--mean-calibration`	off	Enable mean calibration scatter
`--mean-pairwise`	off	Enable dataset-level mean comparison
`--recursive [PATTERN]`	off	Recursively search directories for result files; defaults to `scribe_results.pkl` when used without PATTERN
`--overwrite`	off	Regenerate existing plots
`--format`	`png`	Output format (`png`, `pdf`, `svg`, `eps`)
`--slurm`	off	Launch as SLURM batch job
`--slurm-profile`	---	Reusable SLURM profile name or path
`--slurm-set`	---	Per-run SLURM overrides (repeatable)

For inference CLI usage, see scribe-infer. For the Python visualization API, see the API Reference.

scribe-visualize CLI Guide¶