Crossed hierarchical models & multi-donor differential expression¶

This tutorial fits a single crossed hierarchical scribe model to a donor × condition perturbation experiment and reads off the population-level treatment effect while explicitly accounting for donor-to-donor heterogeneity. The data are from Zhao et al. (2021): peripheral cells from seven donors, profiled at baseline (control) and after treatment with panobinostat, a pan-HDAC inhibitor.

The notebook is built around the multi-factor hierarchy. Mean expression is decomposed additively as

\[ \log \mu_g^{(\ell)} = \log \mu_g^{\mathrm{pop}} + \alpha_g[\mathrm{cond}(\ell)] + \beta_g[\mathrm{donor}(\ell)], \]

with a fixed, weakly-informative treatment effect \(\alpha_g\) (the contrast of interest) and a random, horseshoe-shrunk donor effect \(\beta_g\) (the heterogeneity we average over). It covers:

The model, written out — negative binomial with per-cell capture, plus the crossed additive hierarchy, and why the treatment factor is fixed while the donor factor is random.
Fit diagnostics — ELBO loss, a per-leaf mean-calibration grid laid out as condition × donor, and a posterior predictive check.
Reading the fitted structure — exposing the treatment effect and the donor deviations as posteriors (get_factor_effect), and recovering the paired donor groups (get_group / iter_groups).
Differential expression — the donor-averaged, paired CLR estimand (compare_groups), why CLR needs a stable reference (pooling near-silent genes into "other"), and the complementary compositional and biological views. The recovered hits reproduce the canonical panobinostat / HDAC-inhibitor signature (histones, metallothioneins, heat-shock and p53-stress up; MYC, cyclins, the myeloid program and MDM2 down) at high sign-confidence.

Pre-computed outputs

This notebook requires a GPU to run. All outputs shown below were pre-computed and exported to static HTML. To re-run it yourself, clone the repository and execute the notebook with marimo edit docs/tutorials/zhao_2021_hierarchical.py on a GPU-enabled machine.

Open notebook in full page