import%20marimo%0A%0A__generated_with%20%3D%20%220.23.2%22%0Aapp%20%3D%20marimo.App()%0A%0A%0A%40app.cell%0Adef%20_()%3A%0A%20%20%20%20import%20marimo%20as%20mo%0A%0A%20%20%20%20return%20(mo%2C)%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20%23%20Modeling%20Assumptions%20for%20Single-Cell%20RNA-seq%20with%20%60scribe%60%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20In%20this%20tutorial%20we%20walk%20through%20a%20series%20of%20modeling%20and%20inference%20choices%20that%20%60scribe%60's%20Bayesian%20engine%20makes%20available%20for%20single-cell%20RNA-seq%20data.%20We%20use%20a%20public%20dataset%20from%2010x%20Genomics%20(%5BJurkat%20Cells%5D(https%3A%2F%2Fwww.10xgenomics.com%2Fdatasets%2Fjurkat-cells-1-standard-1-1-0))%2C%20described%20as%0A%0A%20%20%20%20-%20~3%2C200%20cells%20detected%0A%20%20%20%20-%20Sequenced%20on%20Illumina%20HiSeq%202500%20Rapid%20Run%20V2%20with%20~33%2C900%20reads%20per%20cell%0A%0A%20%20%20%20Because%20this%20is%20a%20monoculture%20(one%20cell%20type)%2C%20it%20lets%20us%20focus%20on%20the%20statistical%20modeling%20rather%20than%20biological%20heterogeneity.%20We%20will%20progressively%20build%20up%20from%20the%20simplest%20count%20model%20to%20structured%20variational%20families%2C%20checking%20at%20each%20step%20whether%20the%20added%20complexity%20improves%20the%20fit.%0A%0A%20%20%20%20Let%20us%20begin%20by%20importing%20the%20necessary%20packages.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_()%3A%0A%20%20%20%20%23%20Import%20basic%20packages%0A%20%20%20%20from%20pathlib%20import%20Path%0A%20%20%20%20import%20pickle%0A%0A%20%20%20%20%23%20Import%20functionality%20to%20clear%20GPU%20memory%0A%20%20%20%20import%20gc%0A%20%20%20%20from%20jax%20import%20clear_caches%0A%0A%20%20%20%20%23%20Import%20our%20main%20package%0A%20%20%20%20import%20scribe%0A%0A%20%20%20%20%23%20Import%20useful%20tools%0A%20%20%20%20import%20numpy%20as%20np%0A%0A%20%20%20%20%23%20Import%20plotting%20packages%0A%20%20%20%20import%20matplotlib.pyplot%20as%20plt%0A%20%20%20%20import%20seaborn%20as%20sns%0A%20%20%20%20import%20corner%0A%0A%20%20%20%20%23%20Set%20our%20plotting%20style%20(totally%20optional)%0A%20%20%20%20scribe.viz.matplotlib_style()%0A%20%20%20%20return%20Path%2C%20clear_caches%2C%20corner%2C%20gc%2C%20np%2C%20pickle%2C%20plt%2C%20scribe%2C%20sns%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20Having%20loaded%20the%20packages%2C%20let%20us%20proceed%20to%20load%20the%20data.%0A%0A%20%20%20%20%3E%20**Note%3A**%20%60scribe%60%20borrows%20%60scanpy%60's%20loading%20data%20functionality%20for%20%60h5ad%60%20and%20%60mtx%60%20files.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(Path%2C%20scribe)%3A%0A%20%20%20%20%23%20Define%20data%20directory%0A%20%20%20%20data_dir%20%3D%20Path(%0A%20%20%20%20%20%20%20%20%22%2Fpath%2Fto%2Fjurkat_cells%2F%22%0A%20%20%20%20)%0A%0A%20%20%20%20%23%20Load%20the%20data%20%0A%20%20%20%20adata%20%3D%20scribe.data_loader.load_and_preprocess_anndata(%0A%20%20%20%20%20%20%20%20data_dir%2C%20return_jax%3DFalse%0A%20%20%20%20)%0A%0A%20%20%20%20adata%0A%20%20%20%20return%20adata%2C%20data_dir%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20Before%20fitting%20any%20model%2C%20let%20us%20look%20at%20one%20basic%20summary%20of%20the%20data%3A%20the%20total%20UMI%20count%20per%20cell%20(library%20size).%20Most%20analysis%20pipelines%20treat%20this%20quantity%20as%20a%20nuisance%20and%20%22normalize%22%20it%20away%20by%20scaling%20every%20cell%20to%20an%20arbitrary%20target%20(often%2010%2C000%20UMIs).%20%60scribe%60%20takes%20a%20different%20approach%3A%20it%20explicitly%20models%20the%20per-cell%20capture%20probability%20that%20generates%20the%20library-size%20variation%20in%20the%20first%20place.%20So%20the%20empirical%20cumulative%20distribution%20function%20(ECDF)%20below%20is%20not%20just%20a%20quality-control%20plot%E2%80%94it%20previews%20the%20kind%20of%20cell-to-cell%20variability%20the%20model%20will%20need%20to%20account%20for.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(adata%2C%20np%2C%20plt%2C%20sns)%3A%0A%20%20%20%20%23%20Total%20UMI%20per%20cell%3B%20.A1%20flattens%20a%20sparse-matrix%20row%20sum%20to%20a%201-D%20array%0A%20%20%20%20_library_sizes%20%3D%20np.asarray(adata.X.sum(axis%3D1)).ravel()%0A%0A%20%20%20%20_fig%2C%20_ax%20%3D%20plt.subplots(figsize%3D(5%2C%204))%0A%20%20%20%20sns.ecdfplot(_library_sizes%2C%20ax%3D_ax)%0A%20%20%20%20_ax.set_xlabel('total%20UMI%20count%20per%20cell')%0A%20%20%20%20_ax.set_ylabel('ECDF')%0A%20%20%20%20_ax.set_ylim(-0.01%2C%201.01)%0A%20%20%20%20plt.show()%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20The%20ECDF%20shows%20roughly%20a%20tenfold%20range%20in%20library%20size%E2%80%94about%20one%20order%20of%20magnitude%20of%20spread.%20In%20a%20Jurkat%20monoculture%2C%20that%20variation%20is%20much%20more%20plausibly%20technical%20(sequencing%20and%20capture%20depth)%20than%20biological.%20The%20rest%20of%20this%20notebook%20walks%20through%20how%20%60scribe%60%20treats%20that%20kind%20of%20depth%20variation%20inside%20the%20generative%20model%3A%20we%20start%20with%20the%20simplest%20count%20model%2C%20then%20progressively%20layer%20in%20the%20pieces%20that%20make%20depth%20explicit%20rather%20than%20washing%20it%20out%20with%20ad%20hoc%20scaling%20in%20preprocessing.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20%23%23%20Fitting%20the%20simplest%20%60scribe%60%20model%20(no%20explicit%20capture)%0A%0A%20%20%20%20We%20begin%20with%20the%20smallest%20model%20%60scribe%60%20implements.%20The%20motivation%20is%20not%20a%20vague%20%E2%80%9Coverdispersion%20%E2%87%92%20pick%20an%20NB%E2%80%9D%20heuristic.%20In%20the%20standard%20two-state%20(bursty)%20promoter%20model%2C%20the%20steady-state%20mRNA%20count%20for%20a%20gene%20follows%20a%20negative%20binomial%20distribution%20under%20well-characterized%20biophysical%20assumptions%20(see%2C%20e.g.%2C%20Raj%20*et%20al.*%2C%202006).%20That%20is%20the%20starting%20point%3A%20a%20concrete%20biophysical%20model%20whose%20algebra%20lands%20on%20a%20per-gene%20negative%20binomial%20for%20the%20counts%20we%20observe.%0A%0A%20%20%20%20Concretely%2C%20let%20%24m_g%24%20be%20the%20latent%20mRNA%20count%20for%20gene%20%24g%24%20in%20a%20cell.%20The%20generative%20story%20is%3A%0A%0A%20%20%20%20%24%24%0A%20%20%20%20m_g%20%5Csim%20%5Coperatorname%7BNegBinom%7D(r_g%2C%20p)%2C%0A%20%20%20%20%24%24%0A%0A%20%20%20%20where%20%24r_g%24%20is%20gene-specific%20(controlling%20how%20bursty%20or%20variable%20a%20gene%20is)%20and%20%24p%24%20is%20shared%20across%20genes%20(related%20to%20burst%20size).%20This%20is%20not%20the%20only%20parameterization%20%60scribe%60%20offers%E2%80%94you%20will%20see%20alternatives%20later%E2%80%94but%20it%20is%20the%20simplest%20place%20to%20begin.%0A%0A%20%20%20%20%60scribe%60%E2%80%99s%20most%20basic%20inference%20strategy%20uses%20a%20mean-field%20variational%20approximation.%20In%20practical%20terms%2C%20this%20means%20the%20approximate%20posterior%20treats%20each%20parameter%20as%20independently%20varying%3A%20it%20does%20not%20represent%20gene%E2%80%93gene%20correlations.%20The%20results%20are%20best%20interpreted%20as%20marginals%20(per-gene%20summaries%2C%20per-cell%20quantities%2C%20shared%20parameters)%20rather%20than%20as%20a%20full%20joint%20picture%20of%20co-expression.%20We%20will%20relax%20this%20independence%20assumption%20later%20in%20the%20tutorial.%0A%0A%20%20%20%20For%20this%20first%20fit%20we%20set%20%60variable_capture%3DFalse%60%2C%20meaning%20we%20are%20not%20yet%20adding%20a%20per-cell%20capture%2Fsequencing-depth%20layer.%20In%20effect%2C%20we%20treat%20the%20observed%20UMI%20counts%20%24u_g%24%20as%20if%20they%20were%20on%20the%20same%20scale%20as%20the%20latent%20transcript%20counts%20%24m_g%24%E2%80%94no%20thinning%20or%20rescaling%20step.%20This%20is%20obviously%20the%20wrong%20assumption%2C%20but%20it%20serves%20a%20pedagogical%20purpose%3A%20it%20lets%20us%20isolate%20what%20the%20NB%20piece%20is%20doing%20before%20we%20introduce%20the%20technical%20sampling%20story.%0A%0A%20%20%20%20If%20you%20are%20used%20to%20Scanpy%20or%20Seurat%3A%0A%0A%20%20%20%20-%20Most%20workflows%20fix%20depth%20first%20(normalize%2C%20scale%2C%20log-transform)%20and%20then%20cluster%20or%20run%20differential%20expression%20on%20the%20transformed%20matrix.%0A%20%20%20%20-%20Here%20we%20take%20the%20opposite%20order%3A%20start%20with%20the%20simplest%20count%20model%2C%20then%20add%20depth%20explicitly%20as%20a%20model%20ingredient%20via%20%60variable_capture%3DTrue%60.%0A%0A%20%20%20%20In%20short%2C%20%60variable_capture%3DFalse%60%20is%20a%20stepping%20stone%2C%20not%20a%20biological%20claim.%20It%20lets%20us%20see%20the%20NB%20layer%20in%20isolation%20before%20mixing%20in%20the%20capture%20model.%0A%0A%20%20%20%20We%20also%20pass%20%60early_stopping%3D%7B...%7D%60%20so%20training%20does%20not%20run%20indefinitely.%20The%20%60enabled%60%20flag%20turns%20the%20mechanism%20on%20or%20off%3B%20%60patience%60%20controls%20how%20many%20steps%20without%20meaningful%20improvement%20the%20optimizer%20tolerates%20before%20stopping.%20Under%20the%20hood%2C%20%60scribe%60%20tracks%20a%20moving%20average%20of%20the%20loss%20so%20a%20single%20noisy%20step%20does%20not%20trigger%20a%20premature%20stop.%20Additional%20knobs%20(%60min_delta%60%2C%20%60min_delta_pct%60%2C%20%60check_every%60%2C%20%60warmup%60%2C%20%60smoothing_window%60%2C%20%E2%80%A6)%20are%20available%20for%20finer%20control.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(adata%2C%20data_dir%2C%20pickle%2C%20scribe)%3A%0A%20%20%20%20%23%20Define%20output%20directory%0A%20%20%20%20out_dir%20%3D%20data_dir%20%2F%20%22scribe_results%22%0A%20%20%20%20out_dir.mkdir(parents%3DTrue%2C%20exist_ok%3DTrue)%0A%0A%20%20%20%20%23%20Define%20parameterization%0A%20%20%20%20_parameterization%20%3D%20%22canonical%22%0A%0A%20%20%20%20%23%20Define%20output%20file%20path%0A%20%20%20%20_out_path%20%3D%20out_dir%20%2F%20f%22scribe_results_nbdm_%7B_parameterization%7D.pkl%22%0A%0A%20%20%20%20if%20_out_path.exists()%3A%0A%20%20%20%20%20%20%20%20%23%20Load%20model%20from%20pkl%20file%0A%20%20%20%20%20%20%20%20with%20open(_out_path%2C%20%22rb%22)%20as%20_f%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20results_nbdm%20%3D%20pickle.load(_f)%0A%20%20%20%20else%3A%0A%20%20%20%20%20%20%20%20%23%20Fit%20basic%20model%20to%20data%20with%20variable%20capture%20probability%20fit%20per%20cell%0A%20%20%20%20%20%20%20%20results_nbdm%20%3D%20scribe.fit(%0A%20%20%20%20%20%20%20%20%20%20%20%20adata%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20variable_capture%3DFalse%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20parameterization%3D_parameterization%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20early_stopping%3D%7B%22enabled%22%3A%20True%2C%20%22patience%22%3A%201000%7D%0A%20%20%20%20%20%20%20%20)%0A%20%20%20%20%20%20%20%20%23%20Save%20the%20fitted%20model%0A%20%20%20%20%20%20%20%20with%20open(_out_path%2C%20%22wb%22)%20as%20_f%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20pickle.dump(results_nbdm%2C%20_f)%0A%0A%20%20%20%20results_nbdm%0A%20%20%20%20return%20out_dir%2C%20results_nbdm%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20To%20check%20whether%20training%20converged%2C%20we%20examine%20the%20%5BELBO%5D(https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FEvidence_lower_bound)%20(evidence%20lower%20bound)%20loss%20curve.%20The%20ELBO%20is%20the%20quantity%20variational%20inference%20optimizes%3A%20roughly%2C%20the%20negative%20log-evidence%20plus%20a%20gap%20measuring%20how%20well%20the%20approximate%20posterior%20matches%20the%20true%20one.%20For%20a%20sanity%20check%20we%20do%20not%20need%20to%20read%20it%20quantitatively%3A%20a%20clean%20run%20produces%20a%20curve%20that%20drops%20quickly%20at%20first%20and%20then%20flattens%20out.%20Spikes%2C%20slow%20drifts%20upward%2C%20or%20oscillations%20that%20never%20settle%20are%20warnings%20worth%20investigating.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(results_nbdm%2C%20scribe)%3A%0A%20%20%20%20%23%20Plot%20ELBO%20loss%0A%20%20%20%20scribe.viz.plot_loss(results_nbdm%2C%20figsize%3D(6%2C%203))%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20The%20training%20loss%20has%20converged%2C%20but%20that%20alone%20does%20not%20prove%20the%20model%20describes%20the%20*counts*%20well.%20A%20posterior%20predictive%20check%20(PPC)%20asks%20a%20more%20direct%20question%3A%20if%20we%20simulate%20new%20data%20from%20the%20fitted%20model%2C%20do%20those%20simulations%20look%20like%20the%20real%20measurements%3F%20Concretely%2C%20we%20draw%20synthetic%20count%20matrices%20from%20the%20posterior%20predictive%20distribution%20and%20compare%20their%20marginal%20distributions%20(gene%20by%20gene)%20to%20the%20observed%20histograms.%20When%20the%20model%20is%20adequate%2C%20the%20synthetic%20and%20real%20histograms%20overlap%3B%20when%20it%20is%20not%2C%20you%20will%20see%20systematic%20mismatches%E2%80%94shifted%20means%2C%20wrong%20dispersion%2C%20missing%20zeros%2C%20or%20heavy%20tails%E2%80%94even%20if%20the%20optimizer%20stopped%20happily.%0A%0A%20%20%20%20In%20%60scribe%60%2C%20%60scribe.viz.plot_ppc%60%20automates%20that%20comparison%20for%20a%20grid%20of%20genes.%20It%20stratifies%20genes%20by%20per-gene%20median%20expression%20using%20log-spaced%20bins%2C%20then%20fills%20a%20panel%20layout%20(%60n_genes%60%2C%20%60n_rows%60%2C%20and%20derived%20columns)%20so%20each%20row%20samples%20a%20different%20part%20of%20the%20dynamic%20range.%20For%20each%20gene%20it%20overlays%20the%20observed%20UMI%20histogram%20against%20the%20posterior%20predictive%20distribution%20using%20%60n_samples%60%20Monte%20Carlo%20draws%2C%20showing%20the%20model%E2%80%99s%20implied%20variability%20rather%20than%20a%20single%20predicted%20curve.%20The%20point%20is%20a%20quick%20visual%20sanity%20check%20that%20the%20generative%20model%20matches%20what%20you%20actually%20measured.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(adata%2C%20results_nbdm%2C%20scribe)%3A%0A%20%20%20%20scribe.viz.plot_ppc(%0A%20%20%20%20%20%20%20%20results_nbdm%2C%0A%20%20%20%20%20%20%20%20adata%2C%0A%20%20%20%20%20%20%20%20n_genes%3D16%2C%0A%20%20%20%20%20%20%20%20n_rows%3D4%2C%0A%20%20%20%20%20%20%20%20figsize%3D(8%2C%208)%2C%0A%20%20%20%20%20%20%20%20n_samples%3D512%2C%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20Looking%20at%20the%20PPC%20panels%2C%20three%20patterns%20stand%20out%3A%0A%0A%20%20%20%20-%20*Systematic%20mismatch*%20%E2%80%94%20for%20several%20genes%20the%20observed%20UMI%20histogram%20does%20not%20sit%20where%20the%20model%E2%80%99s%20predictive%20band%20expects%20it.%0A%20%20%20%20-%20*Over-wide%20predictive%20bands*%20%E2%80%94%20even%20when%20the%20match%20is%20roughly%20acceptable%2C%20the%20shaded%20posterior%20predictive%20region%20looks%20unnecessarily%20diffuse%2C%20as%20if%20the%20model%20is%20hedging.%0A%20%20%20%20-%20*High-abundance%20housekeeping%20genes%20look%20odd*%20%E2%80%94%20very%20highly%20expressed%20genes%20like%20FAU%2C%20RPS15%2C%20and%20RPS2%20do%20not%20even%20show%20a%20PPC.%20Did%20the%20fit%20fail%20entirely%20for%20these%20genes%3F%0A%0A%20%20%20%20None%20of%20this%20contradicts%20a%20%E2%80%9Cnice-looking%E2%80%9D%20training%20loss.%20PPCs%20ask%20a%20different%20question%3A%20*does%20the%20fitted%20generative%20model%20actually%20reproduce%20the%20count-level%20patterns%20you%20measured%3F*%0A%0A%20%20%20%20In%20practice%2C%20the%20issues%20above%20stem%20from%20two%20interacting%20limitations%20of%20this%20first%20fit%3A%0A%0A%20%20%20%201.%20**Library%20size%20varies%20substantially%20across%20cells%2C%20even%20in%20a%20monoculture.**%20%20Our%20ECDF%20showed%20close%20to%20an%20order-of-magnitude%20spread%20in%20total%20UMIs.%20In%20a%20Jurkat-only%20dataset%2C%20that%20variation%20is%20most%20plausibly%20sequencing%20and%20capture%20depth%2C%20not%20a%20biological%20axis.%20When%20the%20model%20pretends%20every%20cell%20is%20on%20the%20same%20effective%20depth%20scale%20(%60variable_capture%3DFalse%60)%2C%20it%20must%20explain%20depth-driven%20shifts%20through%20the%20gene-level%20count%20model%20instead%E2%80%94much%20like%20running%20differential%20expression%20without%20size%20factors.%20You%20can%20still%20get%20a%20fit%2C%20but%20the%20marginal%20count%20distributions%20will%20look%20strained.%0A%0A%20%20%20%20%20%20%20%20The%20fix%20is%20analogous%20to%20per-cell%20normalization%2C%20but%20implemented%20as%20an%20explicit%20model%20ingredient%3A%20give%20each%20cell%20its%20own%20capture%20parameter%20(%60variable_capture%3DTrue%60)%20so%20depth%20is%20not%20silently%20absorbed%20into%20gene%20parameters.%0A%0A%20%20%20%202.%20**Negative%20Binomial%20parameters%20can%20be%20%E2%80%9Cslippery%E2%80%9D%20in%20canonical%20%24(r%2Cp)%24%20coordinates.**%20%20Many%20different%20%24(r%2Cp)%24%20pairs%20produce%20nearly%20the%20same%20mean%E2%80%93variance%20trade-off%2C%20creating%20a%20curved%20ridge%20of%20near-equivalent%20solutions.%20Because%20our%20guide%20is%20mean-field%20(treating%20parameters%20as%20independent)%2C%20the%20approximation%20smears%20posterior%20mass%20across%20that%20ridge%20rather%20than%20concentrating%20it%20along%20it.%20A%20common%20symptom%20is%20predictive%20bands%20that%20are%20wider%20than%20the%20data%20warrant.%0A%0A%20%20%20%20We%20will%20address%20depth%20and%20guide%20flexibility%20as%20we%20go%3A%20first%20add%20variable%20capture%2C%20then%20explore%20richer%20variational%20families%20that%20can%20represent%20parameter%20dependence.%0A%0A%20%20%20%20To%20make%20the%20%24(r%2Cp)%24%20coupling%20more%20concrete%20before%20we%20change%20the%20model%2C%20we%20pick%20a%20few%20genes%20from%20the%20PPC%20grid%2C%20draw%20posterior%20samples%20with%20%60get_posterior_matrix%60%2C%20and%20look%20at%20a%20corner%20plot%20of%20%24p%24%20versus%20the%20gene-specific%20%24r_g%24%20values.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(adata%2C%20corner%2C%20results_nbdm)%3A%0A%20%20%20%20%23%20Build%20integer%20indices%20for%20a%20handful%20of%20example%20genes%0A%20%20%20%20gene_list%20%3D%20%5B%22MKKS%22%2C%20%22EIF5%22%2C%20%22RPS29%22%2C%20%22RPS2%22%5D%0A%20%20%20%20_name_to_idx%20%3D%20%7Bg%3A%20i%20for%20i%2C%20g%20in%20enumerate(adata.var_names)%7D%0A%20%20%20%20gene_index%20%3D%20%5B_name_to_idx%5Bg%5D%20for%20g%20in%20gene_list%20if%20g%20in%20_name_to_idx%5D%0A%0A%20%20%20%20_results_subset%20%3D%20results_nbdm%5Bgene_index%5D%0A%0A%20%20%20%20%23%20Export%20directly%20as%20a%20corner-ready%20matrix%20plus%20labels%2Fmetadata.%0A%20%20%20%20_samples_2d%2C%20_labels%2C%20_%20%3D%20_results_subset.get_posterior_matrix(%0A%20%20%20%20%20%20%20%20n_samples%3D5_000%2C%0A%20%20%20%20%20%20%20%20include%3D%5B%22p%22%2C%20%22r%22%5D%2C%0A%20%20%20%20%20%20%20%20exclude_deterministic%3DTrue%2C%0A%20%20%20%20%20%20%20%20store_samples%3DFalse%2C%0A%20%20%20%20%20%20%20%20convert_to_numpy%3DTrue%2C%0A%20%20%20%20)%0A%0A%20%20%20%20corner.corner(_samples_2d%2C%20labels%3D_labels)%0A%20%20%20%20return%20(gene_index%2C)%0A%0A%0A%40app.cell%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20%23%23%23%20Reading%20the%20corner%20plot%0A%0A%20%20%20%20A%20corner%20plot%20(sometimes%20called%20a%20pair%20plot)%20is%20a%20compact%20way%20to%20look%20at%20uncertainty%20across%20several%20parameters%20at%20once.%0A%0A%20%20%20%20-%20Along%20the%20diagonal%20you%20see%20one-dimensional%20histograms%3A%20the%20marginal%20posterior%20distribution%20of%20each%20parameter%20(here%2C%20the%20shared%20%24p%24%20plus%20gene-specific%20%24r_g%24%20for%20a%20handful%20of%20genes).%0A%20%20%20%20-%20In%20the%20lower%20triangle%20you%20see%20two-dimensional%20scatter%2Fdensity%20panels%20for%20every%20pair%20of%20parameters%2C%20showing%20how%20their%20posterior%20samples%20relate%20to%20each%20other.%0A%0A%20%20%20%20Think%20of%20it%20as%20a%20multi-way%20version%20of%20%E2%80%9Cplot%20gene%20A%20vs.%20gene%20B%20and%20look%20for%20correlation%2C%E2%80%9D%20except%20the%20axes%20are%20model%20parameters%20and%20the%20cloud%20of%20points%20represents%20posterior%20uncertainty%2C%20not%20cells.%0A%0A%20%20%20%20%23%23%23%20What%20to%20look%20for%20in%20this%20particular%20figure%0A%0A%20%20%20%20We%20are%20focusing%20on%20the%20negative%20binomial%E2%80%99s%20canonical%20parameters%3A%20a%20shared%20success%20probability%20%24p%24%20and%20gene-specific%20dispersion%20parameters%20%24r_g%24.%20In%20principle%2C%20many%20%24(r_g%2Cp)%24%20combinations%20can%20predict%20similar%20counts%20(a%20curved%20ridge%20in%20the%20full%20posterior).%20When%20a%20mean-field%20variational%20approximation%20encounters%20that%20ridge%2C%20it%20cannot%20represent%20the%20coupling%20and%20instead%20inflates%20each%20marginal%20independently.%20The%20result%20is%20that%20the%202D%20panels%20between%20%24p%24%20and%20each%20%24r_g%24%20look%20like%20broad%20blobs%20rather%20than%20tight%2C%20tilted%20ellipses%E2%80%94extra%20marginal%20uncertainty%20that%20arises%20from%20the%20approximation%2C%20not%20from%20genuine%20parameter%20ambiguity.%0A%0A%20%20%20%20The%20practical%20punchline%3A%20PPC%20weirdness%20is%20often%20less%20about%20%E2%80%9Cthe%20optimizer%20failed%E2%80%9D%20and%20more%20about%20whether%20the%20model%20plus%20its%20approximation%20represent%20depth%20and%20NB%20geometry%20in%20a%20way%20that%20matches%20how%20counts%20were%20actually%20generated.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20%23%23%20Fitting%20a%20model%20with%20variable%20capture%0A%0A%20%20%20%20Our%20first%20fit%20essentially%20pretended%20that%20the%20UMI%20matrix%20is%20already%20on%20the%20%E2%80%9Cright%E2%80%9D%20scale%20for%20a%20single%20negative%20binomial%20story.%20But%20the%20ECDF%20of%20library%20size%20told%20us%20otherwise%3A%20in%20a%20Jurkat%20monoculture%2C%20an%20order-of-magnitude%20spread%20in%20total%20counts%20is%20far%20easier%20to%20attribute%20to%20technical%20depth%2Fcapture%20variability%20than%20to%20a%20hidden%20biological%20axis.%0A%0A%20%20%20%20In%20a%20standard%20pipeline%2C%20one%20forces%20cells%20onto%20a%20common%20scale%20with%20normalization%20(size%20factors%2C%20scaling%20to%2010K%20UMIs%2C%20etc.).%20In%20%60scribe%60%2C%20you%20can%20instead%20treat%20depth%20as%20a%20first-class%20parameter%3A%20each%20cell%20gets%20its%20own%20capture%20probability%20%24%5Cnu_c%20%5Cin%20(0%2C1)%24%E2%80%94a%20concrete%20way%20to%20say%20%E2%80%9Cthis%20cell%E2%80%99s%20mRNA%20molecules%20were%20converted%20into%20UMIs%20with%20this%20efficiency.%E2%80%9D%0A%0A%20%20%20%20The%20algebra%20(worked%20out%20in%20the%20paper%E2%80%99s%20Dirichlet%E2%80%93multinomial%20derivation)%20is%20pleasantly%20tidy.%20The%20story%20for%20one%20gene%20in%20one%20cell%20is%3A%0A%0A%20%20%20%201.%20**Latent%20biology%3A**%20a%20transcript%20count%20%24m_g%20%5Csim%20%5Cmathrm%7BNB%7D(r_g%2C%20p)%24%20(shared%20%24p%24%20across%20genes%20in%20this%20formulation).%0A%20%20%20%202.%20**Observation%3A**%20UMIs%20are%20a%20thinning%20of%20transcripts%2C%20%24u_g%20%5Cmid%20m_g%2C%20%5Cnu_c%20%5Csim%20%5Cmathrm%7BBinomial%7D(m_g%2C%20%5Cnu_c)%24.%0A%0A%20%20%20%20If%20you%20marginalize%20out%20the%20unobserved%20%24m_g%24%2C%20the%20observed%20UMIs%20are%20still%20negative%20binomial%E2%80%94same%20%24r_g%24%2C%20but%20with%20an%20effective%20success%20probability%20that%20combines%20biology%20(%24p%24)%20and%20technology%20(%24%5Cnu_c%24)%3A%0A%0A%20%20%20%20%24%24%0A%20%20%20%20%5Chat%7Bp%7D_c%20%3D%20%5Cfrac%7Bp%7D%7B%5Cnu_c%20%2B%20p(1-%5Cnu_c)%7D.%0A%20%20%20%20%24%24%0A%0A%20%20%20%20In%20words%3A%20%24p%24%20and%20%24r_g%24%20control%20the%20shape%20of%20expression%20at%20the%20%E2%80%9Ctrue%20transcript%E2%80%9D%20level%2C%20while%20%24%5Cnu_c%24%20asks%20%E2%80%9Chow%20much%20of%20that%20signal%20survived%20the%20experiment%20for%20this%20cell%3F%E2%80%9D%20When%20%24%5Cnu_c%24%20is%20low%20(a%20shallowly%20sequenced%20cell)%2C%20the%20cell%20effectively%20observes%20a%20more%20thinned%20draw%3B%20the%20formula%20packages%20that%20thinning%20into%20a%20single%20cell-specific%20%24%5Chat%7Bp%7D_c%24%20so%20the%20likelihood%20stays%20a%20clean%20%24%5Cmathrm%7BNB%7D(r_g%2C%5Chat%7Bp%7D_c)%24%20for%20UMIs.%0A%0A%20%20%20%20Setting%20%60variable_capture%3DTrue%60%20is%20therefore%20not%20a%20magic%20extra%20knob%E2%80%94it%20is%20the%20principled%20analogue%20of%20accounting%20for%20sequencing%20depth%2C%20except%20the%20depth%20correction%20lives%20inside%20the%20generative%20model%20(with%20its%20own%20posterior%20uncertainty)%20instead%20of%20being%20a%20one-off%20preprocessing%20division.%0A%0A%20%20%20%20In%20the%20next%20cell%20we%20refit%20with%20this%20extension%20turned%20on%20(still%20in%20canonical%20%24(r%2Cp)%24%20coordinates%20for%20the%20moment)%2C%20and%20we%20will%20revisit%20PPCs%20to%20see%20whether%20the%20count-level%20story%20looks%20more%20coherent%20once%20cells%20are%20allowed%20to%20differ%20in%20%24%5Cnu_c%24.%0A%0A%20%20%20%20Before%20each%20heavier%20refit%2C%20we%20call%20JAX%E2%80%99s%20%60clear_caches()%60%20plus%20%60gc.collect()%60%20so%20GPU%20memory%20stays%20manageable%E2%80%94we%20repeat%20this%20housekeeping%20pattern%20whenever%20we%20swap%20models%20in%20this%20tutorial%20to%20avoid%20out-of-memory%20issues.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(adata%2C%20clear_caches%2C%20gc%2C%20out_dir%2C%20pickle%2C%20scribe)%3A%0A%20%20%20%20%23%20Clear%20GPU%20memory%20from%20previous%20fits%0A%20%20%20%20clear_caches()%0A%20%20%20%20gc.collect()%0A%0A%20%20%20%20%23%20Define%20parameterization%0A%20%20%20%20_parameterization%20%3D%20%22canonical%22%0A%0A%20%20%20%20%23%20Define%20output%20file%20path%0A%20%20%20%20_out_path%20%3D%20out_dir%20%2F%20f%22scribe_results_nbvcp_%7B_parameterization%7D.pkl%22%0A%0A%20%20%20%20if%20_out_path.exists()%3A%0A%20%20%20%20%20%20%20%20%23%20Load%20model%20from%20pkl%20file%0A%20%20%20%20%20%20%20%20with%20open(_out_path%2C%20%22rb%22)%20as%20_f%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20results_nbvcp%20%3D%20pickle.load(_f)%0A%20%20%20%20else%3A%0A%20%20%20%20%20%20%20%20%23%20Fit%20basic%20model%20to%20data%20with%20variable%20capture%20probability%20fit%20per%20cell%0A%20%20%20%20%20%20%20%20results_nbvcp%20%3D%20scribe.fit(%0A%20%20%20%20%20%20%20%20%20%20%20%20adata%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20variable_capture%3DTrue%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20parameterization%3D_parameterization%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20early_stopping%3D%7B%22enabled%22%3A%20True%2C%20%22patience%22%3A%201000%7D%0A%20%20%20%20%20%20%20%20)%0A%20%20%20%20%20%20%20%20%23%20Save%20the%20fitted%20model%0A%20%20%20%20%20%20%20%20with%20open(_out_path%2C%20%22wb%22)%20as%20_f%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20pickle.dump(results_nbvcp%2C%20_f)%0A%0A%20%20%20%20results_nbvcp%0A%20%20%20%20return%20(results_nbvcp%2C)%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20Right%20after%20a%20variable-capture%20fit%2C%20it%20is%20worth%20checking%20that%20the%20model%20is%20using%20per-cell%20capture%20sensibly.%20%60scribe.viz.plot_p_capture_scaling%60%20scatters%20each%20cell%E2%80%99s%20estimated%20capture%20probability%20(MAP%20of%20%60p_capture%60)%20against%20its%20library%20size%20%24L_c%24%20(total%20UMIs)%2C%20so%20you%20can%20see%20whether%20shallow%20and%20deep%20cells%20receive%20sensible%20relative%20adjustments.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(adata%2C%20results_nbvcp%2C%20scribe)%3A%0A%20%20%20%20scribe.viz.plot_p_capture_scaling(%0A%20%20%20%20%20%20%20%20results_nbvcp%2C%20counts%3Dadata%2C%20figsize%3D(4%2C4)%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20The%20capture-scaling%20plot%20is%20a%20diagnostic%20for%20%E2%80%9Cdid%20variable%20capture%20do%20anything%3F%E2%80%9D%20Each%20point%20is%20a%20cell%2C%20with%20library%20size%20%24L_c%24%20on%20the%20%24x%24-axis%20and%20the%20fitted%20per-cell%20capture%20probability%20%24%5Cnu_c%24%20on%20the%20%24y%24-axis.%0A%0A%20%20%20%20The%20pattern%E2%80%94capture%20probability%20rising%20with%20depth%20at%20modest%20library%20sizes%2C%20then%20flattening%20near%201%20for%20the%20largest%20libraries%E2%80%94shows%20how%20the%20fitted%20model%20distributes%20the%20same%20depth%20variation%20we%20saw%20in%20the%20ECDF%3A%20shallow%20cells%20are%20interpreted%20as%20having%20lower%20effective%20capture%2C%20while%20the%20deepest%20cells%20are%20pushed%20toward%20near-complete%20capture.%0A%0A%20%20%20%20It%20is%20tempting%20to%20read%20the%20plateau%20as%20%E2%80%9Cabove%20some%20UMI%20threshold%2C%20everything%20is%20biology.%E2%80%9D%20Be%20careful%3A%20from%20a%20generative-model%20perspective%2C%20absolute%20capture%20is%20not%20something%20counts%20pin%20down%20by%20themselves.%20Observed%20library%20size%20mainly%20constrains%20a%20*product*%20involving%20%24%5Cnu_c%24%20and%20the%20cell%E2%80%99s%20total%20mRNA%20budget%3B%20separating%20%E2%80%9Chow%20much%20RNA%20was%20in%20the%20cell%E2%80%9D%20from%20%E2%80%9Chow%20efficiently%20it%20was%20converted%20into%20UMIs%E2%80%9D%20requires%20extra%20information.%20If%20you%20had%20an%20independent%20total-mRNA%20quantification%20(or%20another%20trustworthy%20anchor%20for%20total%20RNA%20content)%2C%20you%20could%20inject%20that%20as%20stronger%20prior%20knowledge%20and%20make%20the%20capture%20curve%20more%20interpretable%20in%20absolute%20terms.%20In%20routine%20scRNA-seq%20we%20usually%20lack%20that%20direct%20measurement%2C%20so%20this%20plot%20is%20best%20read%20as%20a%20sanity%20check%20that%20depth%20is%20being%20routed%20through%20a%20technical%20channel%20in%20a%20coherent%20way%2C%20not%20as%20a%20calibrated%20readout%20of%20physical%20capture%20efficiency.%0A%0A%20%20%20%20The%20reassuring%20structural%20point%E2%80%94for%20the%20kinds%20of%20gene-by-gene%20comparisons%20we%20often%20care%20about%E2%80%94is%20that%20when%20capture%20acts%20like%20random%20thinning%20of%20transcripts%20into%20UMIs%2C%20relative%20expression%20(gene%20A%20vs.%20gene%20B%20within%20a%20cell)%20can%20be%20much%20more%20stable%20than%20the%20absolute%20%E2%80%9Chow%20many%20molecules%E2%80%9D%20story.%20In%20the%20paper%20we%20show%20that%20misspecifying%20%24%5Cnu_c%24%20forces%20the%20model%20to%20compensate%20elsewhere%2C%20but%20many%20between-gene%20comparisons%20based%20on%20relative%20abundance%20remain%20unchanged.%20By%20contrast%2C%20summaries%20that%20speak%20literally%20about%20total%20RNA%20or%20other%20absolute%20biological%20scales%20can%20absorb%20capture%20error%20more%20directly%20and%20deserve%20extra%20caution.%0A%0A%20%20%20%20One%20more%20guardrail%3A%20a%20smooth%20capture-vs.-depth%20curve%20can%20still%20be%20misleading%20if%20the%20model%20is%20not%20actually%20predicting%20counts%20well.%20Do%20not%20treat%20this%20figure%20as%20authoritative%20when%20PPCs%20look%20systematically%20off.%20Trust%20the%20capture%20story%20only%20insofar%20as%20the%20generative%20model%20passes%20count-level%20sanity%20checks%E2%80%94which%20is%20why%20we%20revisit%20PPCs%20next.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20Let%20us%20revisit%20posterior%20predictive%20checks%20now%20that%20variable%20capture%20is%20turned%20on.%20The%20hope%20is%20straightforward%3A%20much%20of%20what%20looked%20wrong%20in%20the%20first%20PPC%20grid%20was%20probably%20depth%2Fsampling%20variation%20masquerading%20as%20gene-level%20count%20behavior.%20By%20giving%20each%20cell%20its%20own%20capture%20parameter%20%24%5Cnu_c%24%2C%20the%20model%20can%20absorb%20that%20cell-to-cell%20library-size%20spread%20as%20technical%20rather%20than%20forcing%20the%20negative%20binomial%20gene%20parameters%20to%20explain%20everything.%0A%0A%20%20%20%20When%20you%20scan%20the%20new%20panels%2C%20do%20not%20look%20for%20perfection%E2%80%94look%20for%20systematic%20repair%3A%20predicted%20count%20distributions%20should%20track%20the%20observed%20histograms%20more%20closely%2C%20especially%20for%20genes%20where%20depth%20drives%20most%20of%20the%20marginal%20weirdness%2C%20and%20the%20predictive%20bands%20should%20not%20be%20artificially%20wide%20from%20confusing%20sampling%20depth%20with%20biological%20dispersion.%0A%0A%20%20%20%20Variable%20capture%20is%20not%20a%20universal%20solvent%2C%20however.%20If%20you%20still%20see%20strain%20in%20the%20PPCs%2C%20that%20is%20a%20hint%20that%20other%20parts%20of%20the%20modeling%2Finference%20story%20matter%E2%80%94most%20notably%20the%20%24(r%2Cp)%24%20geometry%20under%20a%20canonical%20parameterization%20and%20what%20a%20mean-field%20posterior%20can%20(and%20cannot)%20represent.%20We%20will%20chase%20those%20refinements%20next%3B%20for%20now%2C%20treat%20this%20PPC%20pass%20as%20the%20%E2%80%9Cdid%20depth%20help%3F%E2%80%9D%20checkpoint.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(adata%2C%20results_nbvcp%2C%20scribe)%3A%0A%20%20%20%20scribe.viz.plot_ppc(%0A%20%20%20%20%20%20%20%20results_nbvcp%2C%0A%20%20%20%20%20%20%20%20adata%2C%0A%20%20%20%20%20%20%20%20n_genes%3D16%2C%0A%20%20%20%20%20%20%20%20n_rows%3D4%2C%0A%20%20%20%20%20%20%20%20figsize%3D(8%2C%208)%2C%0A%20%20%20%20%20%20%20%20n_samples%3D512%2C%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20Compared%20to%20the%20first%20fit%2C%20many%20genes%20stop%20looking%20systematically%20shifted%20relative%20to%20the%20model%E2%80%99s%20predictive%20band%2C%20because%20depth%20is%20no%20longer%20being%20smuggled%20into%20every%20gene-level%20parameter.%0A%0A%20%20%20%20But%20it%20is%20also%20common%E2%80%94exactly%20what%20we%20see%20here%E2%80%94that%20the%20predictive%20shaded%20region%20still%20looks%20%E2%80%9Ctoo%20fat.%E2%80%9D%20When%20that%20happens%2C%20it%20is%20a%20hint%20that%20capture%20was%20only%20part%20of%20the%20story.%20The%20other%20part%20is%20geometry%3A%20in%20canonical%20coordinates%20we%20explicitly%20sample%20%24r_g%24%20and%20%24p%24%2C%20and%20those%20coordinates%20carry%20a%20strong%2C%20curved%20dependence%20(the%20same%20%E2%80%9Cbanana%E2%80%9D%20posterior%20shape%20people%20discuss%20for%20negative%20binomials).%20A%20mean-field%20variational%20posterior%20tries%20to%20approximate%20that%20curved%20object%20with%20something%20rectangular%2C%20which%20often%20manifests%20as%20extra%20uncertainty%3A%20the%20model%20%E2%80%9Cknows%E2%80%9D%20it%20cannot%20be%20arbitrarily%20confident%2C%20so%20predictive%20simulations%20become%20unnecessarily%20diffuse%20even%20when%20the%20mean%20trend%20improves.%0A%0A%20%20%20%20Our%20next%20move%20is%20therefore%20not%20%E2%80%9Cchange%20the%20biology%E2%80%9D%E2%80%94it%20is%20to%20change%20the%20coordinates%20we%20use%20for%20inference%20while%20keeping%20the%20same%20generative%20story%20for%20UMIs.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20%23%23%20Same%20generative%20model%2C%20new%20coordinates%3A%20the%20mean%E2%80%93odds%20parameterization%0A%0A%20%20%20%20In%20%60scribe%60%2C%20the%20%60mean_odds%60%20parameterization%20swaps%20the%20primary%20free%20parameters%20to%3A%0A%0A%20%20%20%20-%20%24%5Cmu_g%24%3A%20a%20gene-specific%20mean%20parameter%20for%20the%20negative%20binomial%20(at%20the%20biological%20level%2C%20i.e.%2C%20the%20mean%20mRNA%20count%20%24%5Clangle%20m%20%5Crangle%24)%2C%20and%0A%20%20%20%20-%20%24%5Cphi%24%3A%20a%20positive%20odds-style%20parameter%20that%20sets%20the%20NB%20success%20probability%20through%0A%0A%20%20%20%20%24%24%0A%20%20%20%20p%20%3D%20%5Cfrac%7B1%7D%7B1%2B%5Cphi%7D%2C%0A%20%20%20%20%24%24%0A%0A%20%20%20%20with%20the%20dispersion%20reconstructed%20deterministically%20as%0A%0A%20%20%20%20%24%24%0A%20%20%20%20r_g%20%3D%20%5Cmu_g%5C%2C%5Cphi.%0A%20%20%20%20%24%24%0A%0A%20%20%20%20Two%20points%20make%20this%20a%20big%20deal%20in%20practice%3A%0A%0A%20%20%20%201.%20**A%20friendlier%20landscape%20for%20optimization.**%20A%20probability%20like%20%24p%24%20lives%20in%20a%20bounded%20interval%20and%20can%20spend%20time%20pressed%20against%20boundaries%20in%20tricky%20regimes.%20Working%20with%20%24%5Cphi%20%5Cin%20(0%2C%5Cinfty)%24%20avoids%20much%20of%20the%20numerical%20awkwardness%20that%20comes%20from%20navigating%20a%20near-hard%20wall%20at%20%24p%20%5Capprox%201%24.%0A%0A%20%20%20%202.%20**Correlation%20becomes%20partly%20built-in%20rather%20than%20fought.**%20In%20canonical%20form%2C%20many%20%24(r_g%2C%20p)%24%20pairs%20reproduce%20similar%20means%2C%20creating%20the%20ridge%20that%20mean-field%20approximations%20struggle%20with.%20In%20mean%E2%80%93odds%20coordinates%2C%20%24%5Cmu_g%24%20is%20the%20natural%20knob%20for%20%E2%80%9Chow%20highly%20expressed%20is%20this%20gene%2C%E2%80%9D%20and%20%24%5Cphi%24%20absorbs%20complementary%20mean%E2%80%93variance%20%2F%20odds%20degrees%20of%20freedom%2C%20with%20%24r_g%24%20no%20longer%20an%20independent%20floating%20parameter%E2%80%94it%20is%20computed%20from%20%24(%5Cmu_g%2C%20%5Cphi)%24.%20This%20does%20not%20magically%20eliminate%20all%20posterior%20dependence%20(the%20data%20can%20still%20couple%20genes)%2C%20but%20it%20is%20the%20same%20trick%20used%20throughout%20statistics%3A%20reparameterize%20so%20the%20parameters%20you%20fit%20are%20closer%20to%20orthogonal%20in%20practice%2C%20even%20when%20the%20underlying%20model%20is%20unchanged.%0A%0A%20%20%20%20The%20right%20mental%20model%20is%20cartographic%3A%20you%20are%20not%20changing%20the%20terrain%20(the%20NB%20sampling%20model%2C%20capture%20thinning%2C%20etc.)%3B%20you%20are%20changing%20the%20map%20projection%20the%20optimizer%20and%20variational%20approximation%20use%20to%20walk%20over%20it.%20Same%20model%2C%20different%20geometry%E2%80%94and%20often%20a%20noticeably%20easier%20path%20to%20tighter%2C%20more%20honest%20uncertainty%20in%20PPCs.%20The%20change%20is%20as%20simple%20as%20setting%20%60parameterization%3D%22mean_odds%22%60%20in%20%60scribe.fit%60.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(adata%2C%20clear_caches%2C%20gc%2C%20out_dir%2C%20pickle%2C%20scribe)%3A%0A%20%20%20%20%23%20Clear%20GPU%20memory%20from%20previous%20fits%0A%20%20%20%20clear_caches()%0A%20%20%20%20gc.collect()%0A%0A%20%20%20%20%23%20Define%20parameterization%0A%20%20%20%20_parameterization%20%3D%20%22mean_odds%22%0A%0A%20%20%20%20%23%20Define%20output%20file%20path%0A%20%20%20%20_out_path%20%3D%20out_dir%20%2F%20f%22scribe_results_nbvcp_%7B_parameterization%7D.pkl%22%0A%0A%20%20%20%20if%20_out_path.exists()%3A%0A%20%20%20%20%20%20%20%20%23%20Load%20model%20from%20pkl%20file%0A%20%20%20%20%20%20%20%20with%20open(_out_path%2C%20%22rb%22)%20as%20_f%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20results_odds%20%3D%20pickle.load(_f)%0A%20%20%20%20else%3A%0A%20%20%20%20%20%20%20%20%23%20Fit%20basic%20model%20to%20data%20with%20variable%20capture%20probability%20fit%20per%20cell%0A%20%20%20%20%20%20%20%20results_odds%20%3D%20scribe.fit(%0A%20%20%20%20%20%20%20%20%20%20%20%20adata%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20variable_capture%3DTrue%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20parameterization%3D_parameterization%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20early_stopping%3D%7B%22enabled%22%3A%20True%2C%20%22patience%22%3A%201000%7D%0A%20%20%20%20%20%20%20%20)%0A%20%20%20%20%20%20%20%20%23%20Save%20the%20fitted%20model%0A%20%20%20%20%20%20%20%20with%20open(_out_path%2C%20%22wb%22)%20as%20_f%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20pickle.dump(results_odds%2C%20_f)%0A%0A%20%20%20%20results_odds%0A%20%20%20%20return%20(results_odds%2C)%0A%0A%0A%40app.cell%0Adef%20_(results_odds%2C%20scribe)%3A%0A%20%20%20%20%23%20Plot%20ELBO%20loss%20for%20mean_odds%20fit%0A%20%20%20%20scribe.viz.plot_loss(results_odds%2C%20figsize%3D(6%2C%203))%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20Let%20us%20look%20again%20at%20the%20%24p_%5Ctext%7Bcapture%7D%24%20scaling%20plot.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(adata%2C%20results_odds%2C%20scribe)%3A%0A%20%20%20%20scribe.viz.plot_p_capture_scaling(%0A%20%20%20%20%20%20%20%20results_odds%2C%20counts%3Dadata%2C%20figsize%3D(4%2C4)%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20For%20this%20fit%2C%20the%20model%20places%20the%20full-capture%20saturation%20point%20around%2030K%20UMI%20reads.%20As%20discussed%20earlier%2C%20the%20absolute%20scale%20of%20%24%5Cnu_c%24%20is%20not%20identifiable%20from%20counts%20alone%E2%80%94but%20for%20downstream%20analysis%20the%20degeneracy%20does%20not%20affect%20our%20conclusions.%0A%0A%20%20%20%20Let%20us%20look%20at%20the%20resulting%20PPCs%20for%20the%20same%20set%20of%20genes.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(adata%2C%20results_odds%2C%20scribe)%3A%0A%20%20%20%20scribe.viz.plot_ppc(%0A%20%20%20%20%20%20%20%20results_odds%2C%0A%20%20%20%20%20%20%20%20adata%2C%0A%20%20%20%20%20%20%20%20n_genes%3D16%2C%0A%20%20%20%20%20%20%20%20n_rows%3D4%2C%0A%20%20%20%20%20%20%20%20figsize%3D(8%2C%208)%2C%0A%20%20%20%20%20%20%20%20n_samples%3D512%2C%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20This%20is%20a%20substantial%20step%20forward%3A%20after%20switching%20to%20%60mean_odds%60%2C%20the%20PPC%20panels%20track%20the%20observed%20UMI%20histograms%20much%20more%20closely%2C%20and%20the%20predictive%20bands%20no%20longer%20look%20artificially%20inflated%20in%20the%20way%20we%20saw%20when%20the%20variational%20approximation%20was%20fighting%20canonical%20%24(r%2Cp)%24%20geometry.%0A%0A%20%20%20%20PPCs%20are%20useful%20for%20a%20handful%20of%20genes%2C%20but%20it%20is%20natural%20to%20wonder%20whether%20the%20improvement%20holds%20genome-wide.%20%60scribe.viz.plot_mean_calibration%60%20provides%20a%20compact%20global%20check%3A%20one%20point%20per%20gene%20comparing%20the%20empirical%20average%20UMI%20count%20to%20the%20model%E2%80%99s%20predicted%20average%20from%20the%20MAP%20parameters.%20For%20variable-capture%20fits%2C%20the%20prediction%20converts%20the%20latent%20NB%20%E2%80%9Cbiological%E2%80%9D%20mean%20into%20an%20observed-scale%20mean%20using%20the%20average%20capture%20%24%5Cbar%7B%5Cnu%7D%24%20across%20cells%2C%20so%20the%20comparison%20is%20like-for-like%20against%20what%20you%20actually%20measured.%20On%20a%20log%E2%80%93log%20plot%2C%20points%20hugging%20the%20identity%20line%20mean%20the%20model%E2%80%99s%20best%20single-parameter%20explanation%20of%20overall%20expression%20level%20matches%20the%20data%20across%20the%20dynamic%20range%3B%20systematic%20curvature%20or%20a%20drifting%20cloud%20flags%20global%20miscalibration%20that%20might%20be%20easy%20to%20miss%20in%20individual%20PPC%20panels.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(adata%2C%20results_odds%2C%20scribe)%3A%0A%20%20%20%20scribe.viz.plot_mean_calibration(%0A%20%20%20%20%20%20%20%20results_odds%2C%20counts%3Dadata%2C%20figsize%3D(4%2C%204)%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20With%20good%20PPCs%20in%20hand%2C%20we%20can%20look%20at%20the%20same%20interpretable%20NB%20coordinates%20biologists%20think%20in%E2%80%94even%20though%20the%20fit%20was%20done%20in%20%24(%5Cmu_g%2C%20%5Cphi)%24%20space.%20In%20%60scribe%60%2C%20%24p%24%20and%20%24r_g%24%20are%20deterministic%20functions%20of%20the%20sampled%20parameters%20(they%20inherit%20uncertainty%20from%20%24(%5Cmu%2C%20%5Cphi)%24%2C%20but%20they%20are%20not%20separate%20free%20knobs%20in%20this%20parameterization).%20The%20next%20corner%20plot%20is%20not%20contradicting%20the%20reparameterization%3A%20we%20are%20simply%20exporting%20the%20implied%20%24(p%2C%20r_g)%24%20samples%20to%20compare%20apples-to-apples%20with%20the%20earlier%20canonical-only%20exercise.%0A%0A%20%20%20%20If%20things%20went%20well%2C%20the%202D%20panels%20involving%20shared%20%24p%24%20and%20gene-specific%20%24r_g%24%20should%20look%20more%20structured%20and%20less%20like%20an%20arbitrary%20fat%20blob%20than%20under%20the%20canonical%20mean-field%20regime.%20Good%20PPCs%20mean%20the%20model%20has%20found%20a%20self-consistent%20explanation%20of%20the%20counts%3B%20when%20you%20map%20that%20explanation%20back%20to%20%24(r%2Cp)%24%20coordinates%2C%20you%20often%20see%20correlations%20closer%20to%20the%20mean%E2%80%93dispersion%20constraints%20implied%20by%20the%20NB%2C%20rather%20than%20the%20approximation%20smearing%20mass%20in%20a%20way%20that%20made%20predictions%20overly%20wide.%0A%0A%20%20%20%20Below%20we%20reuse%20the%20same%20illustrative%20genes%20and%20call%20%60get_posterior_matrix%60%20with%20%60include%3D%5B%22p%22%2C%22r%22%5D%60%20and%20%60exclude_deterministic%3DFalse%60%20so%20those%20derived%20canonical%20parameters%20are%20actually%20materialized%20for%20plotting.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(corner%2C%20gene_index%2C%20results_odds)%3A%0A%20%20%20%20_results_subset%20%3D%20results_odds%5Bgene_index%5D%0A%0A%20%20%20%20%23%20Export%20directly%20as%20a%20corner-ready%20matrix%20plus%20labels%2Fmetadata.%0A%20%20%20%20_samples_2d%2C%20_labels%2C%20_%20%3D%20_results_subset.get_posterior_matrix(%0A%20%20%20%20%20%20%20%20n_samples%3D5_000%2C%0A%20%20%20%20%20%20%20%20include%3D%5B%22p%22%2C%20%22r%22%5D%2C%0A%20%20%20%20%20%20%20%20exclude_deterministic%3DFalse%2C%0A%20%20%20%20%20%20%20%20store_samples%3DFalse%2C%0A%20%20%20%20%20%20%20%20convert_to_numpy%3DTrue%2C%0A%20%20%20%20)%0A%0A%20%20%20%20corner.corner(_samples_2d%2C%20labels%3D_labels)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20We%20see%20the%20expected%20coupling%20between%20%24r_g%24%20and%20the%20shared%20%24p%24%3A%20in%20a%20negative%20binomial%2C%20many%20%24(r_g%2C%20p)%24%20pairs%20reproduce%20a%20similar%20mean%20along%20a%20curved%20ridge%2C%20so%20a%20strong%20%24r_g%24%E2%80%93%24p%24%20association%20is%20exactly%20what%20the%20sampling%20model%20(not%20necessarily%20a%20surprise%20from%20the%20data)%20tells%20you%20to%20expect.%0A%0A%20%20%20%20A%20corner%20plot%20can%20over-interpret%20that%20as%20%E2%80%9Ceverything%20is%20correlated%20with%20everything%E2%80%9D%20for%20a%20few%20mechanical%20reasons.%20First%2C%20%24p%24%20is%20the%20same%20across%20genes%20in%20this%20fit%2C%20so%20the%20%24p%24%20axis%20is%20literally%20one%20shared%20draw%20replayed%20in%20every%20%24r_g%24%E2%80%93%24p%24%20panel%E2%80%94of%20course%20the%20clouds%20line%20up.%20Second%2C%20you%20are%20exporting%20coordinates%20%24(p%2C%20r_g)%24%20that%20are%20not%20meant%20to%20be%20orthogonal%20in%20the%20original%20parameterization%2C%20while%20the%20fit%20itself%20was%20done%20with%20a%20mean-field%20posterior%20that%20pretends%20each%20dimension%20has%20an%20independent%20margin.%20The%20combination%E2%80%94shared%20parameter%20%2B%20ridge%20geometry%20%2B%20mean-field%E2%80%94can%20make%20the%20off-diagonals%20look%20like%20extreme%20global%20correlation%20even%20when%20a%20large%20portion%20of%20the%20pattern%20is%20(i)%20structural%20sharing%20of%20%24p%24%20and%20(ii)%20a%20variational%2Fparameterization%20artifact%20rather%20than%20extra%20biological%20coupling.%0A%0A%20%20%20%20When%20we%20fit%20in%20mean%E2%80%93odds%20%24(%5Cmu_g%2C%20%5Cphi)%24%20coordinates%2C%20%24p%24%20and%20%24r_g%24%20become%20deterministic%20functions%20of%20%24(%5Cmu_g%2C%20%5Cphi)%24%3A%20%24p%24%20depends%20on%20%24%5Cphi%24%20alone%2C%20and%20%24r_g%20%3D%20%5Cmu_g%20%5Cphi%24.%20That%20does%20not%20erase%20dependence%20in%20the%20true%20posterior%2C%20but%20it%20changes%20what%20the%20guide%20is%20allowed%20to%20smear.%20To%20make%20the%20contrast%20obvious%2C%20the%20next%20panel%20plots%20%24(%5Cmu%2C%20%5Cphi)%24%E2%80%94where%20the%20fit%20actually%20lives%E2%80%94where%20we%20expect%20the%20spurious%20%E2%80%9Cglobal%20correlation%E2%80%9D%20to%20fall%20away%20and%20the%20posterior%20samples%20to%20look%20like%20relatively%20uncoupled%202D%20marginals.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(corner%2C%20gene_index%2C%20results_odds)%3A%0A%20%20%20%20_results_subset%20%3D%20results_odds%5Bgene_index%5D%0A%0A%20%20%20%20%23%20Export%20directly%20as%20a%20corner-ready%20matrix%20plus%20labels%2Fmetadata.%0A%20%20%20%20_samples_2d%2C%20_labels%2C%20_%20%3D%20_results_subset.get_posterior_matrix(%0A%20%20%20%20%20%20%20%20n_samples%3D5_000%2C%0A%20%20%20%20%20%20%20%20include%3D%5B%22phi%22%2C%20%22mu%22%5D%2C%0A%20%20%20%20%20%20%20%20exclude_deterministic%3DTrue%2C%0A%20%20%20%20%20%20%20%20store_samples%3DFalse%2C%0A%20%20%20%20%20%20%20%20convert_to_numpy%3DTrue%2C%0A%20%20%20%20)%0A%0A%20%20%20%20corner.corner(_samples_2d%2C%20labels%3D_labels)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20%23%23%20A%20joint%2C%20low-rank%20variational%20guide%20(still%20the%20same%20generative%20model)%0A%0A%20%20%20%20So%20far%2C%20variable%20capture%20helped%20the%20model%20treat%20depth%20more%20honestly%2C%20but%20the%20posterior%20approximation%20itself%20can%20still%20be%20a%20limiting%20factor.%20The%20default%20mean-field%20story%20is%20computationally%20convenient%2C%20but%20it%20has%20a%20structural%20blind%20spot%3A%20it%20pretends%20different%20parameter%20blocks%20vary%20independently%20even%20when%20the%20data%20naturally%20tie%20them%20together.%0A%0A%20%20%20%20Two%20kinds%20of%20coupling%20matter%20for%20negative%20binomial%20counts%3A%0A%0A%20%20%20%20-%20**Within%20a%20gene%3A**%20the%20NB%20still%20has%20a%20mean%E2%80%93variance%20%2F%20mean%E2%80%93dispersion%20structure.%20In%20canonical%20%24(r_g%2C%20p)%24%20space%20that%20shows%20up%20as%20the%20famous%20curved%20ridge.%20Mean%E2%80%93odds%20%24(%5Cmu_g%2C%20%5Cphi)%24%20already%20made%20mean-field%20life%20easier%20because%20%24%5Cmu_g%24%20is%20the%20natural%20%E2%80%9Clevel%E2%80%9D%20knob%20and%20%24%5Cphi%24%20carries%20complementary%20degrees%20of%20freedom%2C%20with%20%24(p%2C%20r_g)%24%20computed%20deterministically.%0A%20%20%20%20-%20**Across%20genes%3A**%20even%20with%20a%20shared%20success%20probability%2C%20every%20gene%E2%80%99s%20data%20tug%20on%20the%20same%20shared%20piece%20of%20the%20model%2C%20inducing%20indirect%20posterior%20coupling%20between%20genes.%20A%20factorized%20guide%20can%20smear%20that%20coupling%20into%20extra%20marginal%20uncertainty%2C%20which%20often%20looks%20like%20over-wide%20PPC%20bands.%0A%0A%20%20%20%20The%20paper%E2%80%99s%20joint%20low-rank%20guide%20is%20a%20pragmatic%20upgrade%20that%20is%20not%20tied%20to%20one%20parameterization%3A%20%60scribe%60%20can%20pair%20low-rank%20%2B%20joint%20structure%20with%20%60canonical%60%2C%20%60mean_prob%60%2C%20or%20%60mean_odds%60.%20Mathematically%2C%20the%20idea%20is%20to%20replace%20an%20overly%20factorized%20family%20with%20one%20multivariate%20Normal%20(in%20an%20unconstrained%2Freparameterized%20space)%20whose%20covariance%20is%20low%20rank%3A%0A%0A%20%20%20%20%24%24%0A%20%20%20%20%5Cunderline%7B%5Cunderline%7B%5CSigma%7D%7D%20%5C%3B%5Capprox%5C%3B%20%5Cunderline%7B%5Cunderline%7BW%7D%7D%5C%2C%5Cunderline%7B%5Cunderline%7BW%7D%7D%5E%5Ctop%20%2B%20%5Cunderline%7B%5Cunderline%7BD%7D%7D%2C%0A%20%20%20%20%24%24%0A%0A%20%20%20%20with%20a%20small%20number%20of%20columns%20in%20%24%5Cunderline%7B%5Cunderline%7BW%7D%7D%24%20(here%20%60guide_rank%3D128%60)%E2%80%94analogous%20to%20keeping%20only%20the%20first%20few%20principal%20components%20to%20summarize%20coordinated%20variation.%20The%20guide%20can%20move%20parameters%20along%20a%20low-dimensional%20set%20of%20shared%20joint%20modes%20instead%20of%20forcing%20strict%20independence%2C%20while%20still%20avoiding%20the%20%24O(G%5E2)%24%20cost%20of%20a%20dense%20covariance%20across%20all%20genes.%0A%0A%20%20%20%20We%20run%20it%20under%20%60mean_odds%60%20because%20that%20parameterization%20was%20already%20empirically%20well-behaved%20at%20the%20mean-field%20stage.%20In%20code%2C%20three%20knobs%20implement%20the%20idea%3A%0A%0A%20%20%20%20-%20%60guide_rank%3D128%60%3A%20how%20many%20joint%20directions%20of%20coordinated%20posterior%20variability%20the%20guide%20can%20use.%0A%20%20%20%20-%20%60joint_params%3D%22biological%22%60%3A%20shorthand%20for%20the%20two%20sampled%20core%20parameters%20in%20the%20active%20parameterization.%20For%20%60mean_odds%60%2C%20that%20means%20%24%5Cunderline%7B%5Cmu%7D%24%20and%20%24%5Cphi%24%20are%20guided%20jointly%2C%20so%20the%20approximation%20can%20represent%20correlation%20between%20mean%20intensity%20and%20odds%2Fdispersion-like%20degrees%20of%20freedom.%0A%0A%20%20%20%20We%20also%20increase%20%60n_steps%3D500_000%60%20because%20richer%20guides%20often%20need%20more%20optimization%20time%2C%20and%20we%20turn%20off%20%60early_stopping%60%20to%20let%20the%20optimizer%20use%20all%20available%20steps.%20The%20loss%20curve%20may%20look%20flat%20for%20long%20stretches%2C%20but%20small%20incremental%20improvements%20still%20tighten%20the%20fit.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(adata%2C%20clear_caches%2C%20gc%2C%20out_dir%2C%20pickle%2C%20scribe)%3A%0A%20%20%20%20%23%20Clear%20GPU%20memory%20from%20previous%20fits%0A%20%20%20%20clear_caches()%0A%20%20%20%20gc.collect()%0A%0A%20%20%20%20%23%20Define%20parameterization%0A%20%20%20%20_parameterization%20%3D%20%22mean_odds%22%0A%0A%20%20%20%20%23%20Define%20output%20file%20path%0A%20%20%20%20_out_path%20%3D%20out_dir%20%2F%20f%22scribe_results_nbvcp-low-rank_%7B_parameterization%7D.pkl%22%0A%0A%20%20%20%20if%20_out_path.exists()%3A%0A%20%20%20%20%20%20%20%20%23%20Load%20model%20from%20pkl%20file%0A%20%20%20%20%20%20%20%20with%20open(_out_path%2C%20%22rb%22)%20as%20_f%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20results_lowrank%20%3D%20pickle.load(_f)%0A%20%20%20%20else%3A%0A%20%20%20%20%20%20%20%20%23%20Fit%20basic%20model%20to%20data%20with%20variable%20capture%20probability%20fit%20per%20cell%0A%20%20%20%20%20%20%20%20results_lowrank%20%3D%20scribe.fit(%0A%20%20%20%20%20%20%20%20%20%20%20%20adata%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20variable_capture%3DTrue%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20parameterization%3D_parameterization%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20guide_rank%3D128%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20joint_params%3D%22biological%22%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20n_steps%3D500_000%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20early_stopping%3D%7B%22enabled%22%3A%20False%7D%0A%20%20%20%20%20%20%20%20)%0A%20%20%20%20%20%20%20%20%23%20Save%20the%20fitted%20model%0A%20%20%20%20%20%20%20%20with%20open(_out_path%2C%20%22wb%22)%20as%20_f%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20pickle.dump(results_lowrank%2C%20_f)%0A%0A%20%20%20%20results_lowrank%0A%20%20%20%20return%20(results_lowrank%2C)%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20Let%20us%20check%20the%20loss%20to%20make%20sure%20training%20converged.%0A%0A%20%20%20%20%3E%20**Note%3A**%20we%20are%20not%20showing%20the%20loss%20curve%20for%20every%20model%20to%20save%20space%2C%20but%20this%20should%20always%20be%20the%20first%20plot%20you%20make.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(results_lowrank%2C%20scribe)%3A%0A%20%20%20%20%23%20Plot%20ELBO%20loss%0A%20%20%20%20scribe.viz.plot_loss(results_lowrank%2C%20figsize%3D(6%2C%203))%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20Now%20let%20us%20look%20at%20the%20PPC%20plots%20under%20the%20low-rank%20posterior.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(adata%2C%20results_lowrank%2C%20scribe)%3A%0A%20%20%20%20scribe.viz.plot_ppc(%0A%20%20%20%20%20%20%20%20results_lowrank%2C%0A%20%20%20%20%20%20%20%20adata%2C%0A%20%20%20%20%20%20%20%20n_genes%3D16%2C%0A%20%20%20%20%20%20%20%20n_rows%3D4%2C%0A%20%20%20%20%20%20%20%20figsize%3D(8%2C%208)%2C%0A%20%20%20%20%20%20%20%20n_samples%3D512%2C%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20After%20the%20joint%20low-rank%20fit%2C%20the%20PPCs%20for%20these%20genes%20look%20essentially%20the%20same%20as%20the%20mean-field%20%60mean_odds%60%20run%E2%80%94the%20marginal%20count%20predictions%20were%20already%20well%20calibrated.%20Let%20us%20check%20the%20global%20mean%20calibration%20as%20well.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(adata%2C%20results_lowrank%2C%20scribe)%3A%0A%20%20%20%20scribe.viz.plot_mean_calibration(%0A%20%20%20%20%20%20%20%20results_lowrank%2C%20counts%3Dadata%2C%20figsize%3D(4%2C%204)%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20So%20far%2C%20mean-field%20variational%20inference%20only%20gave%20us%20independent%20approximate%20marginals%E2%80%94there%20was%20no%20notion%20of%20%E2%80%9Cthis%20gene%E2%80%99s%20parameter%20co-moves%20with%20that%20gene%E2%80%99s%E2%80%9D%20inside%20the%20fit.%20A%20joint%20guide%20(here%2C%20low-rank%20structure%20across%20genes%20and%20parameters)%20is%20exactly%20where%20that%20becomes%20possible%3A%20the%20approximation%20can%20place%20non-negligible%20posterior%20correlation%20between%20gene-level%20parameters%20when%20the%20data%20support%20it.%0A%0A%20%20%20%20%60scribe.viz.plot_correlation_heatmap%60%20visualizes%20a%20correlation%20matrix%20of%20posterior%20samples%20(clustered%2C%20focusing%20on%20a%20subset%20of%20the%20most%20variable%20genes%20for%20readability).%20It%20is%20a%20practical%20first%20pass%20at%20spotting%20putative%20co-regulation%3A%20genes%20whose%20fitted%20parameters%20co-vary%20under%20the%20current%20model%20and%20guide.%20These%20patterns%20are%20worth%20cross-checking%20with%20external%20biology%2C%20but%20they%20provide%20a%20useful%20way%20to%20go%20from%20a%20%E2%80%9Cbetter%20guide%E2%80%9D%20to%20something%20that%20can%20suggest%20coordinated%20gene%20modules%20for%20follow-up.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(adata%2C%20results_lowrank%2C%20scribe)%3A%0A%20%20%20%20scribe.viz.plot_correlation_heatmap(%0A%20%20%20%20%20%20%20%20results_lowrank%2C%20%0A%20%20%20%20%20%20%20%20counts%3Dadata%2C%20%0A%20%20%20%20%20%20%20%20n_genes%3D100%2C%0A%20%20%20%20%20%20%20%20n_samples%3D1_000%2C%0A%20%20%20%20%20%20%20%20figsize%3D(7%2C7)%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(corner%2C%20gene_index%2C%20results_lowrank)%3A%0A%20%20%20%20_results_subset%20%3D%20results_lowrank%5Bgene_index%5D%0A%0A%20%20%20%20%23%20Export%20directly%20as%20a%20corner-ready%20matrix%20plus%20labels%2Fmetadata.%0A%20%20%20%20_samples_2d%2C%20_labels%2C%20_%20%3D%20_results_subset.get_posterior_matrix(%0A%20%20%20%20%20%20%20%20n_samples%3D5_000%2C%0A%20%20%20%20%20%20%20%20include%3D%5B%22p%22%2C%20%22r%22%5D%2C%0A%20%20%20%20%20%20%20%20exclude_deterministic%3DFalse%2C%0A%20%20%20%20%20%20%20%20store_samples%3DFalse%2C%0A%20%20%20%20%20%20%20%20convert_to_numpy%3DTrue%2C%0A%20%20%20%20)%0A%0A%20%20%20%20corner.corner(_samples_2d%2C%20labels%3D_labels)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20%23%23%23%20Why%20is%20the%20%24p%24%20range%20so%20much%20narrower%20here%3F%0A%0A%20%20%20%20In%20the%20mean-field%20corner%20plot%20above%2C%20%24p%24%20wandered%20freely%20from%20roughly%200.24%20to%200.48%E2%80%94and%20yet%20the%20PPCs%20looked%20excellent.%20Here%2C%20the%20joint%20low-rank%20guide%20pins%20%24p%24%20down%20to%20a%20sliver%20around%200.314.%20At%20first%20glance%20that%20feels%20wrong%3A%20if%20so%20many%20%24p%24%20values%20gave%20good%20fits%20before%2C%20should%20a%20richer%20posterior%20not%20keep%20that%20flexibility%3F%0A%0A%20%20%20%20The%20short%20answer%20is%3A%20the%20mean-field%E2%80%99s%20wide%20%24p%24%20was%20the%20artifact%2C%20not%20this%20narrow%20one.%0A%0A%20%20%20%20A%20mean-field%20guide%20treats%20every%20parameter%20as%20independent.%20It%20*cannot*%20represent%20the%20fact%20that%20when%20%24p%24%20goes%20up%2C%20each%20%24r_g%24%20should%20go%20down%20in%20a%20coordinated%20way%20to%20keep%20the%20gene%20means%20stable.%20Faced%20with%20that%20limitation%2C%20the%20optimizer%20makes%20%24q(%5Cphi)%24%20wide%20and%20each%20%24q(%5Cmu_g)%24%20wide%20independently%2C%20so%20that%20at%20least%20some%20random%20draws%20land%20near%20the%20good-fit%20region%20by%20chance.%20The%20wide%20%24p%24%20range%20you%20saw%20is%20not%20the%20posterior%20saying%20%E2%80%9C%24p%24%20is%20genuinely%20uncertain%E2%80%9D%E2%80%94it%20is%20the%20guide%20saying%20%E2%80%9CI%20cannot%20model%20the%20coupling%2C%20so%20I%20will%20be%20vague%20about%20everything%20and%20hope%20for%20the%20best.%E2%80%9D%20The%20dramatic%20banana%20shapes%20in%20the%20off-diagonal%20panels%20literally%20show%20the%20ridge%20structure%20the%20guide%20is%20failing%20to%20capture.%0A%0A%20%20%20%20Think%20of%20it%20like%20giving%20directions%20without%20a%20map.%20You%20might%20say%20%E2%80%9Cthe%20restaurant%20is%20somewhere%20between%201st%20and%2010th%20Street%E2%80%9D%E2%80%94not%20because%20the%20address%20is%20uncertain%2C%20but%20because%20you%20cannot%20explain%20how%20to%20get%20there.%20Hand%20someone%20a%20map%20(model%20the%20coupling)%20and%20the%20directions%20collapse%20to%20%E2%80%9Cit%E2%80%99s%20at%205th%20and%20Main.%E2%80%9D%0A%0A%20%20%20%20The%20low-rank%20guide%20*can*%20model%20the%20coupling%3A%20each%20gene%E2%80%99s%20%24%5Cmu_g%24%20shifts%20in%20response%20to%20%24%5Cphi%24%20through%20a%20learned%20per-gene%20regression%20coefficient%20%24%5Calpha_g%24.%20Once%20that%20coupling%20is%20explicit%2C%20the%20marginal%20width%20of%20%24%5Cphi%24%20reflects%20the%20actual%20posterior%20uncertainty%20perpendicular%20to%20the%20ridge%E2%80%94how%20much%20%24%5Cphi%24%20can%20vary%20when%20all%20the%20%24%5Cmu_g%24%20values%20adjust%20optimally%20alongside%20it.%20With%20%24%5Csim%5C!30%7B%2C%7D000%24%20genes%20and%20%24%5Csim%5C!3%7B%2C%7D000%24%20cells%20all%20constraining%20a%20single%20shared%20scalar%2C%20that%20residual%20uncertainty%20is%20genuinely%20tiny.%20The%20narrow%20%24p%24%20is%20not%20a%20limitation%20of%20the%20guide%3B%20it%20is%20the%20guide%20being%20precise%20enough%20to%20show%20how%20tightly%20the%20data%20actually%20constrain%20this%20parameter.%0A%0A%20%20%20%20A%20sanity%20check%20for%20the%20intuition%3A%20the%20%24r_g%24%E2%80%93%24r_g%24%20panels%20also%20tightened%20dramatically.%20Under%20the%20mean-field%2C%20the%20wide%20off-diagonals%20between%20%24r_g%24%20values%20came%20almost%20entirely%20from%20replaying%20the%20same%20noisy%20%24%5Cphi%24%20draw%20across%20genes%E2%80%94shared-parameter%20bookkeeping%2C%20not%20genuine%20gene%E2%80%93gene%20biology.%20The%20joint%20guide%20absorbs%20that%20mechanical%20coupling%20into%20%24%5Calpha_g%24%20and%20leaves%20only%20the%20real%20residual%20structure.%0A%0A%20%20%20%20In%20summary%3A%20the%20mean-field%20inflates%20marginal%20uncertainty%20to%20compensate%20for%20missing%20coupling%3B%20the%20joint%20guide%20models%20the%20coupling%20directly%20and%20reveals%20that%20the%20true%20uncertainty%20is%20small.%20Both%20are%20internally%20consistent%20given%20their%20assumptions%2C%20but%20the%20joint%20guide%E2%80%99s%20assumptions%20are%20closer%20to%20reality.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20%23%23%20Relaxing%20the%20shared-%24p%24%20assumption%3A%20gene-specific%20%24p_g%24%0A%0A%20%20%20%20So%20far%2C%20every%20model%20we%20have%20fit%20assumes%20a%20single%20success%20probability%20%24p%24%20shared%20across%20all%20genes.%20That%20assumption%20is%20what%20makes%20the%20Dirichlet%E2%80%93multinomial%20factorization%20work%20cleanly%3A%20when%20every%20gene%20uses%20the%20same%20%24p%24%2C%20the%20success%20probability%20cancels%20out%20of%20the%20compositional%20distribution%20and%20differential%20expression%20can%20operate%20on%20%24%5Cunderline%7Br%7D%24%20alone.%0A%0A%20%20%20%20But%20what%20if%20the%20data%20do%20not%20support%20that%20assumption%3F%20Perhaps%20the%20negative%20binomial%E2%80%99s%20mean%E2%80%93variance%20relationship%20is%20not%20uniform%20across%20the%20transcriptome.%20%60scribe%60%20provides%20a%20principled%20way%20to%20check%3A%20replace%20the%20scalar%20%24p%24%20with%20a%20hierarchical%20gene-specific%20%24p_g%24%2C%20where%20each%20gene%20draws%20its%20own%20success%20probability%20from%20a%20learned%20population%20distribution%3A%0A%0A%20%20%20%20%24%24%0A%20%20%20%20%5Ctext%7Blogit%7D(p_g)%20%5Csim%20%5Cmathcal%7BN%7D(%5Cmu_p%2C%20%5Csigma_p)%2C%20%5Cquad%20g%20%3D%201%2C%20%5Cldots%2C%20G.%0A%20%20%20%20%24%24%0A%0A%20%20%20%20The%20hyperparameters%20%24%5Cmu_p%24%20and%20%24%5Csigma_p%24%20are%20inferred%20from%20the%20data.%20If%20the%20data%20are%20consistent%20with%20shared%20%24p%24%2C%20the%20posterior%20will%20shrink%20%24%5Csigma_p%24%20toward%20zero%20and%20all%20%24p_g%24%20will%20collapse%20to%20the%20same%20value%E2%80%94recovering%20the%20standard%20model%20automatically.%20If%20gene-to-gene%20variation%20is%20real%2C%20the%20model%20can%20accommodate%20it%20without%20forcing%20it.%0A%0A%20%20%20%20Crucially%2C%20%60scribe%60%20knows%20how%20to%20turn%20the%20inferred%20parameters%20into%20compositions%E2%80%94the%20fraction%20of%20the%20total%20transcriptome%20that%20each%20gene%20occupies.%20Under%20the%20standard%20shared-%24p%24%20model%2C%20compositions%20come%20from%20a%20Dirichlet%20distribution%20that%20depends%20only%20on%20%24%5Cunderline%7Br%7D%24.%20Under%20the%20gene-specific%20%24p_g%24%20model%2C%20the%20mapping%20is%20slightly%20more%20involved%20(each%20gene%E2%80%99s%20contribution%20is%20scaled%20by%20its%20own%20%24p_g%24)%2C%20but%20the%20math%20is%20exact%20and%20reduces%20to%20the%20standard%20Dirichlet%20when%20all%20%24p_g%24%20happen%20to%20be%20equal.%20This%20means%20differential%20expression%20analysis%20remains%20on%20solid%20mathematical%20footing%20even%20with%20the%20extra%20flexibility.%0A%0A%20%20%20%20In%20code%2C%20three%20new%20knobs%20appear%20compared%20to%20the%20previous%20low-rank%20model%3A%0A%0A%20%20%20%20-%20%60parameterization%3D%22canonical%22%60%3A%20we%20switch%20back%20to%20canonical%20%24(r_g%2C%20p_g)%24%20coordinates%20because%20%24p_g%24%20is%20now%20gene-specific%20and%20no%20longer%20a%20derived%20quantity%E2%80%94it%20is%20a%20first-class%20sampled%20parameter%20alongside%20%24r_g%24.%0A%20%20%20%20-%20%60dense_params%3D%22mean%22%60%3A%20tells%20the%20structured%20joint%20guide%20which%20parameters%20receive%20the%20full%20low-rank%20covariance%20treatment.%20Here%2C%20%60%22mean%22%60%20refers%20to%20%24r_g%24%3A%20it%20gets%20the%20rank-128%20%24%5Cunderline%7B%5Cunderline%7BW%7D%7D%5Cunderline%7B%5Cunderline%7BW%7D%7D%5E%5Ctop%20%2B%20%5Cunderline%7B%5Cunderline%7BD%7D%7D%24%20structure%20for%20gene%E2%80%93gene%20correlations.%20The%20%24p_g%24%20parameters%20receive%20a%20simpler%20diagonal%20posterior%20with%20a%20linear%20regression%20on%20%24r_g%24%E2%80%94enough%20to%20capture%20per-gene%20%24r_g%24%E2%80%93%24p_g%24%20coupling%20without%20consuming%20rank-%24k%24%20capacity%20on%20%24p_g%24%E2%80%93%24p_g%24%20correlations%20across%20genes.%0A%20%20%20%20-%20%60prob_prior%3D%22gaussian%22%60%3A%20activates%20the%20hierarchical%20gene-specific%20%24p_g%24%20model%2C%20replacing%20the%20default%20flat%20or%20fixed%20prior%20on%20a%20shared%20%24p%24%20with%20the%20%24%5Cmathcal%7BN%7D(%5Cmu_p%2C%20%5Csigma_p)%24%20hierarchy.%0A%0A%20%20%20%20The%20result%20is%20a%20model%20that%20is%20more%20flexible%20in%20its%20generative%20assumptions%20(each%20gene%20has%20its%20own%20%24p_g%24)%20and%20still%20compositionally%20valid%20(the%20Gamma-based%20normalization%20is%20exact).%20The%20tradeoff%E2%80%94which%20we%20will%20examine%20shortly%E2%80%94is%20that%20the%20extra%20degrees%20of%20freedom%20may%20widen%20posterior%20uncertainty%20on%20quantities%20that%20depend%20on%20%24p_g%24%2C%20including%20compositions.%20Whether%20that%20extra%20uncertainty%20is%20a%20feature%20(honest%20reflection%20of%20what%20the%20data%20can%20tell%20us)%20or%20a%20cost%20(estimating%20parameters%20the%20data%20did%20not%20need)%20is%20exactly%20the%20kind%20of%20question%20Bayesian%20model%20comparison%20can%20answer%20formally.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(adata%2C%20clear_caches%2C%20gc%2C%20out_dir%2C%20pickle%2C%20scribe)%3A%0A%20%20%20%20%23%20Clear%20GPU%20memory%20from%20previous%20fits%0A%20%20%20%20clear_caches()%0A%20%20%20%20gc.collect()%0A%0A%20%20%20%20%23%20Define%20parameterization%0A%20%20%20%20_parameterization%20%3D%20%22canonical%22%0A%0A%20%20%20%20%23%20Define%20output%20file%20path%0A%20%20%20%20_out_path%20%3D%20out_dir%20%2F%20f%22scribe_results_nbvcp-low-rank-prob_%7B_parameterization%7D.pkl%22%0A%0A%20%20%20%20if%20_out_path.exists()%3A%0A%20%20%20%20%20%20%20%20%23%20Load%20model%20from%20pkl%20file%0A%20%20%20%20%20%20%20%20with%20open(_out_path%2C%20%22rb%22)%20as%20_f%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20results_hierprob%20%3D%20pickle.load(_f)%0A%20%20%20%20else%3A%0A%20%20%20%20%20%20%20%20%23%20Fit%20basic%20model%20to%20data%20with%20variable%20capture%20probability%20fit%20per%20cell%0A%20%20%20%20%20%20%20%20results_hierprob%20%3D%20scribe.fit(%0A%20%20%20%20%20%20%20%20%20%20%20%20adata%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20variable_capture%3DTrue%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20parameterization%3D_parameterization%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20guide_rank%3D128%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20joint_params%3D%22biological%22%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20dense_params%3D%22mean%22%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20prob_prior%3D%22gaussian%22%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20n_steps%3D100_000%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20early_stopping%3D%7B%22enabled%22%3A%20True%2C%20%22patience%22%3A%201000%7D%0A%20%20%20%20%20%20%20%20)%0A%20%20%20%20%20%20%20%20%23%20Save%20the%20fitted%20model%0A%20%20%20%20%20%20%20%20with%20open(_out_path%2C%20%22wb%22)%20as%20_f%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20pickle.dump(results_hierprob%2C%20_f)%0A%0A%20%20%20%20results_hierprob%0A%20%20%20%20return%20(results_hierprob%2C)%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20Let%20us%20look%20again%20at%20the%20capture-scaling%20plot.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(adata%2C%20results_hierprob%2C%20scribe)%3A%0A%20%20%20%20scribe.viz.plot_p_capture_scaling(%0A%20%20%20%20%20%20%20%20results_hierprob%2C%20counts%3Dadata%2C%20figsize%3D(4%2C4)%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20This%20time%2C%20the%20model%20places%20the%20fully%20captured%20saturation%20point%20around%2050K%20reads.%0A%0A%20%20%20%20Now%20let%20us%20look%20at%20the%20PPCs.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(adata%2C%20results_hierprob%2C%20scribe)%3A%0A%20%20%20%20scribe.viz.plot_ppc(%0A%20%20%20%20%20%20%20%20results_hierprob%2C%0A%20%20%20%20%20%20%20%20adata%2C%0A%20%20%20%20%20%20%20%20n_genes%3D16%2C%0A%20%20%20%20%20%20%20%20n_rows%3D4%2C%0A%20%20%20%20%20%20%20%20figsize%3D(8%2C%208)%2C%0A%20%20%20%20%20%20%20%20n_samples%3D512%2C%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20The%20PPCs%20for%20this%20run%20look%20excellent%E2%80%94and%20that%20is%20worth%20a%20pause%2C%20because%20we%20are%20no%20longer%20in%20the%20mean%E2%80%93odds%20%24(%5Cmu_g%2C%20%5Cphi)%24%20reparameterization%20that%20we%20leaned%20on%20earlier.%20What%20changed%20is%20not%20%E2%80%9Cforgetting%20the%20geometry%20problem%E2%80%9D%20but%20addressing%20the%20same%20object%20in%20a%20different%20way%3A%20the%20%24(r_g%2C%20p_g)%24%20ridge%20is%20still%20the%20hard%20part%20of%20a%20negative%20binomial%2C%20but%20here%20each%20gene%E2%80%99s%20%24(r_g%2C%20p_g)%24%20pair%20is%20allowed%20to%20move%20together.%20The%20hierarchical%20%24%5Ctext%7Blogit%7D(p_g)%24%20prior%20plus%20a%20guide%20that%20explicitly%20models%20intra-pair%20dependence%20supplies%20enough%20coupling%20that%20the%20variational%20family%20is%20no%20longer%20forced%20to%20smear%20mass%20along%20a%20fake%20axis%20in%20%24(r%2C%20p)%24.%20In%20other%20words%2C%20we%20traded%20the%20%24%5Cmu%20%2F%20%5Cphi%24%20reparameterization%20trick%20for%20gene-specific%20%24p_g%24%20and%20structured%20per-gene%20%24(r_g%2C%20p_g)%24%20dependence%2C%20and%20the%20PPCs%20suggest%20that%20is%20enough%20to%20recover%20the%20good%20predictive%20behavior.%0A%0A%20%20%20%20Let%20us%20check%20the%20global%20calibration%20overview.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(adata%2C%20results_hierprob%2C%20scribe)%3A%0A%20%20%20%20scribe.viz.plot_mean_calibration(%0A%20%20%20%20%20%20%20%20results_hierprob%2C%20counts%3Dadata%2C%20figsize%3D(4%2C%204)%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20With%20gene-specific%20%24p_g%24%2C%20a%20corner%20plot%20in%20%60include%3D%5B%22p%22%2C%22r%22%5D%60%20becomes%20a%20bigger%20grid%3A%20you%20are%20no%20longer%20overlaying%20many%20genes%20on%20top%20of%20a%20single%20shared%20%24p%24%2C%20but%20laying%20out%20each%20gene%E2%80%99s%20own%20%24(r_g%2C%20p_g)%24%20pair.%20For%20a%20given%20gene%20you%20should%20see%20strong%20intra-gene%20dependence%20between%20its%20%24r_g%24%20and%20its%20%24p_g%24%20(the%20same%20ridge%20geometry%20as%20before%2C%20but%20now%20per%20gene)%2C%20while%20off-diagonal%20blocks%20mixing%20%24p_g%24%20from%20one%20gene%20with%20%24r_%7Bg%E2%80%99%7D%24%20from%20another%20should%20look%20much%20more%20like%20independent%20blobs.%0A%0A%20%20%20%20The%20lack%20of%20obvious%20correlation%20between%20unmatched%20%24(p_g%2C%20r_%7Bg%E2%80%99%7D)%24%20pairs%20in%20this%20figure%20is%20not%20a%20claim%20that%20the%20model%20has%20forgotten%20gene%E2%80%93gene%20structure%20in%20general%20(the%20low-rank%20joint%20structure%20you%20used%20still%20lives%20in%20the%20fit).%20It%20is%20largely%20a%20statement%20about%20which%20genes%20we%20put%20in%20the%20panel%20for%20visibility%3A%20the%20particular%20set%20of%20names%20in%20%60gene_index%60%20can%20easily%20be%20weakly%20coupled%20to%20each%20other%20under%20the%20posterior%2C%20so%20the%20cross-gene%20corners%20look%20flat%20even%20when%20other%20gene%20sets%E2%80%94or%20a%20heatmap%20across%20many%20loci%E2%80%94would%20show%20shared%20axes%20of%20variation.%20In%20other%20words%2C%20this%20corner%20is%20a%20pairwise%20microscope%20on%20local%20%24(r_g%20%5Cleftrightarrow%20p_g)%24%20coupling%2C%20not%20a%20full%20summary%20of%20all%20cross-gene%20co-movement.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(corner%2C%20gene_index%2C%20results_hierprob)%3A%0A%20%20%20%20_results_subset%20%3D%20results_hierprob%5Bgene_index%5D%0A%0A%20%20%20%20%23%20Export%20directly%20as%20a%20corner-ready%20matrix%20plus%20labels%2Fmetadata.%0A%20%20%20%20_samples_2d%2C%20_labels%2C%20_%20%3D%20_results_subset.get_posterior_matrix(%0A%20%20%20%20%20%20%20%20n_samples%3D5_000%2C%0A%20%20%20%20%20%20%20%20include%3D%5B%22p%22%2C%20%22r%22%5D%2C%0A%20%20%20%20%20%20%20%20exclude_deterministic%3DTrue%2C%0A%20%20%20%20%20%20%20%20store_samples%3DFalse%2C%0A%20%20%20%20%20%20%20%20convert_to_numpy%3DTrue%2C%0A%20%20%20%20)%0A%0A%20%20%20%20corner.corner(_samples_2d%2C%20labels%3D_labels)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20%23%23%23%20Do%20compositions%20actually%20change%20between%20models%3F%0A%0A%20%20%20%20Both%20models%E2%80%94shared%20%24p%24%20with%20%60mean_odds%60%20and%20gene-specific%20%24p_g%24%20with%20%60canonical%60%E2%80%94gave%20excellent%20calibration%20and%20PPCs.%20But%20the%20question%20we%20ultimately%20care%20about%20for%20downstream%20analysis%20is%3A%20**do%20the%20inferred%20compositions%20differ%3F**%0A%0A%20%20%20%20Compositions%20tell%20us%20what%20fraction%20of%20the%20transcriptome%20each%20gene%20occupies.%20Under%20the%20shared-%24p%24%20model%2C%20each%20gene's%20contribution%20is%20determined%20solely%20by%20its%20dispersion%20%24r_g%24%3A%20draw%20%24%5Cgamma_g%20%5Csim%20%5Ctext%7BGamma%7D(r_g%2C%201)%24%2C%20normalize%2C%20and%20the%20shared%20%24p%24%20drops%20out%20entirely.%20Under%20the%20gene-specific%20%24p_g%24%20model%2C%20each%20gene's%20contribution%20is%20additionally%20scaled%20by%20%24(1%20-%20p_g)%2Fp_g%24%20before%20normalizing%E2%80%94so%20genes%20with%20lower%20%24p_g%24%20get%20a%20proportionally%20larger%20share.%20If%20the%20%24p_g%24%20values%20are%20all%20nearly%20identical%20(as%20the%20hierarchical%20prior%20encourages)%2C%20the%20two%20models%20should%20agree.%0A%0A%20%20%20%20To%20test%20this%2C%20we%20draw%201%2C000%20compositional%20samples%20from%20each%20fitted%20model%20and%20compare%20the%20median%20composition%20of%20every%20gene%20between%20the%20two.%20The%20error%20bars%20show%20the%2095%25%20credible%20interval%20from%20each%20model's%20posterior%2C%20giving%20us%20a%20sense%20of%20how%20uncertain%20each%20model%20is%20about%20each%20gene's%20share%20of%20the%20transcriptome.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(clear_caches%2C%20gc%2C%20np%2C%20out_dir%2C%20pickle%2C%20plt)%3A%0A%20%20%20%20%23%20Clear%20GPU%20memory%20from%20previous%20fits%0A%20%20%20%20clear_caches()%0A%20%20%20%20gc.collect()%0A%0A%20%20%20%20%23%20Define%20output%20file%20path%0A%20%20%20%20_out_path1%20%3D%20out_dir%20%2F%20f%22scribe_results_nbvcp-low-rank_mean_odds.pkl%22%0A%20%20%20%20_out_path2%20%3D%20out_dir%20%2F%20f%22scribe_results_nbvcp-low-rank-prob_canonical.pkl%22%0A%0A%20%20%20%20if%20_out_path1.exists()%3A%0A%20%20%20%20%20%20%20%20with%20open(_out_path1%2C%20%22rb%22)%20as%20_f%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20_results_lowrank%20%3D%20pickle.load(_f)%0A%0A%20%20%20%20if%20_out_path2.exists()%3A%0A%20%20%20%20%20%20%20%20with%20open(_out_path2%2C%20%22rb%22)%20as%20_f%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20_results_hierprob%20%3D%20pickle.load(_f)%0A%0A%20%20%20%20_N_SAMPLES%20%3D%201_000%0A%0A%20%20%20%20_simplex_lowrank%20%3D%20_results_lowrank.get_compositional_samples(%0A%20%20%20%20%20%20%20%20n_samples%3D_N_SAMPLES%2C%20store_samples%3DFalse%0A%20%20%20%20)%0A%20%20%20%20_simplex_hierprob%20%3D%20_results_hierprob.get_compositional_samples(%0A%20%20%20%20%20%20%20%20n_samples%3D_N_SAMPLES%2C%20store_samples%3DFalse%0A%20%20%20%20)%0A%0A%20%20%20%20_lower_quantile%20%3D%202.5%0A%20%20%20%20_upper_quantile%20%3D%2097.5%0A%0A%20%20%20%20_median_x%20%3D%20np.median(_simplex_lowrank%2C%20axis%3D0)%0A%20%20%20%20_median_y%20%3D%20np.median(_simplex_hierprob%2C%20axis%3D0)%0A%0A%20%20%20%20_low_x%20%3D%20np.percentile(_simplex_lowrank%2C%20_lower_quantile%2C%20axis%3D0)%0A%20%20%20%20_high_x%20%3D%20np.percentile(_simplex_lowrank%2C%20_upper_quantile%2C%20axis%3D0)%0A%20%20%20%20_low_y%20%3D%20np.percentile(_simplex_hierprob%2C%20_lower_quantile%2C%20axis%3D0)%0A%20%20%20%20_high_y%20%3D%20np.percentile(_simplex_hierprob%2C%20_upper_quantile%2C%20axis%3D0)%0A%0A%20%20%20%20_error_x%20%3D%20%5B%0A%20%20%20%20%20%20%20%20_median_x%20-%20_low_x%2C%0A%20%20%20%20%20%20%20%20_high_x%20-%20_median_x%2C%0A%20%20%20%20%5D%0A%20%20%20%20_error_y%20%3D%20%5B%0A%20%20%20%20%20%20%20%20_median_y%20-%20_low_y%2C%0A%20%20%20%20%20%20%20%20_high_y%20-%20_median_y%2C%0A%20%20%20%20%5D%0A%0A%20%20%20%20_fig%2C%20_ax%20%3D%20plt.subplots(1%2C%201%2C%20figsize%3D(4%2C%204))%0A%0A%20%20%20%20_min_val%20%3D%20min(_median_x.min()%2C%20_median_y.min())%0A%20%20%20%20_max_val%20%3D%20max(_median_x.max()%2C%20_median_y.max())%0A%20%20%20%20_ax.plot(%0A%20%20%20%20%20%20%20%20%5B_min_val%2C%20_max_val%5D%2C%0A%20%20%20%20%20%20%20%20%5B_min_val%2C%20_max_val%5D%2C%0A%20%20%20%20%20%20%20%20color%3D%22black%22%2C%0A%20%20%20%20%20%20%20%20linestyle%3D%22--%22%2C%0A%20%20%20%20)%0A%0A%20%20%20%20_ax.errorbar(%0A%20%20%20%20%20%20%20%20_median_x%2C%0A%20%20%20%20%20%20%20%20_median_y%2C%0A%20%20%20%20%20%20%20%20xerr%3D_error_x%2C%0A%20%20%20%20%20%20%20%20yerr%3D_error_y%2C%0A%20%20%20%20%20%20%20%20fmt%3D%22o%22%2C%0A%20%20%20%20%20%20%20%20ms%3D3%2C%0A%20%20%20%20%20%20%20%20elinewidth%3D1%2C%0A%20%20%20%20%20%20%20%20capsize%3D0%2C%0A%20%20%20%20%20%20%20%20alpha%3D0.7%2C%0A%20%20%20%20%20%20%20%20color%3D%22C0%22%2C%0A%20%20%20%20%20%20%20%20ecolor%3D%22gray%22%2C%0A%20%20%20%20)%0A%0A%20%20%20%20_ax.set_xlabel(%22composition%20low-rank%20model%5Cn(shared%20%24p%24)%22)%0A%20%20%20%20_ax.set_ylabel(%22composition%20low-rank%5Cn(gene-specific%20%24p_g%24)%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20The%20medians%20fall%20almost%20exactly%20on%20the%20identity%20line%3A%20the%20two%20models%20agree%20on%20what%20fraction%20each%20gene%20occupies.%20This%20is%20reassuring%E2%80%94the%20modeling%20decision%20(shared%20vs.%20gene-specific%20%24p%24)%20did%20not%20materially%20change%20the%20compositional%20point%20estimates%20that%20would%20feed%20into%20differential%20expression.%0A%0A%20%20%20%20The%20error%20bars%20tell%20a%20more%20interesting%20story.%20The%20gene-specific%20%24p_g%24%20model%20(vertical%20bars)%20has%20noticeably%20wider%20uncertainty%20than%20the%20shared-%24p%24%20model%20(horizontal%20bars)%2C%20especially%20for%20higher-expression%20genes.%20This%20makes%20sense%3A%0A%0A%20%20%20%20-%20**Shared%20%24p%24%3A**%20compositions%20depend%20only%20on%20%24%5Cunderline%7Br%7D%24%2C%20because%20the%20common%20%24p%24%20cancels%20in%20the%20normalization.%20Each%20%24r_g%24%20is%20well-constrained%20by%20thousands%20of%20cells%2C%20so%20compositional%20uncertainty%20is%20tight.%0A%20%20%20%20-%20**Gene-specific%20%24p_g%24%3A**%20compositions%20now%20depend%20on%20both%20%24r_g%24%20and%20%24p_g%24.%20The%20%24r_g%24%20are%20still%20well-constrained%2C%20but%20each%20%24p_g%24%20is%20informed%20by%20data%20from%20just%20one%20gene%20(plus%20the%20hierarchical%20prior%20pulling%20it%20toward%20the%20population%20mean).%20That%20per-gene%20%24p_g%24%20uncertainty%20propagates%20directly%20into%20the%20compositional%20fractions.%0A%0A%20%20%20%20Is%20the%20wider%20uncertainty%20a%20problem%3F%20Not%20necessarily%E2%80%94it%20may%20be%20the%20model%20being%20more%20honest%20about%20what%20the%20data%20can%20actually%20tell%20us.%20But%20for%20a%20monoculture%20like%20Jurkat%20cells%2C%20where%20there%20is%20no%20strong%20biological%20reason%20to%20expect%20gene-to-gene%20variation%20in%20%24p%24%2C%20the%20extra%20uncertainty%20is%20likely%20a%20statistical%20cost%20of%20estimating%20parameters%20the%20data%20did%20not%20need.%20Bayesian%20model%20comparison%20(e.g.%2C%20via%20marginal%20likelihood%20or%20leave-one-out%20cross-validation)%20can%20formalize%20this%20intuition%E2%80%94a%20topic%20we%20will%20revisit%20in%20a%20later%20tutorial.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20%23%23%20Beyond%20Gaussians%3A%20normalizing%20flow%20guides%0A%0A%20%20%20%20Every%20posterior%20approximation%20we%20have%20used%20so%20far%E2%80%94mean-field%2C%20low-rank%2C%20structured%20joint%E2%80%94works%20by%20optimizing%20a%20location-scale%20family%20in%20some%20(possibly%20transformed)%20space.%20In%20the%20unconstrained%20setting%20these%20are%20Gaussians%20passed%20through%20fixed%20bijections%20like%20%60exp%60%20or%20%60sigmoid%60%3B%20in%20the%20constrained%20setting%20they%20are%20Beta%20or%20LogNormal%20distributions.%20Either%20way%2C%20the%20shape%20of%20each%20marginal%20is%20predetermined%3A%20the%20optimizer%20can%20shift%20and%20stretch%20it%2C%20but%20it%20cannot%20bend%20it%20into%20something%20fundamentally%20different.%20If%20the%20true%20posterior%20has%20curved%20ridges%2C%20skewness%2C%20multimodality%2C%20or%20other%20geometry%20that%20these%20fixed%20families%20cannot%20represent%2C%20the%20approximation%20will%20always%20leave%20something%20on%20the%20table.%0A%0A%20%20%20%20Normalizing%20flows%20take%20a%20different%20approach%20entirely.%20A%20normalizing%20flow%20starts%20from%20a%20simple%20base%20distribution%20(a%20standard%20Normal)%20and%20pushes%20it%20through%20a%20chain%20of%20learned%20invertible%20transformations%E2%80%94small%20neural%20networks%20whose%20parameters%20are%20optimized%20alongside%20the%20ELBO.%20Each%20layer%20warps%20the%20density%2C%20and%20by%20stacking%20enough%20layers%20the%20flow%20can%20mold%20an%20initially%20spherical%20Gaussian%20into%20essentially%20any%20smooth%20distribution.%20If%20the%20true%20posterior%20has%20banana-shaped%20ridges%2C%20heavy%20tails%2C%20or%20asymmetric%20modes%2C%20the%20flow%20can%20learn%20to%20reproduce%20that%20geometry%20rather%20than%20forcing%20an%20elliptical%20approximation.%0A%0A%20%20%20%20%60scribe%60%20provides%20normalizing%20flow%20guides%20that%20work%20at%20the%20full%20gene-dimensional%20scale%20(tens%20of%20thousands%20of%20dimensions)%20through%20carefully%20stabilized%20coupling%20architectures.%20When%20combined%20with%20%60joint_params%60%2C%20the%20flow%20operates%20on%20the%20stacked%20parameter%20vector%E2%80%94here%2C%20the%20scalar%20%24p%24%20concatenated%20with%20all%20%24G%24%20gene-specific%20%24r_g%24%20values%E2%80%94so%20a%20single%20flow%20chain%20can%20learn%20nonlinear%20cross-parameter%20and%20cross-gene%20correlations%20simultaneously.%0A%0A%20%20%20%20The%20flexibility%20comes%20with%20its%20own%20tradeoffs.%20Each%20flow%20layer%20contains%20a%20conditioner%20neural%20network%20with%20trainable%20weights%2C%20so%20the%20parameter%20count%20is%20much%20larger%20than%20a%20low-rank%20guide.%20Training%20can%20be%20slower%20and%20more%20sensitive%20to%20hyperparameters%20(learning%20rate%2C%20number%20of%20layers%2C%20activation%20function).%20And%20because%20flows%20are%20universal%20approximators%20in%20principle%2C%20they%20can%20also%20be%20universal%20overfitters%20in%20practice%E2%80%94absorbing%20noise%20as%20fake%20structure%20if%20not%20regularized%20carefully.%0A%0A%20%20%20%20In%20code%2C%20the%20new%20arguments%20are%3A%0A%0A%20%20%20%20-%20%60guide_flow%3D%22affine_coupling%22%60%3A%20selects%20the%20flow%20architecture.%20Affine%20coupling%20layers%20split%20the%20input%20dimensions%20into%20two%20halves%3B%20one%20half%20stays%20fixed%20while%20a%20neural%20network%20predicts%20a%20shift%20and%20scale%20for%20the%20other%20half%2C%20then%20the%20roles%20swap.%20This%20is%20fast%20(both%20forward%20and%20inverse%20passes%20are%20parallel)%20and%20a%20good%20default.%0A%20%20%20%20-%20%60guide_flow_num_layers%3D8%60%3A%20how%20many%20coupling%20layers%20to%20stack.%20More%20layers%20mean%20more%20expressive%20warping%20but%20also%20more%20parameters%20and%20longer%20training.%20Eight%20is%20a%20reasonable%20starting%20point.%0A%20%20%20%20-%20%60guide_flow_activation%3D%22gelu%22%60%3A%20the%20activation%20function%20inside%20the%20conditioner%20networks.%20GELU%20(Gaussian%20Error%20Linear%20Unit)%20is%20a%20smooth%20alternative%20to%20ReLU%20that%20often%20trains%20more%20stably%20in%20flow%20architectures.%0A%20%20%20%20-%20%60joint_params%3D%22biological%22%60%3A%20as%20before%2C%20this%20tells%20%60scribe%60%20to%20model%20%24p%24%20and%20%24%5Cunderline%7Br%7D%24%20jointly.%20With%20a%20flow%20guide%2C%20both%20are%20concatenated%20into%20a%20single%20vector%20and%20processed%20by%20the%20same%20flow%20chain%2C%20so%20the%20flow%20can%20learn%20arbitrary%20(including%20nonlinear)%20coupling.%0A%0A%20%20%20%20We%20return%20to%20the%20canonical%20parameterization%20with%20a%20scalar%20%24p%24%20(no%20%60prob_prior%3D%22gaussian%22%60).%20The%20flow%20is%20expressive%20enough%20in%20principle%20to%20represent%20whatever%20coupling%20exists%20between%20%24p%24%20and%20%24%5Cunderline%7Br%7D%24%20without%20needing%20the%20mean%E2%80%93odds%20reparameterization%20or%20gene-specific%20%24p_g%24%20as%20structural%20crutches.%20Whether%20it%20actually%20converges%20to%20a%20better%20solution%20than%20the%20low-rank%20Gaussian%20is%20an%20empirical%20question.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(adata%2C%20clear_caches%2C%20gc%2C%20out_dir%2C%20pickle%2C%20scribe)%3A%0A%20%20%20%20%23%20Clear%20GPU%20memory%20from%20previous%20fits%0A%20%20%20%20clear_caches()%0A%20%20%20%20gc.collect()%0A%0A%20%20%20%20%23%20Define%20parameterization%0A%20%20%20%20_parameterization%20%3D%20%22canonical%22%0A%0A%20%20%20%20%23%20Define%20output%20file%20path%0A%20%20%20%20_out_path%20%3D%20out_dir%20%2F%20f%22scribe_results_nbvcp-flow_%7B_parameterization%7D.pkl%22%0A%0A%20%20%20%20if%20_out_path.exists()%3A%0A%20%20%20%20%20%20%20%20%23%20Load%20model%20from%20pkl%20file%0A%20%20%20%20%20%20%20%20with%20open(_out_path%2C%20%22rb%22)%20as%20_f%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20results_flow%20%3D%20pickle.load(_f)%0A%20%20%20%20else%3A%0A%20%20%20%20%20%20%20%20%23%20Fit%20basic%20model%20to%20data%20with%20variable%20capture%20probability%20fit%20per%20cell%0A%20%20%20%20%20%20%20%20results_flow%20%3D%20scribe.fit(%0A%20%20%20%20%20%20%20%20%20%20%20%20adata%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20variable_capture%3DTrue%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20parameterization%3D_parameterization%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20joint_params%3D%22biological%22%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20guide_flow%3D%22affine_coupling%22%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20guide_flow_activation%3D%22gelu%22%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20guide_flow_num_layers%3D8%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20n_steps%3D200_000%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20early_stopping%3D%7B%22enabled%22%3A%20True%2C%20%22patience%22%3A%201000%7D%0A%20%20%20%20%20%20%20%20)%0A%20%20%20%20%20%20%20%20%23%20Save%20the%20fitted%20model%0A%20%20%20%20%20%20%20%20with%20open(_out_path%2C%20%22wb%22)%20as%20_f%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20pickle.dump(results_flow%2C%20_f)%0A%0A%20%20%20%20results_flow%0A%20%20%20%20return%20(results_flow%2C)%0A%0A%0A%40app.cell%0Adef%20_(results_flow%2C%20scribe)%3A%0A%20%20%20%20%23%20Plot%20ELBO%20loss%20for%20normalizing%20flow%20fit%0A%20%20%20%20scribe.viz.plot_loss(results_flow%2C%20figsize%3D(6%2C%203))%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20Let%20us%20examine%20the%20corner%20plot%20for%20the%20same%20illustrative%20genes.%20Because%20the%20flow%20can%20learn%20nonlinear%20warping%20of%20the%20posterior%2C%20we%20might%20see%20shapes%20in%20the%202D%20panels%20that%20the%20low-rank%20Gaussian%20guide%20could%20not%20represent%3A%20asymmetric%20tails%2C%20curved%20ridges%20tracked%20more%20faithfully%2C%20or%20tighter%20concentration%20along%20the%20NB%20mean-dispersion%20constraint.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(corner%2C%20gene_index%2C%20results_flow)%3A%0A%20%20%20%20_results_subset%20%3D%20results_flow%5Bgene_index%5D%0A%0A%20%20%20%20%23%20Export%20directly%20as%20a%20corner-ready%20matrix%20plus%20labels%2Fmetadata.%0A%20%20%20%20_samples_2d%2C%20_labels%2C%20_%20%3D%20_results_subset.get_posterior_matrix(%0A%20%20%20%20%20%20%20%20n_samples%3D5_000%2C%0A%20%20%20%20%20%20%20%20include%3D%5B%22p%22%2C%20%22r%22%5D%2C%0A%20%20%20%20%20%20%20%20exclude_deterministic%3DTrue%2C%0A%20%20%20%20%20%20%20%20store_samples%3DFalse%2C%0A%20%20%20%20%20%20%20%20convert_to_numpy%3DTrue%2C%0A%20%20%20%20)%0A%0A%20%20%20%20corner.corner(_samples_2d%2C%20labels%3D_labels)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20A%20striking%20feature%20of%20the%20flow's%20corner%20plot%20is%20how%20much%20narrower%20the%20posteriors%20are%20compared%20to%20the%20low-rank%20Gaussian%20guide.%20The%20shared%20%24p%24%20is%20pinned%20to%20a%20sliver%20roughly%20a%20thousand%20times%20tighter%20than%20the%20low-rank%20result%2C%20and%20the%20%24r_g%24%20marginals%20have%20shrunk%20accordingly.%20Is%20the%20flow%20simply%20more%20precise%2C%20or%20is%20this%20a%20symptom%20of%20mode%20collapse%3F%0A%0A%20%20%20%20Both%20explanations%20are%20plausible%2C%20and%20good%20PPCs%20alone%20cannot%20distinguish%20them.%20The%20ELBO%20objective%2C%20%24%5Ctext%7BKL%7D(q%20%5C%7C%20p)%24%2C%20is%20mode-seeking%3A%20it%20penalizes%20the%20approximation%20for%20placing%20mass%20where%20the%20true%20posterior%20is%20low%2C%20but%20it%20does%20not%20penalize%20failing%20to%20cover%20regions%20where%20the%20true%20posterior%20is%20high.%20A%20flow%20with%20many%20trainable%20parameters%20can%20therefore%20%22snap%22%20to%20one%20narrow%20region%20of%20parameter%20space%20that%20scores%20well%20without%20being%20penalized%20for%20ignoring%20nearby%20mass.%20Because%20PPCs%20test%20the%20predictive%20distribution%E2%80%94which%20is%20dominated%20by%20the%20posterior%20mean%2C%20not%20the%20tails%E2%80%94overconfident%20posteriors%20can%20produce%20perfectly%20adequate%20count-level%20predictions.%0A%0A%20%20%20%20Two%20diagnostics%20can%20help%20resolve%20the%20ambiguity.%20First%2C%20compare%20the%20final%20ELBO%20values%3A%20if%20the%20flow%20achieves%20a%20materially%20better%20(less%20negative)%20ELBO%20than%20the%20low-rank%20guide%2C%20the%20tighter%20posterior%20is%20more%20likely%20a%20faithful%20approximation%3B%20if%20the%20ELBOs%20are%20comparable%2C%20the%20flow%20may%20have%20traded%20coverage%20for%20sharpness%20without%20gaining%20evidence.%20Second%2C%20run%20MCMC%20(which%20%60scribe%60%20supports)%20on%20a%20subset%20of%20genes%20and%20compare%20the%20marginal%20widths%E2%80%94that%20is%20the%20gold-standard%20check.%20Notice%20also%20that%20%24p%24%20settled%20at%20a%20completely%20different%20value%20here%20than%20in%20the%20low-rank%20%60mean_odds%60%20fit%3B%20both%20give%20good%20PPCs%20because%20the%20capture%20probability%20can%20compensate.%20Such%20flat%20directions%20in%20the%20likelihood%20are%20exactly%20where%20mode-seeking%20variational%20families%20are%20most%20prone%20to%20overconfidence.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20Now%20the%20PPCs%20for%20the%20normalizing%20flow%20fit.%20Because%20the%20flow%20operates%20in%20canonical%20%24(r%2C%20p)%24%20coordinates%20with%20joint%20structure%2C%20it%20should%20be%20able%20to%20capture%20the%20ridge%20geometry%20that%20gave%20mean-field%20canonical%20fits%20trouble%20earlier.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(adata%2C%20results_flow%2C%20scribe)%3A%0A%20%20%20%20scribe.viz.plot_ppc(%0A%20%20%20%20%20%20%20%20results_flow%2C%0A%20%20%20%20%20%20%20%20adata%2C%0A%20%20%20%20%20%20%20%20n_genes%3D16%2C%0A%20%20%20%20%20%20%20%20n_rows%3D4%2C%0A%20%20%20%20%20%20%20%20figsize%3D(8%2C%208)%2C%0A%20%20%20%20%20%20%20%20n_samples%3D512%2C%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20_(adata%2C%20results_flow%2C%20scribe)%3A%0A%20%20%20%20scribe.viz.plot_mean_calibration(%0A%20%20%20%20%20%20%20%20results_flow%2C%20counts%3Dadata%2C%20figsize%3D(4%2C%204)%0A%20%20%20%20)%0A%20%20%20%20return%0A%0A%0A%40app.cell(hide_code%3DTrue)%0Adef%20_(mo)%3A%0A%20%20%20%20mo.md(r%22%22%22%0A%20%20%20%20%23%23%20Conclusion%0A%0A%20%20%20%20This%20tutorial%20walked%20through%20a%20progression%20of%20modeling%20and%20inference%20choices%20that%20%60scribe%60%20makes%20available%20for%20single-cell%20RNA-seq%20data%2C%20all%20applied%20to%20the%20same%20Jurkat%20monoculture%20dataset.%0A%0A%20%20%20%20%23%23%23%20What%20we%20explored%0A%0A%20%20%20%201.%20**The%20negative%20binomial%20as%20a%20biophysically%20motivated%20count%20model.**%20Rather%20than%20adopting%20the%20NB%20for%20vague%20%22overdispersion%22%20reasons%2C%20%60scribe%60%20starts%20from%20the%20two-state%20promoter%20model%20where%20the%20NB%20arises%20as%20a%20steady-state%20distribution.%20We%20fitted%20the%20simplest%20version%20of%20this%20model%20(shared%20%24p%24%2C%20no%20capture%20adjustment)%20and%20used%20posterior%20predictive%20checks%20to%20see%20where%20it%20fell%20short.%0A%0A%20%20%20%202.%20**Variable%20capture%20probability.**%20By%20giving%20each%20cell%20its%20own%20capture%20parameter%20%24%5Cnu_c%24%2C%20we%20moved%20depth%20correction%20from%20an%20ad%20hoc%20preprocessing%20step%20into%20the%20generative%20model%20itself.%20PPCs%20improved%20substantially%2C%20and%20the%20capture-scaling%20plot%20provided%20a%20sanity%20check%20on%20how%20depth%20is%20being%20allocated%20across%20cells.%0A%0A%20%20%20%203.%20**Reparameterization%20matters.**%20Switching%20from%20canonical%20%24(r%2C%20p)%24%20to%20mean-odds%20%24(%5Cmu%2C%20%5Cphi)%24%20coordinates%20did%20not%20change%20the%20generative%20model%2C%20but%20it%20changed%20the%20geometry%20the%20optimizer%20and%20variational%20guide%20navigate.%20The%20result%20was%20tighter%2C%20more%20honest%20posterior%20uncertainty%20and%20better%20PPCs%E2%80%94a%20reminder%20that%20inference%20is%20not%20just%20about%20the%20model%20but%20also%20about%20how%20you%20parameterize%20it.%0A%0A%20%20%20%204.%20**Joint%20low-rank%20guides.**%20By%20upgrading%20the%20variational%20family%20from%20mean-field%20to%20a%20low-rank%20covariance%20structure%2C%20we%20let%20the%20approximation%20represent%20cross-gene%20and%20cross-parameter%20correlations.%20The%20marginal%20PPCs%20stayed%20good%2C%20but%20the%20corner%20plots%20revealed%20that%20the%20mean-field%20had%20been%20inflating%20marginal%20uncertainty%20to%20compensate%20for%20missing%20coupling.%20The%20low-rank%20guide%20recovered%20the%20true%20(narrow)%20uncertainty%20on%20shared%20parameters%20like%20%24p%24%20and%20opened%20the%20door%20to%20gene-gene%20correlation%20heatmaps.%0A%0A%20%20%20%205.%20**Hierarchical%20gene-specific%20%24p_g%24.**%20Relaxing%20the%20shared-%24p%24%20assumption%20gave%20each%20gene%20its%20own%20success%20probability%2C%20regularized%20by%20a%20population-level%20prior.%20PPCs%20remained%20excellent%2C%20and%20the%20compositional%20point%20estimates%20barely%20changed%E2%80%94but%20the%20compositional%20uncertainty%20widened%2C%20reflecting%20the%20extra%20degrees%20of%20freedom.%20Whether%20that%20extra%20flexibility%20is%20worth%20the%20statistical%20cost%20is%20a%20question%20Bayesian%20model%20comparison%20can%20answer.%0A%0A%20%20%20%206.%20**Normalizing%20flow%20guides.**%20As%20the%20most%20flexible%20variational%20family%20in%20this%20tutorial%2C%20flows%20can%20learn%20nonlinear%20posterior%20geometry%20that%20Gaussian%20families%20cannot%20represent.%20We%20demonstrated%20that%20%60scribe%60%20can%20train%20flow-based%20guides%20at%20genome%20scale%20in%20canonical%20coordinates%2C%20providing%20another%20tool%20for%20cases%20where%20simpler%20approximations%20leave%20structure%20on%20the%20table.%0A%0A%20%20%20%20%23%23%23%20What%20we%20did%20not%20explore%0A%0A%20%20%20%20Several%20important%20capabilities%20of%20%60scribe%60%20were%20deliberately%20left%20for%20other%20tutorials%3A%0A%0A%20%20%20%20-%20**Differential%20expression%20analysis.**%20%60scribe%60's%20Bayesian%20DE%20framework%E2%80%94including%20centered%20and%20isometric%20log-ratio%20transformations%2C%20the%20local%20false%20sign%20rate%20(lfsr)%2C%20and%20the%20posterior%20expected%20false%20discovery%20proportion%20(PEFP)%E2%80%94is%20the%20subject%20of%20a%20dedicated%20tutorial%20using%20a%20mixed-population%20dataset.%0A%20%20%20%20-%20**Unsupervised%20cell-type%20discovery.**%20Mixture%20models%20and%20their%20interaction%20with%20the%20compositional%20framework%20are%20covered%20separately.%0A%20%20%20%20-%20**Biology-informed%20capture%20priors.**%20When%20external%20information%20about%20total%20mRNA%20content%20is%20available%2C%20%60scribe%60%20can%20use%20it%20to%20anchor%20the%20capture%20probability%20more%20tightly%E2%80%94resolving%20the%20identifiability%20issue%20we%20noted%20in%20the%20capture-scaling%20plots.%0A%20%20%20%20-%20**MCMC%20sampling.**%20All%20fits%20in%20this%20tutorial%20used%20stochastic%20variational%20inference%20(SVI).%20%60scribe%60%20also%20supports%20Hamiltonian%20Monte%20Carlo%20for%20gold-standard%20posterior%20exploration%2C%20at%20higher%20computational%20cost.%0A%20%20%20%20-%20**Formal%20Bayesian%20model%20comparison.**%20We%20compared%20models%20qualitatively%20(PPCs%2C%20calibration%2C%20compositional%20agreement)%3B%20quantitative%20comparison%20via%20marginal%20likelihoods%20or%20predictive%20metrics%20is%20a%20natural%20follow-up.%0A%0A%20%20%20%20%23%23%23%20The%20broader%20message%0A%0A%20%20%20%20The%20progression%20in%20this%20tutorial%E2%80%94from%20a%20bare%20count%20model%20to%20structured%20variational%20families%20with%20tens%20of%20thousands%20of%20parameters%E2%80%94illustrates%20a%20core%20design%20principle%20of%20%60scribe%60%3A%20modeling%20assumptions%20should%20be%20*explicit%2C%20composable%2C%20and%20testable*.%20Each%20ingredient%20(capture%2C%20parameterization%2C%20guide%20structure%2C%20hierarchical%20priors)%20can%20be%20turned%20on%20or%20off%20independently%2C%20and%20posterior%20predictive%20checks%20provide%20a%20consistent%20diagnostic%20language%20for%20asking%20whether%20each%20addition%20actually%20helps.%20The%20goal%20is%20not%20to%20find%20the%20one%20%22right%22%20model%2C%20but%20to%20let%20the%20analyst%20build%20up%20complexity%20incrementally%2C%20checking%20at%20each%20step%20whether%20the%20data%20justify%20the%20added%20flexibility.%0A%20%20%20%20%22%22%22)%0A%20%20%20%20return%0A%0A%0Aif%20__name__%20%3D%3D%20%22__main__%22%3A%0A%20%20%20%20app.run()%0A
831e7f8a2fabae9de02adb5efefe12bf