Demographic Inference with moments.LD ===================================== Packaged script: ``examples/moments_integration_demo.py`` Run it from the repo root (note the ``-e moments`` env flag -- ``moments`` lives in a separate pixi environment to keep the default install lightweight): .. code-block:: bash pixi install -e moments pixi run -e moments python examples/moments_integration_demo.py What it shows ------------- `moments.LD `_ (`Ragsdale & Gravel 2019 `_) infers demographic histories by matching observed two-locus LD statistics to expectations under a parameterized model. The bottleneck in practice is *parsing* the LD statistics from genotype data: for :math:`V` variants and :math:`P` populations the work is :math:`O(V^2 \binom{P+2}{2})`, which on whole-genome data is hours-to-days on a CPU. Inference itself, by contrast, takes seconds. ``pg_gpu.moments_ld.compute_ld_statistics`` is a drop-in replacement for ``moments.LD.Parsing.compute_ld_statistics``. Same arguments, same output dictionary structure -- only the parsing step runs on the GPU. Everything downstream (bootstrap variance-covariance, Demes / parametric inference, Godambe confidence intervals) uses ``moments`` unchanged. The validation tests run pg_gpu's parser against ``moments``'s on the same VCFs and confirm machine precision agreement (max relative error :math:`< 10^{-11}`). The packaged script is an end-to-end demonstration: it simulates 200 replicate 1 Mb regions under a three-population model with recent admixture using ``msprime``, parses each replicate with pg_gpu, fits the demographic model with ``moments.LD`` via the Demes inference engine, computes Godambe standard errors, and plots the fitted vs observed LD decay curves. Total wall time for the parsing step on an A100 is under five minutes -- on the same hardware, the CPU ``moments`` parser would take roughly a day. Recipe ------ The core change between a ``moments.LD`` workflow and a ``pg_gpu``-accelerated workflow is: .. code-block:: python from pg_gpu.moments_ld import compute_ld_statistics import moments.LD r_bins = [0, 1e-6, 2e-6, 5e-6, 1e-5, 2e-5, 5e-5, 1e-4] # Step 1: GPU-accelerated LD parsing (the only part that changes # vs. a pure-moments workflow). ``pop_file`` is the same kwarg # name moments.LD.Parsing.compute_ld_statistics uses; pg_gpu # also accepts ``pop_assignment=`` as an alias if you are coming # in from ``HaplotypeMatrix.from_zarr`` and prefer that spelling. ld_stats = {} for i, vcf in enumerate(vcf_files): ld_stats[i] = compute_ld_statistics( vcf, rec_map_file="genetic_map.txt", pop_file="pops.txt", pops=["pop0", "pop1"], r_bins=r_bins, ) # Step 2: Bootstrap means and variance-covariance (moments, unchanged) mv = moments.LD.Parsing.bootstrap_data(ld_stats) # Step 3: Demographic inference (moments, unchanged) demo_func = moments.LD.Demographics2D.split_mig p_guess = [0.1, 2, 0.075, 2, 10000] opt_params, LL = moments.LD.Inference.optimize_log_lbfgsb( p_guess, [mv["means"], mv["varcovs"]], [demo_func], rs=r_bins, ) # Step 4: Convert to physical units physical = moments.LD.Util.rescale_params( opt_params, ["nu", "nu", "T", "m", "Ne"] ) What it computes ---------------- For each pair of variants binned by recombination distance, ``compute_ld_statistics`` returns 15 two-locus LD statistics: - **DD** (3 stats): :math:`D^2` within and between populations. - **Dz** (6 stats): two-locus LD correlations involving three populations. - **pi2** (6 stats): two-locus joint statistics involving four populations. Plus 3 single-locus heterozygosity statistics (H_0_0, H_0_1, H_1_1). These are exactly the statistics that ``moments.LD`` ingests. Performance ----------- Parsing time across 3 replicate 1 Mb regions (10 diploid individuals per population): .. list-table:: :header-rows: 1 * - Populations - moments - pg_gpu - Speedup * - 2 - 190s - 0.9s - **214x** * - 3 - 638s - 2.9s - **219x** * - 4 - 1630s - 7.3s - **224x** The speedup compounds with the number of populations and the number of replicate regions, which is exactly the regime where demographic-LD parsing has historically been impractical on a single machine. Why it's useful as a template ----------------------------- If you already have a ``moments.LD`` workflow, swapping in pg_gpu parsing is a one-line change at the top of your script. If you don't, the packaged demo is the easiest way to see the whole pipeline end-to-end -- simulate, parse, infer, plot -- in one self-contained file (~250 lines) that you can fork and adapt to your own demographic model.