Demographic Inference with moments.LD

Packaged script: examples/moments_integration_demo.py

Run it from the repo root (note the -e moments env flag – moments lives in a separate pixi environment to keep the default install lightweight):

pixi install -e moments
pixi run -e moments python examples/moments_integration_demo.py

What it shows

moments.LD (Ragsdale & Gravel 2019) infers demographic histories by matching observed two-locus LD statistics to expectations under a parameterized model. The bottleneck in practice is parsing the LD statistics from genotype data: for \(V\) variants and \(P\) populations the work is \(O(V^2 \binom{P+2}{2})\), which on whole-genome data is hours-to-days on a CPU. Inference itself, by contrast, takes seconds.

pg_gpu.moments_ld.compute_ld_statistics is a drop-in replacement for moments.LD.Parsing.compute_ld_statistics. Same arguments, same output dictionary structure – only the parsing step runs on the GPU. Everything downstream (bootstrap variance-covariance, Demes / parametric inference, Godambe confidence intervals) uses moments unchanged. The validation tests run pg_gpu’s parser against moments’s on the same VCFs and confirm machine precision agreement (max relative error \(< 10^{-11}\)).

The packaged script is an end-to-end demonstration: it simulates 200 replicate 1 Mb regions under a three-population model with recent admixture using msprime, parses each replicate with pg_gpu, fits the demographic model with moments.LD via the Demes inference engine, computes Godambe standard errors, and plots the fitted vs observed LD decay curves. Total wall time for the parsing step on an A100 is under five minutes – on the same hardware, the CPU moments parser would take roughly a day.

Recipe

The core change between a moments.LD workflow and a pg_gpu-accelerated workflow is:

from pg_gpu.moments_ld import compute_ld_statistics
import moments.LD

r_bins = [0, 1e-6, 2e-6, 5e-6, 1e-5, 2e-5, 5e-5, 1e-4]

# Step 1: GPU-accelerated LD parsing (the only part that changes
# vs. a pure-moments workflow). ``pop_file`` is the same kwarg
# name moments.LD.Parsing.compute_ld_statistics uses; pg_gpu
# also accepts ``pop_assignment=`` as an alias if you are coming
# in from ``HaplotypeMatrix.from_zarr`` and prefer that spelling.
ld_stats = {}
for i, vcf in enumerate(vcf_files):
    ld_stats[i] = compute_ld_statistics(
        vcf,
        rec_map_file="genetic_map.txt",
        pop_file="pops.txt",
        pops=["pop0", "pop1"],
        r_bins=r_bins,
    )

# Step 2: Bootstrap means and variance-covariance (moments, unchanged)
mv = moments.LD.Parsing.bootstrap_data(ld_stats)

# Step 3: Demographic inference (moments, unchanged)
demo_func = moments.LD.Demographics2D.split_mig
p_guess = [0.1, 2, 0.075, 2, 10000]

opt_params, LL = moments.LD.Inference.optimize_log_lbfgsb(
    p_guess, [mv["means"], mv["varcovs"]], [demo_func], rs=r_bins,
)

# Step 4: Convert to physical units
physical = moments.LD.Util.rescale_params(
    opt_params, ["nu", "nu", "T", "m", "Ne"]
)

What it computes

For each pair of variants binned by recombination distance, compute_ld_statistics returns 15 two-locus LD statistics:

  • DD (3 stats): \(D^2\) within and between populations.

  • Dz (6 stats): two-locus LD correlations involving three populations.

  • pi2 (6 stats): two-locus joint statistics involving four populations.

Plus 3 single-locus heterozygosity statistics (H_0_0, H_0_1, H_1_1). These are exactly the statistics that moments.LD ingests.

Performance

Parsing time across 3 replicate 1 Mb regions (10 diploid individuals per population):

Populations

moments

pg_gpu

Speedup

2

190s

0.9s

214x

3

638s

2.9s

219x

4

1630s

7.3s

224x

The speedup compounds with the number of populations and the number of replicate regions, which is exactly the regime where demographic-LD parsing has historically been impractical on a single machine.

Why it’s useful as a template

If you already have a moments.LD workflow, swapping in pg_gpu parsing is a one-line change at the top of your script. If you don’t, the packaged demo is the easiest way to see the whole pipeline end-to-end – simulate, parse, infer, plot – in one self-contained file (~250 lines) that you can fork and adapt to your own demographic model.