Features
========

pg_gpu provides GPU-accelerated computation of population genetics statistics
using CuPy. All statistics return NumPy arrays and handle missing data
automatically. Below is a comprehensive catalog of every implemented statistic.

Diversity Statistics
--------------------

.. list-table::
   :header-rows: 1
   :widths: 25 45 30

   * - Function
     - Description
     - Reference
   * - ``pi``
     - Nucleotide diversity
     - Nei & Li (1979)
   * - ``theta_w``
     - Watterson's theta
     - Watterson (1975)
   * - ``tajimas_d``
     - Tajima's D neutrality test
     - Tajima (1989)
   * - ``fay_wus_h``
     - Fay & Wu's H (excess high-frequency derived alleles)
     - Fay & Wu (2000)
   * - ``normalized_fay_wus_h``
     - Normalized H (H*)
     - Zeng et al. (2006)
   * - ``theta_h``
     - Fay & Wu's theta_H
     - Fay & Wu (2000)
   * - ``theta_l``
     - Theta_L
     - Zeng et al. (2006)
   * - ``zeng_e``
     - Zeng's E neutrality test
     - Zeng et al. (2006)
   * - ``zeng_dh``
     - Zeng's DH joint test
     - Zeng et al. (2006)
   * - ``segregating_sites``
     - Count of segregating sites
     -
   * - ``singleton_count``
     - Count of singletons
     -
   * - ``haplotype_diversity``
     - Haplotype diversity (1 - sum of squared frequencies)
     -
   * - ``haplotype_count``
     - Number of distinct haplotypes
     -
   * - ``heterozygosity_expected``
     - Expected heterozygosity (gene diversity) per variant
     -
   * - ``heterozygosity_observed``
     - Observed heterozygosity per variant
     -
   * - ``inbreeding_coefficient``
     - Wright's F per variant
     - Wright (1951)
   * - ``allele_frequency_spectrum``
     - Allele frequency spectrum
     -
   * - ``max_daf``
     - Maximum derived allele frequency
     -
   * - ``daf_histogram``
     - Derived allele frequency histogram
     -
   * - ``diplotype_frequency_spectrum``
     - Diplotype (multi-locus genotype) frequency spectrum
     -
   * - ``diversity_stats``
     - All core diversity statistics in one call
     -

Divergence Statistics
---------------------

.. list-table::
   :header-rows: 1
   :widths: 25 45 30

   * - Function
     - Description
     - Reference
   * - ``fst_hudson``
     - Hudson's FST
     - Hudson et al. (1992)
   * - ``fst_weir_cockerham``
     - Weir & Cockerham's FST (method of moments)
     - Weir & Cockerham (1984)
   * - ``fst_nei``
     - Nei's GST
     - Nei (1973)
   * - ``dxy``
     - Absolute divergence (mean pairwise differences between pops)
     - Nei (1987)
   * - ``da``
     - Net divergence (Dxy minus mean within-pop pi)
     - Nei & Li (1979)
   * - ``pbs``
     - Population Branch Statistic
     - Yi et al. (2010)
   * - ``pairwise_fst``
     - Pairwise FST matrix for multiple populations
     -

Distance-Based Two-Population Statistics
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. list-table::
   :header-rows: 1
   :widths: 25 45 30

   * - Function
     - Description
     - Reference
   * - ``snn``
     - Nearest-neighbor statistic
     - Hudson (2000)
   * - ``dxy_min``
     - Minimum pairwise distance between populations
     - Geneva et al. (2015)
   * - ``gmin``
     - Gmin ratio (Dxy_min / Dxy_mean)
     - Geneva et al. (2015)
   * - ``dd``
     - Relative minimum divergence (dd1, dd2)
     - Schrider et al. (2018)
   * - ``dd_rank``
     - Rank of minimum between-pop distance in within-pop distribution
     - Schrider et al. (2018)
   * - ``zx``
     - ZnS ratio (within-pop LD / total LD)
     - Schrider et al. (2018)

Linkage Disequilibrium
----------------------

.. list-table::
   :header-rows: 1
   :widths: 25 45 30

   * - Function
     - Description
     - Reference
   * - ``r``
     - Pearson correlation between variant pairs
     -
   * - ``r_squared``
     - Squared correlation (r-squared) between variant pairs
     -
   * - ``dd`` (LD)
     - D-squared (two-locus LD statistic)
     - Ragsdale & Gravel (2019)
   * - ``dz``
     - Dz statistic (multi-population LD)
     - Ragsdale & Gravel (2019)
   * - ``pi2``
     - Two-locus nucleotide diversity
     - Ragsdale & Gravel (2019)
   * - ``zns``
     - Kelly's ZnS (mean pairwise LD); defaults to the unbiased
       :math:`\sigma_D^2` estimator on ``HaplotypeMatrix`` inputs,
       falls back to naive :math:`r^2` for pre-computed arrays.
     - Kelly (1997); Ragsdale & Gravel (2019)
   * - ``omega``
     - Kim & Nielsen's Omega (partitioned LD); same default policy
       as ``zns``.
     - Kim & Nielsen (2004); Ragsdale & Gravel (2019)
   * - ``mu_ld``
     - Haplotype pattern exclusivity (RAiSD LD component)
     - Alachiotis & Pavlidis (2018)

Selection Scans
---------------

.. list-table::
   :header-rows: 1
   :widths: 25 45 30

   * - Function
     - Description
     - Reference
   * - ``ihs``
     - Integrated Haplotype Score
     - Voight et al. (2006)
   * - ``nsl``
     - Number of Segregating Sites by Length
     - Ferrer-Admetlla et al. (2014)
   * - ``xpehh``
     - Cross-population Extended Haplotype Homozygosity
     - Sabeti et al. (2007)
   * - ``xpnsl``
     - Cross-population nSL
     - Szpiech et al. (2021)
   * - ``garud_h``
     - Garud's H1, H12, H123, H2/H1
     - Garud et al. (2015)
   * - ``moving_garud_h``
     - Garud's H in moving windows
     - Garud et al. (2015)
   * - ``ehh_decay``
     - Extended Haplotype Homozygosity decay
     - Sabeti et al. (2002)

Site Frequency Spectrum
-----------------------

.. list-table::
   :header-rows: 1
   :widths: 25 45 30

   * - Function
     - Description
     - Reference
   * - ``sfs``
     - Unfolded SFS
     -
   * - ``sfs_folded``
     - Folded SFS (minor allele counts)
     -
   * - ``sfs_scaled``
     - Scaled unfolded SFS
     -
   * - ``sfs_folded_scaled``
     - Scaled folded SFS
     -
   * - ``joint_sfs``
     - Joint SFS (two populations)
     -
   * - ``joint_sfs_folded``
     - Folded joint SFS
     -
   * - ``joint_sfs_scaled``
     - Scaled joint SFS
     -
   * - ``joint_sfs_folded_scaled``
     - Scaled folded joint SFS
     -
   * - ``fold_sfs``
     - Fold an unfolded SFS
     -
   * - ``fold_joint_sfs``
     - Fold a joint SFS
     -

Admixture and F-Statistics
--------------------------

.. list-table::
   :header-rows: 1
   :widths: 25 45 30

   * - Function
     - Description
     - Reference
   * - ``patterson_f2``
     - F2 branch length between two populations
     - Patterson et al. (2012)
   * - ``patterson_f3``
     - F3 admixture test
     - Patterson et al. (2012)
   * - ``patterson_d``
     - Patterson's D (ABBA-BABA)
     - Patterson et al. (2012)
   * - ``moving_patterson_f3``
     - Windowed F3
     - Patterson et al. (2012)
   * - ``moving_patterson_d``
     - Windowed D
     - Patterson et al. (2012)
   * - ``average_patterson_f3``
     - F3 with block-jackknife standard error
     - Patterson et al. (2012)
   * - ``average_patterson_d``
     - D with block-jackknife standard error
     - Patterson et al. (2012)

Resampling (Block Jackknife and Bootstrap)
-------------------------------------------

.. list-table::
   :header-rows: 1
   :widths: 25 45 30

   * - Function
     - Description
     - Reference
   * - ``block_jackknife``
     - Delete-1 block jackknife standard error; supports unequal block sizes
     - Busing et al. (1999)
   * - ``block_bootstrap``
     - Block bootstrap standard error and replicate distribution
     - Efron & Tibshirani (1993)

Both operate on pre-binned per-block values and a user-supplied statistic,
so any scalar aggregate (genome-wide mean Tajima's D, per-population :math:`\pi`,
ratio-of-sums estimators like normed F3 / D) can get a calibrated
standard error / CI with a single call.

FrequencySpectrum (Power-User SFS Interface)
---------------------------------------------

The ``FrequencySpectrum`` class provides direct access to SFS-based estimation
for custom weight functions, SFS projection, and the general Achaz (2009)
variance framework.

.. list-table::
   :header-rows: 1
   :widths: 25 45 30

   * - Method
     - Description
     - Reference
   * - ``FrequencySpectrum.theta``
     - Any theta estimator as weighted SFS dot product
     - Achaz (2009)
   * - ``FrequencySpectrum.neutrality_test``
     - Generalized neutrality test from any two theta estimators
     - Achaz (2009)
   * - ``FrequencySpectrum.project``
     - SFS projection via hypergeometric sampling
     - Gutenkunst et al. (2009)

Built-in estimators: ``pi``, ``watterson``, ``theta_h``, ``theta_l``,
``eta1``, ``eta1_star``, ``minus_eta1``, ``minus_eta1_star``. Custom weight
functions are also supported.

Dimensionality Reduction and Distance
--------------------------------------

.. list-table::
   :header-rows: 1
   :widths: 25 45 30

   * - Function
     - Description
     - Reference
   * - ``pca``
     - Principal Component Analysis (GPU-accelerated SVD)
     - Patterson et al. (2006)
   * - ``randomized_pca``
     - Randomized PCA (truncated SVD approximation)
     - Halko et al. (2011)
   * - ``pairwise_distance``
     - Pairwise genetic distance (Euclidean, cityblock, etc.)
     -
   * - ``pcoa``
     - Principal Coordinate Analysis (classical MDS)
     -
   * - ``local_pca``
     - Per-window PCA (lostruct); GPU-batched ``eigh`` over stacked per-window Gram matrices
     - Li & Ralph (2019)
   * - ``local_pca_jackknife``
     - Delete-1 block jackknife standard error of local PCs (batched)
     - Li & Ralph (2019)
   * - ``pc_dist``
     - Frobenius distance between per-window low-rank covariance reps
     - Li & Ralph (2019)
   * - ``corners``
     - Extreme-cluster selection in a 2D MDS embedding (Welzl MEC)
     - Li & Ralph (2019)

Relatedness and Kinship
-----------------------

.. list-table::
   :header-rows: 1
   :widths: 25 45 30

   * - Function
     - Description
     - Reference
   * - ``grm``
     - Genetic Relationship Matrix
     - Yang et al. (2011)
   * - ``ibs``
     - Pairwise Identity by State proportions
     -

Distance Distribution Statistics
---------------------------------

.. list-table::
   :header-rows: 1
   :widths: 25 45 30

   * - Function
     - Description
     - Reference
   * - ``pairwise_diffs``
     - Pairwise Hamming distances (haploid or diploid)
     -
   * - ``dist_var``
     - Variance of pairwise distance distribution
     - Schrider et al. (2018)
   * - ``dist_skew``
     - Skewness of pairwise distance distribution
     - Schrider et al. (2018)
   * - ``dist_kurt``
     - Excess kurtosis of pairwise distance distribution
     - Schrider et al. (2018)
   * - ``dist_moments``
     - Variance, skewness, and kurtosis in one call
     - Schrider et al. (2018)

Biobank-Scale Streaming
-----------------------

A *VCZ store* is a Zarr encoding of a VCF stored on disk: the genotype
matrix is split into compressed chunks, each chunk a small array of
samples by variants. ``pg_gpu`` reads VCZ stores; if your data
is in VCF you can convert it with the bio2zarr tools
(``vcf2zarr explode`` then ``vcf2zarr encode``). The streaming
codepath needs that VCZ layout because it relies on a fast
per-chunk decode, an operation that can't be done on a VCF. See
:doc:`tutorials/biobank_streaming` for the VCF→VCZ conversion
and a worked end-to-end example.

A VCZ store too large for the GPU (tens to hundreds of thousands
of haplotypes) opens via ``HaplotypeMatrix.from_zarr`` /
``GenotypeMatrix.from_zarr`` as a streaming view that walks the
chromosome chunk by chunk; every kernel listed below dispatches
on the streaming object the same way it would on a fully loaded
matrix -- the calling code is identical to the in-memory path.

What runs on a streaming matrix:

.. list-table::
   :header-rows: 1
   :widths: 35 65

   * - Entry point
     - How it streams
   * - ``windowed_analysis``
     - Each chunk's windows are computed on the GPU as the chunk arrives, then rows are concatenated; window grids are aligned to the chunk grid so no window straddles a chunk boundary.
   * - ``sfs.sfs``, ``sfs.joint_sfs``, ``sfs.sfs_folded``, ``sfs.joint_sfs_folded``
     - Each chunk contributes its own per-site frequency-spectrum counts; the chromosome-wide answer is the sum across chunks.
   * - ``sfs.project_joint_sfs``
     - Same per-chunk accumulation, but the joint SFS is projected (via hypergeometric sampling) to a small target grid as it is built, so the full ``(n1+1, n2+1)`` histogram is never materialized.
   * - ``HaplotypeMatrix.compute_ld_statistics_gpu_single_pop`` / ``_two_pops``
     - Variant-pair statistics within ``max_bp_dist`` are summed per chunk into the bp bins. Pairs that fall on opposite sides of a chunk boundary would be missed by naive per-chunk sums, so the last ``max_bp_dist`` of one chunk is carried forward and paired with the start of the next, counted exactly once. Returns moments-LD ``DD``, ``Dz``, ``pi²`` (3 stats single-pop, 15 stats two-pop).
   * - ``relatedness.ibs``, ``relatedness.grm``
     - Streamed along the variant axis; the individual axis is tiled into row blocks. ``(n_ind, n_ind)`` accumulators live on host so the output can exceed GPU memory. ``grm`` is a two-pass operation (first to calculate chromosome-wide allele frequencies, second to accumulate a per-chunk outer product).
   * - ``StreamingHaplotypeMatrix.materialize(region, sample_subset)``
     - Pulls one sub-region (and optionally a subset of haplotypes) of the chromosome into GPU memory for kernels that need every variant simultaneously -- ``pairwise_r2``, Garud's H, or any custom recipe. 
   * - ``zarr_io.allel_zarr_to_vcz``
     - Streaming converter from scikit-allel layout to VCZ for stores that pre-date bio2zarr.

Fused Windowed Statistics
-------------------------

The ``windowed_analysis()`` function computes statistics across all genomic
windows in a single GPU pass via fused CUDA kernels.

.. list-table::
   :header-rows: 1
   :widths: 25 55

   * - Statistic
     - Description
   * - ``pi``
     - Nucleotide diversity per window
   * - ``theta_w``
     - Watterson's theta per window
   * - ``tajimas_d``
     - Tajima's D per window
   * - ``segregating_sites``
     - Segregating site count per window
   * - ``singletons``
     - Singleton count per window
   * - ``fst``
     - Hudson's FST per window
   * - ``fst_wc``
     - Weir-Cockerham FST per window
   * - ``dxy``
     - Absolute divergence per window
   * - ``da``
     - Net divergence per window
   * - ``garud_h1``, ``garud_h12``, ``garud_h123``, ``garud_h2h1``
     - Garud's H statistics per window
   * - ``mean_nsl``
     - Mean nSL per window
   * - ``local_pca``
     - Per-window local PCA (lostruct); returns a ``LocalPCAResult`` with eigvals, eigvecs, and window metadata