Coalescence simulations¶
These applications simulate a genomic region evolving through time with some mutation and recombination rates. Unlike the case with one locus and two alleles, in these cases the simulations are not calculated forward in time, but by a coalescence approach.
msprime is the library used to run these simulations. Once the simulation is completed the software generates a genotypic matrix with the genotype for each individual and marker.
The different parameters are calculated using the pyNei library.
Expected heterozygosity¶
The expected heterozygosity, a measure of genetic diversity, is calculated using the Unbiased expected heterozygosity Genalex formula.
Polymorphic markers¶
A variation/marker is considered polymorphic when its major allele has a frequency lower than 95%. The simulations calculate the total number of polymorphic variants and the proportion of polymorphic variants calculated over the total number of variations.
Allele Frequency Spectrum¶
The Allele Frequency Spectrum is the distribution of allelic frequencies of the major allele, the most abundant allele. So, in the chart the number of polymorphic alleles for different major allele frequencies is plotted.
Principal Component Analysis (PCA)¶
The Principal Component Analysis is not calculated using all markers. Before the calculation the non-polymorphic (95%) and the highly associated (r² > 0.1) markers are filtered out. With the remaining markers the genotypes are coded in a array with 0 for the major allele homozygote, 2 for a homozygote of any minor allele and 1 for the heterozygotes. Finally, this array is used to carry out the Principal Component Analysis.
Population distances¶
Distances between populations are calculated following the Jost’s Differentiation estimation (Dest) suggested by Genalex.
Observed Heterozygosity.
Observed heterozygosity, averaged across populations. The average observed heterozygosity of a collection of populations. Here, HOs is the observed heterozygosity in the s-th population; k is the number of populations.
Average within population heterozygosity. Identical to the mean He, being the average of the within population expected heterozygosity across populations.
Corrected Hs. HS for a given locus is adjusted for small population size and inbreeding by the correction of Nei and Cheeser [30], where n̂ is the harmonic mean population size for k populations, and HO is the average observed within-population heterozygosity for the populations.
Total expected heterozygosity calculated as if all populations were pooled (no subdivision).
Corrected Ht, adjusted for small population size and inbreeding, using the correction of Nei and Cheeser. The harmonic mean of population size over the k populations is n̂.