In this vignette, you will calculate basic population genetic statistics from microsatellite data using R packages.
The data used for these analyses are contained in an R dataset: nancycats
, a “genind” object, i.e. an object of the package adegenet. It contains microsatellite genotypes of 237 cats from 17 colonies of Nancy (France). Each individuals of the 17 colonies are genotyped for 9 microsatellite loci.
##
## // Number of individuals: 237
## // Group sizes: 10 22 12 23 15 11 14 10 9 11 20 14 13 17 11 12 13
## // Number of alleles per locus: 16 11 10 9 12 8 12 12 18
## // Number of alleles per group: 36 53 50 67 48 56 42 54 43 46 70 52 44 61 42 40 35
## // Percentage of missing data: 2.34 %
## // Observed heterozygosity: 0.67 0.67 0.68 0.71 0.63 0.57 0.65 0.62 0.45
## // Expected heterozygosity: 0.87 0.79 0.8 0.76 0.87 0.69 0.82 0.76 0.61
## /// GENIND OBJECT /////////
##
## // 237 individuals; 9 loci; 108 alleles; size: 150.5 Kb
##
## // Basic content
## @tab: 237 x 108 matrix of allele counts
## @loc.n.all: number of alleles per locus (range: 8-18)
## @loc.fac: locus factor for the 108 columns of @tab
## @all.names: list of allele names for each locus
## @ploidy: ploidy of each individual (range: 2-2)
## @type: codom
## @call: genind(tab = truenames(nancycats)$tab, pop = truenames(nancycats)$pop)
##
## // Optional content
## @pop: population of each individual (group size range: 9-23)
## @other: a list containing: xy
##
## // Number of individuals: 237
## // Group sizes: 10 22 12 23 15 11 14 10 9 11 20 14 13 17 11 12 13
## // Number of alleles per locus: 16 11 10 9 12 8 12 12 18
## // Number of alleles per group: 36 53 50 67 48 56 42 54 43 46 70 52 44 61 42 40 35
## // Percentage of missing data: 2.34 %
## // Observed heterozygosity: 0.67 0.67 0.68 0.71 0.63 0.57 0.65 0.62 0.45
## // Expected heterozygosity: 0.87 0.79 0.8 0.76 0.87 0.69 0.82 0.76 0.61
plot(div$Hobs, xlab="Loci number", ylab="Observed Heterozygosity",
main="Observed heterozygosity per locus")
plot(div$Hobs, div$Hexp, xlab="Observed Heterozygosity", ylab="Expected Heterozygosity",
main="Expected heterozygosity as a function of observed heterozygosity per locus")
##
## Bartlett test of homogeneity of variances
##
## data: list(div$Hexp, div$Hobs)
## Bartlett's K-squared = 0.046962, df = 1, p-value = 0.8284
We get various information from the command summary. The one that interest us is the observed and expected heterozygosity per locus. We observed that heterozygosity varies among loci.
The results from the Bartlett test indicates that we have no difference between the mean observed and expected heterozygosity
The function basic.stats()
provides the observed heterozygosity (\(H_o\)), mean gene diversities within population (\(H_s\)), \(F_{is}\), and \(F_{st}\). The function boot.ppfis()
provides confidence interval for \(F_{is}\). The function indpca()
does a PCA on the centered matrix of individuals’ allele frequencies.
nancycats.hfstat <- genind2hierfstat(nancycats)
basicstat <- basic.stats(nancycats, diploid = TRUE, digits = 2)
names(basicstat)
## [1] "n.ind.samp" "pop.freq" "Ho" "Hs" "Fis"
## [6] "perloc" "overall"
## $call
## boot.ppfis(dat = nancycats.hfstat)
##
## $fis.ci
## ll hl
## 1 -0.0413 0.2793
## 2 0.1223 0.2857
## 3 0.0561 0.2772
## 4 0.0978 0.2461
## 5 0.0552 0.2065
## 6 0.0448 0.2789
## 7 0.0642 0.2830
## 8 0.0645 0.3363
## 9 -0.2343 0.1041
## 10 -0.0714 0.2338
## 11 0.0833 0.2261
## 12 0.0670 0.2509
## 13 -0.1093 0.1310
## 14 0.0203 0.2314
## 15 -0.1597 0.1231
## 16 -0.0846 0.1556
## 17 -0.1309 0.1801
## chi^2 df Pr(chi^2 >) Pr.exact
## fca8 395.80006 120 0.000000e+00 0
## fca23 239.34221 55 0.000000e+00 0
## fca43 434.33397 45 0.000000e+00 0
## fca45 66.11849 36 1.622163e-03 0
## fca77 270.52066 66 0.000000e+00 0
## fca78 402.80002 28 0.000000e+00 0
## fca90 217.19836 66 0.000000e+00 0
## fca96 193.36764 66 1.965095e-14 0
## fca37 291.00731 153 1.209777e-10 0
We get for each locus a test of significance of the null hypothesis: \(H_0\) the locus is in HW equilibrium in the population/ \(H_1\). The locus is not in HW equilibrium.
We can conclude from the p-values of each test that any locus is in HW equilibrium.
In this vignette, we learned how to explore the patterns of genetic diversity in one population. Also, you have an idea of potential violations of the dataset to the null Wright-Fischer model.
You may now want to move on to looking into population differentiation.
This shows us useful information for reproducibility. Of particular importance are the versions of R and the packages used to create this workflow. It is considered good practice to record this information with every analysis.
## ─ Session info ───────────────────────────────────────────────────────────────────────────────────
## setting value
## version R version 3.6.1 (2019-07-05)
## os Debian GNU/Linux 9 (stretch)
## system x86_64, linux-gnu
## ui X11
## language (EN)
## collate en_US.UTF-8
## ctype en_US.UTF-8
## tz Etc/UTC
## date 2019-09-16
##
## ─ Packages ───────────────────────────────────────────────────────────────────────────────────────
## package * version date lib source
## ade4 * 1.7-13 2018-08-31 [1] CRAN (R 3.6.1)
## adegenet * 2.1.1 2018-02-02 [1] CRAN (R 3.6.1)
## ape * 5.3 2019-03-17 [1] CRAN (R 3.6.1)
## assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.1)
## backports 1.1.4 2019-04-10 [1] CRAN (R 3.6.1)
## boot 1.3-22 2019-04-02 [2] CRAN (R 3.6.1)
## callr 3.3.1 2019-07-18 [1] CRAN (R 3.6.1)
## class 7.3-15 2019-01-01 [2] CRAN (R 3.6.1)
## classInt 0.4-1 2019-08-06 [1] CRAN (R 3.6.1)
## cli 1.1.0 2019-03-19 [1] CRAN (R 3.6.1)
## cluster 2.1.0 2019-06-19 [2] CRAN (R 3.6.1)
## coda 0.19-3 2019-07-05 [1] CRAN (R 3.6.1)
## colorspace 1.4-1 2019-03-18 [1] CRAN (R 3.6.1)
## crayon 1.3.4 2017-09-16 [1] CRAN (R 3.6.1)
## DBI 1.0.0 2018-05-02 [1] CRAN (R 3.6.1)
## deldir 0.1-23 2019-07-31 [1] CRAN (R 3.6.1)
## desc 1.2.0 2018-05-01 [1] CRAN (R 3.6.1)
## devtools 2.2.0 2019-09-07 [1] CRAN (R 3.6.1)
## digest 0.6.20 2019-07-04 [1] CRAN (R 3.6.1)
## dplyr 0.8.3 2019-07-04 [1] CRAN (R 3.6.1)
## DT 0.8 2019-08-07 [1] CRAN (R 3.6.1)
## e1071 1.7-2 2019-06-05 [1] CRAN (R 3.6.1)
## ellipsis 0.2.0.1 2019-07-02 [1] CRAN (R 3.6.1)
## evaluate 0.14 2019-05-28 [1] CRAN (R 3.6.1)
## expm 0.999-4 2019-03-21 [1] CRAN (R 3.6.1)
## fs 1.3.1 2019-05-06 [1] CRAN (R 3.6.1)
## gdata 2.18.0 2017-06-06 [1] CRAN (R 3.6.1)
## ggplot2 3.2.1 2019-08-10 [1] CRAN (R 3.6.1)
## glue 1.3.1 2019-03-12 [1] CRAN (R 3.6.1)
## gmodels 2.18.1 2018-06-25 [1] CRAN (R 3.6.1)
## gtable 0.3.0 2019-03-25 [1] CRAN (R 3.6.1)
## gtools 3.8.1 2018-06-26 [1] CRAN (R 3.6.1)
## hierfstat * 0.04-22 2015-12-04 [1] CRAN (R 3.6.1)
## htmltools 0.3.6 2017-04-28 [1] CRAN (R 3.6.1)
## htmlwidgets 1.3 2018-09-30 [1] CRAN (R 3.6.1)
## httpuv 1.5.2 2019-09-11 [1] CRAN (R 3.6.1)
## igraph 1.2.4.1 2019-04-22 [1] CRAN (R 3.6.1)
## KernSmooth 2.23-15 2015-06-29 [2] CRAN (R 3.6.1)
## knitr 1.24 2019-08-08 [1] CRAN (R 3.6.1)
## later 0.8.0 2019-02-11 [1] CRAN (R 3.6.1)
## lattice 0.20-38 2018-11-04 [2] CRAN (R 3.6.1)
## lazyeval 0.2.2 2019-03-15 [1] CRAN (R 3.6.1)
## LearnBayes 2.15.1 2018-03-18 [1] CRAN (R 3.6.1)
## magrittr 1.5 2014-11-22 [1] CRAN (R 3.6.1)
## MASS 7.3-51.4 2019-03-31 [2] CRAN (R 3.6.1)
## Matrix 1.2-17 2019-03-22 [2] CRAN (R 3.6.1)
## memoise 1.1.0 2017-04-21 [1] CRAN (R 3.6.1)
## mgcv 1.8-28 2019-03-21 [2] CRAN (R 3.6.1)
## mime 0.7 2019-06-11 [1] CRAN (R 3.6.1)
## munsell 0.5.0 2018-06-12 [1] CRAN (R 3.6.1)
## nlme 3.1-140 2019-05-12 [2] CRAN (R 3.6.1)
## pegas * 0.11 2018-07-09 [1] CRAN (R 3.6.1)
## permute 0.9-5 2019-03-12 [1] CRAN (R 3.6.1)
## pillar 1.4.2 2019-06-29 [1] CRAN (R 3.6.1)
## pkgbuild 1.0.5 2019-08-26 [1] CRAN (R 3.6.1)
## pkgconfig 2.0.2 2018-08-16 [1] CRAN (R 3.6.1)
## pkgload 1.0.2 2018-10-29 [1] CRAN (R 3.6.1)
## plyr 1.8.4 2016-06-08 [1] CRAN (R 3.6.1)
## prettyunits 1.0.2 2015-07-13 [1] CRAN (R 3.6.1)
## processx 3.4.1 2019-07-18 [1] CRAN (R 3.6.1)
## promises 1.0.1 2018-04-13 [1] CRAN (R 3.6.1)
## ps 1.3.0 2018-12-21 [1] CRAN (R 3.6.1)
## purrr 0.3.2 2019-03-15 [1] CRAN (R 3.6.1)
## R6 2.4.0 2019-02-14 [1] CRAN (R 3.6.1)
## Rcpp 1.0.2 2019-07-25 [1] CRAN (R 3.6.1)
## remotes 2.1.0 2019-06-24 [1] CRAN (R 3.6.1)
## reshape2 1.4.3 2017-12-11 [1] CRAN (R 3.6.1)
## rlang 0.4.0 2019-06-25 [1] CRAN (R 3.6.1)
## rmarkdown 1.15 2019-08-21 [1] CRAN (R 3.6.1)
## rprojroot 1.3-2 2018-01-03 [1] CRAN (R 3.6.1)
## scales 1.0.0 2018-08-09 [1] CRAN (R 3.6.1)
## seqinr 3.6-1 2019-09-07 [1] CRAN (R 3.6.1)
## sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.6.1)
## sf 0.7-7 2019-07-24 [1] CRAN (R 3.6.1)
## shiny 1.3.2 2019-04-22 [1] CRAN (R 3.6.1)
## sp 1.3-1 2018-06-05 [1] CRAN (R 3.6.1)
## spData 0.3.0 2019-01-07 [1] CRAN (R 3.6.1)
## spdep 1.1-2 2019-04-05 [1] CRAN (R 3.6.1)
## stringi 1.4.3 2019-03-12 [1] CRAN (R 3.6.1)
## stringr 1.4.0 2019-02-10 [1] CRAN (R 3.6.1)
## testthat 2.2.1 2019-07-25 [1] CRAN (R 3.6.1)
## tibble 2.1.3 2019-06-06 [1] CRAN (R 3.6.1)
## tidyselect 0.2.5 2018-10-11 [1] CRAN (R 3.6.1)
## units 0.6-4 2019-08-22 [1] CRAN (R 3.6.1)
## usethis 1.5.1 2019-07-04 [1] CRAN (R 3.6.1)
## vegan 2.5-6 2019-09-01 [1] CRAN (R 3.6.1)
## withr 2.1.2 2018-03-15 [1] CRAN (R 3.6.1)
## xfun 0.9 2019-08-21 [1] CRAN (R 3.6.1)
## xtable 1.8-4 2019-04-21 [1] CRAN (R 3.6.1)
## yaml 2.2.0 2018-07-25 [1] CRAN (R 3.6.1)
##
## [1] /usr/local/lib/R/site-library
## [2] /usr/local/lib/R/library