Comparison of mixed model based approaches for correcting for population substructure with application to extreme phenotype sampling

Statistiques de téléchargement

Téléchargements

Téléchargements par mois depuis la dernière année

Onifade, Maryam; Roy-Gagnon, Marie-Helene; Parent, Marie-Élise ORCID: https://orcid.org/0000-0002-4196-3773 et Burkett, Kelly M (2022). Comparison of mixed model based approaches for correcting for population substructure with application to extreme phenotype sampling BMC Genomics , vol. 23 , nº 98. pp. 1-12. DOI: 10.1186/s12864-022-08297-y.

[thumbnail of Comparison of mixed model based approaches for correcting for population substructure with application to extreme phenotype sampling.pdf]

Prévisualisation

PDF - Version publiée
Disponible sous licence Creative Commons Attribution.
Télécharger (954kB) | Prévisualisation

URL Officielle: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC88152...

Résumé

BACKGROUND: Mixed models are used to correct for confounding due to population stratification and hidden relatedness in genome-wide association studies. This class of models includes linear mixed models and generalized linear mixed models. Existing mixed model approaches to correct for population substructure have been previously investigated with both continuous and case-control response variables. However, they have not been investigated in the context of extreme phenotype sampling (EPS), where genetic covariates are only collected on samples having extreme response variable values. In this work, we compare the performance of existing binary trait mixed model approaches (GMMAT, LEAP and CARAT) on EPS data. Since linear mixed models are commonly used even with binary traits, we also evaluate the performance of a popular linear mixed model implementation (GEMMA). RESULTS: We used simulation studies to estimate the type I error rate and power of all approaches assuming a population with substructure. Our simulation results show that for a common candidate variant, both LEAP and GMMAT control the type I error rate while CARAT's rate remains inflated. We applied all methods to a real dataset from a Québec, Canada, case-control study that is known to have population substructure. We observe similar type I error control with the analysis on the Québec dataset. For rare variants, the false positive rate remains inflated even after correction with mixed model approaches. For methods that control the type I error rate, the estimated power is comparable. CONCLUSIONS: The methods compared in this study differ in their type I error control. Therefore, when data are from an EPS study, care should be taken to ensure that the models underlying the methodology are suitable to the sampling strategy and to the minor allele frequency of the candidate SNPs.

Type de document:	Article
Mots-clés libres:	Extreme phenotype sampling; Generalized linear mixed models; Genome-wide association study; Population stratification; Type 1 error Linear Models Models, Genetic Phenotype Polymorphism, Single Nucleotide
Centre:	Centre INRS-Institut Armand Frappier
Date de dépôt:	23 juin 2022 02:40
Dernière modification:	23 juin 2022 02:40
URI:	https://espace.inrs.ca/id/eprint/12414

Gestion Actions (Identification requise)

Modifier la notice