RESEARCH ARTICLE


On the Informativeness of Dominant and Co-Dominant Genetic Markers for Bayesian Supervised Clustering



Gilles Guillot*, 1, Alexandra Carpentier-Skandalis2
1 Department of Informatics and Mathematical Modelling, Technical University of Denmark, 2800, Lyngby, Copenhagen, Denmark.
2 Centre for Ecological and Evolutionary Synthesis, Department of Biology, University of Oslo, P.O. Box 1066 Blindern, 0316 Oslo, Norway


© 2011 Guillot et al.;

open-access license: This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International Public License (CC-BY 4.0), a copy of which is available at: https://creativecommons.org/licenses/by/4.0/legalcode. This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

* Address correspondence to this author at the Department of Informatics and Mathematical Modelling, Technical University of Denmark, 2800, Lyngby, Copenhagen, Denmark. 2800, Lyngby, Copenhagen, Denmark; Tel: +4545253321; E-mail: gigu@imm.dtu.dk


Abstract

We study the accuracy of a Bayesian supervised method used to cluster individuals into genetically homogeneous groups on the basis of dominant or codominant molecular markers. We provide a formula relating an error criterion to the number of loci used and the number of clusters. This formula is exact and holds for arbitrary number of clusters and markers. Our work suggests that dominant markers studies can achieve an accuracy similar to that of codominant markers studies if the number of markers used in the former is about 1.7 times larger than in the latter.

Keywords: Assigment method, multilocus genotype, SNP, AFLP, likelihood, Bayes estimator..