David M. Francis, The Ohio State University
Heather L. Merk, The Ohio State University
Deana Namuth-Covert, University of Nebraska-Lincoln
After a source, often a wild accession, has been identified as possessing a trait of interest to a breeding program, a next logical question for plant breeders is, "How can this trait be best incorporated into valuable breeding material?" The answer depends largely on the genetic nature of the trait. To study the genetic nature of a trait, phenotypic data and genotypic data from molecular markers can, by detecting associations between markers and traits, help determine the number and nature of a gene/quantitative trait locus (QTL) controlling a trait.
To detect associations between molecular markers and traits of interest, data analysis approaches include single marker analysis, simple interval mapping (SIM), multiple interval mapping (MIM), and composite interval mapping (CIM). Although these approaches are designated for QTL analysis, they are also typically employed whenever a trait's method of genetic control is unknown. This article focuses on single marker analysis.
Single marker analysis can be conducted using a variety of statistical analyses, including t-tests, ANOVA, regression, maximum likelihood estimations, and log likelihood ratios. The fact that molecular marker genotypes can be classified into groups means that marker genotypes can be used as classifying variables for a t-test or ANOVA, or as variables for regression analysis. The null hypothesis tested is that genotypic classes do not differ in pheontype for a given molecular marker. Single marker analysis calculates whether phenotype values differ among genotypes for a given molecular marker. For example, do resistant and susceptible individuals have different genotypes at a given molecular marker? Significant differences suggest that the marker genotype and phenotype are connected.
In the simplest case, linear equations can be developed to describe the relationship between a trait and each molecular marker using the following form:
Y = µ + f(marker) + error
Single marker analysis using ANOVA was used in the bacterial spot start-to-finish example to determine associations between phenotype for bacterial spot resistance in the field and genotype for F2 populations.
The Plant and Soil Sciences eLibrary provides a helpful animation that is complementary to this lesson. The animation guides users through single marker analysis using ANOVA in Microsoft Excel.
To perform single marker analysis, the plant breeder must first develop a population that is segregating for the trait of interest. When developing a population for QTL analysis, the population structure and size, as well as the number and type of molecular markers, must be considered. Population structure and size are briefly considered here.
When analyzing populations with balanced structure (e.g., backcross one [BC1], F2, and recombinant inbred line [RIL] populations), the analyses can be easily performed using genetic mapping software such as QTL Cartographer and the statistical analyses are parametric (e.g., ANOVA). As mentioned above, this type of analysis was used to determine associations between phenotype for bacterial spot resistance in the field and genotype for F2 populations in the bacterial spot start-to-finish example.
When analyzing populations with unbalanced structure (e.g., inbred backcross [IBC] populations like a BC2S5 population, which has an expected 7:1 genotypic ratio), non-parametric statistics such as the Kruskal–Wallis statistic may be appropriate. Unbalanced populations typically have a phenotype and/or genotype class that has too few individuals to make parametric statistics appropriate. The IBC population developed in the start-to-finish example provides an example.
The ability to detect associations between molecular markers and bacterial spot resistance are in part dependent on population size. In general, the smaller the effect of the QTL, the larger the number of individuals required to detect it.
To detect an additive QTL that explains 50% of the phenotypic variation of a trait in an F2 population requires a population size of at least 16. This assumes that the marker is completely linked to the trait, the probability level is 0.05, and the probability of missing a true association is 10%. Assuming the same conditions, at least 206 individuals are required to detect an additive QTL that explains only 5% of the phenotypic variation.
The advantages of single marker analysis are based on Collard et al. (2005).
The disadvantages and limitations of single marker analysis are based on Collard et al. (2005).
These limitations may be overcome by using a large number of molecular markers spread throughout the genome.
Single marker analysis is a relatively simple method of QTL analysis that can be conducted to detect associations between molecular markers and traits of interest.
Ben Hui Liu provides a thorough explanation of QTL analysis in his text, Statistical Genomics.
Development of this page was supported in part the National Institute of Food and Agriculture (NIFA) Solanaceae Coordinated Agricultural Project, agreement 2009-85606-05673, administered by Michigan State University. Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the view of the United States Department of Agriculture.