Authors:
David M. Francis, The Ohio State University
Heather L. Merk, The Ohio State University
Matthew Robbins, The Ohio State University
This page is a continuation of the Overview of Analysis of Variance page and is intended to help plant breeders consider the notions of fixed and random effects and the impacts these can have on ANOVA in the context of plant breeding. Briefly, ANOVA is a statistical test that takes the total variation and assigns it to known causes, leaving a residual portion allocated to uncontrolled or unexplained variation, called the experimental error. By measuring variability as sums of squares deviating from the mean sum of squares for all observations, the variation assigned to different controlled causes will be additive. It is therefore important to completely define the statistical model. Otherwise, the experimental error may be unnecessarily inflated (McIntosh, 1983).
In the Overview of Analysis of Variance page, we considered the following linear model:
Y = m + f(treatment) + error
where
Intuitively, we may think about the treatments as being under our control and as "fixed." Usually we are interested in comparing the dependent variable among factors/levels of the fixed effect. For example, we may want to evaluate whether differences in yield (dependent variable) between field locations for some elite cultivars we've been developing. To conduct this experiment, we would select the cultivars we want to evaluate and find suitable locations for our trial. We could think of the cultivars and locations as being fixed; we purposely chose to study different cultivars and locations. In this case, we are only interested in the performance of the elite cultivars we're testing in the specific locations we're testing.
Random effects, in contrast to fixed effects, are typically used to account for variance in the dependent variable. Also, unlike fixed effects, we aren't looking to compare one level of the random effect to another. In our example, we could also consider location as a random effect. In the case of random effects, levels are chosen randomly from an infinite population and we want to make inferences that can extend beyond the sample. If this were the case, the cultivars would still be fixed effects, but location would be random. If we felt our locations were representative of all possible locations, we could use the different locations to help us make an evaluation of how well cultivars perform across locations as a whole, not just at the locations we've tested. The classification of effects as fixed or random determines the appropriate F-test.
McIntosh (1983) provides a set of reference tables for use during experimental design and analysis. These tables are intended for field experiments conducted over two or more locations or years. Some of the tables are replicated below.
Sources of variation | df | Mean squares | Expected mean squares^{1} | ||
---|---|---|---|---|---|
RL-RT | RL-FT | FL-FT | |||
Locations (l) | l-1 | M_{1} | σ^{2}_{e} + rσ^{2}_{TL} + tσ^{2}_{R(L)} + rtσ^{2}_{L} | σ^{2}_{e} + tσ^{2}_{R(L)} + rtσ^{2}_{L} | σ^{2}_{e} + tσ^{2}_{R(L)} + rtσ^{2}_{L} |
Blocks(Location) (r) | l(r-1) | M_{2} | σ^{2}_{e} + tσ^{2}_{R(L)} | σ^{2}_{e} + tσ^{2}_{R(L)} | σ^{2}_{e} + tσ^{2}_{R(L)} |
Treatment (t) | t-1 | M_{3} | σ^{2}_{e} + rσ^{2}_{TL} + rlσ^{2}_{T} | σ^{2}_{e} + rσ^{2}_{TL} + rlσ^{2}_{T} | σ^{2}_{e} + rlσ^{2}_{T} |
Location x treatment | (l-1)(t-1) | M_{4} | σ^{2}_{e} + rσ^{2}_{TL} | σ^{2}_{e} + rσ^{2}_{TL} | σ^{2}_{e} + rσ^{2}_{TL} |
Pooled error | l(r-1)(t-1) | M_{5} | σ^{2}_{e} | σ^{2}_{e} | σ^{2}_{e} |
^{1} R = random, F = fixed, L = location, T = treatment
Sources of variation | Mean squares | Expected mean squares^{1} | ||
---|---|---|---|---|
RL-RT | RL-FT | FL-FT | ||
Locations (l) | M_{1} | (M_{1}+M_{5})/(M_{2}+M_{4}) | M_{1}/M_{2} | M_{1}/M_{2} |
Blocks(Location) (r) | M_{2} | |||
Treatment (t) | M_{3} | M_{3}/M_{4} | M_{3}/M_{4} | M_{3}/M_{5} |
Location x treatment | M_{4} | M_{4}/M_{5} | M_{4}/M_{5} | M_{4}/M_{5} |
Pooled error | M_{5} |
^{1} R = random, F = fixed, L = location, T = treatment
In a genetic/breeding experiment, treatments would likely be genotypes or varieties.
When designing experiments, plant breeders must consider the question they want to answer. Consequently, plant breeders must consider what type of statistical analyses are appropriate to answer the desired question. With regards to ANOVA, two important points should be considered in this context.
Many statistics textbooks provide a good discussion of theory and applications of ANOVA. Two examples are listed below.
Development of this page was supported in part by the National Institute of Food and Agriculture (NIFA) Solanaceae Coordinated Agricultural Project, agreement 2009-85606-05673, administered by Michigan State University. Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the view of the United States Department of Agriculture.
PBGworks 865