Shawn Yarnes Ph.D., The Ohio State Univeristy
N = total sample size (number of experimental units within both treatments)
σ = assumed standard deviation of each treatment response (both treatments assumed equal)
Z(1-(α/2))= related to the chosen significance criterion α; can be found in normal distribution tables, or calculated in Microsoft Excel using the formula = NORM.S.INV(1- (α/2)
Z(1-ß)= related to the chosen power, or sensitivity of the experiment; can be found in normal distribution tables, or calculated in Microsoft Excel using the formula = NORM.S.INV(1-β)
E = minimum detectable difference between treatment means
To solve the equation for total sample size you first must assign a priori values. The variables, Z(1-(α/2) and Z(1-ß) are set based on acceptable confidence and power levels, generally determined by scientific discipline. The variable, σ, is the assumed standard deviation of treatment based on prior knowledge, measured either from a pilot or preliminary study or from previous work, and E is the magnitude of the difference the investor hopes to be able to statitisically differentiate in the experiment. Once these four a priori values are set, the total sample size, N, can be calculated.
|Dry Weight (g)|
Table 1: Preliminary Results (n = 5)
In this example the investigator would need to have treatment sizes of n = 2,904 plants (n = N ÷ 2 ) to detect a 0.5g difference between treatments.
Consider that the investigator has learned that greenhouse space is limited to only 300 plants. The equation can be rearranged to determine the minimum detectable difference between treatment means for N=300.
Given the space constraints and the estimated standard deviation, the minimum difference that can be detected with 300 plants is 2.2g dry weight.
As sample size increases the minimum difference that can be detected increases, but at a diminishing rate. The most effective way to reduce sample size is to reduce error. The blue line represents the standard deviation of plant dry weights observed in the previously mentioned pilot study (σ = 5.9). To detect a 1g difference in dry weights between the two treatments requires a total sample size of 1,454. The red line represents a smaller error measurement of σ = 2.00, that could perhaps be obtained by growing clonal plants instead of siblings. With less variation, the total sample size needed to detect a 1g difference in dry weight is only 168 plants.
Many statisitics text books provide detailed explainations of sample size estimations:
Kuel, R.O. (2000) Design of Experiments: Statistical Principles of Research Design and Analysis, 2nd Duxbury Press, Pacific Grove.
Related eXtension Plant Breeding and Genomics Resources:
Development of this page was supported in part by the National Institute of Food and Agriculture (NIFA) Solanaceae Coordinated Agricultural Project, agreement 2009-85606-05673, administered by Michigan State University. Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the view of the United States Department of Agriculture.