BIOL 458 BIOMETRY
Lab
7 – Multi-Factor ANOVA
PART 1: Introduction to Factorial ANOVA
Single
factor or
Factorial ANOVA allows us to simplify such an analysis by defining two factors (species and sex) each of which had two levels. In this example, we have established two sets of related treatments (e.g., species and sex) each with two levels (e.g., species 1 and species 2, male and female, respectively). Each individual observation must be classified as to which combination of factor levels in which it occurs (each animal weighed had both a species designation and a sex), and each animal must only occur within one combination of factor levels (no animal was both a member of species 1 and species 2, or both male and female).
Let's look at some possible factors and levels:
FACTOR LEVELS
STATE
TREE SPECIES Oak, Maple, Ash, Pine, etc.
TREATMENT Burned, Un-Burned
DEPTH 0
-
Now that you have some idea as to the sort of differences between factors and levels, look at the list below and classify them as being either a factor or level.
1) Height 2) Male 3) Species
4)
Polluted 5) Island 6)
(A good question to ask if you think you've found a factor is: "What are its possible levels?" if you cannot think of any, it is probably not a factor.)
Now that we have a clear idea as to what constitutes a factor and what constitutes a factor level (in the question above, all the odd numbers were factors and the even, levels) it is time to learn how they are used. We will start with a simple example with 2 factors, (A and B) each with 2 levels (1 and 2), where mi is the population mean for a particular combination of the levels of factor A and B (cell i).
Factor A
1 2
Factor B 1 m1 m2
2 m3 m4
What hypotheses might we want to test?
Different treatments in a
plant growth experiment could be defined according to the amounts of fertilizer
added (Factor A). Fertilizer treatment 1 might consist of
There are times when it is desirable to examine the action of 2 or more qualitatively distinct factors simultaneously. To continue our example, suppose we also wish to examine the effect of a second factor, amount of light (Factor B). In Light treatment 1, plants might receive 12 hours a day of light, while in treatment 2 they receive 18 hours a day.
There are 3 sorts of questions we might want to ask from such a design:
1) Is there any difference in plant growth between fertilizer treatments averaged across light treatments (the main effect of factor A)?
2) Is there any difference in plant growth between light treatments averaged across fertilizer treatments (the main effect of factor B)?
3) Is the effect caused by differences in light different among fertilizer treatments? or, conversely, is fertilizer effectiveness different with different amounts of light (the interaction between factor A and B)?
MAIN EFFECT OF FACTOR A
Do the group
means differ among the levels of factor A, averaging over the levels of factor
B:

More simply: Ho: m1 + m3 - m2 - m4 = 0. This is called the main effect of factor A.
MAIN EFFECT OF FACTOR B
Similarly, the main effect of factor B would be: Ho: m1 + m2 - m3 - m4 = 0,
Do the group means differ across the levels of factor B, averaging over the levels of factor A?
INTERACTION
Besides the main effects of the factors, one could also ask if the effect of factor A differed among the levels of factor B. This kind of an effect is known as the interaction of factor A and factor B. A significant interaction effect would indicate that the independent effects of factors A and factor B are not additive. The effect of factor A on the group means depends on the effect of factor B on the group means.
Ho: (m1 - m2) - (m3 + m4) = 0
For example, the growth of plants in response to increasing amounts of water and fertilizer might show an interaction effect. Plants might be able to use more fertilizer and therefore grow more if provided more water, but not if water is limited. Hence, the effect of fertilizer on plant growth depends on the level of water they receive.
Here are some numerical
examples:
(1) Factor A
1 2
Factor B 1 5 10
2 10 15
The numbers listed under each combination of factor levels represent the population means. The questions at hand are:
Is there a main effect of factor A?
Is there a main effect of factor B?
Is there an interaction between factor A and B?
The test for the main effect of factor A would be: Ho: (5+10) - (10+15) = 0.
Since 15 ¹ 25, there is probably a main effect of factor A. Similarly for factor B, Ho: (5+10) - (10+15) = 0, and 15 ¹ 25 so a main effect of factor B is also present.
The test for the interaction between factor A and B would be: Ho: (5+15) - (10+10) = 0. Since 20 = 20, there is no interaction. In other words, the difference between means among levels of factor A does not differ between levels of factor B (in this case the difference is always 5), or conversely, that the differences between the levels of B do not differ between levels of A.
(2) What effects are present in this example?
Factor A
1 2
Factor B 1 5 10
2 15 10
(3) What effects are present in this example?
Factor A
1 2
Factor B 1 5 10
2 10 5
For (2) and (3), follow the procedure used above to determine what main effects and interactions exist. Based on these examples, would you say that the main effects of factor A and B are additive?
Factorial ANOVA applies to the general case of 2 or more sets of related treatments. For example, one could have a 3-factor design in which light, water, and fertilizer are manipulated to determine their individual and interactive effects on plant growth. In this case we would have hypothesis tests on 3 - main effects, 3 two - way interactions, and 1 three - way interaction. Experimental designs with more factors would have an increasing number of main effects and interactions to estimate and test.
PART 2: Balanced and Unbalanced Experimental Designs
In a One - way ANOVA or in a Multi - Factor ANOVA with equal sample sizes under each combination of treatments, the main effects and interactions are orthogonal (independent) of each other - they are balanced. Therefore, the sums - of - squares (SS calculated for each treatment effect can be uniquely attributed to that treatment).
Experimental designs in which the different cells or treatment combinations have different numbers of observations associated with them are termed unbalanced. Experiments may be unbalanced by design, or become unbalanced because subjects are lost from an otherwise balanced design during the course of the experiment. In unbalanced designs, the main effects and interactions are not independent of one another. For example, part of the variation attributable to factor A could also be attributed to factor B. The sources of variation in the data are said to be confounded. The consequences of such confounding is that the results of a test for a given main effect or interaction may depend on the magnitude of other main effects or interactions. Clearly, this is not a desirable property in a statistical test. Fortunately, ways have been developed to deal with the problem.
One approach to account for confounded sources of variation is to test each factor only after all variance that can be explained by other factors has been removed. This is referred to as the Unique SS or the Type III SS approach. This is the approach that should be used in most situations. The Type III SS only uses the variation that can be uniquely attributed to each effect. In a second approach, called the Hierarchical approach or Type II SS, when testing main effects, only the variation attributable to other main effects is removed prior to testing main effects. When testing 2-way interactions, the variance attributable to main effects and other 2-way interactions is removed prior to testing. When testing higher order interactions (3-way and higher), variance attributable to main effects, lower order interactions and interactions of the same order is removed prior to testing. In the final method, called the Sequential or Type I SS approach, the first main effect is tested without removing any of the variance attributable to other factors; the second main effect is tested with the variance attributable to the first removed and so forth until all effects have been tested.
In SPSS, one can perform Multi-factor ANOVA in the General Linear Model -Univariate command in the Analyze Menu. Click on the Model button to select the Type III SS, Type II SS, or Type 1 SS.
Part III: Fixed and Random Effects Factors
Another complication arises in Multi-factor ANOVA as a result of the nature of the factor(s) included in a particular experiment. If for a particular factor, one includes all the relevant levels of that factor in the experiment, then the factor is said to be a fixed effect factor. For example, for the factor sex, if both males and females are included in the experiment, then all the relevant levels have been included. Furthermore, for a factor such as water level, which could take on a wide range of values, if values are selected and included in an effort to represent the entire range of relevant levels, then such a factor is also considered a fixed effect factor. On the other hand, if the specific levels of a factor that are included in an experiment represent a random sample from a very large number of possible levels, then such a factor is called a random effects factor. An example of a random effects factor might be species or location, if the particular species or locations chosen and included in an experiment are chosen at random from a large list of possible species and locations. In experiments that include random effects factors, the effects of the factors are not independent of the particular randomly selected subset of factor levels included in the experiment so, the estimation of treatment effects and hypothesis tests differ from experimental designs with only fixed factors. However, for random effects factors one can make inferences to those levels of the factors not included in the experiment. In the example mentioned above, this would allow us to make inferences to species or sites that we did not sample, as long as they had been at equal risk of being included in the experiment.
Further Instructions
for Lab 7
In
SPSS, when performing any ANOVA that
is more complicated than a One Way ANOVA, one selects the General Linear Model procedure
from the Analyze Menu. From the
submenu, choose Univariate whenever you are performing an ANOVA with all
factors between-subject factors.
The
window for the Univariate General Linear Model will have a list of your
variable on the left side of the window, and a series of boxes into which you
insert your single response variable, and the names of each of the variables
used to define your factors. As in an independent groups t-tests and a single factor ANOVA, factor levels must be defined by
integer codes inserted into the data file.
Since
SPSS does not handle Mixed models well, insert all
your factor names into the box for Fixed factors. This means that you will have
to choose the correct mean square for the denominators of the F-ratios and calculate them by hand if
you have random factors in your ANOVA design.
Ignore the Random Factors Box, The Covariate Box, and the WLS Weight
Box.
Of
the other buttons, I routinely use the Model, Plots and the Options button. The
Model button is where you can specify something other than a full factorial
model. For example, if you did not want to estimate the interaction effects.
This is not something we will do this semester. However, under the Model button
is also where you choose which type sum-of-squares you want to use. The Default
is Type III SS which is what I recommend as well. However, to perform a Type I
or Type II analysis you need to access this sub-window and make your selection.
Under the Plots sub-window, you can get SPSS to produce crude plots of the
means of all the treatments. These are not great graphs, but they are useful
for interpreting the treatment effects, particularly the interaction. For a
2-Factor design decide which factor will have its means arrayed across the
x-axis, and which will be displayed as separate lines. Then click over the
factor names to the respective box. If you have a 3-factor design then the
levels of the 3rd factor need to be displayed in separate graphs.
After clicking over the factor names you wish to graph, click the “Add” button. Click on “Continue”
to return to the next higher sub-window.
For
the Options button, I usually request the descriptive statistics which will
contain the treatments means and their standard errors. Click over the main
effects (v1 and V2) and interactions (V1*V2) to get those treatment means.
Click “Continue” return to the next higher submenu.
Other
buttons are not necessary for this lab, but feel free to check them out to see
that they do.
Click
“OK”
to run the analysis.
Depending
on which Options and Plots you selected you may get a variety of output. The
first table should just summarize the factor levels and their respective sample
sizes, the second table should be a table of descriptive statistics (if you
asked for it), and the third table will be the ANOVA table. The key part of the
output is the ANOVA Table which will appears in a pivot table titled “Tests
of Between-Subject Effects.” This table will have the Means squares, df’s, F’s, and p values for all
hypothesis tests involving the between subjects factors (in this case all the
factors). Remember, if you have a Random factor in your design then you can use
the Mean Squares in the ANOVA Table, but you must recalculate some of the F- ratios and their significance levels
by using the appropriate Mean Square in the denominator of the F- ratios. SPSS only fits the full fixed
effects model or the full random effects model (if you put your factors in the
“Random
Effects Box” when setting up the analysis). After the ANOVA table will
appear a section call “Profile Plots” which will contain any
plots of treatment means you requested.
Interpretation
of the output is relatively straightforward.
LAB - 7 Assignment
A) A text data file on the plant growth experiment example above has been created called PLGR1 (plgr1.dat or the equivalent SPSS File). Each case has three variables: 1) The fertilizer treatment (1 through 3), 2) the light treatment (1 and 2), and 3) the dry weight of the plant after 10 weeks of growth. Use SPSS to analyze the data. Specify: a) the hypotheses tested, b) the result of the hypothesis tests, and c) interpret the results with respect to the experiment.
B) Graph the means for each treatment
combination in the experiment. Does the graphical analysis corroborate the
statistical analysis?
C) Repeat the analysis using either the
Type I SS or the Type II SS approach. Report on any differences from A).
D) Another researcher attempted to replicate the experiment. Unfortunately, during the final phase of the experiment a raccoon got into the plot and ate half the plants. The data from that experiment is on file PLGR2 (plgr2.dat or the equivalent SPSS file). Analyze the results of this experiment using Type I, II and III SS. Compare the results to each other.
E) Repeat D) reversing the order the factors appear in the factors window. Compare the results to those of D). Which method for dealing with unbalanced designs would be best? Which requires the largest number of arbitrary decisions?