Nested Designs
I. Why
Sometimes it
is difficult or impossible to cross all of the factors of interest in an
experimental design. The experimenter
may then choose to use a fractional factorial design or latin square. If each level of one factor is paired with
only one level of another factor, then the first factor is said to be nested
within the second factor.
II. Examples
A. The effect on individual teachers of school
district policies. Teachers are nested
within school districts: T(D).
District
A B C
Teacher T1 T2 T3 T4 T5 T6
If this were
a completely crossed design, each teacher would work in each district; i.e.,
the cells in the following table would be filled. As it is, there are observations in only 6 (see x's) cells.
District
A B C
T1 x
T2 x
T3 x
T4 x
T5 x
T6 x
Note that
this simple nested design is equivalent to a one-way ANOVA [ S(A)].
Structural
model: TR(DF).
Estimated mean square table:
Source df E(MS) Error
1. D d-1 ts2D + s2T(D) 2
2. T(D) d(t-1) s2T(D)
B. The effect
of teacher training in different school districts on students' accomplishments.
District
A B C
Teacher T1 T2 T3 T4 T5 T6
S1 S4 S7 S10 S13 S16
Students S2 S5 S8 S11 S14 S17
S3 S6 S9 S12 S15 S18
Structural Model: SR(TR(DF))
Cornfield-Tukey Table
|
d |
t |
s |
D |
Dt |
t |
n |
T(D) |
1 |
Dt |
n |
S(T(D)) |
1 |
1 |
Dn |
Expected Mean Square Table
Source df E(MS) Error
1. D d-1 nts2D + ns2T(D) + s2S(T(D)) 2
2. T(D) d(t-1) ns2T(D) + s2S(T(D)) 3
3. S(T(D)) dt(n-1) s2S(T(D))
III.
Calculations for Nested Designs
General case with subjects nested
within B nested within A S(B(A).
|
A1 |
... |
Ai |
... |
Aa |
Means |
B1 |
y111 . y11k . y11n |
|
|
|
|
Y11. |
|
y121 . y12k . y12n |
|
|
|
|
Y12. |
.
. |
|
|
yijk |
|
|
Yij. |
. . Bb |
|
|
|
|
|
|
Means |
Y1.. |
|
Yi.. |
|
|
Y... |
SST = SSA + SSB(A)
+ SSS(B(A))
SSB(A)= n å(ij.-
i..)2
for each cell i,j with data
j(i)
SSA = nbå (i.. -
...)2
i
SSS(B(A) = å (yijk-ij.)2
k
IV. Example: Single Nesting
A. BR(AF)
_
A1 A2 Bj
B1 1 1
B2 3 3
B3 9 9
B4 3 3
Ai 2 6 4
SSA= 2*1*[(2-4)2
+(6-4)2] = 16
SSB(A)= 1*[(1-2)2
+ (3-2)2 + (9-6)2 + (3-6)2] = 20
SST = [(1-4)2
+ (3-4)2 + (9-4)2 + (3-4)2] = 36
SSS(B(A))
= SST- SSA - SSB(A) = 0
Note that if A and S were crossed
would have:
_
A1 A2 Bj
B1 1 9 5
B2 3 3 3
2 6 4
SSA = 2*1*[(2-4)2
+ (6-4)2]= 16
SSB = 2*1*[(5-4)2
+ (3-4)2]= 4
SSAB = [1*1 + 1*3 - 1*3 -1*9]2 =16
4
SST= 36
SSB(A)= SSB +
SSAB 20 = 4 + 16
SSB(A) confounds effect
of B with A X B interaction.
V. Ignoring
Grouping Variables
A. What if
subjects are run in groups and each group is exposed to one of two experimental
conditions?
1. True model is: SR(GR(TF))
Source df E(MS) Error Line
1. T t-1 ngs2T+ns2G(T)+ s2S(G(T)) 2
2. G(T) t(g-1) ns2G(T)+ s2S(G(T)) 3
3. S(G(T)) gt(n-1) s2S(G(T))
2. But if the data are analyzed as a
one-way ANOVA -- SR(TF)
Source df Apparent
E(MS) True
E(MS)
1. T t-1
ns2T+s2S(T) 2 ngs2T+ns2G(T)+ s2S(G(T))
2. S(T) t(n-1) s2S(T) s2S(G(T))
Therefore the F ratio will be biased by ns2G(T) - the variance due to
groups.
VI. Example:
Double Nesting
A.
SR(GR(TF))
Groups Treatments
T1 T2 T3 Means
G1 20
18 17.33
14
G2 19
20 19.67
20
G3 14
18 15.33
14
G4 12
12 11.00
9
G5 13
16 14.00
13
G6
9
4
5.67
4
Means: 18.5 13.17 9.83 13.83
Calculations:
SST=3*2*[(18.5-13.83)2+(13.17-13.83)2+(9.83-13.83)2]
SSG(T)=3*[17.33-18.5)2+(19.67-18.5)2+(15.33-13.17)2+
(11-13.17)2+(14-9.83)2+(5.67-9.83)2]
SSS(G(T)=(20-17.33)2+(18-17.33)2+(14-17.33)2+...
Source SS df MS F
p
T 229.47 2 114.75 2.45 ns
G(T) 140.42 3 46.85 9.58 <.01
S(G(T)) 58.68 12 4.89
Total 428.57 17
If grouping factor had been ignored:
Source SS df MS F
p
T 229.47
2 114.75 8.65 <.01
S(T) 199.1 15 13.27
Total 428.57 17
Ignoring the grouping factor can lead to spurious
significant results. Why? Plot the data.
B. Because
the number of groups is likely to be small, the df for the denominator in a
double nested design is likely to be very small; hence, one will need a large F
(around 10 in the above example) to achieve significance.
However, if the MS of S(G(T)) and the MS of G(T) are
close, i.e., if the F ratio comparing G(T) to S(G(T)) is non-significant and
less than 2, then one can pool the G(T) and S(G(T) sums of squares and degrees
of freedom and use the pooled MS in F ratios.
This general rule is good for pooling interaction terms as well (cf.
Green & Tukey (1960), Psychometrica).