Nested Designs

Nested Designs

I. Why

Sometimes it is difficult or impossible to cross all of the factors of interest in an experimental design. The experimenter may then choose to use a fractional factorial design or latin square. If each level of one factor is paired with only one level of another factor, then the first factor is said to be nested within the second factor.

II. Examples

A. The effect on individual teachers of school district policies. Teachers are nested within school districts: T(D).

District

A B C

Teacher T1 T2 T3 T4 T5 T6

If this were a completely crossed design, each teacher would work in each district; i.e., the cells in the following table would be filled. As it is, there are observations in only 6 (see x's) cells.

District

A B C

T1 x

T2 x

T3 x

T4 x

T5 x

T6 x

Note that this simple nested design is equivalent to a one-way ANOVA [ S(A)].

Structural model: T^R(D^F). Estimated mean square table:

Source df E(MS) Error

1. D d-1 ts²_D + s²_T(D) 2

2. T(D) d(t-1) s²_T(D)

B. The effect of teacher training in different school districts on students' accomplishments.

District

A B C

Teacher T1 T2 T3 T4 T5 T6

S1 S4 S7 S10 S13 S16

Students S2 S5 S8 S11 S14 S17

S3 S6 S9 S12 S15 S18

Structural Model: S^R(T^R(D^F))

Cornfield-Tukey Table

	d	t	s
D	Dt	t	n
T(D)	1	Dt	n
S(T(D))	1	1	Dn

Expected Mean Square Table

Source df E(MS) Error

1. D d-1 nts²_D + ns²_T(D) + s²_S(T(D)) 2

2. T(D) d(t-1) ns²_T(D) + s²_S(T(D)) 3

3. S(T(D)) dt(n-1) s²_S(T(D))

III. Calculations for Nested Designs

General case with subjects nested within B nested within A S(B(A).

	A1	...	Ai	...	Aa	Means
B1	y₁₁₁ . y_11k . y_11n					Y_11.
B2	y₁₂₁ . y_12k . y_12n					Y_12.
. Bj .			y_ijk			Yij.
. . B_b
Means	Y1..		Yi..			Y...

SS_T= SS_A + SS_B(A) + SS_S(B(A))

SS_B(A)= n å(_ij_.- _i..)² for each cell i,j with data

j(i)

SS_A = nbå (_i.. - _...)²

SS_S(B(A) = å (y_ijk-_ij.)²

IV. Example: Single Nesting

A. B^R(A^F)

A₁ A₂ B_j

B₁ 1 1

B₂ 3 3

B₃ 9 9

B₄ 3 3

A_i 2 6 4

SS_A= 2*1*[(2-4)² +(6-4)²] = 16

SS_B(A)= 1*[(1-2)² + (3-2)² + (9-6)² + (3-6)²] = 20

SS_T = [(1-4)² + (3-4)² + (9-4)² + (3-4)²] = 36

SS_S(B(A)) = SS_T- SS_A - SS_B(A) = 0

Note that if A and S were crossed would have:

A₁ A₂ B_j

B₁ 1 9 5

B₂ 3 3 3

2 6 4

SS_A = 2*1*[(2-4)² + (6-4)²]= 16

SS_B = 2*1*[(5-4)² + (3-4)²]= 4

SS_AB = [1*1 + 1*3 - 1*3 -1*9]² =16

SS_T= 36

SS_B(A)= SS_B + SS_AB 20 = 4 + 16

SS_B(A) confounds effect of B with A X B interaction.

V. Ignoring Grouping Variables

A. What if subjects are run in groups and each group is exposed to one of two experimental conditions?

1. True model is: S^R(G^R(T^F))

Source df E(MS) Error Line

1. T t-1 ngs²_T+ns²_G(T)+ s²_S(G(T)) 2

2. G(T) t(g-1) ns²_G(T)+ s²_S(G(T)) 3

3. S(G(T)) gt(n-1) s²_S(G(T))

2. But if the data are analyzed as a one-way ANOVA -- S^R(T^F)

Source df Apparent E(MS) True E(MS)

1. T t-1 ns²_T+s²_S(T) 2 ngs²_T+ns²_G(T)+ s²_S(G(T))

2. S(T) t(n-1) s²_S(T) s²_S(G(T))

Therefore the F ratio will be biased by ns²_G(T) - the variance due to groups.

VI. Example: Double Nesting

A. S^R(G^R(T^F))

Groups Treatments

T1 T2 T3 Means

G1 20

18 17.33

G2 19

20 19.67

G3 14

18 15.33

G4 12

12 11.00

G5 13

16 14.00

G6 9

4 5.67

Means: 18.5 13.17 9.83 13.83

Calculations:

SS_T=3*2*[(18.5-13.83)²+(13.17-13.83)²+(9.83-13.83)²]

SS_G(T)=3*[17.33-18.5)²+(19.67-18.5)²+(15.33-13.17)²+

(11-13.17)²+(14-9.83)²+(5.67-9.83)²]

SS_S(G(T)=(20-17.33)²+(18-17.33)²+(14-17.33)²+...

Source SS df MS F p

T 229.47 2 114.75 2.45 ns

G(T) 140.42 3 46.85 9.58 <.01

S(G(T)) 58.68 12 4.89

Total 428.57 17

If grouping factor had been ignored:

Source SS df MS F p

T 229.47 2 114.75 8.65 <.01

S(T) 199.1 15 13.27

Total 428.57 17

Ignoring the grouping factor can lead to spurious significant results. Why? Plot the data.

B. Because the number of groups is likely to be small, the df for the denominator in a double nested design is likely to be very small; hence, one will need a large F (around 10 in the above example) to achieve significance.

However, if the MS of S(G(T)) and the MS of G(T) are close, i.e., if the F ratio comparing G(T) to S(G(T)) is non-significant and less than 2, then one can pool the G(T) and S(G(T) sums of squares and degrees of freedom and use the pooled MS in F ratios. This general rule is good for pooling interaction terms as well (cf. Green & Tukey (1960), Psychometrica).