Nested Designs

 

I.  Why

 

Sometimes it is difficult or impossible to cross all of the factors of interest in an experimental design.  The experimenter may then choose to use a fractional factorial design or latin square.  If each level of one factor is paired with only one level of another factor, then the first factor is said to be nested within the second factor.

 

II.  Examples 

 

A.  The effect on individual teachers of school district policies.  Teachers are nested within school districts: T(D).

                                                            District

                              A                                 B                                   C             

Teacher            T1        T2                    T3        T4                    T5        T6

 

If this were a completely crossed design, each teacher would work in each district; i.e., the cells in the following table would be filled.  As it is, there are observations in only 6 (see x's) cells.

                                            District

                        A                     B                      C

T1                    x                     

T2                    x

T3                                            x

T4                                            x

T5                                                                    x

T6                                                                    x

 

Note that this simple nested design is equivalent to a one-way ANOVA [ S(A)].

 

Structural model: TR(DF).  Estimated mean square table:

 

Source               df                                E(MS)             Error

1. D                 d-1                   ts2D + s2T(D)                   2

2. T(D)             d(t-1)               s2T(D)

 

B. The effect of teacher training in different school districts on students' accomplishments.

 

                                                            District

                              A                                  B                                   C            

Teacher            T1        T2                    T3        T4                    T5        T6

                        S1        S4                    S7        S10                  S13      S16

Students           S2        S5                    S8        S11                  S14      S17

                        S3        S6                    S9        S12                  S15      S18

                       

Structural Model:  SR(TR(DF))

 


Cornfield-Tukey Table

 

 

d

t

s

D

Dt

 t

n

T(D)

1

Dt

n

S(T(D))

1

1

Dn

 

Expected Mean Square Table

 

Source               df                                           E(MS)                                      Error

1. D                 d-1                   nts2D + ns2T(D) + s2S(T(D))                       2

2. T(D)             d(t-1)               ns2T(D) + s2S(T(D))                                   3

3. S(T(D))        dt(n-1)             s2S(T(D))

           

III. Calculations for Nested Designs

           

            General case with subjects nested within B nested within A        S(B(A).

 

 

A1

...

Ai

...

Aa

Means

B1

y111

.

y11k

.

y11n

 

 

 

 

 

Y11.

 

B2

y121

.

y12k

.

y12n

 

 

 

 

 

Y12.

.

Bj

.

 

 

 

 

 

yijk

 

 

 

Yij.

.

.

Bb

 

 

 

 

 

 

Means

 

Y1..

 

 

Yi..

 

 

 

Y...

 

      SST = SSA + SSB(A) + SSS(B(A))

                       

             SSB(A)= n å(ij.- i..)2   for each cell i,j with data

                               j(i)

                                                       

             SSA = nbå  (i.. - ...)2

                             i

             SSS(B(A) =  å (yijk-ij.)2

                                   k

 


IV.  Example: Single Nesting

 

            A. BR(AF)

                                                                        _

                            A1                           A2                           Bj

 

                B1            1                                                              1

                B2            3                                                              3

 

                B3                                            9                              9

                B4                                            3                              3

                _

                Ai            2                              6                              4

 

            SSA= 2*1*[(2-4)2 +(6-4)2] = 16

            SSB(A)= 1*[(1-2)2 + (3-2)2 + (9-6)2 + (3-6)2] = 20

            SST = [(1-4)2 + (3-4)2 + (9-4)2 + (3-4)2] = 36

            SSS(B(A)) = SST- SSA - SSB(A) = 0 

 

            Note that if A and S were crossed would have:

                                                            _

                            A1                           A2           Bj

 

                B1            1                              9              5

                B2            3                              3              3

                                2                              6              4

 

            SSA = 2*1*[(2-4)2 + (6-4)2]= 16

            SSB = 2*1*[(5-4)2 + (3-4)2]= 4

            SSAB = [1*1 + 1*3  - 1*3 -1*9]2  =16

                                                  4

            SST= 36

 

            SSB(A)= SSB + SSAB    20 = 4 + 16

           

            SSB(A) confounds effect of B with A X B interaction.

 


V. Ignoring Grouping Variables

 

A.  What if subjects are run in groups and each group is exposed to one of two experimental conditions?

 

            1. True model is:  SR(GR(TF))

 

                        Source             df                                             E(MS)                  Error Line

              1.       T                      t-1                    ngs2T+ns2G(T)+ s2S(G(T))                       2

              2.       G(T)                 t(g-1)               ns2G(T)+ s2S(G(T))                                   3

              3.       S(G(T))            gt(n-1)              s2S(G(T))

 

            2. But if the data are analyzed as a one-way ANOVA  -- SR(TF)

 

Source      df                Apparent E(MS)                      True E(MS)                             

1. T          t-1               ns2T+s2S(T) 2                           ngs2T+ns2G(T)+ s2S(G(T))

            2. S(T)  t(n-1)             s2S(T)                                        s2S(G(T))

 

Therefore the F ratio will be biased by ns2G(T) - the variance due to groups.

 

VI.  Example:  Double Nesting

            A.  SR(GR(TF))

 

Groups                                  Treatments                                 

                        T1                    T2                    T3                    Means

G1                   20

                        18                                                                    17.33

                        14                                                                               

G2                   19

                        20                                                                    19.67

                        20                                                                               

G3                                           14

                                                18                                            15.33

                                                14                                                       

G4                                           12

                                                12                                            11.00

                                                 9                                                        

G5                                                                   13

                                                                        16                    14.00

                                                                        13                               

G6                                                                   9

                                                                         4                     5.67

                                                                         4                                

 

Means:  18.5                            13.17               9.83                 13.83

 

Calculations:

 

            SST=3*2*[(18.5-13.83)2+(13.17-13.83)2+(9.83-13.83)2]

 

            SSG(T)=3*[17.33-18.5)2+(19.67-18.5)2+(15.33-13.17)2+

                            (11-13.17)2+(14-9.83)2+(5.67-9.83)2]

 

            SSS(G(T)=(20-17.33)2+(18-17.33)2+(14-17.33)2+...

 

            Source               SS                 df                       MS                 F                      p       

            T                      229.47             2                      114.75             2.45                 ns

            G(T)                 140.42             3                      46.85               9.58                  <.01

            S(G(T))            58.68              12                    4.89

 

            Total                428.57             17

 

If grouping factor had been ignored:

 

            Source               SS                 df                       MS                 F                      p       

            T                      229.47             2                      114.75             8.65                 <.01

            S(T)                 199.1               15                    13.27

           

            Total                428.57             17

 

Ignoring the grouping factor can lead to spurious significant results.  Why?  Plot the data.

 

B.  Because the number of groups is likely to be small, the df for the denominator in a double nested design is likely to be very small; hence, one will need a large F (around 10 in the above example) to achieve significance. 

 

However, if the MS of S(G(T)) and the MS of G(T) are close, i.e., if the F ratio comparing G(T) to S(G(T)) is non-significant and less than 2, then one can pool the G(T) and S(G(T) sums of squares and degrees of freedom and use the pooled MS in F ratios.  This general rule is good for pooling interaction terms as well (cf. Green & Tukey (1960), Psychometrica).