Section 6a: Analysis of Fixed Effects in PROC MIXED The CLASS, MODEL, CONTRAST, ESTIMATE, and LSMEANS statements form the foundation for estimating fixed effects ANOVA models with PROC MIXED. Their function is virtually the same as found with these statements in PROC GLM, though with PROC MIXED you'll find a few more options and interesting features are available. For example, the CONTRAST and ESTIMATE statements can also assist with testing pairwise differences in fixed factor means or compute contrasts of interest (rather than the omnibus contrasts produced with the table of Type3 tests, see Section 6b). Assuming you only have 1 observation from each subject or experimental unit, most results for testing hypotheses concerning fixed factor ANOVAs will be the identical to those computed with PROC GLM. PROC MIXED DATA=indat NOitPrint; CLASS clss1 clss2 ; MODEL y = clss1 clss2 cov1 cov2 / ; < LSMEANS, ESTIMATE, CONTRAST and other statements entered here >; RUN; Classification factors to be treated as fixed effects must appear on the CLASS and the MODEL statements (in this example only fixed factors are considered). These variable names are placed on the MODEL statement following the equals sign with desired main effects and interactions (if any) for the chosen model; continuous data (e.g., variable names cov1 and cov2) to be treated as covariates are entered to the right of the equals sign. You can enter all the main effects and interactions individually: MODEL y = clss1 clss2 clss1*clss2; Or enter @# at the end which utilizes the vertical bar notation where the number # specified will compute all 2, 3,.. # interactions. For example, the | entered between the class variables, concluding with the @2 after the final variable, produces all main effects and two-factor interactions from three variables: PROC MIXED; CLASS clss1 clss2 clss3; MODEL y = clss1 | clss2 | clss3 @2; < other statements > ; RUN; For designs with both between- and within-subject, all fixed effects are placed on the MODEL statement in PROC MIXED (see subsequent chapters for examples). This is in contrast to PROC GLM, in which the between-subject effects are placed on the MODEL statement and the main within-subject effect on the REPEATED statement. Why doesn't PROC MIXED Produce Sums of Squares? When you run an analysis with PROC MIXED one portion of the output which is notably absent from the fixed effects ANOVA table are the sums of squares and mean squares. The default estimation method is called REML which is short for "Restricted Maximum Likelihood". Like "Maximum Likelihood" (ML) both estimation methods work with an iterative process to estimate values of the parameters which maximize the likelihood function and therefore the procedure does not need to compute sums of squares. In fact, with the variety complicated covariance structures closed form solutions aren't available. However, to make the transition to PROC MIXED easier, for the basic designs you may want to first produce the familiar ANOVA tables which have sums of squares and mean squares. Beginning with version 8.0, for fixed effect models PROC MIXED computes moment estimates of statistics (like PROC GLM) by specifying METHOD=type<1 or 2 or 3> on the PROC MIXED statement. The omission of sums of squares from the ANOVA table is not a legitimate reason to prefer GLM over MIXED. Complex models that can be fit with likelihood based estimators cannot be estimated with the moment estimators: the METHOD=type option does not work when you include a RANDOM statement for random coefficient models or whenever you include a REPEATED statement. You may not observe the same issues about ordering variables on the model statement as you would find with type1 entry in regression or with simple, unbalanced ANOVA problems, for example. Under the estimation methods of ML or REML the table of fixed effects has either "type1" or "type3" tests invoked with the htype=<1 or 3> option. With balanced data, both options produce the same results. The "type3" option computes an F-test as if each respective variable were the last one to be entered into the model. The Least Squares Means and Tests for Pairwise Differences among the LSMEANS A fixed effects ANOVA with PROC MIXED tests the significance of main effects and interactions. Assuming you have 3 or more levels on a given factor, you can also apply post-hoc tests to determine where significant differences exist while maintaining an overall pvalue of your choice of alpha. A MEANS statement as found in PROC GLM is not available. Pairwise comparisons of means are efficiently computed with the LSMEANS statement which contains the DIFF option specially designed for this purpose (an analogous option to get all possible pairwise pvalues was PDIFF available on the LSMEANS statement with PROC GLM). LSMEANS treat / diff cl adjust=tukey; The cl option expresses the pairwise differences between cell means not only with a p-value but also by providing a 95% confidence interval. With an interaction or covariate you may want to save the LSMEANS to an output file and plot them. Since this process involves the Output Delivery System (ODS), an example of this is given in Chapter 11. Computing LSMEANS and their associated differences is often done with the Output Delivery System ODS statements in place, since the output of differences can get rather messy with interactions or with main effects containing a large number of levels, or with covariates. The usual ODS syntax for working with the LSMEANS with two fixed effects and their differences at and a covariate (cvr) is: ODS OUTPUT LSMEANS=lsm(drop=tvalue df probt) DIFFS=dfs; ODS EXCLUDE LSMEANS DIFFS; PROC MIXED; CLASS treatment gender; MODEL y = treatment|gender|cvr@2 / solution; LSMEANS treatment*gender / diff at (cvr)=(10) ; LSMEANS treatment*gender / diff at means ; LSMEANS treatment*gender / diff at (cvr)=(90) ; RUN; PROC PRINT DATA=lsm NOobs; VAR treatment gender cvr estimate stderr; RUN; PROC PRINT DATA=dfs NOobs; WHERE treatment EQ _treatment; VAR treatment cvr gender _gender estimate stderr df tvalue probt; Run; Testing Interactions With significant two-factor interations you may specify two LSMEANS statements as follows: lsmeans group*time / slice=group; lsmeans group*time / slice=time; The slice option constructs some contrasts of interest. When you specify SLICE=time, you will get tests of the effect of group in each of the time periods. For a design with a control group there hopefully will be no group effect in the first time period. Specifying SLICE=group will give you time effects within each of the intervention groups. You will want to see no period effect within the control group. Tests for main effects and interactions among the levels of fixed effects can be computed with the CONTRAST and ESTIMATE statements, the topic of the next section (6b). This approach is often pursued to compute contrasts that compare three or more levels of the factors or any combination of means that is not a pairwise contrast. Checking Residuals for Normality Section 1 mentioned residuals plots should be examined for normality. With PROC MIXED you can save three types of residuals as an option on the MODEL statement. Actually, several choices exist to place the residuals and predicted values into a SAS dataset. The OUTPM= option of the MODEL statement specifies computed marginal values for each value of the explanatory data. With the RESIDUAL option invoked the type of residual in this file is called marginal and rmi, rmi_student, and rmi_pearson are added to the dataset with variable names Resid, StudentResid, and PearsonResid, respectively. The OUTP= option places conditional predicted values in the dataset and thus the type of residual is known as conditional. With the residual option in place the variable names rci, rci_student, and rci_pearson are added to the dataset. With fixed effects only in the model, these two types of residuals will be the same. The terms marginal and conditional can be interpreted in the same way you think about 'marginal' when running 2x2 tables with PROC FREQ. It is analogous to an average of the possible conditional means but it is not an unbiased estimator (because of the z'_i*gammahat component due to random effects, but to understand this you need to look at the section discussing mixed models with random effects). Marginal is connected to the levels of the fixed effects only. Conditional is essentially connected to predicted values which are based on both the fixed effects and on the random subject effect. More details can be found at: http://support.sas.com/onlinedoc/913/getDoc/en/statug.hlp/mixed_sect39.htm#stat_mixed_mixeddetailres Checking Residuals for Normality Once the residuals have been computed and placed in a SAS dataset, a simple process to check normality is found with these statements from PROC UNIVARIATE: PROC UNIVARIATE DATA=prd NOprint; VAR resid studentresid pearsonresid; HISTOGRAM / NOframe normal midpoints=-2 to 2 by .25; * The normal option superimposes a normal distribution curve on the histogram. The midpoints= option specifies the range and bin width - first, run the step without it and then modify range and bin width and run it again; QQPLOT / Noframe normal; TITLE1 'Residual Plot'; RUN; The HISTOGRAM statement produces a histogram of the residuals with equal bin widths (which can easily be modified based on the range and density of the data). The normal option superimposes a normal curve over the histogram. To change the resolution of the underlying histogram, modify the midpoints= option. First run it without it to see the default (that is place a ;* after the word normal on the HISTOGRAM statement. Then, noting the min and max values, enter them into the midpoint boundaries on this statement with a multiple of the incremental value (in this case, 0.25=(2-1)/4 ). The QQPLOT statement draws a quantile-quantile plot from which normality is deduced by observing how much it resembles an upward sloping straight-line. Outliers tend to fall either above or below the linear fit.