5. Normal Regression Models PROC GENMOD can fit a general linear regression or covariance model. However, some of the options that commonly associated with PROCs REG, GLM, and MIXED are not provided. For example, GENMOD cannot compute a random effects ANOVA and estimate variance components. As with other ANOVA procedures models, one can compare the means of an independent categorical variable at several levels. Beginning with version 7.0 an LSMEANS statement was added to PROC GENMOD. Comparisons of means can now be obtained with this statement by entering the diff option (see example below). The other procedures mentioned so far in this section are based on least squares, ML, or REML estimation methods. Since these methods specialize in normal theory linear models, they are usually preferred to PROC GENMOD. However, for comparison and to lead to interpretation of other types of models, it is helpful to illustration linear regression with GENMOD. Note also, this example illustrates how PROC GENMOD is based on maximum likelihood calculations and for this reason should be considered a procedure to only be utilized for for relatively large datasets. PROC GLM DATA=pr; CLASS location; MODEL weight = location | age / Solution ss3; LSMEANS location / pdiff ; TITLE2 'ANCOVA results with PROC GLM'; run; QUIT; Results from PROC GLM: Source DF Type III SS Mean Square F Value Pr > F location 1 1.7873 1.7873 2.26 0.1379 age 1 405.2960 405.2960 512.43 <.0001 age*location 1 32.7665 32.7665 41.43 <.0001 The ANOVA table: Sum of Source DF Squares Mean Square F Value Pr > F Model 3 536.4489951 178.8163317 226.09 <.0001 Error 61 48.2463649 0.7909240 Corrected Total 64 584.6953600 --> residual variance = .7909240 To compute the unbiased estimate of the variance (as with PROC GLM), add the dscale option to the MODEL statement. PROC GENMOD DATA=pr; CLASS location; MODEL weight = location | age / link=id dist=normal TYPE3 dscale; LSMEANS location / diff ; TITLE2 'ANCOVA with PROC GENMOD, scaled by deviance'; run; Analysis of Parameter Estimates Standard Wald 95% Confidence Chi- Parameter DF Estimate Error Limits Square Intercept 1 -1.2420 0.3488 -1.9257 -0.5584 12.68 location 1 1 0.7592 0.5050 -0.2307 1.7491 2.26 location 2 0 0.0000 0.0000 0.0000 0.0000 . age 1 0.6549 0.0306 0.5950 0.7148 459.17 age*location 1 1 -0.2900 0.0451 -0.3783 -0.2017 41.43 age*location 2 0 0.0000 0.0000 0.0000 0.0000 . Scale 0 0.8893 0.0000 0.8893 0.8893 ^^^^^^ value for Scale = 0.889339 -> MSE = 889339 ** 2 = .7909, the unbiased estimate of the variance, which matches the value computed with PROC GLM Type 3 Sums of Squares, Mean Squares, and F values, Source DF Type III SS Mean Square F Value Pr > F location 1 1.7873092 1.7873092 2.26 0.1379 age 1 405.2959960 405.2959960 512.43 <.0001 age*location 1 32.7664589 32.7664589 41.43 <.0001 The output from the GENMOD LSMEANS statement follows: Least Squares Means Standard Chi- Effect location Estimate Error DF Square Pr > ChiSq location 1 3.1946 0.1503 1 451.62 <.0001 location 2 5.3576 0.1624 1 1088.7 <.0001 The table of of LSMEANS and their differences looks much like that produced from PROC MIXED. These values are computed at the mean of the covariate. One big difference from PROC MIXED is the LSMEANS statement in PROC GENMOD does not have an AT option which allows you to specify computation of LSMEANS for other values of the covariate other than the mean. It is possible to compute these differences at any value of a covariate with ESTIMATE statements within GENMOD, but it is much easier to do these calculations with LSMEANS in PROC MIXED. Differences of Least Squares Means Standard Chi- Effect location _location Estimate Error DF Square Pr > ChiSq location 1 2 -2.1629 0.2213 1 95.55 <.0001