Section 13a: Correlation Structures in a GEE model This section contains descriptions of a few of the correlation structures available in PROC GENMOD. Let R_i(%alpha) be a n_i*n_i working correlation matrix that is fully specified by the vector of parameters %alpha. where A_i is an n_i*n_i diagonal matrix with v(%mu_ij) as the jth diagonal element. If R_i(%alpha) is the true correlation matrix of Y, then V_i is the true covariance matrix of Y. The working correlation matrix R_i(%alpha) is usually unknown and must be estimated within the iterative fitting process by taking the current value of the parameter vector %beta to compute appropriate functions of the Pearson residual: r_ij = (y_ij - %mu_ij)/ SQRT{v(%mu)} There are several specific choices of the form of the working matrix R_i(alpha) to model the correlations of the individual responses. The following descriptions present a few of the common choices for R supported by GENMOD and the resulting covariances: Independence - type=ind R=R_o=I If you specify the working correlation as R_o=I, the identity matrix for the independence model (correlations are assumed to be 0 for all pair-wise combinations of variables), the GEE reduces to the independence (GLM) estimating equation. With the identity matrix, the impact of r_j,j' is negligible. Since all off-diagonal correlations are zero, a working correlation matrix is not estimated for this situation. Given that a dataset consists of repeated measurements within individuals, the simplest possible correlation structure is to (usually incorrectly) assume independence. This assumption is equivalent to each observation collected from an individual is completely uncorrelated with every other observation measured in that individual; correlations are assumed to be 0 for all pair-wise combinations of the within-subject variables. If rho_jk is the correlation between observations j and k, rho_jj=1 and rho_jk=0, j ne k. _ _ | sig^2 0 0 0 0 | | sig^2 0 0 0 | V= | sig^2 0 0 | | symm. Sig^2 0 | | sig^2 | - - Exchangeable - type=exch or Compound Symmetry - type=cs Corr(Y_ij,Y_ik) = %alpha, j =/ k Exchangeable assumes non-zero, yet uniform correlations for all pairs of within-subject variables. Every observation within an individual is equally correlated with every other observation from that individual. . like the icc - - | sig1 + sig2 | | sig1 sig1 + sig2 | V= | sig1 sig1 sig1+sig2 | | sig1 sig1 sig1 sig1+sig2 | _ _ This choice of correlation structure may not be reasonable with multiple measurements collected over time, since the correlations most likely will diminish as the time lag between observations increases. Exchangeable assumes that r12 = r13 = ... = rjj' which is analogous to applying the compound symmetry assumption of repeated measures with PROC GLM or PROC MIXED. Auto-regressive - type=AR(1) Autoregressive is a term derived from times series analysis that assumes observations are related to their own past values through one, two, or a higher order autoregressive (AR) process. An autoregressive correlation structure indicates that two observations taken close in time (or space) within an individual tend to be more highly correlated than two observations taken far apart in time from the same individual. Formally, rho_jj=1 and rho_jk(j ne k) decreases in value as the absolute difference between j and k gets larger. A first-order autoregressive correlation structure specifies that rho_jk = rho**|j-k| = where rho is the correlation when |j-k|=1. Corr(Y_i,j,Y_i,j+t) = a_t = %alpha^t, t=1,2,...,n_i-j %alpha is estimated by %alpha^hat = _ _ | 1 | | %a_1 1 | V= | %a_2 %a_1 1 | | %a_3 %a_2 %a_1 1 | | %a_4 %a_3 %a_2 %a_1 1 | - - m-dependent - MDEP(#) Corr(Y_i,j,Y_i,j+t) = a_t = %alpha_t, t=1,2,...,m = 0, t > m _ _ | 1 | | a_1 1 | V= | a_2 a_1 1 | | 0 a_2 a_1 1 | | 0 0 a_2 a_1 1 | - - Unstructured - type=UN Corr(Y_ij,Y_ik) = a_ij = %alpha_jk The unstructured (type=un) matrix estimates t*(t-1)/2 correlations from the data. Usually complete data are required for this approach. Unstructured assumes unconstrained pair-wise correlations where each correlation is estimated from the data (the most complex model) and is applied to balanced datasets. No assumption is made about the relative magnitude of the correlation between any two pairs of observations. Formally, rho_jj=1 and rho_ij is free to take any value between -1 and +1. _ _ | 1 | | a_21 1 | V= | a_31 a_32 1 | | a_41 a_42 a_43 1 | | a_51 a_52 a_53 a_54 1 | - - User fixed R=R_o=< specified matrix > Let R_o be a user-specified correlation matrix where corr(y_ij,y_ij)=r_ij and r_jk is the jth and kth element of the matrix. Since the correlations are specified constants, a working correlation is not estimated. All correlation coefficients are fixed by the user rather than being estimated from the data. Formally, rho_jj=1 and rho_jk can take any value between -1 and +1, but this value is fixed prior to the analysis rather than being estimated from the data.