Psychology 613
Data Analysis III
Prof. Bertram Malle
Spring 2005


Lecture 3 (Apr 5)
Introduction to Matrix Algebra

  1. Example of matrices: calendar, table in journal article, EXCEL file, and of course data files.

  2. Matrix operations are performed on the whole matrix. A matrix must therefore be complete, with r x c (or n x p) dimensions and all entries real numbers.

    A matrix has usually at least 2 x 2 dimensions, a vector has 1 x p or n x 1 dimensions, and a scalar has 1 x 1 dimensions and is therefore a single number.

    The order of a matrix is its size in terms of its r x c dimensionality.

  3. Square matrix: A matrix for which n = p.

  4. Identity matrix: A square matrix of any dimension with 1's in the main diagonal and 0's in all off-diagonals. Multiplying any matrix by an appropriate identity matrix leaves the matrix untouched. Thus,
    A*I = I*A = A.

  5. Diagonal matrices: Any square matrix that has 0's in its off-diagonals but non-zero elements in the main diagonal.

  6. Matrix addition/subtraction requires two matrices, A and B, of equal order (= equal r x c) and is performed on the corresponding elements, aij and bij. Matrix addition/subtraction is commutative, associative, and distributive.

  7. Scalar multiplication is easy. In k A, each matrix element, aij, is multiplied by the scalar k, and the resulting matrix has the same dimensionality r x c as the original one.

  8. Matrix multiplication is unusual. The new elements result from the sums of cross-multiplications of the original elements, according to an orderly procedure. The golden rule of matrix multiplication is: Before you multiply numbers, determine dimensionality! First, two matrices can only be multiplied if the first matrix's column and the second matrix's row have equal order, because
         A  *  B  =  C
        nxp   pxr   nxr

    As you can see, the "inner" dimensions must be identical and get canceled out. These are the dimensions across which the summing is performed. This fact can be best illustrated by the multiplication of two unit vectors, which leads to a single number, the sum of the elements over which we summed:

    [1 1 1]  *  |1|
                |1| = [3]
                |1|
    
     1x3        3x1   1x1
    

    Matrix multiplication is associative and distributive, but not commutative. That is, normally A*B is not B*A; but A*(B*C) = (A*B)*C.

    For another introduction to the very basics of matrix algebra, see this tutorial from the NIST/SEMATECH e-Handbook of Statistical Methods.

  9. Summing and weighting with vectors. Any matrix that is multiplied by an appropriately dimensioned unit vector (where all elements are 1's) can be reduced in one dimension, and the new elements are sums of the previous ones. This unit vector multiplication performs summing operations on matrices. To sum across the Rows of a matrix you need to pRe-multiply by a unit vector; to sum across the cOlumns of a matrix you need to pOst-multiply by a unit vector.
       1 1 1  *  1 2  
                 3 4  =     9 12 [sums across rows]
                 5 6        
    
         1'       A          1'A
        1x3      3x2         1x2
    
    
            1 2     1       3
            2 4  *  1  =    6
            3 6             9 [sums across columns]
    
             A    1         A1
            3x2  2x1        3x1
    
    By extension, we can perform weighted sums by using weighting vectors (whose elements are not all 1's) instead of unit vectors. Thus, any matrix that is postmultiplied by a weighting vector results in a new vector of numbers that are the weighted sums performed across the matrix's columns. Again, vector postmultiplication sums across columns.
    This procedure is used mainly to form new linear combinations among variables for each case.
            1 2     2       4
            2 4  *  1  =    8
            3 6            12
    
             A    w         Aw
            3x2  2x1       3x1
    

    Also, any matrix that is premultiplied by a weighting vector results in a new vector of numbers that are sums performed across the matrix's rows. Again, vector premultiplication sums across rows. This procedure is used to form statistics (sums, means, etc.) of variables.

       1 2 3  *  1 2  
                 3 4  =     22 28
                 5 6        
    
         w'       A          w'A
        1x3      3x2         1x2
    

  10. Rescaling with diagonal matrices.
    Multiplying a data matrix A by an appropriate diagonal matrix D "rescales" the A matrix. That is, A retains its original order but the specific elements in its columns (or rows) are altered by the diagonal elements in D.
     
            1 2      d1  0           d1  2d2
            2 4   *  0  d2   =      2d1  4d2
            3 6                     3d1  6d2
    
             A         D               AD
            3x2       2x2             3x2
    
    Any matrix A that is pOstmultiplied by a diagonal matrix D (as shown above) results in a matrix of A's dimensionality but with rescaled column entries. The rescaling "runs across" the cOlumns---that is, the diagonal elements, say, d1 and d2, rescale the first and the second columns, respectively.

    Any matrix A that is pRemultiplied by a diagonal matrix D' results in a matrix of A's dimensionality but with rescaled row entries. The rescaling "runs across" the rows---that is, the diagonal elements, say, d1 and d2, rescale the first and the second rows, respectively:

             d1  0   0      1  2             d1  2d1
             0  d2   0  *   3  4  =         3d2  4d2
             0   0  d3      5  6            4d3  6d3
    
                 D'          A                D'A
                3x3         3x2               3x2
    

  11. Transposing means that the columns become rows and the rows become columns. The matrix's dimensionality changes from r x c to c x r. This is accomplished by taking the lower left corner of the matrix and folding it back and up toward to the upper right corner while starting to pull that upper right corner down to the lower left corner. For square matrices you can think about as rotating the matrix in space around the main diagonal.) Here is a brief quicktime demonstration of the transposing process.
            1 2 3     ->    1 4
            4 5 6           2 5
                            3 6
    
              A              A'
             2x3            3x2
    

  12. Self-multiplication.
    Any matrix A, with n x p, that is pre-multiplied with its own transpose, A', with p x n, results in a pxp sums-of-square/cross-products matrix of the variables (p) summed across the cases (n):
             A' * A         = SS/CP of p's
            pxn  nxp             pxp
    
    If A contains mean-deviated scores (p variables in the columns), the SS/CP matrix A'A contains the familiar sums-of-squares from ANOVA. By dividing the SS terms by n-1, we get the variance of each variable. By dividing the CP terms by n-1 we get the covariance terms among each pair of variables.

    Any matrix A, with n x p, that is post-multiplied with its own transpose, A', with p x n, results in a nxn sums-of-square/cross-products matrix of the cases (n) summed across the variables (p), which is a rather infrequent procedure:

  13.  
    
             A * A'         = SS/CP of n's
            nxp  pxn             nxn
    

  14. More complex combinations:
    1'A 1 = grand sum
    D'A D = rescaled for both rows and columns
    (AD)' AD = D' A'A D = self-multiplication after rescaling (e.g., variance/covariance matrix of standardized scores = correlation matrix; see next lecture)