CHAPTER 5.1: An Introduction to SAS Procedures A procedure is a block of statements that begins with the keyword PROC followed by a short acronym or abbreviated name that succinctly describes what the procedure does. Procedures perform a multitude of tasks including data processing such as sorting data or user-written formats, many kinds of data analysis procedures (e.g., summary statistics, frequencies, correlation, regression, analysis of variance, etc.), and the presentation of data (e.g. tables and graphs). Before any data may be submitted to a procedure, they need to be placed in a SAS dataset. A comprehensive list of procedures which do the types of statistical analysis you are looking for with SAS Version 9 is found at: http://support.sas.com/onlinedoc/913/getDoc/en/allprods.hlp/a003135046.htm#a003145671 This page is sorted alphabetically with procedures from all types of SAS software. You can learn more about the function and commands for each by clicking on the procedure name. Another comprehensive list of SAS procedures which perform various types of statistical analysis is found at: http://support.sas.com/techsup/faq/stat_key/a_j.html http://support.sas.com/techsup/faq/stat_key/k_z.html These two pages are sorted alphabetically and also give brief descriptions of how the procedures work. Another helpful web-page is: http://www.ats.ucla.edu/stat/mult_pkg/whatstat/default.htm It helps you pick the right procedure and there is a little bit on assumptions Another short list of procedures with brief descritions is: http://filebox.vt.edu/cc/sas/statproc.html Before beginning data analysis, remember that preparation usually takes longer than the analysis, often much longer. And, if the preparation is wrong, no type of analysis can fix it. Documentation for SAS Procedures Many standard analytical procedures are described in the Base SASŪ 9.1.3 Procedures Guide which can be viewed online in html format at http://support.sas.com/onlinedoc/913/docMainpage.jsp or in pdf format at: http://support.sas.com/documentation/onlinedoc/91pdf/sasdoc_913/base_proc_8417.pdf Some of the most commonly applied task oriented procedures include: CONTENTS * Check the contents of a SAS data set (e.g., no. of observations, variables names, formats, labels, etc.) CORR * computes various types of correlation coefficients including Pearson, Spearman, and Kendall's Tau and coefficient alpha FORMAT * Write your own formats FREQ * produces n-way frequency and crosstabulation tables for categorical data along with relevant statistical tests MEANS * produces the basic descriptive statistics; create output files of summary statistics PLOT * plot one continuous or interval variable against another (two-dimensional scatter plots) PRINT * print the observations in a SAS data set SORT * sort observations in a SAS data set by key variables SUMMARY * compute descriptive statistics and, if desired, output them to a new data set TABULATE * construct tables of descriptive statistics TRANSPOSE * restructure data set to have selected variables become observations UNIVARIATE * produce descriptive statistics (more detailed information than PROC MEANS including boxplots, side-by-side boxplots, quantiles, normal probability plots, etc.) Detailed instructions for more advanced statistical procedures are found in the SAS/STAT User's Guide which can be found online at: http://support.sas.com/onlinedoc/913/docMainpage.jsp http://support.sas.com/documentation/onlinedoc/91pdf/sasdoc_91/stat_ug_7313.pdf The first 15 chapters of this second web site give introductions and an overview of various classification of statistical procedures (analysis of variance, categorical, multivariate, regression, surveys, etc.) explain the four types of estimable functions, and provide an overview to the Output Delivery System (ODS). The most commonly applied statistical procedures include: ANOVA Analysis of variance GENMOD Generalized Linear Models (e.g., binomial, Poisson, negative binomial, normal, and gamma regression) GLM General Linear Model for balanced or unbalanced data (linear regression, ANOVA, ANCOVA, Multivariate) MIXED Linear models with fixed and random effects (split plot, repeated measures, random coefficients, variance components, longitudinal, crossover, among others) NPAR1WAY Nonparametric Analysis of Variance REG Multiple Linear Regression TTEST Two Sample and paired t-tests The keyword PROC and the selected procedure name are written as the first statement that begins the block of procedure statements: PROC ; < insert other statements > RUN; A few procedures such as PLOT, GLM, REG, and DATASETS require the statement QUIT; to be placed after the RUN; statement. The reasons for this were given in a previous chapter. Two examples show basic applications of the PROC step: PROC PRINT DATA= all_data Label NOobs n; VAR ; RUN; PROC FREQ DATA=test_data; TABLES * / < options >; WEIGHT count; RUN; Note that each PROC statement is followed by one or more additional statements that belong to the procedure and perform specific tasks. Some statements such as BY, VAR, CLASS, etc. have the same function in various PROCS although some may structured slightly differently depending on the context (e.g., the CLASS statement in PROCs GLM and MIXED vs. LOGISTIC). Each statement, including the PROC, has mandatory items to include and most have specific options one may enter. Even with only a few statements containing your selected options, SAS will run many complicated analytical tasks. Some SAS procedures, such as FORMAT and SORT (see Chapter 5.3) and DATASETS (Chapter 4.9), perform specific tasks with SAS datasets and generally do not print output; what these procedures did is summarized in the log window or file. These procedures are introduced in various sections under Chapter 5.3 Most PROC steps print statistical results or data summaries to an output listing (such as the output window with PC SAS or *.lst files on DARKWING or GLADSTONE). Procedures that produce printed output are introduced in the various sections under Chapter 5.7. Versions 8 and 9 of SAS have implemented the Output Delivery System (ODS) which is introduced in Chapter 11. It allows you much greater flexibility to choose the portions of output you want to print and how to print them. You can redirect printed output into SAS datasets which can then be entered into subsequent PROC or DATA steps as needed. The output format of most SAS procedures is fairly standard, although procedures currently have the capability to suppress certain portions of the output; for this, see Chapter 11. Most SAS procedures also allow printed output to be completely suppressed when the NOprint option is included on the PROC statement or one of its other statements. The ODS also has this capability which allows you to turn off printing the results: ODS LISTING off; To resume printing, type ODS LISTING; This feature is particularly helpful when you want to write output datasets from a SAS procedure yet do not want or need to see the printed listing.