SAS PROGRAMMING AND USAGE HINTS Section 1: INTRODUCTION SAS has many applications and is a relatively complete data delivery and analysis program. Although it is widely known for its statistical analysis capabilities, it also has many other functions, including database management, simulations, writing reports, and drawing graphs. The SAS manuals are not the best learning resources for beginners. They describe the program's commands and procedures in great detail and provide many excellent examples. However, for most beginners it is much better to start with books and teaching aids designed for a step-by-step introduction. Once you understand the basic concepts underlying SAS, the manuals will then become much more comprehensible and helpful. And, as you gain practical experience, you'll become more equipped to explore the many possible approaches to data management and analysis SAS offers. The purposes of this sequence of documents is to provide basic guidelines for understanding how to run SAS and to demonstrate many helpful hints or unrecognized features that will assist you to utilize it more effectively for data management and analysis. Before you begin, you'll need some familiarity with the operating system in which you will run SAS: the two current choices are for the PC (Windows 98, 2000, or XP Professional) or UNIX. Basic skills you need to possess include how to edit, store, copy, delete, transfer, and retrieve data files in subdirectories. Previous experience with a programming language such as FORTRAN, C, or PASCAL can prove very helpful, though is not required. You also need to know how to use a text editor such as pico available on the unix system to write and edit SAS program files. PC SAS contains a windows based enhanced text editor with features very similar to Microsoft WORD. Conceptually, another method is a word processing program on a PC or MAC to write or modify program files. With other text editing capabilities, this approach is not recommended. If you do, always save command files in pure text format (e.g., 'save as text' on a PC); otherwise SAS will not be able to read them. You would also need to change the suffix from .txt to .sas so SAS can identify as command files. What is SAS? Originally called "Statistical Analysis System", SAS is an integrated set of data management tools continually being developed by the SAS Institute located in Cary, North Carolina. The most recent version is 9.1.3 which far exceeds its original concept as a statistical analysis program. It runs on a variety of platforms from PCs to S/390 Enterprise Servers. Its core, called Base SAS, software consists of a complete programming language (the DATA step and MACRO language) and many procedures (PROCs) that perform a wide variety of analyses by entering relatively few commands. Actually, SAS should more appropriately be described as a Data Delivery and Analysis System so it is not necessarily correct to categorize it as a programming 'language'. Many procedures are included in additional products that are licensed separately from base SAS. They embrace modules for statistical analysis, spreadsheets, CBT, presentation graphics, project management, operations research, scheduling, linear programming, quality control, econometrics, to name only a few. Overview of SAS Capabilities - Why learn SAS? SPSS for Windows and MINITAB are both excellent products for anyone who is learning statistics or doing simple data analyses; however, SAS is far better for real data management and analysis tasks. SAS is more comprehensive than many other programs and the basics are not difficult to learn. It has very sophisticated tools available for data manipulation, import and export of data, and the most recent developments in statistical procedures. Learning SAS is a good investment of your time to gain readiness for future data analysis projects. SAS is especially good when working with longitudinal data (data collected on subjects at two or more points in time). The most efficient way to store longitudinal data collected at well-defined points in time for each subject is in multivariate form, or one record for each subject with each value stored under a different variable name, a horizontal concept. Recent additions to the list of procedures (such as PROCs MIXED and GENMOD) tend to make SAS more 'vertically oriented' or data stored in univariate format (a new record for each time period with every subject). With SAS, it is quite simple to convert data from a horizontal to vertical structure thus allowing you to analyze data the same way they were collected. Why learn SAS? Statistical programs have their strong and weak points; however the good features of SAS almost always outweigh any weaknesses. SAS performs virtually all standard data analysis tasks. It also has many useful features that leave its competition far behind including: * File manipulations, especially appending or merging two or more files together * Working with time series - (manipulating data with date and time values) * Processing survey data * Mixed Models (a combination of fixed and random effects) * Repeated Measures and Multivariate Analysis * Categorical Data Analysis (a type of Generalized Linear Models) * Data processing tasks requiring programming steps * Sequential data analysis (set up a process where you perform a series of 'steps' where the output from one step is used in the next step) * Processing or manipulating extra large data files * Simulations and Bootstrapping If any of these activities are in your future academic work, SAS is definitely worth knowing!