Two examples of excellent summaries for homework1

Summary 1

Descriptive statistics and frequency distributions were first examined to determine if there were any problems with the raw data. Three anomalies were found in the raw data. These anomalies were fixed based on a logical analysis of the constructs involved and other data for that subject. After the anomalies were fixed, descriptive statistics and distribution related plots were obtained. These analyses were used to examine modality, symmetry, and the presence or absence of outliers. Additionally, descriptive statistics were examined.

The four attribute related variables (adjext, adjint, iamext, iamint) demonstrated extreme positive skewness. There were a significant number of zeros present in the data, which would have created the positive skewness. Outliers were also present in some of the variables, which also influenced the shape of the distribution. Transformations were then performed on these variables in attempt to improve the shape of the distribution. Results were reported for both the negative inverse function and the square root function in an effort to fix normality. Although these transformations improved the skewness statistic, they did not significantly improve the actual shape of the distribution.

The other variables (extra, reser, livel, shy, talka, intro, outgo, quiet) in the data set appeared to have a more normal and symmetric distributions based on the descriptive statistics and distribution plots. There did not appear to be any significant skewness associated with any of these variables. Outliers were present that would have an effect on individual distributions. Correlation matrices were obtained on the introverted variables and the extroverted variables to determine the relationship between the related variables. Significant correlations were found between all the extroverted variables and all the introverted variables. Two aggregate variables (combined introverted and combined extroverted) were created to reduce the data. Mean scores were used to compute these variables.

The descriptive statistics and distributions of the two combined variables were then examined. It appears that the combined variables result in a more normal and symmetric distribution than each individual variable. Also, outliers were not present in the combined distributions. Overall, the transformations and aggregations improved the symmetry of the data. Although the square root and negative inverse transformation improved the skewness statistic, I would be skeptical their exact improvement on the data. The aggregation of the variables provided much better results and I would definitely recommend using the aggregated variables in future analyses.

Summary 2:

Prior to analysis, number of extraverted attributes checked, number of introverted attributes checked, number of extraverted attributes generated, number of introverted attributes generated, extraverted unipolar scale, reserved unipolar scale, lively unipolar scale, shy unipolar scale, talkative unipolar scale, introverted unipolar scale, outgoing unipolar scale, and quiet unipolar scale were examined through various SPSS programs for accuracy of data entry, missing values, normality, and outliers. The variables were examined for the 159 subjects.

Missing values were not altered for the exploratory data analyses. Data entry errors were corrected based on the information provided about the data set. Distributions for continuous variables were examined through normality plots, detrended plots, and boxplots. These plots indicated that some of the variables were moderately skewed; however, none of the distributions appeared to be dramatically skewed. The boxplots also indicated the presence of outliers, but these were not shown to be true extreme outliers (more than 3 inter-quartile ranges away from the median). Transformations were applied to skewed distributions using the moderate corrections of squaring the variable and taking the square root of the variable. In the two variables examined, neither transformation produced a distribution better approximating normality than the original variable distributions.

Crosstabulations and bivariate scatterplots were used to inspect relationships between variables. Bivariate scatterplots indicated the possibility of significant and strong correlations between continuous variables. Crosstabulation matrices indicated that introvert and extravert categorical variables appeared to be measuring different personality characteristics.

Computing a correlation matrix for all variables supported the possible correlations observed in the bivariate scatterplots. Based on high correlations and semantic similarity, two aggregate variables were produced by combining the reserved and quiet variables and the talkative and outgoing variables. In doing so, the resulting aggregate variables eliminated the problems with correlations between variables and also remedied the moderately skewed distributions of the original variables.