Answers to Practice Questions for Final:
(annotated with side comments in places)
Part III: Data Analysis
Problem 1: Popularity of candidates
a. Do the candidates differ in popularity?
b. Chi-square goodness of fit
c. The data is categorical (hence chi-square). We are looking to see if candidates DIFFER in popularity (hence goodness of fit)
d. Ha: Candidates differ in popularity: All candidates are not equally popular
Ho: All candidates are equally popular
e. Step 1: completed in d.
Step 2: sampling distribution will be Chi-square with C-1 df. 3-1 =2, so Chi-square (2).
Step 3: find cutoff. Select alpha (.05), critical value will be 5.992
[Note: if .01 chosen, cutoff will be 9.211. Be sure you look at the right column and row in Table A-4, depending on your significance level and df. You will be marked off if you don't indicate what your alpha is.]
Step 4: calculate statistic. Need the "expected" numbers if candidates are equally preferred, which will be the probability for each cell (100%/3 = 33.3%, or .333) times your total number of candidates  = 33.3 for each cell. Then write down the formula for chi-square [see p. 235 in your book] and follow out the calculations: (43-33.3)2/33.3 + (45-33.3)2/33.3 + (12-33.3)2/33.3 =
2.83 + 4.11 + 13.62 = 20.56.
[Note: I probably shouldn't have chosen n=100 for this problem, because in this particular case, the probability equals the predicted number. If n = 200, putting 33.3 in the boxes would NOT work, because when you multiplied the probability times the N, you'd get 66.6. Make sure that the expected FREQUENCY is what appears in both your "observed" and your "expected" boxes]
Step 5. Decide about Ho. 20.56 > 5.992 [if you chose .01, also > 9.211], so reject Ho.
Answer: Candidates are not all equally popular. Looking at the counts, it appears that Joe Bloggs
is trailing the other two.
Note: When you find a result, look back at your observed and expected counts and make a comment about what looks like the main departure.
Problem 2: Gender gap
a. Does the preference for Gore versus Bradley depend on gender?
b. Chi-square for independence
c. We are looking for a possible relation (association) between 2 categorical variables: gender and candidate
d. Ha: Preference for candidate will depend on gender
Ho: Preference for candidate is independent of gender
e. Step 1: completed in d.
Step 2: sampling distribution will be Chi-square with (C-1)(R-1) df. (2-1)(2-1) =1, so
Step 3: find cutoff. Select alpha (.05), critical value will be 3.841
[Note: if .01 chosen, cutoff will be 6.635. Be sure you look at the right column and row in Table A-4, depending on your significance level and df. You will be marked off if you don't indicate what your alpha is.]
Step 4: calculate statistic. Need the "expected" numbers if candidates if the variables are
independent. Use E = (RC)/N to find the expected number for each
cell. R = row total; C = column total, N is total number of all
people/objects ... in this case people. Write in the row and column
totals for the "observed" matrix you already have. This will be
100 / 100 for men, women (the row totals), and 105 / 95 for Gore and
Bradley (the column totals). The overall N is 200. [Note: you can
check your math by making sure the two row totals and the two column
totals both add up to the same N]
For men/Gore, the expected cell value will be (100)(105) / 200 = 52.5. Note: Because you know
the row and column totals, you can figure all the other expected values once you have one. They
will look like this: Note: you can double check your math
in figuring the expected frequencies by checking to ensure that you get
the same row and column totals as you did for "observed" matrix.
|Men||52.5||47.5||(adds to 100)|
|Women||52.5||47.5||(adds to 100)|
|(Adds to)||105||95||(Grand total is 200 both ways)|
Write down the formula for chi-square [see p. 235 in your book] and follow out the calculations: (45-52.5)2/52.5. + (55-47.5)2/47.5 + (60-52.5)2/52.5 + (40-47.5)2/47.5 =
1.07 + 1.18 + 1.07 + 1.18 = 4.5.
Step 5. Decide about Ho. **If you used .05**, 4.5 > 3.841 so reject Ho.
** If you used .01**, 4.5 < 6.635, so retain Ho.
Answer: [.05] Gender and preference for candidate are related, at 95% confidence level. Looking at the cell means [for observed], it looks like women lean more toward Gore, while men lean more toward Bradley.
[.01] Although it looks like there may be a tendency for women to lean toward Gore, and men
toward Bradley, this is not significant at the .01 level.
Note: The focus here is on using what you know to "read" results as they might be presented in a research article. What you are given are two separate correlation matrixes.
1. Lots of possibilities. Here are some:
For gambling boys, there is a significant positive relationship between fighting and three other variables: Alcohol/drug use, theft rates, and anxiety ratings by their mothers. For non-gambling boys, none of these are significantly correlated, and the obtained correlations are either close to zero or slightly negative (for alcohol drug use). [This would count as three differences]
For gambling boys, theft rates and anxiety are significantly and positively correlated. For non-gambling boys, the correlation is negative (although not significantly so). [A fourth difference]
2.a. We could make predictions, using information about fighting, to make predictions about alcohol/drug use and theft --- we would predict that boys that fight more are more likely to show the other anti-social behaviors, HOWEVER
b. We can only make these predictions for boys who gamble. No significant relationship with fighting is apparent for the other two types of anti-social behavior, and thus we can't make predictions if all we know about is fighting.