%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Homework # 9 KEY %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Question #4.50 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % part b [X1 X2 Y] = textread('Crudeoil.dat', '%f %f %f'); ssize=size(Y,1); k=2; U=ones(ssize,1); X1_Sq=X1.*X1; X2_Sq=X2.*X2; X1X2=X1.*X2; X=[U X1 X2 X1_Sq X2_Sq X1X2]; beta=(inv(X'*X))*(X'*Y); SSE=(Y - X*beta)'*(Y - X*beta); beta; %'hat_y = 26.1933 + 0.0477 x1 + 0.7601 x2 - 0.0069 (X2)^2 + 0.0001 X1 X2' Z=zeros(3,6); for i=1:3 for j=1:3 Z(j,1+2*(i-1))=Y(i + 3*(j-1)); Z(j,2+2*(i-1))=X1(i+3*(j-1)); end Z( :, 1+2*(i-1)); Z( :, 2+2*(i-1)); plot(Z( :, 2+2*(i-1)), Z(:,1+2*(i-1)), 'k'); hold on; end hold on; %part c. V=[U X1 X2 X1X2]; betanew=(inv(V'*V))*(V'*Y); betanew; %part d y= inline(' 54.5 + 0.0077.*t + 0.5541.*s + 0.0001.* t.* s', 't', 's'); % y = inline('26.1933 + 0.0477.*t + 0.7601.*s - 0.0069.* (s)^2 + 0.0001.* t.* s', 't', 's'); t=1000:500:2000; plot( t, y(t, 0)); hold on; plot( t, y(t, 15)); hold on; plot( t, y(t, 30)); hold off; %part e Ymean=mean(Y); SSE1=(Y - V*betanew)'*(Y - V*betanew); SS_yy=(Y-Ymean)'*(Y-Ymean); R2=1 - SSE1/SS_yy; n=ssize; k=3; F=(R2/k)/((1-R2)/(n-(k+1))); alpha=0.05; df1=k; df2=n-(k+1); fcv=finv(1-alpha, df1,df2); p_value=1- fcdf(F, df1,df2); %part f %format long invC=inv(V'*V); s2=SSE1/(ssize - (k+1)); s=sqrt(s2); t=betanew(4,1)/(s*sqrt(invC(4,4))); alpha=0.05; df=ssize-(k+1); tcv=tinv(1-alpha/2, df); tp_value = 2*(1- tcdf(abs(t), ssize-(k+1))); clear; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Question #4.50 % Part a. hat{y} =beta_0 + beta_1 x1 + beta_2 x2 + beta_3 (x1)^2 + beta_4 (x2)^2 % + beta_5 x1 x2. %Part b. Since the lines are fairly parallel, there is no indication that % pressure and dipping angle strongly interact. Actually we can fit the % general second model with two independent variables to get % hat{y} = 26.1933 + 0.0477 x1 + 0.7601 x2 - 0.0069 (X2)^2 + 0.0001 X1 X2 %Part c. hat{y} = 54.5 + 0.077 x1 + 0.5541 x2 + 0.0001 X1 X2 %Part d. The plots are very similar. It appears the model will provide an % adequate fit. %Part e. To determine if the model is adequate, we test : % H_0: beta_1=beta_2=beta_3 =0 % H_a: at least one beta_i ~=0, i,1,2,3 % Since the observed F statistics is 44.67 and the critical value % F(1-alpha, 3,5 )=5.4095, the H_0 is rejected. The model is adequate at % alpha =0.05. %Part f. To determine if there is interaction between pressure and dipping angle, we test: % H_0: beta_3=0 vs H_a: beta_3 ~= 0 % since the observed t statistic is 0.67772 and the critical t value % t(1- alpha/2, 5)=2.57, H_0 is not rejected. There is insufficient % evidence to indicate the interaction between pressure and dipping angle % at alpha=0.05. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Question # 4.70 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Question#4.70 %Part a. If a variable is measured ona numerical scale, it is a % quantitative variable. So Quantitative GMAT score, Verbal GMAT score, Undergraduate % GPA, First-year graduate GPA are quantitative variables. Student % cohort is a qualitative variable which has 3 categories. %Part b. The Quantitative GMAT score, verbal GMAT score, undergraduate GPA, and first-year % graduate GPA should all be positively correlated to the final GPA. %Part c. define that x5 is equal to 1 or 0 according as student entered doctoral % program in 1990 or otherwise and that x6 is equal to 1 or 0 % according as entered doctoral program in 1992 or otherwise. %Part d. E(y) = beta_0 + beta_1 x1 + beta_2 x2 + beta_3 x3 + beta_4 x4 + % beta_5 x5 + beta_6 x6. %Part e. beta_0 = the y-intercept for student entering in 1988 % beta_1 : y (the final GPA) increase by beta_1 for each addtional increase of one unit of GMAT score % beta_2 : y (the final GPA) increase by beta_2 for each addtional increase of one unit of verbal % GMAT score % beta_3 : y (the final GPA) increase by beta_3 for each addtional % increase of one undergraduate point. % beta_4 : y (the final GPA) increase by beta_4 for each addtional % increase of one first-year graduate GPA point. % beta_5 : difference of in mean final GPA between student cohort % 1990 and 1998. % beta_6 : difference of in mean final GPA between student cohort % 1992 and 1998. % Part f. E(y) = beta_0 + beta_1 x1 + beta_2 x2 + beta_3 x3 + beta_4 x4 + % beta_5 x5 + beta_6 x6 + beta_7 x1 x5 + beta_8 x1 x6 + beta_9 x2 x5 % beta_10 x2 x6 + beta_11 x3 x5 + beta_12 x3 x6 + beta_13 x4 x5 + % beta_14 x4 x6. % Part g. For the 1988 cohort, x5 = x6 =0, the model is % E(y) = beta_0 + beta_1 x1 + beta_2 x2 + beta_3 x3 + beta_4 x4. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %Question #4.72 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %part a y= inline(' -2.51 + 0.55.*x - 0.01.* x.^2', 'x'); x=10:32; %plot(x,y(x),'k'); % part c R2=0.67; n=33; k=2; F=(R2/k)/((1-R2)/(n-(k+1))); alpha=0.05; df1=k; df2=n-(k+1); fcv=finv(1-alpha, df1,df2); p_value=1- fcdf(F, df1,df2); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %Question #4.72 %Part a. the prediction equation is downward openning. %Part b. R_Sq = 0.67. 67% of sample variation of daily growth rates is % explained by hte quadratic model. % Part c. To determine if the model is useful, we test : % H_0: beat_1 = beat_2 =0 % H_a: either beta_1 or beta_2 is not zero % Since the observed F test statistic is 30.4545 and the critical value % F(1-alpha, 2,30)=3.3158, H_0 is rejected and the model is useful at % alpha =0.05 . %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %Question #4.76 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %part a %For accountants R2=0.114; n=169; k=2; F=(R2/k)/((1-R2)/(n-(k+1))); alpha=0.05; df1=k; df2=n-(k+1); fcv=finv(1-alpha, df1,df2); p_value=1- fcdf(F, df1,df2); clear; %For Truck Drivers R2=0.298; n=107; k=2; F=(R2/k)/((1-R2)/(n-(k+1))); alpha=0.05; df1=k; df2=n-(k+1); fcv=finv(1-alpha, df1,df2); p_value=1- fcdf(F, df1,df2); clear; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %Question #4.76 %Part a. For Accountants, to determine if the model is adequate, we test : % H_0: beta_1=beta_2 =0 % H_a: at least one beta_i ~=0, i,1,2 % Since the observed F statistics is 10.679 and the critical value % F(1-alpha, 2,166 )=3.05, the H_0 is rejected. The model is adequate at % alpha =0.05. % For Truck Drivers, to determine if the model is adequate, we test : % H_0: beta_1=beta_2 =0 % H_a: at least one beta_i ~=0, i,1,2 % Since the observed F statistics is 22.07 and the critical value % F(1-alpha, 2,104 )=3.08, the H_0 is rejected. The model is adequate at % alpha =0.05. %Part b. For Accountants, beta_1=-1.40 is amount shifted along x-axis. It doesn't % have practical meaning. beta_2 = 1.13 is positive. This implies the % curve is upward openning. This gives high mean rate of turnover at both lower % performance and higher performance. % For Truck Drivers, beta_1=-1.50 is amount shifted along x-axis. It doesn't % have practical meaning. beta_2 = 1.22 is positive. This implies the % curve is upward openning. This gives high mean rate of turnover at both lower % performance and higher performance. %Part c. For Accountants, to determine if there is evidence of upward curvature in the relation between % turnover and performance, we test: % H_0: beta_2=0 vs H_a: beta_2 > 0 % since the observed t statistic is 3.23 and the critical t value % t(1- alpha/2, 166)=1.645, H_0 is rejected. There is sufficient % evidence to indicate the evidence of upward curvature in the relation between % turnover and performance at alpha=0.05. %Part d. For Truck Drivers, to determine if there is evidence of upward curvature in the relation between % turnover and performance, we test: % H_0: beta_2=0 vs H_a: beta_2 > 0 % since the observed t statistic is 4.70 and the critical t value % t(1- alpha/2, 104)=1.66, H_0 is rejected. There is sufficient % evidence to indicate the evidence of upward curvature in the relation between % turnover and performance at alpha=0.05. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %Question#4.80 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %part b R2=0.65; n=30; k=3; F=(R2/k)/((1-R2)/(n-(k+1))); alpha=0.05; df1=k; df2=n-(k+1); fcv=finv(1-alpha, df1,df2); p_value=1- fcdf(F, df1,df2); clear; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %Question #4.80 %Part a. beta_0= -105. This does not have a practical meaning. % beta_1=25. We estimate the difference in weekend and weekday mean % daily admissions to be 25 , holding weather condition and % temperature constant. % beta_2= 100. We estimate the difference in sunny and overcast mean % daily admissions to be 100 , holding week and % temperature constant. % beta_3=10. We estimate mean daily admissions to increase 10 for % every 1 degree increase in the temperature, holding the day of week % and weather condition constant. %Part b. To determine if the model is adequate, we test : % H_0: beta_1=beta_2=beta_3 =0 % H_a: at least one beta_i ~=0, i,1,2,3 % Since the observed F statistics is 16.095 and the critical value % F(1-alpha, 3,26 )=2.98, the H_0 is rejected. The model is adequate at % alpha =0.05. %Part c. To determine if the mean attendance increases on weekends, we test: % H_0: beta_1=0 vs H_a: beta_1 > 0 % since the observed t statistic is 2.50 and the critical t value % t(1- alpha, 26)=1.315, H_0 is rejected. There is sufficient % evidence to indicate the mean attendance increases on weekends at alpha=0.05. %Part d. hat{y}=-105+25(0)+100(1)+10(95)=945 %Part e. We are 90% confident the daily attendece on a sunny weekday with a %predicted daily high of 95 F degree will fall in the interval 645 to 1245. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %Question #5.6 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %Question #5.6 %Part a. i. First order. % ii. Third order % iii. First order % iv. Second order %Part b. i. E(y)=beta_0 + beta_1 x % ii. E(y) = beta_0 + beta_1 x + beta_2 x^2 + beta_3 x^3 % iii. E(y)=beta_0 + beta_1 x % iv. E(y) = beta_0 + beta_1 x + beta_2 x^2 %Part c. i. beta_1 > 0 % ii. beta_3 > 0 % iii. beta_1 < 0 % iv. beta_2 < 0 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %Question #5.8 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% [X Y] = textread('Tires2.dat', '%f %f'); ssize=size(X,1) plot(X,Y, 'x'); hold off; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %Question #5.8 %Part a. see the graph %Part b. For x=30,31,32,33 only, we would suggest E(y) = beta_0 + beat_1 x % For x=33,34,35,36 only, we would suggest E(y) = beta_0 + beat_1 x % For all data, we would suggest E(y) = beta_0 + beat_1 x + % beta_2 x^2 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %Question #5.16 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % PART A: x = [30; 31; 32; 33; 34; 35; 36]; y = [29; 32; 36; 38; 37; 33; 26]; xbar = mean(x) sx = sqrt(var(x)) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % xbar = 33 % sx = 2.1602 % This gives us: u = (x-33)/(2.1602) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % PART B: u = (x-xbar)/sx %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % u = % -1.3887 % -0.9258 % -0.4629 % 0 % 0.4629 % 0.9258 % 1.3887 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % PART C: x2 = zeros(7,1); for i = 1:7 x2(i,1) = x(i,1)^2; end w = x2; wbar = mean(w); SSxw = 0; for i = 1:7 SSxw = SSxw + (x(i,1)-xbar)*(w(i,1)-wbar); end SSxx = 0; for i = 1:7 SSxx = SSxx + (x(i,1)-xbar)*(x(i,1)-xbar); end SSww = 0; for i = 1:7 SSww = SSww + (w(i,1)-wbar)*(w(i,1)-wbar); end r = SSxw/(sqrt(SSxx*SSww)) %%%%%%%%%%%%%%%%%%%%%%%%%%%% % r = 0.9997 (before coding) %%%%%%%%%%%%%%%%%%%%%%%%%%%% % PART D: ubar = mean(u); u2 = zeros(7,1); for i = 1:7 u2(i,1) = u(i,1)^2; end z = u2 zbar = mean(z); SSuz = 0; for i = 1:7 SSuz = SSuz + (u(i,1)-ubar)*(z(i,1)-zbar); end SSuu = 0; for i = 1:7 SSuu = SSuu + (u(i,1)-ubar)*(u(i,1)-ubar); end SSzz = 0; for i = 1:7 SSzz = SSzz + (z(i,1)-zbar)*(z(i,1)-zbar); end r = SSuz/(sqrt(SSuu*SSzz)) %%%%%%%%%%%%%%%%%%%%%% % r = 0 (After coding) %%%%%%%%%%%%%%%%%%%%%% % PART E: one = [1;1;1;1;1;1;1]; U = [one, u, u2]; Uty = U'*y; UtU = U'*U; BETA = inv(UtU)*Uty %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % BETA = % 37.5714 % -0.4629 % -5.3333 % % Which gives us the model: % E(y) = 37.5714 - 0.4629*u - 5.3333*u^2 % This will be a parabola opening downward. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% clear;