18
Statistics for Business & Economics, 6 th edition
Chapter 13: Multiple Regression
13.1 Given the following following estimated linear linear model: yˆ = 10 + 3x1 + 2 x2 + 4 x3 a. yˆ = 10 + 3( 20) + 2(11) + 4(10) = 132 b. yˆ = 10 + 3(1) + 2(14) + 4(20) = 1!3 ". yˆ = 10 + 3(3) + 2(1#) + 4(2) = 23 d. yˆ = 10 + 3(10) + 2(1$) + 4(30) = 1#4 13.2 Given the following following estimated linear linear model: yˆ = 10 + x1 + 4 x2 + 2 x3 a. yˆ = 10 + (20) + 4(11) + 2(10) = 1$4 b. yˆ = 10 + (1) + 4(14) + 2(20) = 181 ". yˆ = 10 + (3) + 4(1#) + 2(2) = 311 d. yˆ = 10 + (10) + 4(1$) + 2(30) = 188 13.3 Given the following following estimated linear linear model: yˆ = 10 + 2 x1 + 12 x2 + 8 x3 a. yˆ = 10 + 2(20) + 12(11) + 8(10) = 2!2 b. yˆ = 10 + 2(1) + 12(24) + 8(20) = 488 ". yˆ = 10 + 2(20) + 12(1#) + 8(2) = 4$8 d. yˆ = 10 + 2(10) + 12(#) + 8(30) = 3$8 13.4 Given the following following estimated linear linear model: yˆ = 10 + 2 x1 + 12 x2 + 8 x3 a. yˆ in"reases b% 8 b. yˆ in"reases b% 8 ". yˆ in"reases b% 24 a. yˆ in"reases b% 8 13. Given the following estimated linear model: yˆ = 10 − 2 x1 − 14 x2 + ! x3 a. yˆ de"reases b% 8 b. yˆ de"reases b% ! ". yˆ in"reases b% 28
1#
Statistics for Business & Economics, 6 th edition
13.!
&he estimate estimated d regress regression ion slo'e slo'e "oeff "oeffi"i i"ient entss are are inter' inter'reted reted as follows follows:: b1 = .!!1: ll else e*al+ an in"rease in the 'lane,s 'lane,s to' s'eed s' eed b% one m'h will in"rease the e-'e"ted n*mber of ho*rs in the design effort b% an estimated .!!1 million or !!1 tho*sand worer/ho*rs. b2 = .0!: .0! : ll else e*al+ an in"rease in the 'lane,s weight b% one ton will in"rease the e-'e"ted n*mber of ho*rs in the design effort b% an estimated . 0! million or ! tho*sand worer/ho*rs b3 = /.018: ll ll else e*al+ an in"rease in in the 'er"entage of 'arts 'ar ts in "ommon with other models will res*lt in a de"rease in the e-'e"ted n*mber of ho*rs in the design effort b% an estimated .018 million or 18 tho*sand worer/ ho*rs
13.$
&he estimate estimated d regress regression ion slo'e slo'e "oeff "oeffi"i i"ient entss are are inter' inter'reted reted as follows follows:: b1 = .0$: ll else e*al+ an in"rease of one *nit in the "hange over the *arter in bond '*r"hases b% finan"ial instit*tions res*lts in an estimated . 0$ in"rease in the "hange over the *arter in the bond interest rate b2 = /.0!: ll ll else e*al+ an in"rease of one *nit in the "hange over the *arter in bond sales b% finan"ial instit*tions res*lts in an estimated .0! de"rease in the "hange over the *arter in the bond interest rates rates
1 3 .8
a. b1 = .02: ll else e*al+ an in"rease of one h*ndred dollars do llars in weel% in"ome res*lts in an estimated .02 *arts 'er wee in"rease in mil "ons*m'tion. "ons*m'tion . b2 = 1.14: ll else e*al+ an in"rease in famil% famil% sie b% one 'erson will will res*lt in an estimated estimated in"rease in"rease in mil mil "ons*m'tion "ons*m'tion b% 1.14 *arts 'er wee. b. &he inter"e't term b0 of /.02 is the estimated mil "ons*m'tion of *arts of mil 'er wee given that the famil%,s weel% in"ome is 0 dollars dolla rs and there are ar e 0 members in the famil%. &his is liel% e-tra'olating be%ond the observed data series and is not a *sef*l inter'retation.
13.#
a. b1 = .!3: ll else e*al+ a one *nit in"rease in in the average n*mber of meals eaten 'er wee will res*lt in an estimated .!3 'o*nds gained d*ring freshman %ear. b2 = /1.34: ll else e*al+ a one *nit in"rease in in the average n*mber of ho*rs of e-er"ise 'er wee will res*lt in an estimated 1.34 'o*nd weight loss. b3 = .!13: ll else e*al+ a one *nit in"rease in the average n*mber of beers "ons*med 'er wee will res*lt in an estimated estimated .!13 'o*nd weight gain. b. &he inter"e't term b0 of $.3 is the estimated amo*nt of weight gain d*ring the freshman %ear given that the meals eaten is 0+ ho*rs e-er"ise is 0 and there are no beers "ons*med "on s*med 'er wee. &his is liel% e-tra'olating be%ond the observed data series and is not a *sef*l inter'retation. 13.10 om'*te the slo'e "oeffi"ients "oeffi"ients for the model: model: yˆi = b0 + b1x1i + b2 x2 i
1#
Statistics for Business & Economics, 6 th edition
13.!
&he estimate estimated d regress regression ion slo'e slo'e "oeff "oeffi"i i"ient entss are are inter' inter'reted reted as follows follows:: b1 = .!!1: ll else e*al+ an in"rease in the 'lane,s 'lane,s to' s'eed s' eed b% one m'h will in"rease the e-'e"ted n*mber of ho*rs in the design effort b% an estimated .!!1 million or !!1 tho*sand worer/ho*rs. b2 = .0!: .0! : ll else e*al+ an in"rease in the 'lane,s weight b% one ton will in"rease the e-'e"ted n*mber of ho*rs in the design effort b% an estimated . 0! million or ! tho*sand worer/ho*rs b3 = /.018: ll ll else e*al+ an in"rease in in the 'er"entage of 'arts 'ar ts in "ommon with other models will res*lt in a de"rease in the e-'e"ted n*mber of ho*rs in the design effort b% an estimated .018 million or 18 tho*sand worer/ ho*rs
13.$
&he estimate estimated d regress regression ion slo'e slo'e "oeff "oeffi"i i"ient entss are are inter' inter'reted reted as follows follows:: b1 = .0$: ll else e*al+ an in"rease of one *nit in the "hange over the *arter in bond '*r"hases b% finan"ial instit*tions res*lts in an estimated . 0$ in"rease in the "hange over the *arter in the bond interest rate b2 = /.0!: ll ll else e*al+ an in"rease of one *nit in the "hange over the *arter in bond sales b% finan"ial instit*tions res*lts in an estimated .0! de"rease in the "hange over the *arter in the bond interest rates rates
1 3 .8
a. b1 = .02: ll else e*al+ an in"rease of one h*ndred dollars do llars in weel% in"ome res*lts in an estimated .02 *arts 'er wee in"rease in mil "ons*m'tion. "ons*m'tion . b2 = 1.14: ll else e*al+ an in"rease in famil% famil% sie b% one 'erson will will res*lt in an estimated estimated in"rease in"rease in mil mil "ons*m'tion "ons*m'tion b% 1.14 *arts 'er wee. b. &he inter"e't term b0 of /.02 is the estimated mil "ons*m'tion of *arts of mil 'er wee given that the famil%,s weel% in"ome is 0 dollars dolla rs and there are ar e 0 members in the famil%. &his is liel% e-tra'olating be%ond the observed data series and is not a *sef*l inter'retation.
13.#
a. b1 = .!3: ll else e*al+ a one *nit in"rease in in the average n*mber of meals eaten 'er wee will res*lt in an estimated .!3 'o*nds gained d*ring freshman %ear. b2 = /1.34: ll else e*al+ a one *nit in"rease in in the average n*mber of ho*rs of e-er"ise 'er wee will res*lt in an estimated 1.34 'o*nd weight loss. b3 = .!13: ll else e*al+ a one *nit in"rease in the average n*mber of beers "ons*med 'er wee will res*lt in an estimated estimated .!13 'o*nd weight gain. b. &he inter"e't term b0 of $.3 is the estimated amo*nt of weight gain d*ring the freshman %ear given that the meals eaten is 0+ ho*rs e-er"ise is 0 and there are no beers "ons*med "on s*med 'er wee. &his is liel% e-tra'olating be%ond the observed data series and is not a *sef*l inter'retation. 13.10 om'*te the slo'e "oeffi"ients "oeffi"ients for the model: model: yˆi = b0 + b1x1i + b2 x2 i
Chapter 13: Multiple Regression
Given that b1 a. b1
=
=
s y (rx1 y − rx1 x2 r x2 y ) s x1 (1 − r x1 x2 ) 2
400( 400(.! .!0 0 − (.0 (.0)( )(.$ .$0) 0)))
+ b2
=
20
− rx x r x y ) s x (1 − r 2 x x ) 400( 400(.$ .$0 0 − (.0 (.0)( )(.! .!0) 0))) b2 = = 3.200 200(1 − .02 )
s y (rx2 y 2
1 2
1
1 2
= 2.000+ 200(1 − .02 ) 400(−.!0 − (− .0)(.$0)) b. b1 = = /.!!$+ 200(1 − (−.0)2 ) 400(.$0 − (−.0)(− .!0)) b2 = = 1.0!$ 200(1 − (−.0)2 ) 400( 400(.4 .40 0 − (.80 (.80)( )(.4 .4) ))) 400( 400(.4 .4 − (.80) .80)(. (.40 40))) b = ". b1 = =.083+ = .2$1 2 200(1 − (.80)2 ) 200(1 − (.80)2 ) 400(.!0 − (−.!0)(− .0)) d. b1 = = .#3$+ 200(1 − (−.!0)2 ) 400(−.0 − (− .!0)(.!0)) b1 = = /.43$ 200(1 − (−.!0)2 )
13.11 a. hen hen the the "orrelatio "orrelation n is = 0 the slo'e slo'e "oeff "oeffi" i"ien ientt of the the 1 term sim'lifies sim'lifies to the slo'e slo' e "oeffi"ient "oeffi"ient of the bivariate bivariate regression: tart with s y (rx1 y − rx1 x2 r x2 y ) b = e*ation 13.4: 1 . 5ote that if the "orrelation between s x1 (1 − r 2 x1 x2 ) 1 and 2 is ero+ then the se"ond terms in both the n*merator and denominator are ero and the form*la algebrai"all% red*"es to b1
=
s y r x1 y s x1
whi"h is the e*ivalent of the bivariate slo'e "oeffi"ient (see bo- on bottom of 'age 380). b. hen the "orrelation between 1 and 2 is = 1+ then the se"ond term in the denominator goes to 0 and the slo'e "oeffi"ient is *ndefined. 13.12 a. 6le"tri"it% 6le"tri"it% sales as a f*n"tion f*n"tion of n*mber of "*stomers "*stomers and 'ri"e Regression Analysis: salesmw2 versus priclec2, numcust2 The regression equation is salesmw2 = - 647363 + 19895 priclec2 + 235 numcust2 !re"ictor #oe$ %& #oe$ T ! #onstant -647363 291734 -222 ''3' priclec2 19895 22515 '88 '38' numcust2 2353' '2233 1'54 '''' % = 66399 (-%q nal.sis o$ /ariance %ource 0 (egression (egression 2 (esi"ual &rror 61 Total 63
= 792)
(-%q*a", = 785)
%% % 1'248'&+12 1'248'&+12 5124''&+11 5124''&+11 268939&+11 268939&+11 44'8828732 44'8828732 129374&+12
11622
! ''''
21
Statistics for Business & Economics, 6 th edition
ll else e*al+ for ever% one *nit in"rease in the 'ri"e of ele"tri"it%+ we estimate that sales will in"rease b% 1#8# mwh. 5ote that this estimated "oeffi"ient is not signifi"antl% different from ero ('/val*e = .380). ll else e*al+ for ever% additional residential "*stomer who *ses ele"tri"it% in the heating of their home+ we estimate that sales will in"rease b% 2.33 mwh. b. 6le"tri"it% sales as a f*n"tion of n*mber of "*stomers Regression Analysis: salesmw2 versus numcust2 The regression equation is salesmw2 = - 41'2'2 + 22' numcust2 !re"ictor #oe$ %& #oe$ #onstant -41'2'2 114132 numcust2 22'27 '1445 % = 66282 (-%q nal.sis o$ /ariance %ource 0 (egression 1 (esi"ual &rror 62 Total 63
= 789)
T -359 1525
! '''1 ''''
(-%q*a", = 786)
%% % 1'2136&+12 1'2136&+12 272381&+11 439324'914 129374&+12
23248
! ''''
n additional residential "*stomer will add 2.202$ mwh to ele"tri"it% sales. &he two models have ro*ghl% e*ivalent e-'lanator% 'ower7 therefore+ adding 'ri"e as a variable does not add a signifi"ant amo*nt of e-'lanator% 'ower to the model. &here a''ears to be a 'roblem of high "orrelation between the inde'endent variables of 'ri"e and "*stomers. ". Regression Analysis: salesmw2 versus priclec2, degrday2 The regression equation is salesmw2 = 231226' - 165275 priclec2 + 561 "egr"a.2 !re"ictor #oe$ %& #oe$ T ! #onstant 231226' 148794 1554 '''' priclec2 -165275 248'9 -666 '''' "egr"a.2 56'6 6'37 '93 '357 % = 11'725 (-%q nal.sis o$ /ariance %ource 0 (egression 2 (esi"ual &rror 61 Total 63
= 422)
(-%q*a", = 4'3)
%% % 545875&+11 272938&+11 747863&+11 1226''53296 129374&+12
2226
! ''''
ll else e*al+ an in"rease in the 'ri"e of ele"tri"it% will red*"e ele"tri"it% sales b% 1!+2$ mwh. ll else e*al+ an in"rease in the degree da%s (de'art*re from normal weather) b% one *nit will in"rease ele"tri"it% sales b% !.0! mwh. 5ote that the "oeffi"ient on the 'ri"e variable is now negative+ as e-'e"ted+ and it is signifi"antl% different from ero ('/val*e = .000)
Chapter 13: Multiple Regression
22
d. Regression Analysis: salesmw2 versus Yd872, degrday2 The regression equation is salesmw2 = 293949 + 326 "872 + 584 "egr"a.2 !re"ictor #oe$ %& #oe$ T #onstant 293949 67939 433 "872 32585 213' 1529 "egr"a.2 5836 3579 163 % = 66187 (-%q nal.sis o$ /ariance %ource 0 (egression 2 (esi"ual &rror 61 Total 63
= 793)
! '''' '''' '1'8
(-%q*a", = 787)
%% % 1'2652&+12 513259&+11 267221&+11 438'674677 129374&+12
11716
! ''''
ll else e*al+ an in"rease in 'ersonal dis'osable in"ome b% one *nit will in"rease ele"tri"it% sales b% 32.8 mwh. ll else e*al+ an in"rease in degree da%s b% one *nit will in"rease ele"tri"it% sales b% 8.3! mwh. 13.13 a. m'g as a f*n"tion of horse'ower and weight Regression Analysis: milpgal versus horspwr, weight The regression equation is milpgal = 558 - '1'5 horspwr - '''661 weight 15' cases use" 5 cases contain missing alues !re"ictor #oe$ %& #oe$ T ! #onstant 55769 1448 3851 '''' horspwr -'1'489 ''2233 -47' '''' weight -'''66143 ''''9'15 -734 '''' % = 39'1 (-%q = 723) (-%q*a", = 72') nal.sis o$ /ariance %ource 0 %% % (egression 2 585'' 2925' 19223 (esi"ual &rror 147 22368 152 Total 149 8'868
! ''''
ll else e*al+ a one *nit in"rease in the horse'ower of the engine will red*"e f*el mileage b% .1048# m'g. ll else e*al+ an in"rease in the weight of the "ar b% 100 'o*nds will red*"e f*el mileage b% .!!143 m'g. b. dd n*mber of "%linders Regression Analysis: milpgal versus horspwr, weight, cylinder The regression equation is milpgal = 559 - '117 horspwr - '''758 weight 15' cases use" 5 cases contain missing alues !re"ictor #oe$ %& #oe$ T #onstant 55925 1443 3877 horspwr -'11744 ''2344 -5'1 weight -'''7576 '''1'66 -71' c.lin"er '726' '4362 166
+ '726 c.lin"er ! '''' '''' '''' ''98
% = 3878 (-%q = 729) (-%q*a", = 723) nal.sis o$ /ariance %ource 0 %% % (egression 3 58916 19639 13'62 (esi"ual &rror 146 21951 15' Total 149 8'868
! ''''
23
Statistics for Business & Economics, 6 th edition
ll else e*al+ one additional "%linder in the engine of the a*to will in"rease f*el mileage b% .$2! m'g. 5ote that this is not signifi"ant at the .0 level ('/val*e = . 0#8). orse'ower and weight still have the e-'e"ted negative signs ". m'g as a f*n"tion of weight+ n*mber of "%linders Regression Analysis: milpgal versus weight, cylinder The regression equation is milpgal = 559 - ''1'4 weight + '121 c.lin"er 154 cases use" 1 cases contain missing alues !re"ictor #oe$ %& #oe$ T ! #onstant 55914 1525 3665 '''' weight -''1'368' ''''9779 -1'6' '''' c.lin"er '12'7 '4311 '28 '78' % = 4151 (-%q = 688) (-%q*a", = 683) nal.sis o$ /ariance %ource 0 %% % (egression 2 5725' 28625 16613 (esi"ual &rror 151 26'18 172 Total 153 83268
! ''''
ll else e*al+ an in"rease in the weight of the "ar b% 100 'o*nds will red*"e f*el mileage b% 1.03!8 m'g. ll else e*al+ an in"rease in the n*mber of "%linders in the engine will in"rease m'g b% .120$ m'g. &he e-'lanator% 'ower of the models has sta%ed relativel% the same with a slight dro' in e-'lanator% 'ower for the latest regression model. 5ote that the "oeffi"ient on weight has sta%ed negative and signifi"ant ('/val*es of .000) for all of the regression models7 altho*gh the val*e of the "oeffi"ient has "hanged. &he n*mber of "%linders is not signifi"antl% different from ero in either of the models where it was *sed as an inde'endent variable. &here is liel% some "orrelation between "%linders and the weight of the "ar as well as between "%linders and the horse'ower of the "ar. d. m'g as a f*n"tion of horse'ower+ weight+ 'ri"e Regression Analysis: milpgal versus horspwr, weight, price The regression equation is milpgal = 544 - ''938 horspwr - '''735 weight +''''137 price 15' cases use" 5 cases contain missing alues !re"ictor #oe$ %& #oe$ T ! #onstant 54369 1454 374' '''' horspwr -''9381 ''2177 -431 '''' weight -'''73518 ''''895' -821 '''' price ''''13721 '''''395' 347 '''1 % = 3762 (-%q = 745) (-%q*a", = 739) nal.sis o$ /ariance %ource 0 %% % (egression 3 6'2'7 2''69 14182 (esi"ual &rror 146 2'66' 142 Total 149 8'868
! ''''
ll else e*al+ an in"rease b% one *nit in the horse'ower of the a*to will red*"e f*el mileage b% .0#381 m'g. ll else e*al+ an in"rease b% 100 'o*nds in the weight of the a*to will red*"e f*el mileage b% .$318 m'g and an in"rease in the 'ri"e of the a*to b% one dollar will in"rease f*el mileage b% .00013$21 m'g.
Chapter 13: Multiple Regression
24
e. orse 'ower and weight remain signifi"ant negative inde'endent variables thro*gho*t whereas the n*mber of "%linders has been insignifi"ant. &he sie of the "oeffi"ients "hange as the "ombinations of inde'endent variables "hanges. &his is liel% d*e to strong "orrelation that ma% e-ist between the inde'endent variables. 13.14
a. orse'ower as a f*n"tion of weight+ "*bi" in"hes of dis'la"ement Regression Analysis: horspwr versus weight, displace The regression equation is horspwr = 235 + ''154 weight + '157 "isplace 151 cases use" 4 cases contain missing alues !re"ictor #oe$ %& #oe$ T ! #onstant 23496 7341 32' '''2 weight ''15432 '''4538 34' '''1 "isplace '15667 ''3746 418 '''' % = 1364 (-%q = 692) (-%q*a", = 688) nal.sis o$ /ariance %ource 0 %% % (egression 2 61929 3'964 16633 (esi"ual &rror 148 27551 186 Total 15' 8948'
/ 6' 6'
! ''''
ll else e*al+ a 100 'o*nd in"rease in the weight of the "ar is asso"iated with a 1.4 in"rease in horse'ower of the a*to. ll else e*al+ a 10 "*bi" in"h in"rease in the dis'la"ement of the engine is asso"iated with a 1.$ in"rease in the horse'ower of the a*to. b. orse'ower as a f*n"tion of weight+ dis'la"ement+ n*mber of "%linders Regression Analysis: horspwr versus weight, displace, cylinder The regression equation is horspwr = 167 + ''163 weight + '1'5 "isplace + 257 c.lin"er 151 cases use" 4 cases contain missing alues !re"ictor #oe$ %& #oe$ T ! / #onstant 167'3 9449 177 ''79 weight ''16261 '''4592 354 '''1 62 "isplace '1'527 ''5859 18' ''74 148 c.lin"er 2574 2258 114 '256 78 % = 1363 (-%q = 695) (-%q*a", = 689) nal.sis o$ /ariance %ource 0 %% % (egression 3 6217' 2'723 11155 (esi"ual &rror 147 2731' 186 Total 15' 8948'
! ''''
ll else e*al+ a 100 'o*nd in"rease in the weight of the "ar is asso"iated with a 1.!3 in"rease in horse'ower of the a*to. ll else e*al+ a 10 "*bi" in"h in"rease in the dis'la"ement of the engine is asso"iated with a 1.0 in"rease in the horse'ower of the a*to. ll else e*al+ one additional "%linder in the engine is asso"iated with a 2.$ in"rease in the horse'ower of the a*to. 5ote that adding the inde'endent variable n*mber of "%linders has not added to the e-'lanator% 'ower of the model. 9 s*are has in"reased marginall%. 6ngine dis'la"ement is no longer signifi"ant at the .0 level ('/val*e of .0$4) and the estimated regression slo'e "oeffi"ient on the n*mber of "%linders is not
2
Statistics for Business & Economics, 6 th edition
signifi"antl% different from ero. &his is d*e to the strong "orrelation that e-ists between "*bi" in"hes of engine dis'la"ement and the n*mber of "%linders. ". orse'ower as a f*n"tion of weight+ dis'la"ement and f*el mileage Regression Analysis: horspwr versus weight, displace, milpgal The regression equation is horspwr = 936 + '''2'3 weight + '165 "isplace - 124 milpgal 15' cases use" 5 cases contain missing alues !re"ictor #oe$ %& #oe$ T ! / #onstant 9357 1533 611 '''' weight '''2'31 '''4879 '42 '678 83 "isplace '16475 ''3475 474 '''' 61 milpgal -12392 '2474 -5'1 '''' 31 % = 1255 (-%q = 742) (-%q*a", = 736) nal.sis o$ /ariance %ource 0 %% % (egression 3 66'42 22'14 13977 (esi"ual &rror 146 22994 157 Total 149 89'36
! ''''
ll else e*al+ a 100 'o*nd in"rease in the weight of the "ar is asso"iated with a . 203 in"rease in horse'ower of the a*to. ll else e*al+ a 10 "*bi" in"h in"rease in the dis'la"ement of the engine is asso"iated with a 1.!4$ in"rease in the horse'ower of the a*to. ll else e*al+ an in"rease in the f*el mileage of the vehi"le b% 1 mile 'er gallon is asso"iated with a red*"tion in horse'ower of 1.23#2. 5ote that the negative "oeffi"ient on f*el mileage indi"ates the trade/off that is e-'e"ted between horse'ower and f*el mileage. &he dis'la"ement variable is signifi"antl% 'ositive+ as e-'e"ted+ however+ the weight variable is no longer signifi"ant. gain+ one wo*ld e-'e"t high "orrelation among the inde'endent variables. d. orse'ower as a f*n"tion of weight+ dis'la"ement+ m'g and 'ri"e Regression Analysis: horspwr versus weight, displace, milpgal, price The regression equation is horspwr = 981 - ''''32 weight + '175 "isplace - 132 milpgal +''''138 price 15' cases use" 5 cases contain missing alues !re"ictor #oe$ %& #oe$ T ! / #onstant 9814 16'5 611 '''' weight -''''324 '''5462 -''6 '953 1'3 "isplace '17533 ''3647 481 '''' 68 milpgal -13194 '2613 -5'5 '''' 35 price ''''1379 ''''1438 '96 '339 13 % = 1255 (-%q = 743) (-%q*a", = 736) nal.sis o$ /ariance %ource 0 %% % (egression 4 66187 16547 1'5'' (esi"ual &rror 145 22849 158 Total 149 89'36
! ''''
6ngine dis'la"ement has a signifi"ant 'ositive im'a"t on horse'ower+ f*el mileage is negativel% related to horse'ower and 'ri"e is not signifi"ant.
Chapter 13: Multiple Regression
e. 6-'lanator% 'ower has marginall% in"reased from the first model to the last. &he estimated "oeffi"ient on 'ri"e is not signifi"antl% different from ero. is'la"ement and f*el mileage have the e-'e"ted signs. &he "oeffi"ient on weight has the wrong sign7 however+ it is not signifi"antl% different from ero ('/val*e of .#3). 13.1
regression anal%sis has 'rod*"ed the following nal%sis of ;arian"e table SSE SSR SSE 2 2 = 1− Given that & = 9 < 6+ s e = + R = + n − k − 1 SST SST SSE ( n − k − 1) R 2 = 1 − SST ( n − 1) a. SSE = 00+ s
2
00
= e
= 1#.2308+ se = 4.383
30 − 3 − 1 b. & = 9 < 6 = 4+00 < 00 = +000 00 (2!) 400 00 2 2 = 1− ". R = = .#0+ R = 1 − = .888 000 (2#) 000 000 13.1!
regression anal%sis has 'rod*"ed the following nal%sis of ;arian"e table SSE SSR SSE 2 2 = 1− Given that & = 9 < 6+ s e = + R = + n − k − 1 SST SST SSE ( n − k − 1) R 2 = 1 − SST ( n − 1) 2
200
e =
= 8!.20$+ se =#.2848 32 − 2 − 1 b. & = 9 < 6 = $+000 < 2+00 = #+00 200 (2#) $000 200 2 2 = 1− ". R = = .$3!8+ R = 1 − = .$08! #00 (31) #00 #00 a. SSE = 200+ s
13.1$
regression anal%sis has 'rod*"ed the following nal%sis of ;arian"e table SSE SSR SSE 2 2 = 1− Given that & = 9 < 6+ s e = + R = + n − k − 1 SST SST SSE ( n − k − 1) R 2 = 1 − SST ( n − 1) 2
=
10000
= 222.222+ se =14.#0$1 0 − 4 − 1 b. & = 9 < 6 = 40+000 < 10+000 = 0+000 10+000 (4) 40+ 000 10+ 000 2 2 = 1− ". R = = .80+ R = 1 − = .$822 0+0 00 (4#) 0+ 000 0+ 000 a. SSE = 10+000+ s
e
2!
2$
Statistics for Business & Economics, 6 th edition
13.18
regression anal%sis has 'rod*"ed the following nal%sis of ;arian"e table SSE SSR SSE 2 2 = 1− Given that & = 9 < 6+ s e = + R = + n − k − 1 SST SST SSE ( n − k − 1) R 2 = 1 − SST ( n − 1) 2
=
1+000
= $.0+ se = 8.!!0 20! − − 1 b. & = 9 < 6 = 80+000 < 1+000 = #+000 10+000 (4) 80+ 000 1+ 000 2 2 = 1− ". R = = .8421+ R = 1 − = .$822 0+0 00 (4#) #+ 000 #+ 000 a. SSE = 1+000+ s
e
=
3.4#
=
88.2
13.1#
= .#14 + therefore+ #1.4 of the variabilit% in wor/ho*rs 3.881 of design effort "an be e-'lained b% the variation in the 'lane,s to' s'eed+ weight and 'er"entage n*mber of 'arts in "ommon with other models. b. 6 = 3.881/3.4# = .332 .332 >(2$ − 4) 2 = .#033 ". R = 1 − 3.881> 2! d. R = .#14 = .#!3 . &his is the sam'le "orrelation between observed and 'redi"ted val*es of the design effort
13.20
= .441 + therefore+ 4.41 of the variabilit% in mil 1!2.1 "ons*m'tion "an be e-'lained b% the variations in weel% in"ome and famil% sie. $3.# >(30 − 3) 2 = .103 b. R = 1 − 1!2.1> 2# ". R = .441 = .$3$! . &his is the sam'le "orrelation between observed and 'redi"ted val*es of mil "ons*m'tion.
13.21
= .!331 + therefore+ !3.31 of the variabilit% in weight $#.2 + 4.# gain "an be e-'lained b% the variations in the average n*mber of meals eaten+ n*mber of ho*rs e-er"ised and n*mber of beers "ons*med weel%. 4.# >(2 − 4) 2 = .80$ b. R = 1 − 12.1> 24 ". R = .!331 = .$#$ . &his is the sam'le "orrelation between observed and 'redi"ted val*es of weight gained
2 a. R
2 a. R
2 a. R
=
$#.2
Chapter 13: Multiple Regression
13.22
a. Regression Analysis: Y profit versus X2 offices The regression equation is pro$it = 155 -''''12' 2 o$$ices !re"ictor #oe$ %& #oe$ #onstant 1546' '1'48 2 o$$i -''''12'33 '''''1434
T 1475 -839
! '''' ''''
% = ''7'49 (-%q = 754) (-%q*a", = 743) nal.sis o$ /ariance %ource 0 %% % (egression 1 '34973 '34973 7'38 (esi"ual &rror 23 '11429 '''497 Total 24 '464'2
! ''''
b. Regression Analysis: X1 revenue versus X2 offices The regression equation is 1 reenue = - ''78 +''''543 2 !re"ictor #onstant 2 o$$i
#oe$ -''781 ''''5428'
%& #oe$ '2975 '''''4'7'
o$$ices T -'26 1334
! '795 ''''
% = '2''' (-%q = 885) (-%q*a", = 881) nal.sis o$ /ariance %ource 0 %% % (egression 1 71166 71166 17784 (esi"ual &rror 23 '92'4 ''4'' Total 24 8'37'
! ''''
". Regression Analysis: Y profit versus X1 revenue The regression equation is pro$it = 133 - '169 1 reenue !re"ictor #oe$ %& #oe$ #onstant 13262 '1386 1 reen -'16913 ''3559
T 957 -475
! '''' ''''
% = '1''9 (-%q = 495) (-%q*a", = 474) nal.sis o$ /ariance %ource 0 %% % (egression 1 '2299' '2299' 2259 (esi"ual &rror 23 '23412 ''1'18 Total 24 '464'2
! ''''
d. Regression Analysis: X2 offices versus X1 revenue The regression equation is 2 o$$ices = 957 + 1631 1 reenue !re"ictor #oe$ %& #oe$ #onstant 9569 4765 1 reen 16313 1223
T 2'1 1334
! ''57 ''''
% = 3468 (-%q = 885) (-%q*a", = 881) nal.sis o$ /ariance %ource 0 %% % (egression 1 21388'13 21388'13 17784 (esi"ual &rror 23 2766147 12'267 Total 24 24154159
! ''''
28
2#
Statistics for Business & Economics, 6 th edition
13.23 Given the following res*lts where the n*mbers in 'arentheses are the sam'le standard error of the "oeffi"ient estimates a. om'*te two/sided # "onfiden"e intervals for the three regression slo'e "oeffi"ients b j ± tn− k −1+α 2 sb j # ? for x1 = 4.8 ± 2.08! (2.1)7 .41#4 *' to #.180! # ? for x2 = !.# ± 2.08! (3.$)7 /.8182 *' to 14.!182 # ? for x3 = /$.2 ± 2.08! (2.8)7 /13.0408 *' to /1.3#2 b. &est the h%'othesis H 0 : β j @or x1: t =
= 0+ H 1 : β j > 0
4.8
= 2.28! t 20+.0>.01 = 1.$2+ 2.28 2.1 &herefore+ reAe"t H 0 at the level b*t not at the 1 level !.# = 1.8! t 20+.0>.01 = 1.$2+ 2.28 @or x2: t = 3.$ &herefore+ reAe"t H 0 at the level b*t not at the 1 level −$.2 = −2.$1 t 20+.0>.01 = 1.$2+ 2.28 @or x3: t = 2.8 &herefore+ do not reAe"t H 0 at either level 13.24 Given the following res*lts where the n*mbers in 'arentheses are the sam'le standard error of the "oeffi"ient estimates a. om'*te two/sided # "onfiden"e intervals for the three regression slo'e "oeffi"ients b j ± tn− k −1+α 2 sb j # ? for x1 = !.8 ± 2.042 (3.1)7 .4!#8 *' to 13.1302 # ? for x2 = !.# ± 2.042 (3.$)7 /!.44 *' to 14.44 # ? for x3 = /$.2 ± 2.042 (3.2)7 /13.$344 *' to /.!!! b. &est the h%'othesis H 0 : β j = 0+ H 1 : β j > 0 !.8 = 2.1#4 t 30+.0>.01 = 1.!#$+ 2.4$ @or x1: t = 3.1 &herefore+ reAe"t H 0 at the level b*t not at the 1 level !.# = 1.8! t 30+.0>.01 = 1.!#$+ 2.4$ @or x2: t = 3.$ &herefore+ reAe"t H 0 at the level b*t not at the 1 level −$.2 = −2.2 t 30+.0>.01 = 1.!#$+ 2.4$ @or x3: t = 3.2 &herefore+ do not reAe"t H 0 at the level nor the 1 level
Chapter 13: Multiple Regression
13.2 Given the following res*lts where the n*mbers in 'arentheses are the sam'le standard error of the "oeffi"ient estimates a. om'*te two/sided # "onfiden"e intervals for the three regression slo'e "oeffi"ients b j ± tn− k −1+α 2 sb j # ? for x1 = 34.8 ± 2.000 (12.1)7 10.!0 *' to #.0 # ? for x2 = !.# ± 2.000 (23.$)7 #.0 *' to 104.30 # ? for x3 = /$.2 ± 2.000 (32.8)7 /122.80 *' to 8.40 b. &est the h%'othesis H 0 : β j = 0+ H 1 : β j > 0 34.8 = 2.8$! t !0+.0>.01 = 1.!$1+ 2.3#0 @or x1: t = 12.1 &herefore+ reAe"t H 0 at the level b*t not at the 1 level !.# = 2.401 t !0+.0>.01 = 1.!$1+ 2.3#0 @or x2: t = 23.$ &herefore+ reAe"t H 0 at the level b*t not at the 1 level −$.2 = −1.$44 t !0+.0>.01 = 1.!$1+ 2.3#0 @or x3: t = 32.8 &herefore+ do not reAe"t H 0 at either level 13.2! Given the following res*lts where the n*mbers in 'arentheses are the sam'le standard error of the "oeffi"ient estimates a. om'*te two/sided # "onfiden"e intervals for the three regression slo'e "oeffi"ients b j ± tn− k −1+α 2 sb j # ? for x1 = 1$.8 ± 2.042 ($.1)7 3.3018 *' to 32.2#82 # ? for x2 = 2!.# ± 2.042 (13.$)7 /1.0$4 *' to 4.8$4 # ? for x3 = /#.2 ± 2.042 (3.8)7 /1!.##! *' to /1.44 b. &est the h%'othesis H 0 : β j = 0+ H 1 : β j > 0 @or x1: t =
1$.8
= 2.0$ t 3+.0>.01 ≈1.!#$+ 2.4$ $.1 &herefore+ reAe"t H 0 at the level b*t not at the 1 level 2!.# = 1.#!4 t 3+.0>.01 ≈1.!#$+ 2.4$ @or x2: t = 13.$ &herefore+ reAe"t H 0 at the level b*t not at the 1 level −#.2 = −2.421 t 3+.0>.01 ≈1.!#$+ 2.4$ @or x3: t = 3.8 &herefore+ do not reAe"t H 0 at either level 13.2$
a. b1 = .!!1+ sb 1 #0 ?:
= .0##+ n = 2$+ t23+.0> .02 = 1.$14+ 2.0!# .!!1 ± 1.$14(.0##)7 .4#13 *' to .830$
30
31
Statistics for Business & Economics, 6 th edition
# ?: .!!1 ± 2.0!#(.0##)7 .4!2 *' to .8!8 b. b2 = .0!+ sb 2 = .032+ t 23+.02> .00 = 2.0!#+ 2.80$ # ?: .0! ± 2.0!#(.032)7 /.0012 *' to .1312 ## ?: .0! ± 2.80$(.032)7 /.0248 *' to .148 ". H 0 : β 2 = 0+ H 1 : β 2 ≠ 0 .0! = 2.031 t = .032 t 23+.0>.02 = 1.$14+ 2.0!# &herefore+ reAe"t H 0 at the 10 level b*t not at the level d. H 0 : β1 = β 2 = 0+ H 1 : t least one β i ≠ 0+ (i = 1+ 2) (3.311 − .332)> 2 F = = 103.188 .332> 23 &herefore+ reAe"t H 0 at the 1 level sin"e @ = 103.188 B .!! = @ 2+23+.01 13.28
a. H 0 : β1 = 07 H 1 : β 1 > 0 .02 = 2.2! t = .023 t 2$+.02>.01 = 2.02+2.4$3 &herefore+ reAe"t H 0 at the 2. level b*t not at the 1 level b. t 2$+.0>.02>.00 = 1.$03+2.02+2.$$1 #0 ?: 1.14 ± 1.$03(.3)7 .43# *' to 1.$3!1 # ?: 1.14 ± 2.02(.3)7 .4218 *' to 1.882 ## ?: 1.14 ± 2.$$1(.3)7 .1$01 *' to 2.10##
13.2#
a. H 0 : β 2 = 07 H 1 : β 2 < 0 −1.34 t = = −2.381 .! t 21+.02>.01 = −2.080+ − 2.18 &herefore+ reAe"t H 0 at the 2. level b*t not at the 1 level b. H 0 : β3 = 07 H 1 : β 3 > 0 .!13 = 2.23 t = .243 t 21+.01>.00 = 2.18+2.831 &herefore+ reAe"t H 0 at the 1 level b*t not at the . level ". t 21+.0>.02>.00 = 1.$21+ 2.080+ 2.831 #0 ?: .!3 ± 1.$21(.18#)7 .32$$ *' to .#$83 # ?: .!3 ± 2.080(.18#)7 .2## *' to 1.04!1 ## ?: .!3 ± 2.831(.18#)7 .11$# *' to 1.1881
Chapter 13: Multiple Regression
32
13.30 a. H 0 : β3 = 0+ H 1 : β 3 ≠ 0 −.0001#1 = −.428 t = .00044! t 1!+.10 = /1.33$ &herefore+ do not reAe"t H 0 at the 20 level b. H 0 : β1 = β 2 = β 3 = 0+ H 1 : t least one β i 1! .$1 = 13.0$ + F 3+1!+.01 = .2# F = 3 1 − .$1 &herefore+ reAe"t H 0 at the 1 level
≠ 0+ (i = 1+ 2+3)
= 2.000+2.!!0 #0 ?: $.8$8 ± 2.000(1.80#)7 4.2!0 *' to 11.4#! # ?: $.8$8 ± 2.!!0(1.80#)7 3.0!!1 *' to 12.!8## .003!!! H 0 : β 2 = 07 H 1 : β 2 > 0 + t = = 2.$3 + t 8+.00 = 2.!!0
13.31 a. t 8+.02>.00
b.
13.32
.001344 &herefore+ reAe"t H 0 at the . level
a. ll else being e*al+ an e-tra C1 in mean 'er "a'ita 'ersonal in"ome leads to an e-'e"ted e-tra C.04 of net reven*e 'er "a'ita from the lotter% b. b2 = .8$$2+ sb 2 = .310$+ n = 2#+ t24+.02 = 2.0!4 # ?: .8$$2 ± 2.0!4(.310$)+ .23# *' to 1.18 ". H 0 : β3 = 0+ H 1 : β 3 < 0 −3!.01 t = = −1.383 2!3.88 t 24+.10>.0 = /1.318+ /1.$11 &herefore+ reAe"t H 0 at the 10 level b*t not at the level
= $.!3+ sb = 3.082+ n = 1#+ t1+.02 = 2.131 # ?: $.!3 ± 2.131(3.082)+ 1.083 *' to 14.220$
13.33 a. b3
b. H 0 : β 2
3
= 0+ H 1 : β 2 > 0 + t = 1#.$2# = 2.1#4 8.##2
t 1+.02>.01 = 2.131+ 2.!02
&herefore+ reAe"t H 0 at the 2. level b*t not at the 1 level 13.34 a. n = 1#+ b1 = .2+ sb 1
= .00#2+ t1!+.02 = 2.12
# ?: .2 ± 2.12(.00#2)+ .180 *' to .21# −.1 = −1.1# b. H 0 : β 2 = 0+ H 1 : β 2 < 0 + t = .084
33
Statistics for Business & Economics, 6 th edition
t 1!+.10 = /1.33$+ &herefore+ do not reAe"t H 0 at the 10 level
13.3 a. n = 14+ b1 = .101+ sb 1
= .023+ t10+.0 = 1.812
#0 ?: .101 ± 1.812(.023)+ .0#3 *' to .142$ −.244 = −3.0 b. H 0 : β 2 = 0+ H 1 : β 2 < 0 + t = .08 −t 10+.01>.00 = /2.$!4+ /3.1!# &herefore+ reAe"t H 0 at the 1 level b*t not at the . level ". H 0 : β3 = 0+ H 1 : β 3 > 0 t =
.0$
= !.1!2 .00#2 t 10+.00 = 3.1!# &herefore+ reAe"t H 0 at the . level 13.3! a. n = 3#+ b b.
= .04#+ sb = .011$2+ t30+.00 = 2.$0 ## ?: .04# ± 2.$0(.011$2)+ .01$3 *' to .081$ H 0 : β 4 = 0+ H 1 : β 4 ≠ 0 t =
.48122
1
= .!1$
.$$#4 t 30+.10 = 1.31
&herefore+ do not reAe"t H 0 at the 20 level ". H 0 : β $ = 0+ H 1 : β $ ≠ 0 .00!4 = 2.108 t = .0030! t 30+.02>.01 = 2.042+ 2.4$ &herefore+ reAe"t H 0 at the level b*t not at the 2 level 13.3$ &est the h%'othesis that all three of the 'redi"tor variables are e*al to ero given the following nal%sis of ;arian"e tables a. H 0 : β1 = β 2 = β 3 = 0+ H 1 : t least one β i ≠ 0+ (i = 1+ 2+ 3) F =
SSR k SSE ( n − k − 1)
=
MSR s2 e
=
400 3 00 2!
= $8.0+ F 3+2!+.0 = 2.#8
&herefore+ reAe"t H 0 at the level b. H 0 : β1 = β 2 = β 3 = 0+ H 1 : t least one β i ≠ 0+ (i = 1+ 2+ 3) #$80 3 SSR k MSR = 2 = F = = 40.3!2+ F 3+2!+.0 = 2.#8 2100 2! SSE ( n − k − 1) se &herefore+ reAe"t H 0 at the level
Chapter 13: Multiple Regression
". H 0 : β1 F =
= β 2 = β 3 = 0+ H 1 : t least one SSR k
SSE ( n − k − 1)
=
MSR s2 e
=
β i
≠ 0+ (i = 1+ 2+ 3)
4!+000 3 2+000 2!
= 1.#4!$+ F 3+2!+.0 = 2.#8
&herefore+ reAe"t H 0 at the level d. H 0 : β1 F =
= β 2 = β 3 = 0+ H 1 : t least one SSR k
SSE ( n − k − 1)
=
MSR s2 e
=
β i
≠ 0+ (i = 1+ 2+3)
8$+000 3 48+000 2!
= 1.3$08+ F
= 2.#8
3+2!+.0
&herefore+ reAe"t H 0 at the level 13.38 a. & = 3.881+ 9 = 3.4#+ 6 = .332 H 0 : β1 = β 2 = β 3 = 0+ H 1 : t least one β i ≠ 0+ (i = 1+ 2+ 3) F =
3.4#>3
= 81.# .332> 23 F 3+23+.01 = 4.$! &herefore+ reAe"t H 0 at the 1 level. b. nal%sis of ;arian"e table: o*r"es of *m of egress of variation *ares @reedom Dean *ares 9egressor 3.4# 3 1.183 6rror .332 23 .01443 &otal 3.881 2!
@/9atio 81.#
13.3# H 0 : β1 = β 2 = 0+ H 1 : t least one β i ≠ 0+ (i = 1+ 2) 4 .4!3 + (2 > 4) = 21.2!2 F = 2 1 − .4!3 F 2+4+.01 = .18 &herefore+ reAe"t H 0 at the 1 level 13.40 a. & = 1!2.1+ 9 =88.2+ 6 = $3.# H 0 : β1 = β 2 = 0+ H 1 : t least one β i ≠ 0+ (i = 1+ 2) F =
88.2>2
= 1!.113 + F 2+2$+.01 = .4# $3.#> 2$ &herefore+ reAe"t H 0 at the 1 level
b. o*r"es of variation 9egressor 6rror &otal
*m of *ares 88.2 $3.# 1!2.1
egress of @reedom Dean *ares 2 44.10 2$ 2.$3$ 2#
13.41 a. & = 12.1+ 9 = $#.2+ 6 = 4.#
@/9atio 1!.113
34
3
Statistics for Business & Economics, 6 th edition
H 0 : β1 F =
= β 2 = β 3 = 0+ H 1 : t least one
β i
≠ 0+ (i = 1+ 2+3)
$#.2>3
= 12.0$8 + F 3+21+.01 = 4.8$ 4.#> 21 &herefore+ reAe"t o at the 1 level
b. o*r"es of variation 9egressor 6rror &otal
*m of egress of *ares @reedom $#.2 3 4.# 21 12.1 24
Dean *ares 2!.4 2.18$14
@/9atio 12.0$8
13.42 a. H 0 : β1 = β2 = β 3 = β 4 = 0+ H 1 : t least one β i ≠ 0+ (i = 1+ 2+ 3+ 4) &he test "an be based dire"tl% on the "oeffi"ient of determination sin"e R 2 SSR SST SSR SSR SSE 2 = = = F + and = 1− R = + and hen"e 2 1 − R SSE SST SSE SST SST F =
n − K − 1 R2 K
1 − R 2÷ +
F =
24 .1 4 1 − .1
= !.244# + F 4+24+.01 =
4.22. &herefore+
reAe"t H 0 at the 1 level 13.43 a. H 0 : β1 R
2
=
F =
= β 2 = β 3 = 0+ H 1 : t least one
R 2 SSE = 1− + and hen"e 1 − R 2 SST SST SSR
=
β i
≠ 0+ (i = 1+ 2+ 3)
SSR SST SSE SST
=
n − K − 1 R2 K
SSR SSE
1 .84 1 − R 2÷ + F = 3 1 − .84 = 2!.2 + F 3+1+.01
= F + and
= .42. &herefore+ reAe"t
H 0 at the 1 level
13.44 a. H 0 : β1 R
2
=
F =
= β 2 = 0+ H 1 : t least one
β i
R 2 SSE = 1− + and hen"e 1 − R 2 SST SST SSR
n − K − 1 R2 K
≠ 0+ (i = 1+ 2) =
SSR SST SSE SST
=
SSR SSE
= F + and
1! .#! + (2 >1!) 1 − R 2÷ + F = 2 1 − .#! = 21$ + F 2+1!+.01
&herefore+ reAe"t H 0 at the 1 level
= !.23
Chapter 13: Multiple Regression
3!
= β 2 = β 3 = β 4 = β = β ! = β $ = 0+ H 1 : t least one β i ≠ 0+ (i = 1+ 2+ 3+ 4+ + !+ $) 31 . 3$2 + ($ > 31) F = = 4.01$ + F $+30+.01 = 3.30. $ 1 − .3$2
13.4 H 0 : β1
&herefore+ reAe"t H 0 at the 1 level ( SSE E − SSE) k1
13.4!
SSE >( n − k − 1)
=
13.4$
=
n − k − 1 ( SSE E − SSE) > SST SSE > SST
k1
n − k − 11− R 2E − (1− R 2 )
1 − R2
k1
=
n − k − 1 R 2 − R 2E
1 − R2
k1
Fet β 3 be the "oeffi"ient on the n*mber of 'res"hool "hildren in the ho*sehold (88.2 − 83.$) >1 H 0 : β3 = 0+ H 1 : β 3 ≠ 0 + F = = 1.3#8 + F 1+2!+.0 = 4.23 83.$> 2! &herefore+ do not reAe"t H 0 at the level 2 a. R
13.4$.1 n −1
R n − k −1
2
b. in"e R
".
− 2
= 1−
SSE >( n − k − 1) SST >( n − 1)
k
=
n − k − 1
=
(n − 1) R 2 − k n − k − 1
SSR > k SSE >( n − k − 1)
=
= 1 −
n −1 n − k − 1
(n − 1) R 2 − k n − k − 1
+ then R
2
=
(n − k − 1) R 2 + k n −1
n − k − 1 SSR > SST n − k − 1 R 2 = = k SSE > SST 1 − R2 k
n − k −1
( n − k − 1) R 2 + k H >( n − 1)
k
n − 1 − ( n − k − 1) R 2 − k H >( n − 1) =
(1 − R 2 ) =
=
n − k − 1 ( n − k − 1) R 2 + k k
( n − k − 1)(1 − R 2 )
n − k −1 R2 + k k
(1 − R 2 )
13.4# Given the estimated m*lti'le regression e*ation yˆ = ! + x1 + 4 x2 + $ x3 + 8 x4 a. yˆ = ! + (10) + 4(23) + $(#) + 8(12) = 30$ b. yˆ = ! + (23) + 4(18) + $(10) + 8(11) = 31 ". yˆ = ! + (10) + 4(23) + $(#) + 8(12) = 30$ d. yˆ = ! + (−10) + 4(13) + $(−8) + 8(−1!) = /1$!
3$
Statistics for Business & Economics, 6 th edition
13.0 Y ˆ = $.3 + .!3(20) − 1.34(10) + .!13(!) = 10.!38 'o*nds 13.1 Y ˆ = .$8 + .02(!) + 1.14(4) = .4 *arts of mil 'er wee 13.2 Y ˆ = 2.0 + .!!1(1) + .0!($) − .018(0) = 2.21! million worer ho*rs 13.3 a. ll else e*al+ a one s*are foot in"rease in the lot sie is e-'e"ted to in"rease the selling 'ri"e of the ho*se b% C1.4!8 b. #8.43 of the variation in the selling 'ri"e of homes "an be e-'lained b% the variation in ho*se sie+ lot sie+ n*mber of bedrooms and n*mber of bathrooms 2$01.1 = 1.33 + t 1+.10>.0 = 1.341+ 1.$3 ". H 0 : β 4 = 0+ H 1 : β 4 > 0 + t = 1##!.2 &herefore+ reAe"t H 0 at the 10 level b*t not at the level d. ˆ = 1##8. + 22.32(120) + 1.4!8!(4$00)+ !$!$.3(3)+ 2$01.1(1.)= !11#4.4$ Y 13.4
om'*te val*es of %i when -i = 1+ 2+ 4+ !+ 8+ 10 i 1 2 4 ! 1. 4 11.313$ 32 8.$8$8 yi = 4 x yi
= 1 + 2 xi + 2 xi2
13
41
8
8 #0.0#$
10 12!.4!11
14
221
13. om'*te val*es of %i when -i = 1+ 2+ 4+ !+ 8+ 10 i 1 2 4 ! 8 10 1.8 4 13.#288 48.02# 100.!311 1!8.8#$0 22.382# yi = 4 x yi
= 1 + 2 xi + 2 xi 2
13
41
8
13.! om'*te val*es of %i when -i = 1+ 2+ 4+ !+ 8+ 10 i 1 2 4 ! 4 11.313$ 32 8.$8$8 yi = 4 x1. yi
= 1 + 2 xi + 1.$ xi 2
4.$
11.8
3!.2
$4.2
13.$ om'*te val*es of %i when -i = 1+ 2+ 4+ !+ 8+ 10 i 1 2 4 ! 1.2 3 !.8#22 1.8341 2.$$4 yi = 3x yi
= 1 + xi − 1. xi 2
4.
1
/3
/23
14
221
8 #0.0#$
10 12!.4!11
12.8
1#1
8 3!.3$$2
10 4$.4!8
/
/##
13.8 &here are man% 'ossible answers. 9elationshi's that "an be a''ro-imated b% a non/linear *adrati" model in"l*de man% s*''l% f*n"tions+ 'rod*"tion f*n"tions and "ost f*n"tions in"l*ding average "ost vers*s the n*mber of *nits 'rod*"ed.
38
Chapter 13: Multiple Regression
&o estimate the f*n"tion with linear least s*ares+ solve the e*ation β1 + β 2 for β 2 . in"e β 2 = 2 − β 1 + 'l*g into the e*ation and algebrai"all% mani'*late:
13.#
= β o + β1 X 1 + (2 − β1 ) X 21 + β 3 X 2 Y = β o + β1 X 1 + 2 X 21 − β1 X 21 + β 3 X 2 Y = β o + β1 X 1 − X 21 H + 2 X 21 + β 3 X 2 Y
Y
β o + β1G X 1 − X 21 H + β 3 X 2 ond*"t the variable transformations and estimate the model *sing least s*ares. −
2 X 21
=2
=
13.!0
a. ll else e*al+ 1 in"rease in ann*al "ons*m'tion e-'endit*res will be asso"iated with a 1.1! in"rease in e-'endit*res on va"ation travel. ll else e*al+ a 1 in"rease in the sie of the ho*sehold will be asso"iated with a .4408 de"rease in e-'endit*res on va"ation travel. b. 1!.8 of the variation in va"ation travel e-'endit*res "an be e-'lained b% the variations in the log of total "ons*m'tion e-'endit*res and log of the n*mber of members in the ho*sehold ". 1.1! ± 1.#!(.04!) = 1.04# *' to 1.2!2! −.4408 = −8.##! + d. H 0 : β 2 = 0+ H 1 : β 2 < 0 + t = .04#0 &herefore+ reAe"t H 0 at the 1 level
13.!1
a. 1 in"rease in median in"ome leads to an e-'e"ted .!8 in"rease in store sie. .!8 = 8.831 + &herefore+ reAe"t H 0 at the 1 level b. H 0 : β1 = 0+ H 1 : β 1 > 0 + t = .0$$
13.!2
a. ll else e*al+ a 1 in"rease in the 'ri"e of beef will be asso"iated with a de"rease of .2# in the tons of beef "ons*med ann*all% in the I.. b. ll else e*al+ a 1 in"rease in the 'ri"e of 'or will be asso"iated with an in"rease of .21$ in the tons of beef "ons*med ann*all% in the I.. .41! = 2.2 + t 2+.01 = 2.48+ &herefore+ reAe"t ". H 0 : β 4 = 0+ H 1 : β 4 > 0 + t = .1!3 H 0 at the 1 level d. H 0 : β1 = β2 = β 3 = β 4 = 0+ H 1 : t least one β i ≠ 0+ (i = 1+ 2+ 3+ 4) F =
n − k − 1 R2
=
2 .!83
1− R 4 1 − .!83 k reAe"t H 0 at the 1 level 2
= 13.4!! + F 4+2+.01
= 4.18. &herefore+
e. ?f an im'ortant inde'endent variable has been omitted+ there ma% be s'e"ifi"ation bias. &he regression "oeffi"ients 'rod*"ed for the miss'e"ified model wo*ld be misleading. 13.!3 6stimate a obb/o*glas 'rod*"tion f*n"tion with three inde'endent variables: β β β Y = β 0 X 1 1 X 2 2 X 3 3 ε where 1 = "a'ital+ 2 = labor and 3 = basi" resear"h
3#
Statistics for Business & Economics, 6 th edition
&aing the log of both sides of the e*ation %ields: log(Y ) = log( β 0 ) + β1 log( X 1 ) + β 2 log( X 2 ) + β 3 log( X 3 ) + ε Ising this form+ now regression the log of J on the logs of the three inde'endent variables and obtain the estimated regression slo'e "oeffi"ients. 13.!4
a. oeffi"ients for e-'onential models "an be estimated b% taing the logarithm of both sides of the m*lti'le regression model to obtain an e*ation that is linear in the logarithms of the variables. log(Y ) = log(β 0 ) + β1 log( X1 ) + β 2 log( X 2 ) + β 3 log( X 3 ) + β 4 (log( X 4 ) + log( ε) *bstit*ting in the restri"tions on the "oeffi"ients: β1 + β 2
= 1+ β 2 = 1− β1 +
β3 + β 4
= 1+ β 4 = 1− β3 log(Y ) = log(β 0 ) + β1 log( X1 ) + 1 − β1 Hlog( X 2 ) + β3 log( X 3 ) + 1 − β3 H(log( X 4 ) + log( ε ) im'lif% algebrai"all% and estimate the "oeffi"ients. &he "oeffi"ient β 2 "an be fo*nd b% s*btra"ting β 1 from 1.0. Fiewise the "oeffi"ient β 4 "an be fo*nd b% s*btra"ting β 3 from 1.0. b. onstant elasti"it% for J vers*s 4 is the regression slo'e "oeffi"ient on the 4 term of the logarithm model. 12.34
Finear model:
Regression lot Salar$ % !"44'" 0 616'113 E#perience S % 311&'()
R*S+ % &(' ,
R*S+-ad./ % &&') ,
"
4
$ r a l a S 3
!
1
!
E#perience
3
4
Chapter 13: Multiple Regression
K*adrati" model:
Regression lot Salar$ % 1(6(3'( 0 )1'(& E#perience * ('!13(! E#perience22! S % 3!&'1&
R*S+ % &)'4 ,
R*S+-ad./ % &)'1 ,
"
4
$ r a l a S 3
!
1
!
E#perience
3
4
40
41
Statistics for Business & Economics, 6 th edition
*bi" model:
Regression lot Salar$ % !((1'1 0 344'4(4 E#perience 0 !6'43!3 E#perience22! * '"(!""3 E#perience223 S % !)(!'43
R*S+ % ('! ,
R*S+-ad./ % &)'( ,
"
4
$ r a l a S 3
!
1
!
3
4
E#perience
ll three of the models a''ear to fit the data well. &he "*bi" model a''ears to fit the data the best as the standard error of the estimate is lowest. ?n addition+ e-'lanator% 'ower is marginall% higher for the "*bi" model than the other models. 13.!! Results for: erman!mports"#ls Regression Analysis: $ogYt versus $ogX1t, $ogX2t The regression equation is ogt = - 4'7 + 136 og1t + '1'1 og2t !re"ictor #oe$ %& #oe$ T #onstant -4'7'9 '31'' -1313 og1t 135935 ''3''5 4523 og2t '1''94 ''5715 177 % = ''4758
(-%q = 997)
nal.sis o$ /ariance %ource 0 (egression 2 (esi"ual &rror 28 Total 3' %ource og1t og2t
0 1 1
%% 21345 ''63 214'9
! '''' '''' ''88
49 49
(-%q*a", = 997)
% 1'673 '''2
471532
%eq %% 21338 '''7
13.!$ hat is the model "onstant when the d*mm% variable e*als 1 a. yˆ = $ + 8 x1 + b0 = $ b. yˆ = 12 + ! x1 + b0 = 12 ". yˆ = $ + 12 x1 + b0 = $
/
! ''''
Chapter 13: Multiple Regression
42
13.!8 hat is the model "onstant when the d*mm% variable e*als 1 a. yˆ = .$8 + 4.8$ x1 b. yˆ = 1.1 + #.1x1 ". yˆ = 13.!$ + 8.#8x1 13.!#
&he inter'retation of the d*mm% variable is that we "an "on"l*de that for a given differen"e between the s'ot 'ri"e in the "*rrent %ear and LM6 'ri"e in the 'revio*s %ear+ the differen"e between the LM6 'ri"e in the "*rrent %ear and LM6 'ri"e in the 'revio*s %ears is C.22 higher in 1#$4 d*ring the oil embargo than in other %ears
13.$0
a. ll else being e*al+ e-'e"ted selling 'ri"e is higher b% C3+21# if "ondo has a fire'la"e. b. ll else being e*al+ e-'e"ted selling 'ri"e is higher b% C2+00 if "ondo has bri" siding. ". # ?: 321# ± 1.#!(#4$) = C1+3!2.88 *' to C+0$.12 200 = 2.!11 + t 80#+.00 = 2.$! d. H 0 : β = 0+ H 1 : β > 0 + t = $!8 &herefore+ reAe"t H 0 at the . level
13.$1
a. ll else being e*al+ the 'ri"e/earnings ratio is higher b% 1.23 for a regional "om'an% than a national "om'an% 1.23 = 2.48 + t 2#+.01>.00 = 2.4!2+ 2.$! b. H 0 : β 2 = 0+ H 1 : β 2 ≠ 0 + t = .4#! &herefore+ reAe"t H 0 at the 2 level b*t not at the 1 level
= β 2 = 0+ H 1 : t least one β i ≠ 0+ (i = 1+ 2) 2# .3$ n − k − 1 R2 F = = = 8.1! + F 2+2#+.0 = 3.33 1 − R2 2 1 − .3$ k
". H 0 : β1
&herefore+ reAe"t H 0 at the level 13.$2 3.! of the variation in overall 'erforman"e in law s"hool "an be e-'lained b% the variation in *ndergrad*ate g'a+ s"ores on the F&s and whether the st*dent,s letter of re"ommendation are *n*s*all% strong. &he overall model is signifi"ant sin"e we "an reAe"t the n*ll h%'othesis that the model has no e-'lanator% 'ower in favor of the alternative h%'othesis that the model has signifi"ant e-'lanator% 'ower. &he individ*al regression "oeffi"ients that are signifi"antl% different than ero in"l*de the s"ores on the F& and whether the st*dent,s letters of re"ommendation were *n*s*all% strong. &he "oeffi"ient on *ndergrad*ate g'a was not fo*nd to be signifi"ant at the level. 13.$3
a. ll else e*al+ the ann*al salar% of the attorne% general who "an be removed is C+$#3 higher than if the attorne% general "annot be removed
43
Statistics for Business & Economics, 6 th edition
b. ll else e*al+ the ann*al salar% of the attorne% general of the state is C3+100 lower if the s*'reme "o*rt A*sti"es are ele"ted on 'artisan ballots $#3 = 1.###! + t 43+.0>.02 = 1.!8+ 2.01! ". H 0 : β = 0+ H 1 : β > 0 + t = 28#$ &herefore+ reAe"t H 0 at the level b*t not at the 2. level −3100 = −1.$! + t 43+.0>.02 = /1.!8+ /20.1! d. H 0 : β ! = 0+ H 1 : β ! < 0 + t = 1$!1 &herefore+ reAe"t H 0 at the level b*t not at the 2. level e. t 43+.0>.02 = 2.01! # ?: 4$ ± 2.01!(124.3)+ 2#!.41 *' to $#$.# 13.$4
a. ll else e*al+ the average rating of a "o*rse is !.21 *nits higher if a visiting le"t*rer is bro*ght in than if otherwise. !.21 = 1.$3 + t 20+.0 = 1.$2 b. H 0 : β 4 = 0+ H 1 : β 4 > 0 + t = 3.# &herefore+ reAe"t H 0 at the level ". !.# of the variation in the average "o*rse rating "an be e-'lained b% the variation in the 'er"entage of time s'ent in gro*' dis"*ssions+ the dollars s'ent on 're'aring the "o*rse materials+ the dollars s'ent on food and drins+ and whether a g*est le"t*rer is bro*ght in. H 0 : β1 = β2 = β 3 = β 4 = 0+ H 1 : t least one β i ≠ 0+ (i = 1+ 2+ 3+ 4) F =
n − k − 1 R2
20 .!#
= !.! 1 − R2 4 1 − .!# F 4+20+.01 = 4.43 &herefore+ reAe"t H 0 at the 1 level k
=
d. t 20+.02 = 2.08! # ?: .2 ± 2.08!(.21)+ .081# *' to .#81 13.$ 34.4 of the variation in a test on *nderstanding "ollege e"onomi"s "an be e-'lained b% whi"h "o*rse was taen+ the st*dent,s g'a+ the tea"her that ta*ght the "o*rse+ the gender of the st*dent+ the 're/test s"ore+ the n*mber of "redit ho*rs "om'leted and the age of the st*dent. &he regression model has signifi"ant e-'lanator% 'ower: H 0 : β1 = β 2 = β 3 = β 4 = β = β ! = β $ = 0+ H 1 : t least one
≠ 0+ (i = 1+ 2+ 3+ 4+ + !+ $) 342 .344 n − k − 1 R2 F = = = 2.!2 2 1− R $ 1 − .344 k β i
Chapter 13: Multiple Regression
44
13.$! Results for: %tudent &erformance"#ls Regression Analysis: Y versus X1, X2, X', X(, X) The regression equation is = 2'' + '''99 1 + ''763 2 - '137 3 + ''64 4 + '138 5 !re"ictor #onstant 1 2 3 4 5
#oe$ 1997 '''99' ''7629 -'13652 ''636 '13794
% = '5416
%& #oe$ 1273 ''1654 ''5654 ''6922 '26'6 ''7521
(-%q = 265)
nal.sis o$ /ariance %ource 0 (egression 5 (esi"ual &rror 21 Total 26
T 157 '6' 135 -197 '24 183
! '132 '556 '192 ''62 '81' ''81
/ 13 12 11 14 11
(-%q*a", = 9')
%% 22165 61598 83763
% '4433 '2933
151
! '229
&he model is not signifi"ant ('/val*e of the @/test = .22#). &he model onl% e-'lains 2!. of the variation in g'a with the ho*rs s'ent st*d%ing+ ho*rs s'ent 're'aring for tests+ ho*rs s'ent in bars+ whether or not st*dents tae notes or mar highlights when reading tests and the average n*mber of "redit ho*rs taen 'er semester. &he onl% inde'endent variables that are marginall% signifi"ant (10 level b*t not the level) in"l*de n*mber of ho*rs s'ent in bars and average n*mber of "redit ho*rs. &he other inde'endent variables are not signifi"ant at "ommon levels of al'ha. 13.$$ a. Negin the anal%sis with the "orrelation matri- O identif% im'ortant inde'endent variables as well as "orrelations between the inde'endent variables *orrelations: %alary, +#perience, yearsenior, ender1&perien .earseni
%alar. &perien .earseni '883 '''' '777 ''''
'674 ''''
en"er:1 -'429 ''''
-'378 ''''
-'292 ''''
Regression Analysis: %alary versus +#perience, yearsenior, ender1The regression equation is %alar. = 22644 + 437 &perience + 415 .earsenior - 1443 en"er:1 !re"ictor #oe$ %& #oe$ T ! / #onstant 226441 5218 434' '''' &perien 4371' 3141 1392 '''' 2' .earseni 41471 5531 75' '''' 18 en"er:1 -14432 5198 -278 '''6 12 % = 26'3
(-%q = 849)
nal.sis o$ /ariance %ource 0 (egression 3 (esi"ual &rror 146
%% 55591635'5 989'63178
(-%q*a", = 846)
% 1853'545'2 67744'5
27354
! ''''
4
Statistics for Business & Economics, 6 th edition
Total
149
6548226683
84.# of the variation in ann*al salar% (in dollars) "an be e-'lained b% the variation in the %ears of e-'erien"e+ the %ears of seniorit% and the gender of the em'lo%ee. ll of the variables are signifi"ant at the .01 level of signifi"an"e ('/val*es of .000+ .000 and .00! res'e"tivel%). &he @/test of the signifi"an"e of the overall model shows that we reAe"t H 0 that all of the slo'e "oeffi"ients are Aointl% e*al to ero in favor of H 1 that at least one slo'e "oeffi"ient is not e*al to ero. &he @/test %ielded a '/ val*e of .000. b. H 0 : β3 = 0+ H 1 : β 3 < 0 −1443.2 = −2.$8 + t 14!+.01 = /2.32! t = 1#.8 &herefore+ reAe"t H 0 at the 1 level. nd "on"l*de that the ann*al salaries for females are statisti"all% signifi"antl% lower than the% are for males. ". dd an intera"tion term and test for the signifi"an"e of the slo'e "oeffi"ient on the intera"tion term. 13.$8 &wo variables are in"l*ded as 'redi"tor variables. hat is the effe"t on the estimated slo'e "oeffi"ients when these two variables have a "orrelation e*al to a. .$8. large "orrelation among the inde'endent variables will lead to a high varian"e for the estimated slo'e "oeffi"ients and will tend to have a small 2 st*dent,s t statisti". Ise the r*le of th*mb r > to determine if the n "orrelation is Plarge,. b. .08. 5o "orrelation e-ists among the inde'endent variables. 5o effe"t on the estimated slo'e "oeffi"ients. ". .#4. large "orrelation among the inde'endent variables will lead to a high varian"e for the estimated slo'e "oeffi"ients and will tend to have a small st*dent,s t statisti". 2 d. .33. Ise the r*le of th*mb r > to determine if the "orrelation is Plarge,. n 13.$# n = 34 and fo*r inde'endent variables. 9 = .23. oes this im'l% that this inde'endent variable will have a ver% small st*dent,s t statisti"Q orrelation between the inde'endent variable and the de'endent variable is not ne"essaril% eviden"e of a small st*dent,s t statisti". high "orrelation among the independent variables "o*ld res*lt in a ver% small st*dent,s t statisti" as the "orrelation "reates a high varian"e.
Chapter 13: Multiple Regression
4!
13.80 n = 4$ with three inde'endent variables. Lne of the inde'endent variables has a "orrelation of .# with the de'endent variable. orrelation between the inde'endent variable and the de'endent variable is not ne"essaril% eviden"e of a small st*dent,s t statisti". high "orrelation among the independent variables "o*ld res*lt in a ver% small st*dent,s t statisti" as the "orrelation "reates a high varian"e. 13.81 n = 4# with two inde'endent variables. Lne of the inde'endent variables has a "orrelation of .! with the de'endent variable. orrelation between the inde'endent variable and the de'endent variable is not ne"essaril% eviden"e of a small st*dent,s t statisti". high "orrelation among the independent variables "o*ld res*lt in a ver% small st*dent,s t statisti" as the "orrelation "reates a high varian"e. 13.82 &hro*gh 13.84 9e'orts "an be written b% following the e-tended ase t*d% on the data file otton O see e"tion 13.# 13.8 Regression Analysis: ydeathrate versus #1totmiles, #2avgspeed The regression equation is .:"eathrate = - 297 - '''447 1:totmiles + '219 2:agspee" !re"ictor #oe$ %& #oe$ T ! / #onstant -2969 3437 -'86 '416 1:totmi -'''447' '''1549 -289 ''23 117 2:agsp '21879 ''8391 261 ''35 117 % = '1756 (-%q = 551) (-%q*a", = 423) nal.sis o$ /ariance %ource 0 %% % (egression 2 '265'7 '13254 43' (esi"ual &rror 7 '21593 ''3'85 Total 9 '481''
! ''61
.1 of the variation in death rates "an be e-'lained b% the variation in total miles traveled and in average travel s'eed. &he overall model is signifi"ant at the 10 b*t not the level sin"e the '/val*e of the @/test is .0!1. ll else e*al+ the average s'eed variable has the e-'e"ted sign sin"e as average s'eed in"reases+ the death rate also in"reases. &he total miles traveled variable is negative whi"h indi"ates that the more miles traveled+ the lower the death rate. Noth of the inde'endent variables are signifi"ant at the level ('/val*es of .023 and .03 res'e"tivel%). &here a''ears to be some "orrelation between the inde'endent variables.
4$
Statistics for Business & Economics, 6 th edition
*e to the high "orrelation between the inde'endent variables+ an alternative model *sing a *adrati" model is as follows: Regression Analysis: ydeathrate versus #1totmiles, #1tots.uared The regression equation is .:"eathrate = - 654 + ''268 1:totmiles -'''''15 1:totsquare" !re"ictor #oe$ #onstant -6539 1:totmi ''268'' 1:totsq -'''''148'
%& #oe$ 1296 '''2835 ''''''153
T -5'4 945 -968
! '''1 '''' ''''
/ 2855 2855
% = ''6499 (-%q = 939) (-%q*a", = 921) nal.sis o$ /ariance %ource 0 %% % (egression 2 '45143 '22572 5344 (esi"ual &rror 7 ''2957 '''422 Total 9 '481'' %ource 1:totmi 1:totsq
0 1 1
! ''''
%eq %% ''5534 '396'9
13.8! Regression Analysis: y-emale$-&R versus #1income, #2yrsedu, """ The regression equation is .:emale!( = '2 +''''4'6 1:income + 484 2:.rse"u - 155 3:$emaleun !re"ictor #oe$ %& #oe$ T ! / #onstant '16 3491 ''' '996 1:incom ''''4'6' ''''1736 234 ''24 12 2:.rse" 4842 2813 172 ''92 15 3:$emal -15543 '3399 -457 '''' 13 % = 3'48
(-%q = 543)
nal.sis o$ /ariance %ource 0 (egression 3 (esi"ual &rror 46 Total 49
%% 5'835 42722 93557
(-%q*a", = 514)
% 16945 929
1824
! ''''
13.8$ Regression Analysis: ymoney versus #1pcincome, #2ir The regression equation is .:mone. = - 1158 + '253 1:pcincome - 196 2:ir !re"ictor #oe$ %& #oe$ T ! #onstant -11584 5879 -197 ''8' 1:pcinc '25273 ''3453 732 '''' 2:ir -1956 2173 -'9' '391 % = 8493 (-%q = 898) (-%q*a", = 875) nal.sis o$ /ariance %ource 0 %% % (egression 2 57'857 285429 3957 (esi"ual &rror 9 64914 7213 Total 11 635771 %ource 1:pcinc 2:ir
0 1 1
%eq %% 565'12 5845
/ 13 13
! ''''
Chapter 13: Multiple Regression
48
13.88 Regression Analysis: ymanufgrowt versus #1aggrowth, #2e#portgro, """ The regression equation is .:manu$growth = 215 + '493 1:aggrowth + '27' 2:eportgrowth - '117 3:in$lation !re"ictor #oe$ %& #oe$ T ! / #onstant 215'5 '9695 222 ''32 1:aggro '4934 '2'2' 244 ''19 1' 2:epor '26991 ''6494 416 '''' 1' 3:in$la -'117'9 ''52'4 -225 ''3' 1' % = 3624 (-%q = 393) (-%q*a", = 351) nal.sis o$ /ariance %ource 0 %% % (egression 3 37398 12466 949 (esi"ual &rror 44 57797 1314 Total 47 95195 %ource 1:aggro 2:epor 3:in$la
0 1 1 1
! ''''
%eq %% 8'47 227'2 665'
13.8# &he method of least s*ares regression %ields estimators that are NFI6 O Nest Finear Inbiased 6stimators. &his res*lt holds when the ass*m'tions regarding the behavior of the error term are tr*e. NFI6 estimators are the most effi"ient (best) estimators o*t of the "lass of all *nbiased estimators. &he advent of "om'*ting 'ower in"or'orating the method of least s*ares has dramati"all% in"reased its *se. 13.#0 &he anal%sis of varian"e table identifies how the total variabilit% of the de'endent variable (&) is s'lit *' between the 'ortion of variabilit% that is e-'lained b% the regression model (9) and the 'art that is *ne-'lained (6). &he oeffi"ient of etermination (9 2) is derived as the ratio of 9 to &. &he anal%sis of varian"e table also "om'*tes the @ statisti" for the test of the signifi"an"e of the overall regression O whether all of the slo'e "oeffi"ients are Aointl% e*al to ero. &he asso"iated '/val*e is also generall% re'orted in this table. 13.#1
a. @alse. ?f the regression model does not e-'lain a large eno*gh 'ortion of the variabilit% of the de'endent variable+ then the error s*m of s*ares "an be larger than the regression s*m of s*ares b. @alse O the s*m of several sim'le linear regressions will not e*al a m*lti'le regression sin"e the ass*m'tion of Pall else e*al, will be violated in the sim'le linear regressions. &he m*lti'le regression Pholds, all else e*al in "al"*lating the 'artial effe"t that a "hange in one of the inde'endent variables has on the de'endent variable. ". &r*e d. @alse O hile the reg*lar "oeffi"ient of determination (9 2) "annot be negative+ the adA*sted "oeffi"ient of determination R 2 "an be"ome negative. ?f the inde'endent variables added into a regression e*ation have ver% little e-'lanator% 'ower+ the loss of degrees of freedom ma% more than offset the added e-'lanator% 'ower.
4#
Statistics for Business & Economics, 6 th edition
e. &r*e 13.#2 ?f one model "ontains more e-'lanator% variables+ then & remains the same for both models b*t 9 will be higher for the model with more e-'lanator% variables. in"e & = 9 1 < 61 whi"h is e*ivalent to 9 2 < 62 and given that 9 2 B 9 1+ then 61 B 62. en"e+ the "oeffi"ient of determination will be higher with a greater n*mber of e-'lanator% variables and the "oeffi"ient of determination m*st be inter'reted in "onA*n"tion with whether or not the regression slo'e "oeffi"ients on the e-'lanator% variables are signifi"antl% different from ero. 13.#3 &his is a "lassi" e-am'le of what ha''ens when there is a high degree of "orrelation between the inde'endent variables. &he overall model "an be shown to have signifi"ant e-'lanator% 'ower and %et none of the slo'e "oeffi"ients on the inde'endent variables are signifi"antl% different from ero. &his is d*e to the effe"t that high "orrelation among the inde'endent variables has on the varian"e of the estimated slo'e "oeffi"ient. 13.#4
∑e = ∑(y − a − b x − b x ) ∑e = ∑(y − y + b x + b x − b x − b x ) ∑ e = ny − ny + nb x + nb x − nb x − nb x ∑e = 0 i
i
1 1i
2
i
i
1 1i
2 2i
i
1 1
2i
2 2
1 1i
1 1
2 2i 2
2
i
13.#
a. ll else e*al+ a *nit "hange in 'o'*lation+ ind*str% sie+ meas*re of e"onomi" *alit%+ meas*re of 'oliti"al *alit%+ meas*re of environmental *alit%+ meas*re of health and ed*"ational *alit%+ and so"ial life res*lts in a res'e"tive 4.#83+ 2.1#8+ 3.81!+ /.310+ /.88!+ 3.21+ and .08 in"rease in the new b*siness starts in the ind*str%.
b.
2
R = .$!! .
$!.! of the variabilit% in new b*siness starts in the ind*str% "an be
e-'lained b% the variabilit% in the inde'endent variables7 'o'*lation+ ind*str% sie+ e"onomi"+ 'oliti"al environmental+ health and ed*"ational *alit% of life. ". t !2+.0 = 1.!$+ therefore+ the #0 ?: 3.81! ± 1.!$(2.0!3) = .3$08 *' to $.2!12 −.88! = −.2# + t !2+.02 = R 2.000 d. H 0 : β = 0+ H 1 : β ≠ 0 + t = 3.0 &herefore+ do not reAe"t H 0 at the level 3.12 = 2.0 + t !2+.02 = R 2.000 e. H 0 : β ! = 0+ H 1 : β ! ≠ 0 + t = 1.!8 &herefore+ reAe"t H 0 at the level
= β 2 = β 3 = β 4 = β = β ! = β $ = 0+ H 1 : t least one β i ≠ 0+ (i = 1+ 2+ 3+ 4+ + !+ $) !2 .$!! n − k − 1 R2 F = = = 28.## 2 1− R 2 1 − .$!! k
f. H 0 : β1
F $+!2+.01 = 2.$#+ &herefore+ reAe"t H 0 at the 1 level
Chapter 13: Multiple Regression
0
13.#!
a. ll else e*al+ an in"rease of one *estion res*lts in a de"rease of 1.834 in e-'e"ted 'er"entage of res'onses re"eived. ll else e*al+ an in"rease in one word in length of the *estionnaire res*lts in a de"rease of .01! in e-'e"ted 'er"entage of res'onses re"eived. b. !3.$ of the variabilit% in the 'er"entage of res'onses re"eived "an be e-'lained b% the variabilit% in the n*mber of *estions ased and the n*mber of words ". H 0 : β1 = β 2 = 0+ H 1 : t least one β i ≠ 0+ (i = 1+ 2) F =
n − k − 1 R2
=
2$ .!3$
= 23.!# 1− R 2 1 − .!3$ k F 2+2$+.01 = .4#+ &herefore+ reAe"t H 0 at the 1 level 2
d. t 2$+.00 = 2.$$1+ ## ?: /1.834 ± 2.$$1(.!34#). O3.#38 *' to /.0$2 e. t = /1.$8+ t 2$+.0>.02 = /1.$03+ /2.02. &herefore+ reAe"t H 0 at the level b*t not at the 2. level 13.#$
a. ll else e*al+ a 1 in"rease in "o*rse time s'ent in gro*' dis"*ssion res*lts in an e-'e"ted in"rease of .381$ in the average rating of the "o*rse. ll else e*al+ a dollar in"rease in mone% s'ent on the 're'aration of s*bAe"t matter materials res*lts in an e-'e"ted in"rease of .1$2 in the average rating b% 'arti"i'ants of the "o*rse. ll else e*al+ a *nit in"rease in e-'endit*re on non/ "o*rse related materials res*lts in an e-'e"ted in"rease of .0$3 in the average rating of the "o*rse. b. $.# of the variation in the average rating "an be e-'lained b% the linear relationshi' with of "lass time s'ent on dis"*ssion+ mone% s'ent on the 're'aration of s*bAe"t matter materials and mone% s'ent on non/"lass related materials. ". H 0 : β1 = β 2 = β 3 = 0+ H 1 : t least one β i ≠ 0+ (i = 1+ 2+ 3) F =
n − k − 1 R2
=
21 .$#
= #.!2$ 1 − R2 3 1 − .$# k F 2+21+.0 = 3.4$ &herefore+ reAe"t H 0 at the level d. t 21+.0 = 1.$21+ #0 ?: .381$ ± 1.$21(.2018) .0344 *' to .$2# e. t = 2.!4+ t 21+.01>.00 = 2.18+ 2.831 &herefore+ reAe"t H 0 at the 1 level b*t not at the . level f. t = 1.0#+ t 21+.0 = 1.$21. &herefore+ do not reAe"t H 0 at the 10 level
1
Statistics for Business & Economics, 6 th edition
13.#8 Regression Analysis: yrating versus #1e#pgrade, #2/umstudents The regression equation is .:rating = - '2'' + 141 1:epgra"e - ''158 2:;umstu"ents !re"ictor #oe$ %& #oe$ T ! / #onstant -'2''1 '6968 -'29 '777 1:epgr 14117 '178' 793 '''' 15 2:;umst -''15791 '''3783 -417 '''1 15 % = '1866 (-%q = 915) (-%q*a", = 9'5) nal.sis o$ /ariance %ource 0 %% % (egression 2 63375 31687 9'99 (esi"ual &rror 17 '592' ''348 Total 19 69295
! ''''
13.##
= β = β ! = β $ = 0+ H 1 : t least one β i ≠ 0+ (i = 4++ !+ $+ ) .4!$ − .242 n − k − 1 R2 F = = = .804 + F 4++.01 = 3.!8 2 1− R 4 1 − .4!$ k H 0 : β 4
&herefore+ reAe"t H 0 at the 1 level 13.100
a. ll else e*al+ ea"h e-tra 'oint in the st*dent,s e-'e"ted s"ore leads to an e-'e"ted in"rease of .4!# in the a"t*al s"ore b. t 103+.02 = 1.#8+ therefore+ the # ?: 3.3!# ± 1.#8(.4!) = 2.4!!1 *' to 4.2$1# 3.04 = 2.0#! + t 103+.02 = 1.#! ". H 0 : β3 = 0+ H 1 : β 3 ≠ 0 + t = 1.4$ &herefore+ reAe"t H 0 at the level d. !8.! of the variation in the e-am s"ores is e-'lained b% their linear de'enden"e on the st*dent,s e-'e"ted s"ore+ ho*rs 'er wee s'ent woring on the "o*rse and the st*dent,s grade 'oint average e. H 0 : β1 = β 2 = β 3 = 0+ H 1 : t least one β i ≠ 0+ (i = 1+ 2+ 3) F =
n − k − 1 R2
=
103 .!8!
= $.008 + F 3+103+.01 = 3.# 1− R 3 1 − .!8! k 9eAe"t H 0 at an% "ommon levels of al'ha 2
f. R = .!8! = .8282 g. Y ˆ = 2.1$8 + .4!#(80) + 3.3!#(8) + 3.04(3) = $.812 13.101 a. t 22+.01 = 2.81#+ therefore+ the ## ?: .0#$4 ± 2.81#(0.021) = .03!8 *' to .180 .3$4 = 1.$8# + t 22+.0>.02 = 1.$1$+ 2.0$4. b. H 0 : β 2 = 0+ H 1 : β 2 > 0 + t = .20# &herefore+ reAe"t H 0 at the level b*t not the 2. level 22(.#1) + 2 2 = .#1$ ". R = 24
Chapter 13: Multiple Regression
2
= β 2 = 0+ H 1 : t least one β i ≠ 0+ (i = 1+ 2) 22 .#1$ n − k − 1 R2 F = = = 122.33 + F 2+22+.01 = .$2 2 1− R 2 1 − .#1$ k
d. H 0 : β1
9eAe"t H 0 at an% "ommon levels of al'ha e. R = .#1$ = .#$# 13.102 a. t 2!!#+.0 = 1.!4+ therefore+ the #0 ?: 480.04 ± 1.!4(224.#) = 110.0$# *' to 80.000 b. t 2!!#+.00 = 2.$!+ therefore+ the ## ?: 130.3 ± 2.$!(212.3) = 803.412 *' to 18#$.1848 −8#1.!$ = −4.#2## ". H 0 : β8 = 0+ H 1 : β 8 < 0 + t = 180.8$ t 2!!#+.00 = 2.$!+ therefore+ reAe"t H 0 at the . level d. H 0 : β#
= 0+ H 1 : β # > 0 + t = $22.# = !.142
110.#8 t 2!!#+.00 = 2.$!+ therefore+ reAe"t H 0 at the . level
e. 2.3# of the variabilit% in min*tes 'la%ed in the season "an be e-'lained b% the variabilit% in all # variables. f. R = .23# = .$238 13.103 a. H 0 : β1 = 0+ H 1 : β 1 ≠ 0 −.02 t = = −2.$3$ + t !0+.00 = 2.!!+ therefore+ reAe"t H 0 at the 1 level .01# −.00 = −.11# b. H 0 : β 2 = 0+ H 1 : β 2 ≠ 0 + t = .042 t !0+.10 = 1.2#!+ therefore+ do not reAe"t H 0 at the 20 level ". 1$ of the variation in the growth rate in GM "an be e-'lained b% the variations in real in"ome 'er "a'ita and the average ta- rate+ as a 'ro'ortion of G5M. d. R = .1$ = .4123 13.104 re'ort "an be written b% following the ase t*d% and testing the signifi"an"e of the model. ee se"tion 13.#
3
Statistics for Business & Economics, 6 th edition
13.10 a. tart with the "orrelation matri-: *orrelations: +con&A, %A0ver, %A0math, %&ct &con! %Ter< '478 ''''
%Ter<
%Tmath
'427 ''''
'353 '''3
%!ct
'362 ''''
'2'1 '121
%Tmath
'497 ''''
Regression Analysis: +con&A versus %A0ver, %A0math, %&ct The regression equation is &con! = '612 + ''239 %Ter< + ''117 %Tmath + '''53' 61 cases use" 51 cases contain missing alues !re"ictor #oe$ %& #oe$ T ! #onstant '6117 '4713 13' '2'' %Ter< ''23929 '''7386 324 '''2 %Tmath ''11722 '''7887 149 '143 %!ct '''53'3 '''4213 126 '213 % = '4238 (-%q = 329) (-%q*a", = 294) nal.sis o$ /ariance %ource 0 %% % (egression 3 5'171 16724 931 (esi"ual &rror 57 1'2385 '1796 Total 6' 152556 %ource %Ter< %Tmath %!ct
0 1 1 1
%!ct / 12 15 13
! ''''
%eq %% 37516 '98'9 '2846
&he regression model indi"ates 'ositive "oeffi"ients+ as e-'e"ted+ for all three inde'endent variables. &he greater the high s"hool ran+ and the higher the & verbal and & math s"ores+ the larger the 6"on GM. &he high s"hool ran variable has the smallest t/statisti" and is removed from the model: Regression Analysis: +con&A versus %A0ver, %A0math The regression equation is &con! = '755 + ''23' %Ter< + ''174 %Tmath 67 cases use" 45 cases contain missing alues !re"ictor #oe$ %& #oe$ T ! #onstant '7547 '4375 172 ''89 %Ter< ''22951 '''6832 336 '''1 %Tmath ''17387 '''6558 265 ''1' % = '4196 (-%q = 3'5) (-%q*a", = 283) nal.sis o$ /ariance %ource 0 %% % (egression 2 49488 24744 14'5 (esi"ual &rror 64 112693 '1761 Total 66 162181 %ource %Ter< %Tmath
0 1 1
%eq %% 371'9 12379
/ 11 11
! ''''
Chapter 13: Multiple Regression
4
Noth & variables are now statisti"all% signifi"ant at the .0 level and a''ear to 'i" *' se'arate infl*en"es on the de'endent variable. &he sim'le "orrelation "oeffi"ient between & math and & verbal is relativel% low at .33. &h*s+ m*lti"ollinearit% will not be dominant in this regression model. &he final regression model+ with "onditional t/statisti"s in 'arentheses *nder the "oeffi"ients+ is: Yˆ = .$ + .023(SATverbal ) + .01$4( SATmath) (3.3!) (2.!) 2 = .41#! 9 = .30 n = !$ b. tart with the "orrelation matri-: *orrelations: +con&A, Acteng, A*0math, A*0ss, A*0comp, %&ct &con! '387 '''1
cteng
#Tmath
'338 '''3
'368 '''1
#Tss
'442 ''''
'448 ''''
'439 ''''
#Tcomp
'474 ''''
'65' ''''
'765 ''''
'812 ''''
%!ct
'362 ''''
'173 '15'
'29' ''14
'224 ''6'
cteng
#Tmath
#Tss
#Tcomp
'23' ''53
Regression Analysis: +con&A versus Acteng, A*0math, """ The regression equation is &con! = - '2'7 + ''266 cteng - '''23 #Tmath + ''212 #Tss + ''384 #Tcomp + ''128 %!ct 71 cases use" 41 cases contain missing alues !re"ictor #oe$ %& #oe$ T ! / #onstant -'2'69 '6564 -'32 '754 cteng ''2663 ''2838 '94 '352 22 #Tmath -'''229 ''3'31 -''8 '94' 42 #Tss ''2118 ''28'6 '75 '453 46 #Tcomp ''3843 ''7287 '53 '6'' 127 %!ct ''12817 '''5271 243 ''18 12 % = '5'34 (-%q = 314) (-%q*a", = 261) nal.sis o$ /ariance %ource 0 %% % (egression 5 75253 15'51 594 (esi"ual &rror 65 164691 '2534 Total 7' 239945 %ource cteng #Tmath #Tss #Tcomp %!ct
0 1 1 1 1 1
%eq %% 35362 1'529 14379 ''''1 14983
! ''''
Statistics for Business & Economics, 6 th edition
&he regression shows that onl% high s"hool ran is signifi"ant at the .0 level. e ma% s*s'e"t m*lti"ollinearit% between the variables+ 'arti"*larl% sin"e there is a Ptotal, & s"ore (& "om'osite) as well as the "om'onents that mae *' the & "om'osite. in"e "onditional signifi"an"e is de'endent on whi"h other inde'endent variables are in"l*ded in the regression e*ation+ dro' one variable at a time. &math has the lowest t/statisti" and is removed: Regression Analysis: +con&A versus Acteng, A*0ss, A*0comp, %&ct The regression equation is &con! = - '195 + ''276 cteng + ''224 #Tss + ''339 #Tcomp + ''127 %!ct 71 cases use" 41 cases contain missing alues !re"ictor #oe$ %& #oe$ T ! / #onstant -'1946 '6313 -'31 '759 cteng ''2756 ''2534 1'9 '281 18 #Tss ''2242 ''2255 '99 '324 3' #Tcomp ''3391 ''4133 '82 '415 42 %!ct ''127'2 '''5''9 254 ''14 11 % = '4996 (-%q = 314) (-%q*a", = 272) nal.sis o$ /ariance %ource 0 %% % (egression 4 75239 1881' 754 (esi"ual &rror 66 1647'6 '2496 Total 7' 239945 %ource cteng #Tss #Tcomp %!ct
0 1 1 1 1
! ''''
%eq %% 35362 21618 '2211 16'48
gain+ high s"hool ran is the onl% "onditionall% signifi"ant variable. &"om' has the lowest t/statisti" and is removed: Regression Analysis: +con&A versus Acteng, A*0ss, %&ct The regression equation is &con! = ''49 + ''39' cteng + ''364 #Tss + ''129 %!ct 71 cases use" 41 cases contain missing alues !re"ictor #oe$ %& #oe$ T ! / #onstant ''487 '556' ''9 '93' cteng ''3897 ''2114 184 ''7' 13 #Tss ''3643 ''147' 248 ''16 13 %!ct ''12896 '''4991 258 ''12 11 % = '4983 (-%q = 3'7) (-%q*a", = 276) nal.sis o$ /ariance %ource 0 %% % (egression 3 73558 24519 987 (esi"ual &rror 67 166386 '2483 Total 7' 239945 %ource cteng #Tss %!ct
0 1 1 1
%eq %% 35362 21618 16579
! ''''
Chapter 13: Multiple Regression
!
5ow &ss and high s"hool ran are "onditionall% signifi"ant. &english has a t/ statisti" less than 2 and is removed:
$
Statistics for Business & Economics, 6 th edition
Regression Analysis: +con&A versus A*0ss, %&ct The regression equation is &con! = '566 + ''479 #Tss + ''137 %!ct 71 cases use" 41 cases contain missing alues !re"ictor #oe$ %& #oe$ T #onstant '5665 '4882 116 #Tss ''479' ''1355 353 %!ct ''13665 '''5'61 27'
! '25' '''1 '''9
% = '5'7' (-%q = 271) (-%q*a", = 25') nal.sis o$ /ariance %ource 0 %% % (egression 2 65123 32562 1267 (esi"ual &rror 68 174821 '2571 Total 7' 239945 %ource #Tss %!ct
0 1 1
/ 11 11
! ''''
%eq %% 46377 18746
Noth of the inde'endent variables are statisti"all% signifi"ant at the .0 level and hen"e+ the final regression model+ with "onditional t/statisti"s in 'arentheses *nder the "oeffi"ients+ is: Yˆ = .!$ + .04$#( ACTss ) + .013$(HS!t ) (3.3) (2.$0) 2 = .0$0 9 = .2$1 n = $1 ". &he regression model with the & variables is the better 'redi"tor be"a*se the standard error of the estimate is smaller than for the & model (.41#! vs. .0$0). &he 9 2 meas*re "annot be dire"tl% "om'ared d*e to the sam'le sie differen"es. 13.10! *orrelations: %alary, age, +#perien, yrsasoc, yrsfull, %e#1-em, 3ar4et, *8 age
%alar. '749 ''''
age &perien .rs:asoc .rs:$ull %e:1em
&perien
'883 ''''
'877 ''''
.rs:asoc
'698 ''''
'712 ''''
'8'3 ''''
.rs:$ull
'777 ''''
'583 ''''
'674 ''''
'312 ''''
%e:1em -'429 ''''
-'234 '''4
-'378 ''''
-'367 ''''
-'292 ''''
ar>et
''26 '75'
-'134 '1'3
-'15' ''67
-'113 '169
-''17 '833
''62 '453
-''29 '721
-'189 ''2'
-'117 '155
-''73 '373
-''43 '598
-''94 '254
#8
ar>et
-'1'7 '192
Chapter 13: Multiple Regression
8
&he "orrelation matri- indi"ates that several of the inde'endent variables are liel% to be signifi"ant+ however+ m*lti"ollinearit% is also a liel% res*lt. &he regression model with all inde'endent variables is: Regression Analysis: %alary versus age, +#perien, """ The regression equation is %alar. = 23725 - 4'3 age + 357 &perien + 263 .rs:asoc + 493 .rs:$ull - 954 %e:1em + 3427 ar>et + 1188 #8 !re"ictor #oe$ %& #oe$ T ! / #onstant 23725 1524 1557 '''' age -4'29 4498 -'9' '372 47 &perien 35683 6348 562 '''' 1'' .rs:asoc 2625' 7511 349 '''1 4' .rs:$ull 49291 5927 832 '''' 26 %e:1em -9541 4873 -196 ''52 13 ar>et 34272 7541 454 '''' 11 #8 11884 5975 199 ''49 11 % = 2332 (-%q = 882) (-%q*a", = 876) nal.sis o$ /ariance %ource 0 %% % (egression 7 5776'63882 825151983 15174 (esi"ual &rror 142 7721628'1 5437766 Total 149 6548226683 %ource age &perien .rs:asoc .rs:$ull %e:1em ar>et
0 1 1 1 1 1 1
! ''''
%eq %% 366921'599 1459475287 1979334 5''316356 227'7368 1''86'164
in"e age is insignifi"ant and has the smallest t/statisti"s+ it is removed from the model: &he "onditional @ test for age is: SSR F − SSRR +$!!+0!4+000 − + $$1+$00+$3! F X 2 = = = .80 (2332) 2 s 2Y S X hi"h is well below an% "ommon "riti"al val*e of @. &h*s+ age is removed from the model. &he remaining inde'endent variables are all signifi"ant at the .0 level of signifi"an"e and hen"e+ be"ome the final regression model. 9esid*al anal%sis to determine if the ass*m'tion of linearit% holds tr*e follows:
#
Statistics for Business & Economics, 6 th edition
Regression Analysis: %alary versus +#perien, yrsasoc, """ The regression equation is %alar. = 22455 + 324 &perien + 258 .rs:asoc + 491 .rs:$ull %e:1em + 3449 ar>et + 1274 #8 !re"ictor #oe$ %& #oe$ T ! #onstant 224552 5577 4'26 '''' &perien 32424 5199 624 '''' .rs:asoc 25788 7488 344 '''1 .rs:$ull 49'97 5919 829 '''' %e:1em -1'434 4767 -219 ''3' ar>et 34494 7532 458 '''' #8 12745 5893 216 ''32 % = 233' (-%q = 881) (-%q*a", = 876) nal.sis o$ /ariance %ource 0 %% % (egression 6 57717''58' 96195''97 17715 (esi"ual &rror 143 7765261'3 543'252 Total 149 6548226683
1
" 1 3 S E R
*"
1
!
E#perien
3
4
- 1'43
/ 67 4' 26 12 11 11
! ''''
Chapter 13: Multiple Regression
1
" 1 3 S E R
*"
1
!
$rsasoc
1
" 1 3 S E R
*"
1
$rsfull
!
!0
!1
Statistics for Business & Economics, 6 th edition
1
" 1 3 S E R
*"
'
'"
1'
Se#15em
1
" 1 3 S E R
*"
'
'"
Maret
1'
Chapter 13: Multiple Regression
!2
1
" 1 3 S E R
*"
'
'"
1'
C(
&he resid*al 'lot for 6-'erien"e shows a relativel% strong *adrati" relationshi' between 6-'erien"e and alar%. &herefore+ a new variable+ taing into a""o*nt the *adrati" relationshi' is generated and added to the model. 5one of the other resid*al 'lots shows strong eviden"e of non/linearit%. Regression Analysis: %alary versus +#perien, +#per%.uared, """ The regression equation is %alar. = 18915 + 875 &perien - 159 &per%quare" + 222 .rs:asoc + 612 .rs:$ull - 65' %e:1em + 3978 ar>et + 1'42 #8 !re"ictor #oe$ %& #oe$ T ! / #onstant 189152 5832 3243 '''' &perien 87535 722' 1212 '''' 2'6 &per%qu -15947 1717 -929 '''' 162 .rs:asoc 22158 594' 373 '''' 4' .rs:$ull 6121' 4863 1259 '''' 28 %e:1em -65'1 3796 -171 ''89 12 ar>et 39783 5988 664 '''' 11 #8 1'423 4671 223 ''27 11 % = 1844 (-%q = 926) (-%q*a", = 923) nal.sis o$ /ariance %ource 0 %% % (egression 7 6'6518927' 86645561' 25471 (esi"ual &rror 142 483'37413 34'1672 Total 149 6548226683 %ource &perien &per%qu .rs:asoc .rs:$ull %e:1em ar>et #8
0 1 1 1 1 1 1 1
%eq %% 51'9486518 91663414 15948822 678958872 12652358 13954'652 16938635
! ''''
!3
Statistics for Business & Economics, 6 th edition
&he s*ared term for e-'erien"e is statisti"all% signifi"ant7 however+ the e-T1@em is no longer signifi"ant at the .0 level and hen"e is removed from the model: Regression Analysis: %alary versus +#perien, +#per%.uared, """ The regression equation is %alar. = 18538 + 888 &perien - 163 &per%quare" + 237 .rs:asoc + 624 .rs:$ull + 3982 ar>et + 1145 #8 !re"ictor #oe$ %& #oe$ T ! / #onstant 185378 5436 341' '''' &perien 88785 7232 1228 '''' 2'4 &per%qu -16275 1718 -948 '''' 16' .rs:asoc 23689 5911 4'1 '''' 39 .rs:$ull 62449 4841 129' '''' 28 ar>et 39818 6'29 66' '''' 11 #8 11454 4663 246 ''15 1' % = 1857 (-%q = 925) (-%q*a", = 922) nal.sis o$ /ariance %ource 0 %% % (egression 6 6'55213'11 1''92'2168 29272 (esi"ual &rror 143 493'13673 3447648 Total 149 6548226683
! ''''
&his is the final model with all of the inde'endent variables being "onditionall% signifi"ant+ in"l*ding the *adrati" transformation of 6-'erien"e. &his wo*ld indi"ate that a non/linear relationshi' e-ists between e-'erien"e and salar%. 13.10$ *orrelations: hseval, *omper, omper, !ndper, si5ehse, incom72 hseal -'335 '''1
#omper
omper
'145 '171
-'499 ''''
n"per
-''86 '419
-'14' '188
-'564 ''''
si?ehse
'542 ''''
-'278 '''8
'274 '''9
-'245 ''2'
incom72
'426 ''''
-'198 ''62
-''83 '438
'244 ''2'
#omper
omper
n"per
si?ehse
'393 ''''
&he "orrelation matri- indi"ates that the sie of the ho*se+ in"ome and 'er"ent homeowners have a 'ositive relationshi' with ho*se val*e. &here is a negative relationshi' between the 'er"ent ind*strial and 'er"ent "ommer"ial and ho*se val*e.
Chapter 13: Multiple Regression
!4
Regression Analysis: hseval versus *omper, omper, """ The regression equation is hseal = - 19' - 264 #omper - 121 omper - 155 n"per + 722 si?ehse + '''4'8 incom72 !re"ictor #oe$ %& #oe$ T ! / #onstant -19'2 132' -144 '153 #omper -26393 989' -267 '''9 22 omper -12123 75'8 -161 '11' 3' n"per -15531 863' -18' ''75 26 si?ehse 7219 2138 338 '''1 15 incom72 '''4'81 '''1555 262 ''1' 14 % = 3949 (-%q = 4'1) (-%q*a", = 365) nal.sis o$ /ariance %ource 0 %% % (egression 5 8768' 17536 1125 (esi"ual &rror 84 13'983 1559 Total 89 218663
! ''''
ll variables are "onditionall% signifi"ant with the e-"e'tion of ?nd'er and om'er. in"e om'er has the smaller t/statisti"+ it is removed: Regression Analysis: hseval versus *omper, !ndper, si5ehse, incom72 The regression equation is hseal = - 3'9 - 152 #omper - 573 n"per + 744 si?ehse + '''418 incom72 !re"ictor #oe$ %& #oe$ T ! / #onstant -3'88 11'7 -279 '''7 #omper -15211 7126 -213 ''36 11 n"per -5735 6194 -'93 '357 13 si?ehse 7439 2154 345 '''1 15 incom72 '''4175 '''1569 266 '''9 14 % = 3986 (-%q = 382) (-%q*a", = 353) nal.sis o$ /ariance %ource 0 %% % (egression 4 83615 2'9'4 1316 (esi"ual &rror 85 135'48 1589 Total 89 218663
! ''''
?nd'er is not signifi"ant and is removed: Regression Analysis: hseval versus *omper, si5ehse, incom72 The regression equation is hseal = - 342 - 139 #omper + 827 si?ehse + '''364 incom72 !re"ictor #oe$ %& #oe$ T ! / #onstant -3424 1'44 -328 '''2 #omper -13881 6974 -199 ''5' 11 si?ehse 827' 1957 423 '''' 12 incom72 '''3636 '''1456 25' ''14 12 % = 3983
(-%q = 376)
nal.sis o$ /ariance %ource 0 (egression 3 (esi"ual &rror 86 Total 89
%% 82253 13641' 218663
(-%q*a", = 354)
% 27418 1586
1729
! ''''
!
Statistics for Business & Economics, 6 th edition
&his be"omes the final regression model. &he sele"tion of a "omm*nit% with the obAe"tive of having larger ho*se val*es wo*ld in"l*de "omm*nities where the 'er"ent of "ommer"ial 'ro'ert% is low+ the median rooms 'er residen"e is high and the 'er "a'ita in"ome is high. 13.108
a. orrelation matri-: *orrelations: deaths, vehwt, impcars, lghttr4s, carage ehwt impcars
lghttr>s
carage
"eaths '244 ''91
ehwt
impcars lghttr>s
-'284 ''48
-'943 ''''
'726 ''''
'157 '282
-'175 '228
-'422 '''3
'123 '4''
''11 '943
-'329 ''21
rash deaths are 'ositivel% related to vehi"le weight and 'er"entage of light tr*"s and negativel% related to 'er"ent im'orted "ars and "ar age. Fight tr*"s will have the strongest linear asso"iation of an% inde'endent variable followed b% "ar age. D*lti"ollinearit% is liel% to e-ist d*e to the strong "orrelation between im'"ars and vehi"le weight. b. Regression Analysis: deaths versus vehwt, impcars, lghttr4s, carage The regression equation is "eaths = 26' +'''''64 ehwt - '''121 impcars lghttr>s - ''395 carage !re"ictor #oe$ %& #oe$ T #onstant 2597 1247 2'8 ehwt '''''643 ''''19'8 '34 impcars -'''1213 '''5249 -'23 lghttr>s '''8332 '''1397 596 carage -''3946 ''1916 -2'6
+ '''833
! ''43 '738 '818 '''' ''45
% = ''5334 (-%q = 595) (-%q*a", = 558) nal.sis o$ /ariance %ource 0 %% % (egression 4 '183634 ''459'9 1614 (esi"ual &rror 44 '125174 '''2845 Total 48 '3'88'9
/ 1'9 1'6 12 14
! ''''
Fight tr*"s is a signifi"ant 'ositive variable. in"e im'"ars has the smallest t/ statisti"+ it is removed from the model:
Chapter 13: Multiple Regression
!!
Regression Analysis: deaths versus vehwt, lghttr4s, carage The regression equation is "eaths = 255 +''''1'6 ehwt + '''831 lghttr>s - ''411 carage !re"ictor #oe$ %& #oe$ T ! / #onstant 2555 122' 2'9 ''42 ehwt ''''1'622 '''''59'1 18' ''79 11 lghttr>s '''8312 '''138' 6'2 '''' 12 carage -''4114 ''1754 -234 ''24 12 % = ''5277
(-%q = 594)
nal.sis o$ /ariance %ource 0 (egression 3 (esi"ual &rror 45 Total 48
(-%q*a", = 567)
%% '183482 '125326 '3'88'9
% ''61161 '''2785
2196
! ''''
lso+ remove vehi"le weight *sing the same arg*ment: Regression Analysis: deaths versus lghttr4s, carage The regression equation is "eaths = 251 + '''883 lghttr>s - ''352 carage !re"ictor #oe$ %& #oe$ T ! #onstant 25'6 1249 2'1 ''51 lghttr>s '''8835 '''1382 639 '''' carage -''3522 ''1765 -2'' ''52 % = ''54'4
(-%q = 565)
nal.sis o$ /ariance %ource 0 (egression 2 (esi"ual &rror 46 Total 48
/ 11 11
(-%q*a", = 546)
%% '174458 '134351 '3'88'9
% ''87229 '''2921
2987
! ''''
&he model has light tr*"s and "ar age as the signifi"ant variables. 5ote that "ar age is marginall% signifi"ant ('/val*e of .02) and hen"e "o*ld also be dro''ed from the model. ". &he regression modeling indi"ates that the 'er"entage of light tr*"s is "onditionall% signifi"ant in all of the models and hen"e is an im'ortant 'redi"tor in the model. ar age and im'orted "ars are marginall% signifi"ant 'redi"tors when onl% light tr*"s is in"l*ded in the model. 13.10#
a. orrelation matri-: *orrelations: deaths, &uranpop, Ruspeed, &rsurf "eaths !ur
!rsur$
'3'5 ''33
-'224 '121
-'556 ''''
'2'7 '153
(uspee"
-'232 '1'9
!$
Statistics for Business & Economics, 6 th edition
6escriptive %tatistics: deaths, &uranpop, &rsurf, Ruspeed /aria
; 49 49 49 49
ean '1746 '589' '798' 58186
e"ian '178' '6311 '863' 584''
Trean '1675 '5992 '8117 58222
/aria
inimum ''569 ''''' '2721 535''
aimum '55'5 '9689 1'''' 622''
@1 '124' '4'85 '6563 57'5'
@3 '2'5' '8113 '9485 5915'
%t0e ''8'2 '2591 '1928 1683
%& ean ''115 ''37' ''275 '24'
&he 'ro'ortion of *rban 'o'*lation and r*ral roads that are s*rfa"ed are negativel% related to "rash "r ash deaths. verage r*ral s'eed is 'ositivel% related+ b*t the relationshi' relationshi' is not as strong as the 'ro'ortion of *rban 'o'*lation and s*rfa"ed roads. road s. &he sim'le sim'le "orrelation "oeffi"ients "oeffi"ients among the inde'endent inde'endent variables are relativel% low and hen"e m*lti"ollinearit% sho*ld not be dominant in this model. 5ote the relativel% narrow range for average r*ral s'eed. &his wo*ld indi"ate indi"ate that there is not m*"h variabilit% in this inde'endent variable. b. D*lti'le D*lti'le regression regression Regression Analysis: deaths versus &uranpop, &rsurf, Ruspeed The regression equation is "eaths = '141 - '149 !ur
#oe$ '14'8 -'14946 -'18'58 '''4569
%& #oe$ '2998 ''3192 ''4299 '''4942
(-%q = 558)
nal.sis o$ /ariance %ource 0 (egression (egression 3 (esi"ual &rror 45 Total 48
%% '1722'7 '1366'2 '3'88'9
T '47 -468 -42' '92
! '641 '''' '''' '36'
/ 11 11 11
(-%q*a", = 528)
% ''574'2 '''3'36
1891
! ''''
&he model has "onditionall% signifi"ant variables for 'er"ent *rban 'o'*lation and 'er"ent s*rfa"ed roads. in"e average r*ral s'eed is not "onditionall% "onditionall% signifi signifi"ant+ "ant+ it is dro''ed from the model:
Chapter 13: Multiple Regression
!8
Regression Analysis: deaths versus &uranpop, &rsurf The regression equation is "eaths = '416 - '155 !ur
(-%q = 549)
nal.sis o$ /ariance %ource 0 (egression (egression 2 (esi"ual &rror 46 Total 48
/ 1' 1'
(-%q*a", = 53')
%% '169612 '139197 '3'88'9
% ''848'6 '''3'26
28'3
! ''''
&his be"omes the final model sin"e both variables are "onditionall% signifi"ant. ". on"l*de on"l*de that the the 'ro'ortions 'ro'ortions of *rban *rban 'o'*lati 'o'*lations ons and and the 'er"ent 'er"ent of r*ral r*ral roads that are s*rfa"ed are im'ortant inde'endent variables in e-'laining "rash deaths. deaths . ll else e*al+ in"reases in"reases in the 'ro'ortion 'ro' ortion of *rban 'o'*lation+ 'o'*latio n+ the lower the "rash deaths. ll ll else e*al+ in"reases in the 'ro'ortion 'ro'o rtion of r*ral roads that tha t are s*rfa"ed will res*lt in lower "rash deaths. &he average r*ral s'eed is not "onditionall% signifi"ant. 13.110 a. orrelation orrela tion matri- and des"ri'tive statisti"s *orrelations: hseval, si5ehse, 0a#hse, *omper, incom72, tote#p si?ehse Tahse #omper incom72 totep
hseal '542 '''' '248 ''19 -'335 '''1 '426 '''' '261 ''13
si?ehse
Tahse
#omper
incom72
'289 '''6 -'278 '''8 '393 '''' -''22 '834
-'114 '285 '261 ''13 '228 ''3'
-'198 ''62 '269 ''1'
'376 ''''
&he "orrelation matri- shows that m*lti"ollinearit% is not liel% to be a 'roblem in this model sin"e all of the "orrelations among the inde'endent variables are relativel% low.
!#
Statistics for Business & Economics, 6 th edition
6escriptive %tatistics: hseval, si5ehse, 0a#hse, *omper, incom72, tote#p /aria
; 9' 9' 9' 9' 9' 9'
ean 21'31 54778 13'13 '16211 336'9 1488848
e"ian 2'3'1 54''' 13167 '1593' 3283' 1'8911'
Trean 2'687 54638 12831 '162'6 33532 1295444
/aria
inimum 133'' 5'''' 35'4 ''28'5 2739' 36129'
aimum 35976 62''' 3996' '28427 4193' 7'6233'
@1 17665 53''' 9885 '11388 31143 8'8771
@3 24'46 56''' 15519 '2'826 35853 157'275
%t0e 4957 '24'7 4889 ''6333 317' 1265564
%& ean '522 ''254 515 '''668 334 1334'2
&he range for a''l%ing the regression model (variable means < > / 2 standard errors): seval 21.03 <>/ 2(4.#$) = 11.11 to 30.#4 iehse.48 <>/ 2(.24) = .0 to .#! &a-hse 130.13 <>/ 2(48.8#) = 32.3 to 22 22$.#1 om'er .1! <> <>/ 2(.0!3) = .034 to .2 .28! ?n"om$2 33!1 <>/ 2(31$) = 2$2$ to 3## &ote-' 14888 88848 <> <>/ 2(1 2(12! 2!!4) = not not a good good a''ro-i o-imation b. 9egression models: Regression Analysis: hseval versus si5ehse, 0a#hse, """ The regression equation is hseal = - 311 + 91' si?ehse - ''''58 Tahse incom72 +''''''1 totep !re"ictor #oe$ %& #oe$ T #onstant -31'7 1''9 -3'8 si?ehse 91'5 1927 472 Tahse -''''584 '''891' -''7 #omper -22197 71'8 -312 incom72 '''12'' '''1566 '77 totep ''''''125 '''''''38 '''''''38 328 % = 3785
(-%q = 45')
nal.sis o$ /ariance %ource 0 (egression 5 (esi"ual &rror 84 Total 89
%% 98298 12'365 218663
- 222 #omper + '''12'
! '''3 '''' '948 '''2 '445 '''2
/ 13 12 13 15 15
(-%q*a", = 417)
% 1966' 1433
1372
! ''''
&a-hse is not "onditionall% signifi"ant+ nor is in"ome7 however+ dro''ing one variable at a time+ eliminate &a-hse &a-hse first+ then eliminate in"ome:
Chapter 13: Multiple Regression
$0
Regression Analysis: hseval versus si5ehse, *omper, tote#p The regression equation is hseal = - 299 + 961 si?ehse - 235 #omper +''''''1 totep !re"ictor #oe$ %& #oe$ T ! / #onstant -29875 9791 -3'5 '''3 si?ehse 9613 1724 558 '''' 11 #omper -23482 68'1 -345 '''1 12 totep ''''''138 '''''''33 422 '''' 11 % = 3754
(-%q = 446)
nal.sis o$ /ariance %ource 0 (egression 3 (esi"ual &rror 86 Total 89
(-%q*a", = 426)
%% 97455 1212'8 218663
% 32485 14'9
23'5
! ''''
&his is the final regression model. ll of the inde'endent variables are "onditionall% signifi"ant. Noth the sie of ho*se and total government e-'endit*res enhan"es maret val*e of homes while the 'er"ent of "ommer"ial 'ro'ert% tends to red*"e maret val*es of homes. ". ?n the final regression model+ the ta- variable was not fo*nd to be "onditionall% signifi"ant and hen"e it is diffi"*lt to s*''ort the develo'er,s "laim. 13.111
a. orrelation matri*orrelations: retsal8(, nemp8(, perinc8( retsal84 Anemp84 -'37' '''8 perinc84 '633 ''''
Anemp84
-'232 '1'1
&here is a 'ositive asso"iation between 'er "a'ita in"ome and retail sales. &here is a negative asso"iation between *nem'lo%ment and retail sales. igh "orrelation among the inde'endent variables does not a''ear to be a 'roblem sin"e the "orrelation between the inde'endent variables is relativel% low. 6escriptive %tatistics: retsal8(, perinc8(, nemp8( /aria
; 51 51 51
ean 5536 12277 7335
e"ian 5336 12314 7'''
Trean 5483 12166 7196
/aria
inimum 425' 8857
aimum 8348 17148
@1 5'59 1'689
@3 6'37 13218
%t0e 812 1851 2216
%& ean 114 259 '31'
$1
Statistics for Business & Economics, 6 th edition
Regression Analysis: retsal8( versus nemp8(, perinc8( The regression equation is retsal84 = 3'54 - 863 Anemp84 + '254 perinc84 !re"ictor #oe$ %& #oe$ T ! #onstant 3'543 7244 422 '''' Anemp84 -8625 4'2' -215 ''37 perinc84 '25368 ''4815 527 '''' % = 6129
(-%q = 453)
nal.sis o$ /ariance %ource 0 (egression 2 (esi"ual &rror 48 Total 5'
/ 11 11
(-%q*a", = 43')
%% 14931938 18'29333 32961271
% 7465969 375611
1988
! ''''
&his is the final model sin"e all of the inde'endent variables are "onditionall% signifi"ant at the .0 level. &he # "onfiden"e intervals for the regression slo'e "oeffi"ients: ˆ ± t ( S ) : /8!.2 <>/ 2.011(40.2) = /8!.2 <>/ 80.84 β 1 β ˆ 1
β ˆ2 ± t ( S β ˆ ) : 2
b.
.24 < > / 2.011(.0481) = .24 < > / .0#!8
ll things e*al+ the "ondition effe"t of a C1+000 de"rease in 'er "a'ita in"ome on retail sales wo*ld be to red*"e retail sales b% C24.
".
dding state 'o'*lation as a 'redi"tor %ields the following regression res*lts: Regression Analysis: retsal8( versus nemp8(, perinc8(, 0otpop8( The regression equation is retsal84 = 2828 - 713 Anemp84 + '272 perinc84 - ''247 Totpop84 !re"ictor #oe$ %& #oe$ T ! / #onstant 28284 7379 383 '''' Anemp84 -7133 414' -172 ''91 11 perinc84 '27249 ''4977 547 '''' 11 Totpop84 -''2473 ''1845 -134 '187 11 % = 6'78 (-%q = 473) (-%q*a", = 44') nal.sis o$ /ariance %ource 0 %% % (egression 3 15595748 5198583 14'7 (esi"ual &rror 47 17365523 369479 Total 5' 32961271
! ''''
&he 'o'*lation variable is not "onditionall% signifi"ant and adds little e-'lanator% 'ower+ therefore+ it will not im'rove the m*lti'le regression model. 13.112 a. *orrelations: -R, -&R, --+6, -32, 6&, B!( &0 2 0!
( '51' '''' '244 '''1 '854 '''' '934 '''' '9'7
B!(
&0
2
0!
'957 '''' '291 '''' '58' '''' '592
''77 '326 '287 '''' '285
'987 '''' '977
'973
Chapter 13: Multiple Regression
''''
''''
''''
''''
$2
''''
&he "orrelation matri- shows that both interest rates have a signifi"ant 'ositive im'a"t on residential investment. &he mone% s*''l%+ GM and government e-'endit*res also have a signifi"ant linear asso"iation with residential investment. 5ote the high "orrelation between the two interest rate variables+ whi"h+ as e-'e"ted+ wo*ld "reate signifi"ant 'roblems if both variables are in"l*ded in the regression model. en"e+ the interest rates will be develo'ed in two se'arate models. Regression Analysis: -R versus -&R, -32, 6&, The regression equation is ( = 7'' - 379 B!( - ''542 2 + ''932 0! - '165 166 cases use" 52 cases contain missing alues !re"ictor #oe$ %& #oe$ T ! / #onstant 7''' 2487 282 '''5 B!( -37871 '6276 -6'3 '''' 12 2 -''5421' '''921' -589 '''' 468 0! ''93223 '''7389 1262 '''' 581 -'16514 ''3747 -441 '''' 287 % = 2342 (-%q = 867) (-%q*a", = 863) nal.sis o$ /ariance %ource 0 %% % (egression 4 5737'' 143425 26142 (esi"ual &rror 161 88331 549 Total 165 662'3'
! ''''
&his will be the final model with 'rime rate as the interest rate variable sin"e all of the inde'endent variables are "onditionall% signifi"ant. 5ote the signifi"ant m*lti"ollinearit% that e-ists between the inde'endent variables. Regression Analysis: -R versus --+6, -32, 6&, The regression equation is ( = 55' - 276 &0 - ''558 2 + ''9'4 0! - '148 166 cases use" 52 cases contain missing alues !re"ictor #oe$ %& #oe$ T ! / #onstant 55'' 2626 2'9 ''38 &0 -2764' '6548 -422 '''' 12 2 -''5578 ''1''7 -554 '''' 5'7 0! ''9'4'2 '''7862 115' '''' 596 -'14752 ''3922 -376 '''' 285 % = 2461 (-%q = 853) (-%q*a", = 849) nal.sis o$ /ariance %ource 0 %% % (egression 4 564511 141128 233'' (esi"ual &rror 161 97519 6'6 Total 165 662'3'
! ''''
&he model with the federal f*nds rate as the interest rate variable is also the final model with all of the inde'endent variables "onditionall% signifi"ant. gain+ high "orrelation among the inde'endent variables will be a 'roblem with this regression model.
$3
Statistics for Business & Economics, 6 th edition
b. # "onfiden"e intervals for the slo'e "oeffi"ients on the interest rate term: Nan 'rime rate as the interest rate variable: ˆ ± t ( S ) : /3.$8$1 <>/ 1.#!(.!2$!) = /3.$8$1 <>/ 1.23 β ˆ 1 β 1
@ederal f*nds rate as the interest rate variable: ˆ ± t ( S ) : /2.$!4 <>/ 1.#!(.!48) = /2.$!4 <>/ 1.2834 β ˆ 1 β 1 13.112 a. *orrelations: !nfmrt82, &hys82, &erinc8(, &erhosp n$mrt82 !h.s82 '434 '''1 !erinc84 ''94 '511 !erhosp '411 '''3
!h.s82 !erinc84
'614 '''' '285 ''42
'267 ''58
&he "orrelation matri- shows a 'ositive asso"iation with Mh%s82 and Merhos'. &hese variables are the n*mber of 'h%si"ians 'er 100+000 'o'*lation and the total 'er "a'ita e-'endit*res for hos'itals. Lne wo*ld e-'e"t a negative asso"iation+ therefore+ e-amine the s"atterdiagram of infant mortalit% vs. 'h%s82:
!
! ( t r 1" m f n 3
1
1
1"
!
!"
3
3"
4
4"
"
""
h$s(!
&he gra'h shows an obvio*s o*tlier whi"h+ *'on f*rther investigation+ is the istri"t of ol*mbia. *e to the o*tlier stat*s+ this row is dro''ed from the anal%sis and the "orrelation matri- is re"al"*lated:
Chapter 13: Multiple Regression
$4
*orrelations: !nfmrt82, &hys82, &erinc8(, &erhosp n$mrt82 !h.s82 -'147 '3'9 !erinc84 -'192 '181 !erhosp '199 '166
!h.s82 !erinc84
'574 '''' -''65 '654
'14' '331
&he 'h%si"ians 'er 100+000 'o'*lation now has the "orre"t sign+ however+ none of the inde'endent variables has a statisti"all% signifi"ant linear asso"iation with the de'endent variable. Mer "a'ita e-'endit*res for hos'itals is an *ne-'e"ted 'ositive sign7 however+ it is not "onditionall% signifi"ant. &he m*lti'le regression res*lts are liel% to %ield low e-'lanator% 'ower with insignifi"ant inde'endent variables: Regression Analysis: !nfmrt82 versus &hys82, &erinc8(, &erhosp The regression equation is n$mrt82 = 127 - ''''17 !h.s82 -''''2'6 !erinc84 + 63' !erhosp !re"ictor #oe$ %& #oe$ T ! / #onstant 127'1 1676 758 '''' !h.s82 -''''167 '''6647 -''3 '98' 15 !erinc84 -''''2'64 ''''1637 -126 '214 16 !erhosp 6297 3958 159 '118 11 % = 16'2 (-%q = 89) (-%q*a", = 3') nal.sis o$ /ariance %ource 0 %% % (egression 3 11546 3849 15' (esi"ual &rror 46 118'29 2566 Total 49 129575
! '227
s e-'e"ted+ the model e-'lains less than # of the variabilit% in infant mortalit%. 5one of the inde'endent variables are "onditionall% signifi"ant and high "orrelation among the inde'endent variables does not a''ear to be a signifi"ant 'roblem. &he standard error of the estimate is ver% large (1.!02) relative to the sie of the infant mortalit% rates and hen"e the model wo*ld not be a good 'redi"tor. e*entiall% dro''ing the inde'endent variable with the lowest t/statisti" "onfirms the "on"l*sion that none of the inde'endnet variables is "onditionall% signifi"ant. &he sear"h is on for better inde'endent variables. b. &he two variables to in"l*de are 'er "a'ita s'ending on ed*"ation (Mer6d*") and 'er "a'ita s'ending on '*bli" welfare (MerMbwel). in"e the "onditional signifi"an"e of the inde'endent variables is a f*n"tion of other inde'endent variables in the model+ we will in"l*de the original set of variables:
$
Statistics for Business & Economics, 6 th edition
Regression Analysis: !nfmrt82 versus &hys82, &erinc8(, """ The regression equation is n$mrt82 = 122 - '''122 !h.s82 +'''''15 !erinc84 + 887 !erhosp - 196 !er&"uc - 456 !er!
! '1''
&he model shows low e-'lanator% 'ower and onl% one inde'endent variable that is "onditionall% signifi"ant (Merhos'). ro''ing se*entiall% the inde'endent variable with the lowest t/statisti" %ields a model with no "onditionall% signifi"ant inde'endent variables. &his 'roblem ill*strates that in some a''li"ations+ the variables that have been identified as theoreti"all% im'ortant 'redi"tors do not meet the statisti"al test. 13.114 a. *orrelations: %alary, age, yrsasoc, yrsfull, %e#1-em, 3ar4et, *8 %alar. '749 '''' '698 '''' .rs:$ull '777 '''' %e:1em -'429 '''' ar>et ''26 '75' #8 -''29 '721 age .rs:asoc
age .rs:asoc .rs:$ull %e:1em
'712 '''' '583 '''' -'234 '''4 -'134 '1'3 -'189 ''2'
'312 '''' -'367 '''' -'113 '169 -''73 '373
-'292 '''' -''17 '833 -''43 '598
''62 '453 -''94 '254
ar>et
-'1'7 '192
&he "orrelation matri- indi"ates several inde'endent variables that sho*ld 'rovide good e-'lanator% 'ower in the regression model. e wo*ld e-'e"t that age+ %ears at sso"iate 'rofessor and %ears at f*ll 'rofessor are liel% to be "onditionall% signifi"ant:
Chapter 13: Multiple Regression
$!
Regression Analysis: %alary versus age, yrsasoc, """ The regression equation is %alar. = 211'7 + 1'5 age + 532 .rs:asoc + 69' .rs:$ull - 1312 %e:1em + 2854 ar>et + 11'1 #8 !re"ictor #oe$ %& #oe$ T ! / #onstant 211'7 1599 132' '''' age 1'459 4'62 258 ''11 31 .rs:asoc 53227 6366 836 '''' 24 .rs:$ull 68993 5266 131' '''' 17 %e:1em -13118 5323 -246 ''15 13 ar>et 28539 8233 347 '''1 1' #8 11'1' 6581 167 ''97 11 % = 2569
(-%q = 856)
nal.sis o$ /ariance %ource 0 (egression 6 (esi"ual &rror 143 Total 149
%% 56'4244'75 9439826'8 6548226683
(-%q*a", = 85')
% 934'4'679 66'1277
14149
! ''''
ro''ing the 8 variable %ields: Regression Analysis: %alary versus age, yrsasoc, """ The regression equation is %alar. = 21887 + 9'' age + 539 .rs:asoc + 697 .rs:$ull - 1397 %e:1em + 2662 ar>et !re"ictor #oe$ %& #oe$ T ! / #onstant 21887 1539 1422 '''' age 9''2 3992 226 ''26 3' .rs:asoc 53948 6391 844 '''' 24 .rs:$ull 69735 528' 1321 '''' 17 %e:1em -13972 5332 -262 ''1' 12 ar>et 26623 82'3 325 '''1 1' % = 2585
(-%q = 853)
nal.sis o$ /ariance %ource 0 (egression 5 (esi"ual &rror 144 Total 149
%% 5585766862 962459821 6548226683
(-%q*a", = 848)
% 1117153372 6683749
16714
! ''''
&his is the final model. ll of the inde'endent variables are "onditionall% signifi"ant and the model e-'lains a sieable 'ortion of the variabilit% in salar%. b. &o test the h%'othesis that the rate of "hange in female salaries as a f*n"tion of age is less than the rate of "hange in male salaries as a f*n"tion of age+ the d*mm% variable e-T1@em is *sed to see if the slo'e "oeffi"ient for age (1) is different for males and females. &he following model is *sed: Y = β 0 + ( β1 + β ! X 4 ) X1 + β 2 X 2 + β 3 X 3 + β 4 X 4 + β X
= β 0 + β1 X 1 + β ! X 4 X1 + β 2 X 2 + β 3 X 3 + β 4 X 4 + β X
$$
Statistics for Business & Economics, 6 th edition
reate the variable 41 and then test for "onditional signifi"an"e in the regression model. ?f it 'roves to be a signifi"ant 'redi"tor of salaries then there is strong eviden"e to "on"l*de that the rate of "hange in female salaries as a f*n"tion of age is different than for males: Regression Analysis: %alary versus age, femage, """ The regression equation is %alar. = 22'82 + 851 age + 117 $emage + 543 .rs:asoc + 7'1 .rs:$ull - 1878 %e:1em + 2673 ar>et !re"ictor #oe$ %& #oe$ T ! / #onstant 22'82 1877 1177 '''' age 85'7 4836 176 ''81 44 $emage 1166 6389 '18 '855 322 .rs:asoc 54285 6673 813 '''' 26 .rs:$ull 7'135 5735 1223 '''' 2' %e:1em -1878 2687 -'7' '486 315 ar>et 26728 8251 324 '''1 1' % = 2594 (-%q = 853) (-%q*a", = 847) nal.sis o$ /ariance %ource 0 %% % (egression 6 558599'999 93'9985'' 13836 (esi"ual &rror 143 962235684 6728921 Total 149 6548226683
! ''''
&he regression shows that the newl% "reated variable of femage is not "onditionall% signifi"ant. &h*s+ we "annot "on"l*de that the rate of "hange in female salaries as a f*n"tion of age differs from that of male salaries. 13.11 Regression Analysis: hseval versus si5ehse, ta#rate, incom72, omper The regression equation is hseal = - 327 + 674 si?ehse - 223 tarate + '''464 incom72 + 112 omper !re"ictor #oe$ %& #oe$ T ! / #onstant -32694 8972 -364 '''' si?ehse 674' 188' 358 '''1 14 tarate -22296 4539 -491 '''' 12 incom72 '''4642 '''1349 344 '''1 12 omper 11215 4592 244 ''17 13 % = 361' (-%q = 493) (-%q*a", = 47') nal.sis o$ /ariance %ource 0 %% % (egression 4 1'79'8 26977 2'7' (esi"ual &rror 85 11'755 13'3 Total 89 218663
! ''''
ll of the inde'endent variables are "onditionall% signifi"ant. 5ow add the 'er"ent of "ommer"ial 'ro'ert% to the model to see if it is signifi"ant:
Chapter 13: Multiple Regression
$8
Regression Analysis: hseval versus si5ehse, ta#rate, """ The regression equation is hseal = - 316 + 676 si?ehse - 218 tarate + '''453 incom72 + 1'3 omper - 218 #omper !re"ictor #oe$ %& #oe$ T ! / #onstant -31615 9839 -321 '''2 si?ehse 6757 1892 357 '''1 14 tarate -21763 4958 -439 '''' 14 incom72 '''4534 '''1412 321 '''2 14 omper 1'287 5721 18' ''76 2' #omper -2182 794' -'27 '784 17 % = 363' (-%q = 494) (-%q*a", = 464) nal.sis o$ /ariance %ource 0 %% % (egression 5 1'8''7 216'1 164' (esi"ual &rror 84 11'656 1317 Total 89 218663
! ''''
ith a t/statisti" of /.2$ we have not fo*nd strong eno*gh eviden"e to reAe"t H 0 that the slo'e "oeffi"ient on 'er"ent "ommer"ial 'ro'ert% is signifi"antl% different from ero. &he "onditional @ test: SSR F − SSRR 1080.0$ − 10$#.08 F Comper = = = .0$ + th*s+ at an% "ommon level of (3.!3) 2 S 2Y S X al'ha+ do not reAe"t H 0 that the 'er"ent "ommer"ial 'ro'ert% has no effe"t on ho*se val*es. dd 'er"ent ind*strial 'ro'ert% to the base model: Regression Analysis: hseval versus si5ehse, ta#rate, """ The regression equation is hseal = - 286 + 61' si?ehse - 232 tarate + '''521 incom72 + 868 omper - 75' n"per !re"ictor #oe$ %& #oe$ T ! / #onstant -28643 96'2 -298 '''4 si?ehse 6'96 1956 312 '''3 15 tarate -23234 46'' -5'5 '''' 12 incom72 '''52'8 '''1431 364 '''' 14 omper 8681 5'7' 171 ''91 16 n"per -75'5 6427 -117 '246 17 % = 36'2
(-%q = 5'2)
nal.sis o$ /ariance %ource 0 (egression 5 (esi"ual &rror 84 Total 89
%% 1'9677 1'8986 218663
(-%q*a", = 472)
% 21935 1297
1691
! ''''
Fiewise+ the 'er"ent ind*strial 'ro'ert% is not signifi"antl% different from ero. &he RSS − RSS 4 10#!.$$ − 10$#.08 F "ndper = = = 1.3! + again this is "onditional @ test: (3.!02)2 S 2Y S X
$#
Statistics for Business & Economics, 6 th edition
lower than the "riti"al val*e of @ based on "ommon levels of al'ha+ therefore+ do not reAe"t H 0 that the 'er"ent ind*strial 'ro'ert% has no effe"t on ho*se val*es. &a- rate models: Regression Analysis: ta#rate versus ta#ase, e#percap, omper The regression equation is tarate = - ''174 -''''''' ta
! ''''
in"e ta-base is not signifi"ant+ it is dro''ed from the model: Regression Analysis: ta#rate versus e#percap, omper The regression equation is tarate = - ''192 +''''158 epercap + ''448 omper !re"ictor #oe$ %& #oe$ T ! #onstant -''19188 '''7511 -255 ''12 epercap ''''15767 '''''31'6 5'8 '''' omper ''44777 '''886' 5'5 '''' % = '''7676 (-%q = 314) (-%q*a", = 298) nal.sis o$ /ariance %ource 0 %% % (egression 2 '''23414 '''117'7 1987 (esi"ual &rror 87 '''51257 '''''589 Total 89 '''74671
/ 11 11
! ''''
Noth of the inde'endent variables are signifi"ant. &his be"omes the base model that we now add 'er"ent "ommer"ial 'ro'ert% and 'er"ent ind*strial 'ro'ert% se*entiall%: Regression Analysis: ta#rate versus e#percap, omper, *omper The regression equation is tarate = - ''413 +''''157 epercap + ''643 omper + ''596 #omper !re"ictor #oe$ %& #oe$ T ! / #onstant -''41343 '''8455 -489 '''' epercap ''''1566' '''''2819 555 '''' 11 omper ''6432' '''9172 7'1 '''' 14 #omper ''596' ''1346 443 '''' 13 % = '''6966
(-%q = 441)
nal.sis o$ /ariance %ource 0 (egression 3 (esi"ual &rror 86 Total 89
%% '''32936 '''41735 '''74671
(-%q*a", = 422)
% '''1'979 '''''485
2262
! ''''
Chapter 13: Multiple Regression
80
Mer"ent "ommer"ial 'ro'ert% is "onditionall% signifi"ant and an im'ortant inde'endent variable as shown b% the "onditional @/test: RSS3 − RSS 2 .0032#4 − .00234 F Comper = = = 1#.!2 (.00!#!!)2 S 2Y S X ith 1 degree of freedom in the n*mberator and (#0/3/1) = 8! degrees of freedom in the denominator+ the "riti"al val*e of @ at the .0 level is 3.#. en"e we wo*ld "on"l*de that the 'er"entage of "ommer"ial 'ro'ert% has a statisti"all% signifi"ant 'ositive im'a"t on ta- rate. e now add ind*strial 'ro'ert% to test the effe"t on ta- rate: Regression Analysis: ta#rate versus e#percap, omper, !ndper The regression equation is tarate = - ''15' +''''156 epercap + ''398 omper - ''1'5 n"per !re"ictor #oe$ %& #oe$ T ! / #onstant -''15'38 '''9'47 -166 '1'' epercap ''''15586 '''''312' 5'' '''' 11 omper ''3982 ''1'71 372 '''' 16 n"per -''1'52 ''1273 -'83 '411 15 % = '''769' (-%q = 319) (-%q*a", = 295) nal.sis o$ /ariance %ource 0 %% % (egression 3 '''238178 ''''79393 1343 (esi"ual &rror 86 '''5'8533 '''''5913 Total 89 '''746711
! ''''
&he 'er"ent ind*strial 'ro'ert% is insignifi"ant with a t/statisti" of onl% /.83. &he @/test "onfirms that the variable does not have a signifi"ant im'a"t on ta- rate: RSS3 − RSS 2 .002382 − .00234 = = .!83 . ith 1 degree of freedom in the F "ndper = (.00$!#) 2 S 2Y S X n*mberator and (#0/3/1) = 8! degrees of freedom in the denominator+ the "riti"al val*e of @ at the .0 level is 3.#. en"e we wo*ld "on"l*de that the 'er"entage of "ommer"ial 'ro'ert% has no statisti"all% signifi"ant im'a"t on ta- rate. ?n "on"l*sion+ we fo*nd no eviden"e to ba" three of the a"tivists "laims and strong eviden"e to reAe"t one of them. e "on"l*ded that "ommer"ial develo'ment will have no effe"t on ho*se val*e+ while it will a"t*all% in"rease ta- rate. ?n addition+ we "on"l*ded that ind*strial develo'ment will have no effe"t on ho*se val*e or ta- rate. ?t was im'ortant to in"l*de all of the other inde'endent variables in the regression models be"a*se the "onditional signifi"an"e of an% one variable is infl*en"ed b% whi"h other inde'endent variables are in the regression model. &herefore+ it is im'ortant to test if dire"t relationshi's "an be Pe-'lained, b% the relationshi's with other 'redi"tor variables.
81
Statistics for Business & Economics, 6 th edition
13.11!
a. orrelation matri-:
*orrelations: +con&A, se#, Acteng, A*0math, A*0ss, A*0comp, %&ct &con! '187 ''49 '387 '''1 #Tmath '338 '''3 #Tss '442 '''' #Tcomp '474 '''' %!ct '362 '''' se cteng
se
cteng
#Tmath
#Tss
#Tcomp
'27' ''21 -'17' '151 -'1'5 '375 -''84 '478 '216 ''26
'368 '''1 '448 '''' '65' '''' '173 '15'
'439 '''' '765 '''' '29' ''14
'812 '''' '224 ''6'
'23' ''53
&here e-ists a 'ositive relationshi' between 6"onGM and all of the inde'endent variables+ whi"h is e-'e"ted. 5ote that there is a high "orrelation between the "om'osite & s"ore and the individ*al "om'onents+ whi"h is again+ as e-'e"ted. &h*s+ high "orrelation among the inde'endent variables is liel% to be a serio*s "on"ern in this regression model. Regression Analysis: +con&A versus se#, Acteng, """ The regression equation is &con! = - ''5' + '261 se + '''99 cteng + '''64 #Tmath + ''27' #Tss + ''419 #Tcomp + '''898 %!ct 71 cases use" 41 cases contain missing alues !re"ictor #oe$ %& #oe$ T ! / #onstant -''5'4 '6554 -''8 '939 se '2611 '16'7 162 '1'9 15 cteng '''991 ''2986 '33 '741 25 #Tmath '''643 ''3'41 '21 '833 43 #Tss ''2696 ''2794 '96 '338 47 #Tcomp ''4188 ''72'' '58 '563 128 %!ct '''8978 '''5716 157 '121 14 % = '4971 (-%q = 341) (-%q*a", = 279) nal.sis o$ /ariance %ource 0 %% % (egression 6 81778 1363' 552 (esi"ual &rror 64 158166 '2471 Total 7' 239945
! ''''
s e-'e"ted+ high "orrelation among the inde'endent variables is affe"ting the res*lts. strateg% of dro''ing the variable with the lowest t/statisti" with ea"h s*""essive model "a*ses the dro''ing of the following variables (in order): 1) &math+ 2) &eng+ 3) &ss+ 4) M"t. &he two variables that remain are the final model of gender and &"om':