FINAL EXAM - SOLUTIONS BUSINESS ANALYTICS II – WINTER 2016
► grading ► farming ► sugar ► hospitals ► true or false ► autoparts ► yogurt
█ GRADING SCEHDULE: Do not distribute the exam or solutions to anybody (Code honor) ► farming
40 points
i. identify the constant as the value for December ii. set dummy for September to one and interpret coefficient iii. correct variables included in the regression perform curvature and heteroskedasticity tests set hypothesis and perform test or provide p-value iv. correct variables included in the regression perform curvature and heteroskedasticity tests choice of model
10 points 10 points 4 points 3 points 3 points 4 points 3 points 3 points
► sugar
10 points
consumption
i. identify the coefficients coefficients that gives the change change in lndiabetes correct interpretation of the change (percentage or otherwise) ii. correctly identify the predicted lndiabetes (in log units) correctly transform from log units into a rate - taking just exp() ► hospitals checklist procedure
i. correct variables included in the regression ii. correctly identify the need for panel data regression provide the correct syntax for the regression ► true/false
i. correct answer: False ii. correct answer: False iii. correct answer: True ► autoparts
i. correct variables included in the regression ii. perform curvature and heteroskedasticity tests iii. correct interpretation of coefficient for analysis ► yogurt
sales
i. correctly identify the two channels correct conclusion on the bias ii. correct argument towards using or not the fixed effects regression iii. correct choice of (a), (b) correct choice of not (c)
FINAL EXAM – GRADING SCHEDULE
3 points 2 points 3 points 2 points 10 points 5 points 3 points 2 points 15 points
5 points 5 points 5 points 10 points
5 points 2 points 3 points 15 points
3 points 2 points 5 points 2 points 3 points
Page | 1
FINAL EXAM - SOLUTIONS BUSINESS ANALYTICS II – WINTER 2016
► instructions ► farming ► sugar ► hospitals ► true or false ► autoparts ► yogurt
█ Farming [40 points] Part I.
i.
[10 points] The constant is the average weekly sales in December: 83,929 bottles.
ii.
[10 points] The average weekly total volume in September is (set dummy for September equal to one and the rest of the dummies to zero): 83929.75 + 6623.875 = 90,553.625 bottles.
Part II.
iii.
[10 points] The answer is no. We run the regression of miloprice on ivicaentry: -----------------------------------------------------------------------------miloprice | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------ivicaentry | .2715052 .0818844 3.32 0.001 .1097517 .4332586 _cons | 5.746374 .0530913 108.24 0.000 5.641498 5.851249 ------------------------------------------------------------------------------
The model looks linear (rvfplot) and homoskedastic (the p-value of Breusch-Pagan test is obtained as p = 0.31>0.05). If Milo’s prices have dropped, then true coefficient multiplying ivicaentry should be negative. The estimated coefficient is positive (0.2715); hence, the p-value for the hypothesis test that the coefficient is negative will be large (bigger than 0.5 > 0.05) and we cannot prove Milo’s average price was lower after Ivica’s entry. (One could get the exact p-value of 0.999 by using klincom ivicaentry and looking at the “Ha: <” alternative hypothesis.) Note that just looking at the regression output’s reported p-value of 0.001 gives the wrong answer, because that is for the test trying to show that Milo’s average price was different before and after Ivica’s entry, and in this case we see that this difference is positive and not negative as we were trying to show. Part III.
iv.
[10 points] Either a semi-log specification or a log-log regression will be appropriate in this case. The results for the two regressions are given below.
FINAL EXAM – FARMING
Page | 1
FINAL EXAM - SOLUTIONS BUSINESS ANALYTICS II – WINTER 2016
► instructions ► farming ► sugar ► hospitals ► true or false ► autoparts ► yogurt
█ Farming [40 points] (continued) Part III. iv.
[10 points] Regressions results: Log-linear :
-------------+---------------------------------------------------------------lnpancicvolume | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------pancicprice | -.4454663 .024758 -17.99 0.000 -.4943938 -.3965387 miloprice | .0827112 .0234977 3.52 0.001 .0362741 .1291482 gbosprice | .0487792 .0304608 1.60 0.111 -.0114184 .1089768 aldiprice | .1384748 .0331023 4.18 0.000 .0730569 .2038927 ivicaentry | -.7341845 .425805 -1.72 0.087 -1.575674 .1073055 milo_entry | .0220722 .0409245 0.54 0.590 -.0588042 .1029487 pancic_entry | -.0391247 .0366463 -1.07 0.287 -.1115463 .033297 gbos_entry | .0770227 .0767372 1.00 0.317 -.074628 .2286734 aldi_entry | .0684911 .0517738 1.32 0.188 -.0338261 .1708083 _cons | 11.56035 .2020676 57.21 0.000 11.16102 11.95969 ------------------------------------------------------------------------------
Log-log specification -------------------------------------------------------------------------------lnpancicvolume | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------------+---------------------------------------------------------------lnpancicprice | -2.452484 .1383394 -17.73 0.000 -2.725875 -2.179093 lnmiloprice | .4863486 .1291666 3.77 0.000 .2310852 .741612 lngbosprice | .2593963 .1696241 1.53 0.128 -.0758205 .5946132 lnaldiprice | .5089998 .1234564 4.12 0.000 .2650211 .7529785 ivicaentry | -1.136476 .7709127 -1.47 0.143 -2.65998 .3870271 lnmilo_entry | .2348183 .2352649 1.00 0.320 -.2301199 .6997565 lnpancic_entry | -.4077246 .2133808 -1.91 0.058 -.8294149 .0139656 lngbos_entry | .4142044 .4639228 0.89 0.373 -.5026152 1.331024 lnaldi_entry | .4268966 .2136244 2.00 0.048 .004725 .8490683 _cons | 12.58364 .3377567 37.26 0.000 11.91615 13.25112 --------------------------------------------------------------------------------
FINAL EXAM – FARMING
Page | 2
FINAL EXAM - SOLUTIONS BUSINESS ANALYTICS II – WINTER 2016
► instructions ► farming ► sugar ► hospitals ► true or false ► autoparts ► yogurt
█ Farming [40 points] (continued) Part III. iv.
[10 points] Regressions variables:
► The dependent variable is pancicvolume. ► It is not clear if we should include or exclude
ivicaprice as that was not the focus, the answer above
omits it, but it was equally acceptable to include it (note that since it was 0 before Ivica’s entry, including it was exactly the same as including the slope dummy ivicaentry·ivicaprice). We need to include:
► miloprice,
gobsprice, aldiprice and pancicprice because their relationship with pancicvolume was the
focus of our investigation.
► ivicaentry, as we know that our prediction is for the time after Ivica’s entry. ► slope dummies: milo_entry, gobs_entry, aldi_entry and pancic_entry (each equaling the corresponding price variable times Ivicaentry) to allow for these prices to affect Drago’s sales differently before and after Ivica’s entry After running a linear regression, we notice that the model is heteroskedastic (Breusch-Pagan test is obtained as p = 0.0046). We run a semi-log model or log-log model. Either model looks linear (run rvfplot) and homoscedastic (B-P test = 0.70 for semi-log, p = 0.72 for log-log). Remark: similar conclusion would have been obtained if you had included ivicaprice as an additional
variable, the hettest p-value = 0.0042, and either semi-log or log-log corrects for it.
FINAL EXAM – FARMING
Page | 3
FINAL EXAM - SOLUTIONS BUSINESS ANALYTICS II – WINTER 2016
► instructions ► farming ► sugar ► hospitals ► true or false ► autoparts ► yogurt
█ Sugar Consumption [10 points] i.
[5 points] The estimated change in lndiabetes when lnsugar increases by 1 in countries where more than 30% of the population is classified as overweight (and thus obese = 1 and obese_lnsguar = lnsugar ) is 0.1181605 + (–0.0385298) = 0.0796307. Thus when sugar consumption per capita increases by 1% the diabetes rate increases by 0.0796307%; alternatively, the elasticity of diabetes rate over sugar consumption is 0.0796307.
ii.
[5 points] The predicted lndiabetes is (note that obese = 1 and obese_lnsguar = lnsugar ): 0.6201178 + 0.1181605·ln(285) + 1.232242·1 + (–0.0385298)·1·ln(285) = 2.3024715 The predicted diabetes rate is therefore exp(2.3024715)= 9.9988641. Remark: We do not use the correction factor here because we are estimating the prediction for one country
not the average across countries.
FINAL EXAM – SUGAR CONSUMPTION
Page | 4
FINAL EXAM - SOLUTIONS BUSINESS ANALYTICS II – WINTER 2016
► instructions ► farming ► sugar ► hospitals ► true or false ► autoparts ► yogurt
█ Hospitals Checklist Procedure [10 points] i.
[5 points] The STATA command is: regress infectionrate checklist
We do not want to keep hospital specific factors constant as we are comparing two different hospitals. Hence, a standard regression rather than fixed effects is appropriate. We do not know whether nsr is the same in both hospitals nor do we know the values of nsr for each hospital; hence, we must omit it. We also might suspect that data is clustered as we have panel data with multiple observations for each hospital. No need to include cluster option here though as the next question would run the panel data analysis. ii.
[5 points] The STATA command is xi: regress infectionrate checklist nsr i.hospid
We want to keep all hospital specific factors constant, as after introducing the checklist Northwestern Memorial will remain the same hospital in all other respects. Hence, we use a fixed effects model.
FINAL EXAM – HOSPITALS CHECKLIST PROCEDURE
Page | 5
FINAL EXAM - SOLUTIONS BUSINESS ANALYTICS II – WINTER 2016
► instructions ► farming ► sugar ► hospitals ► true or false ► autoparts ► yogurt
█ True or False [15 points] Part I. [5 points] You estimate a regression y = b0 + b1·x1 + b2·x2 and find that both x1 and x2 are statistically significant. Once the level of x2 is fixed the change in y associated with a given change in x1 is equal to b1. True
■ False
Part II. [5 points] Your regression is heteroskedastic and you do not correct the problem. The reported coefficients are not biased, only the standard errors are biased. True
■ False
Part III. [5 points] Suppose you run two regressions on the same data and get the following output: Regression 1:
y = 0.00233 + 0.0178·x
Regression 2:
y = – 0.0965 + 0.000376· x + 0.0215·z
Since b1* = 0.0178 > b1 = 0.000376 we have overestimation thus it must be the case that b2·a1 > 0 where a1 is the correlation coefficient between x and z and b2 = 0.0215 > 0. It must be the case that a1 > 0. ■ True
FINAL EXAM – TRUE OR FALSE
False
Page | 6
FINAL EXAM - SOLUTIONS BUSINESS ANALYTICS II – WINTER 2016
► instructions ► farming ► sugar ► hospitals ► true or false ► autoparts ► yogurt
█ Autoparts [10 points] Choice: location A. We run the following regression: ----------------------------------------------------------------------------sales | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------inc3mi | .0287468 .010682 2.69 0.010 .0072574 .0502363 pop3mi | .0100729 .0012368 8.14 0.000 .0075848 .012561 endcap | 2026.702 356.865 5.68 0.000 1308.782 2744.622 _cons | -883.1614 831.5062 -1.06 0.294 -2555.936 789.6133 ------------------------------------------------------------------------------
The model is linear ( rvfplot) and homoskedastic (the Breusch-Pagan test result is p-value = 0.619). Moving from non-endcap to endcap holding income and population fixed (since both locations have the same 3 mile radius) is estimated to increase the average sales by $2,026,702, which is more than the estimated $1.8 million gain we needed to justify location A. However, we need a formal test: H0: Sales|endcap = 1 – Sales|endcap = 0 < 1,800 Since Sales|endcap = 1 – Sales|endcap = 0 = _b[endcap] the hypotheses are H0: _b[endcap] < 1,800 Ha: _b[endcap] >= 1,800 Ee calculate the t-test = (2,026.702 – 1,800)/356.865 = 0.63525983 and then the right tail test (according to the alternative) as ttail(47,0.63525983) = 0.264. We cannot reject the null thus we would suggest location B. Remark: Notice that, given the information in the question, since the 3mile radius is the same for the two locations, we know that inc3mi is the same for the two locations, therefore it must be included in the regression. Furthermore, we should not include a slope-dummy endcap·pop3mi because we are told that the impact of population on sales is the same whether or not the store is an encap thus we are looking at a difference in levels.
FINAL EXAM – AUTOPARTS
Page | 7
FINAL EXAM - SOLUTIONS BUSINESS ANALYTICS II – WINTER 2016
► instructions ► farming ► sugar ► hospitals ► true or false ► autoparts ► yogurt
█ Yogurt Sales [15 points] i.
[5 points] Since: (a) in a more affluent neighborhood you would probably see higher prices for almost all goods including prices of yogurt (b) in a more affluent neighborhood you would probably see the sales of yogurt will be higher we can safely infer an overestimation effect of ovb.
ii.
[5 points] As long as the average income remains constant in each neighborhood then, yes, the fixed effect regression will help.
iii.
[5 points] The difference can be explain by (a) and (b). (a) Reason: A change in the average income in some of the neighborhoods means that the main assumption of the fixed effects regression is violated. (b) Reason: The relation between sales and price changed because of new competitors thus the true coefficient of price is changed with or without fixed effects. Remark: Part (c) is the very definition of fixed effects which means that the method does remove ovb for all
variables having only between group variation: i.e. fixed for each store across all period 2001 – 2010 but different across stores.
FINAL EXAM – YOGURT SALES
Page | 8