Marketing Research (Chapter-16, 17, 18)
1. Why the value of eta 2 lies between 0 and 1?
Ans: The value of eta 2 lies between 0 and 1 because it is based on two measures of
variance: variance: within within groups groups (SSe) and between groups (SS x). Thus, it compares the Y variance estimates based on between-group and within-group variation.
Thus, 2.How does ANOVA differ from Regression in terms of data requirement? Ans: ANOVA must have a metric (interval or ratio scale) dependent variable and one
or more categorical (nonmetric) independent variable. On the other hand, Regression must have a metric dependent variable and one or more metric independent variable.
3.“In one-way ANOVA factor levels and treatment levels are equal”—Explain it. Ans: Fact Factor orss are cate catego goric rical al inde indepe pend nden entt vari variab able les. s. Trea Treatm tmen entt is a part partic icul ular ar
comb combin inat atio ion n of fact factor or leve levels ls or cate catego gori ries es.. OneOne-wa way y ANOVA ANOVA invo involv lves es only only one one categorical variable or a single factor. Therefore, there should be only one treatment level for a single factor in one way ANOVA. AN OVA. 4.What are the differences between ANOVA and Regression? ANOVA Regression
1. A statistical technique for examining the 1.
differ differenc ences es among among means means for or more more populations.
A st stati atistical pro procedure for for ana analyzing asso as soci ciat ativ ive e rela relati tion onsh ship ipss betw betwee een n a metric dependent variable and one or more independent variable. 2. Regression must have a metric depe depend nden entt vari variab able le and and one one or more more metric independent variable.
2. ANOVA ANOVA must must have a metric metric (interva (intervall or ratio scale) dependent variable and one or more categorical (nonmetric) independent variable. 3. It is used as a test test of means means for two or 3.It 3.It is conc concer erne ned d with with the the natu nature re and and more populations. degree of association between variables and does not imply any causation. 5. What are the steps steps in conductin conducting g one-way ANOVA? ANOVA? Ans: There are five steps in conducting one-way ANOVA—
i) Identi Identify fy the depen dependen dentt and indep independ endent ent varia variable bless ii) Decomp Decompose ose the the total total variat variation ion iii) Measure Measure the the effects effects iv) Test the the significance significance v) Interp Interpret ret the res result ultss
6.What are the steps of conducting exploratory research or identifying the problems? Ans: There are four steps—
i) ii) iii) iv)
Discussion with decision makers Interviews with experts Secondary da data an analysis Qualitative research
MBA 12th Batch Dept. of Marketing
7.What is the null hypothesis in case of ANOVA? Ans: The null hypothesis is that all means are equal. 8. What is the null hypothesis in case of MANOVA? Ans: The null hypothesis is that the vectors of means on multiple dependent variables
are equal across groups.
9.
What What is the the nul nulll hypo hypoth thes esis is in case case of regr regres essio sion n ana analy lysi sis? s? Ans: The null hypothesis is that there is no linear relationship between independent
variable, X, and dependent variable, Y.
10. What What is the the null null hypoth hypothesi esis s in case case of multi multiple ple regr regress ession ion analy analysis sis? ? Ans: The null hypothesis is that the coefficient of multiple determinations in the
population, R2pop, is zero. H0: R2pop = 0.
11. 11. What What is the null ull hyp hypot othe hesi sis s in in cas case e of of dis discr crim imin inan antt an analys alysis is? ? hypothesiss is that the means of all discriminant discriminant functions functions in all groups groups Ans: The null hypothesi
are equal. 12.
What What is the the null null hypot hypothe hesis sis in case case of Logi Logitt model/ model/Wal Wald d statis statistic tics? s? Ans: The null hypothesis is that logistic regression coefficient is zero, a i = 0. 13. What is correlation? Ans: Correlation is a statistic summarizing the strength of association between two
metric variables. It is an index used to determine whether a linear, or straight line, relationship, exist between X and Y.
14.
Differ Different entiat iate e among among Regres Regression sion,, Corre Correlat lation ion,, & Caus Causalit ality. y. Ans: Regression is a statistical procedure for analyzing associative relationships
between a metric dependent variable and one or more independent variable. It is concerned with the nature and degree of association between variables and does not imply any causation. Correlation is a statistic summarizing the strength of association between two metric
vari variab able les. s. It is an inde index x used used to dete determ rmin ine e whet whethe herr a line linear ar,, or stra straig ight ht line line,, relationship, exist between X and Y.
Causality means the occurrence of X increases the probability of the occurrence of Y.
Before assuming causality three conditions must be satisfied: i) concomitant variation, ii) time order of occurrence, and iii) elimination of other possible causal factors. 15.What What are are the the diff differ eren ence ces s betw betwee een n stan standa dard rdiz ized ed coef coeffi fici cien entt and and
unstandardized coefficient? Ans: Stan Standa dard rdiz ized ed coef coeffi fici cien ent, t, β, is the the slo slope obtain tained ed when when the the dat data ar are e
stan standa dard rdiz ized ed that that have have a mean mean of 0 and and a vari varian ance ce of 1. On the the othe otherr hand hand,, unstandardized coefficient, b, is calculated based on the raw data. It is the slope of the regression line and it indicates the change in Y when X is changed by 1 unit. 16.What is the meaning of r xy.pqr?
thirdd-or orde derr part partia iall corr correl elat atio ion n coef coeffi fici cien ent. t. It mean meanss the the part partia iall Ans: rxy.pqr is a thir correlation coefficient between X and Y, after controlling for the effect of p, q, & r. 17.What is indicated by R2 = 0 in product moment correlation? MBA 12th Batch Dept. of Marketing
Ans: R2 = 0 indicates that there is no linear relationship between X and Y.
18.If r2 = 0.87, How the will scatter diagram be plotted?
Ans: The dots will be very close to the straight line.
19. 19. What What is corr correl elat atio ion n mat matri rix? x? What What is the the use use of corr correl elat atio ion n mat matri rix? x? Corre rrela lati tion on matr matrix ix indi indicat cates es the the simp simple le corr correl elat atio ion n betw between een ea each ch pair pair of Ans: Co
variables. variables. Correlatio Correlation n matrix is used to identify high correlatio correlation n between predictor predictor variables or multicollinearity problems. This is useful to formulate a model. 20. When doe does the the sup suppres press sor ef effect & the spu spurious eff effect rev reveal in correlation? Ans: When a partial correlation is larger than its respective zero-order correlation
involves a suppressor effect. On the other hand, when the partial correlation is smaller than the its respective zero-order correlation involves a spurious effect. 21.What is standardized coefficient/standardized regression coefficient?
Ans: This is the slope obtained by the regression of Y on X when the data are
standardized; the intercept assumes a value of 0. 22.
What is the use of multivari ariate ate correlation ion?
Ans: It is useful to examine the simple correlation between each pair of variables 23. 23. What What is the Metho ethod d of of leas leastt squ squar are e? Wh What is the use of it? it? Ans: Least square procedure is a technique for fitting a straight line to a scatter
diagram by minimizing the square of the vertical distances of all the points from the line. It is used to best fitting the line in a scatter diagram by minimizing the vertical distances. 24.
What is partial F-test? Why is partial F-test used for? significance of a partial partial regression regression coefficient, coefficient, β i of Xi may be tested using Ans: The significance
an international F statistics. The incremental F statistic is based on the increment in the explained sum of squares resulting from the addition of the independent variable X to the regression equation after all the other independent variables have been included.
Partial F-test is used to identify one or more population partial regression coefficients have a value different from 0 when the null hypothesis is rejected. It helps to determine which, βi, are nonzero. 25.. 25
What What is the me meanin aning g of of β? β? Ans: β means standardized regression coefficient and it is the slope obtained by the
regression of Y on X when the data are standardized. It indicates the expected change in Y when X is changed by 1 unit. 26.Why is R2pop = 0 is equivalent to β 1 = β2 = β3 = ……….. = β k = 0?
determine iness the streng strength th of ass associ ociati ation on that that is stipu stipulat lated ed by regres regressio sion n Ans: R2 determ
equation. Thus, R 2pop = 0 means there is no association between X and Y. On the other hand, βi, the partial regression coefficient denotes the change in the predicted value, Ŷ, per unit change in X1 when the other independent variables, X 2 to Xk are held constant. Thus, β 1 = β2 = β3 = ……. = βk = 0 means there is no change in the MBA 12th Batch Dept. of Marketing
predicted value, Ŷ, per unit change in X 1 when the other independent variables, X 2 to Xk are held constant. Therefore, R2pop = 0 is equivalent to β1 = β2 = β3 = ……. = βk = 0. 27. What are are the cond onditions ions of standar dardization? on? Ans: The conditions of standardization are: i) a mean of zero, and ii) a standard
deviation of 1.
Show the the simila similarit rities ies and dissim dissimilar ilariti ities es among among ANOVA, ANOVA, Regres Regression sion,, 28.Show and Discriminate analysis. A NO V A
Regression
DIscriminant/ Logit Analysis
One
One
One
Multiple
Multiple
Multiple
Metric
Metric
Categorical/Binar y
Categorical
Metric
Metric
Similarities
Number of dependent variables Numb Number er of inde indepe pend nden entt variables Dissimilarities
Nature Natu re of the the depe depend nden entt variables Natu Na ture re of the the inde indepe pend nden entt variables
29. What are the pr properties of of Z? Ans: The properties of Z are: i) it has a mean of 0 i.e. µ = 0, and ii) a standard
deviation of 1 i.e. σ = 1.
30. What is is di discriminant fu function? Ans: The The linear linear combin combinati ation on of indepe independe ndent nt variab variables les develo developed ped by discri discrimin minant ant
anal analys ysis is that that will will best best disc discrim rimin inan antt betw betwee een n the the cate catego gori ries es of the the depe depend nden entt variables. 31. What is unstandardized coefficient? Ans: Unstandardized coefficient, b, is calculated based on the raw data. It is the slope
of the regression line and it indicates the change in Y when X is changed by 1 unit.
32.
Diff Differ eren enti tiat ate e betw betwee een n twotwo-gr grou oup p disc discri rimi mina nant nt anal analys ysis is and and mult multip iple le discrimant analysis. Ans: Tow-group discriminant analysis is a discriminant analysis technique where the
criterion variable has two categories. Multiple discriminant is discriminant analysis technique where the criterion variable involves three or more categories.
The main distinction is that, in the two-group case, it is possible to derive only one discriminant function. In multiple discriminant analysis, more than one function may be computed. 33.What is the model estimation in discriminant analysis? / How is model estimated in case of discriminant analysis? MBA 12th Batch Dept. of Marketing
Ans: The coefficients, or weights (b), are estimated so that the groups differ as much
as possible on the values of the discriminant function. This occurs when the ratio between-group sum of squares to within-group sum of squares for the discriminant score is at a maximum. 34.How is model/parameter estimated in case of logit model?
binary ry logi logitt mode model, l, the the mode modell is es esti tima mate ted d by the the maxi maximu mum m like likeli liho hood od Ans: In bina method. It estimates the likelihood or probability of observing the actual data.
35.How is model / parameter estimated in case of regression analysis?
Ans: The The regres regressio sion n model model is fit by the least square squaress proced procedure ure.. Lea Least st squares squares
procedure determines the best-fitting line by minimizing the square of the vertical distances of all the points from the line.
36.
What condition is implied to estimate discriminant model/parameter? Ans: The groups’ difference should be maximum on the values of the discriminant
function.
37.What is classification matrix? What is hit ratio? What is the use of hit ratio? Ans: The The clas classi sifi fica cati tion on matr matrix ix cont contai ains ns the the numb number er of corre correct ctly ly clas classi sifi fied ed and and
misclassified cases. The correctly classified cases appear on the diagonal, because the predicted and actual groups are the same. Hit ratio is the percentage of cases correctly classified by the discrimant analysis. Hit ratio is used for assessing the validation of the model estimation in discriminant analysis. Hit ratio is also used to compare the percentage of cases correctly classified by discriminant analysis to the percentage that would be obtained by chance. 38.. 38
Dist Distin ingu guis ish h be between ween anal analys ysis is sam sampl ple e an and val valid idat atio ion n sam sampl ple. e. Analysis sample Validation sample
Anal Analys ysis is sa samp mple le is the the part part of i) Vali Valida dati tion on sa samp mple le is that that part part of the the tota totall the total sample that is used for sampl ample e use sed d to chec check k the the re ressults ults of the the esti es tima mati tion on of the the disc discri rimi mina nant nt estimation sample. function. ii) It is used used for for estim stimat atin ing g the the ii) It is used for developing the classification matrix. discriminating function. i)
What condit condition ion is implie implied d to estima estimate te multip multiple le discri discrimin minant ant model model / 39.What Parameter? Ans: The coefficients, or weights (b), are estimated so that the groups differ as much
as possible on the values of the discriminant function. This occurs when the ratio between-group sum of squares to within-group sum of squares for the discriminant score is at a maximum.
40. 40. Ans:
What What are are the the ass assum umpt ptio ions ns of clas classi sica cal/ l/bi biva vari riat ate e reg regre ress ssio ion n mod model el? ?
i) The error error term term is norm normall ally y distri distribut buted. ed. ii) The means means of all these normal normal distribut distributions ions of Y, given given X, lies on a straight straight line with slopw b. iii) The mean mean of the error error term is is 0. iv) The variance of the error term is constant. This This variance does not depend on the the values assumed by X. v) The error terms are uncorrelate uncorrelated. d. MBA 12th Batch Dept. of Marketing
41. “In case of regression analysis error term is normally distributed.” Is it applicable for logit model? Ans: No, it is not applicable in case of logit model. Because each error term in the
binary logit model can assume only two values 0 and 1.
42. How model fit is tested in case of logit model/regression analysis/discriminant analysis? Ans: In binary logit model, model fit is tested by the likelihood functions, and Cox &
Snell R square and Nagelkerke R square. In regression analysis model fit is tested by the square of the coefficient of determination, r 2 or R 2. In discriminant analysis model fit is tested by determining the proportion of correct prediction (Hit ratio).
43. If it is replace ace by P then what will ill be the problem? Ans: P will not be constrained to lie between 0 and 1; it is possible to obtain
estimated values of P that are less than 0 or greater than 1. These values are conceptually and intuitively unappealing. 44.. 44
Diff Differ eren enti tiat ate e betw betwee een n clus cluste terr anal analys ysis is and and fact factor or anal analys ysis is.. Factor Analysis Cluster Analysis
Factor analysis is a general name denoting a class of procedure primarily used for data reduction and summarization. iii) It is used to examine and repres represent ent the relati relations onship hipss among among sets of many interrelated variables in terms of a few underlying factors. i)
ii) Cluster
Analysis is a class of techniques used to clas classi sify fy objec objects ts or case casess into into rela relati tive vely ly homogeneous groups called clusters.
ii) It is concerned with classification of objects or cases. There is no a priori information about the group or cluster membership for any of the objects.
MBA 12th Batch Dept. of Marketing