Analysis of Variance Approach to Regression Analysis
Rationale
The Analysis of Variance (ANOVA) is a statistical principle that is based on partitioning total observed variation into several components with the aim of trying to explain the sources of such variation.
Total observed variation is often measured by the total of the squared deviations of each observation from the mean. 2
Rationale
In the context of regression analysis in which we presume that the observations on the response variable can be expressed as a (linear) function of the independent variables in the form of yi 0 1 xi ei
3
Rationale
Based on sample data, and assuming that such a relation is true, the line that best fits the observed values is obtained as yi 0 1 xi ˆ
ˆ
ˆ
After fitting the said regression line, we now gather some evidence if indeed such model really holds in describing such relationship. 4
Rationale yi
y
( yi yi ) ˆ
( yi y ) yi ˆ
( yi y ) ˆ
y
yi 0 1 xi ˆ
ˆ
ˆ
( yi y ) ( yi yi ) ( yi y ) ˆ
ˆ
x 5
Total Deviation ( yi y ) ( yi yi ) ( yi y ) ˆ
TOTAL DEVIATION
ˆ
Deviation of fitted regression value around the mean
Deviation around fitted regression line
6
Sum of Squares 2 2 y y y y y y ( ) [( ) ( )] i i i i ˆ
ˆ
( yi y ) ( yi yi ) 2 ( yi yi )( yi y ) 2
2
ˆ
ˆ
ˆ
ˆ
0
( yi y ) ( yi y ) ( yi yi ) 2
2
ˆ
Total Sum of Squares (TSS)
Sum of Squares due to the Regression of y on x (SSR)
2
ˆ
Sum of Squares Error (SSE)
TSS = SSR + SSE 7
Degrees of Freedom
Total Degrees of Freedom (associated with TSS) is (n-1). One degree of freedom is lost because:
The deviations ( yi y ) is subject to one constraint: sum=0; or, The sample mean is used to estimate the population mean
8
Degrees of Freedom
Degrees of Freedom due to Error : n-2. Two degrees of freedom are lost because we are estimating two parameters 0 and 1 in obtaining the fitted value yi ˆ
9
Degrees of Freedom
Degreed of Freedom due to Regression: 1.
Although there are n deviations ( yi y ) , all fitted values yi are calculated from the same regression line. ˆ
ˆ
Two df is associated with regression line but 1 df is lost because the deviations ( yi y ) are subject to one constraint: sum is zero ˆ
Thus,
df Total df Regression df Error 10
Mean Squares
In a general ANOVA, the mean squares are obtained by dividing the SS with it corresponding df. That is
MSTot = SSTot/(n-1)
MSR = SSR/1; MSE = SSE/(n-2)
Note: MSTot MSR + MSE
11
ANOVA Table
Results of the Analysis of Variance are summarized in an ANOVA table:
Source of
df
SS
MS
1
SSR
MSR
Error
n-2
SSE
MSE
Total
n-1
Variation
Regression
12
Expected Mean Squares (EMS)
The EMS are useful quantities that:
Tells us what parametric function is being estimated by the MS [Method of Moments Estimator]
In some instances, this will suggest how the test-statistic will be defined to test specific hypotheses.
13
Expected Mean Squares (EMS)
E[ MSE ]
2
The mean of the sampling distribution of MSE is 2 whether or not X and Y are linearly related ( whether or not β1=0) 2
( SSE ) / ~
2 ( n 2)
E[ SSE / ] n 2 2
SSE 2 E E MSE [ ] n 2 14
Expected Mean Squares (EMS
2
2
E[ MSR ] 1
2 x x ( ) i
The mean of the sampling distribution of MSR is also 2 when β1=0. In this case, MSR and MSE will tend to be of the same magnitude.
When, β10, MSR > MSE. Thus, a comparison of MSR and MSE may be used to determined whether or not β1=0. 15
Test of Hypothesis: H0:1=0 vs H1: 10
From the EMS, it appears to be logical that to test this hypothesis, one can compare MSR and MSE.
From statistical theory and assuming normality of the error terms (Cochran’s Theorem):
16
Test of Hypothesis: H0:1=0 vs H1: 10
MSR and MSE are independent
MSR ~ (1) ,
2
Thus, a logical test-statistic (GLRT) would be MSR Fc
2
MSE ~ ( n 2)
MSE
~ F (1,n 2)
Reject H0 for large values of Fc or if Fc F ,(1,n 2 ) 17
General Linear Test Approach
Another approach to test the hypothesis concerning regression parameters (or a function of such parameters).
First Fit the Full Model. In SLR case, yi 0 1 xi ei
Compute: SSE ( F )
( yi yi ) 2 ˆ
( yi [ 0 1 xi ]) 2 SSE ˆ
ˆ
18
General Linear Test Approach
Fit the Reduced Model under H0 (assuming H0 is true). In the SLR case, H0: β1=0. This means, we fit the (reduced) model: yi 0 ei
In this case, the value of β0 that minimizes
ei 2
( yi 0 ) 2 is 0 y ˆ
ˆ
19
General Linear Test Approach
Compute the SSE for the reduced model as: SSE ( R) ( yi 0 ) 2 ( yi y ) 2 SST ˆ
Note that in general, SSE ( F ) SSE ( R) due to the fact that the more parameters are employed in model fitting, the better the fit.
20
General Linear Test Approach
Test Statistic: SSE ( R ) SSE ( F ) F *
df R df F SSE( F )
~ F [ df R df F ,df F ]
df F
Note that in the case of the SLR and testing H0:β1=0, 21
General Linear Test Approach SST SSE
SSR
MSR ( n 1) (n 2) 1 F SSE MSE MSE n2 *
Thus, the two tests are equivalent.
This approach can be extended for more complex tests.
22