Project Report
On
TESTS
Submitted To: Submitted By: Simi Mam Sumeet Singh MBA (Gen)
OF
HYPOTHESIS
Roll. No.= 89 Sec = B
Contents
Subject Covered
Sr.No.
Page No.
1
Introduction
3
2
Decision Errors: Errors:
5
3
Decision Rules:
5
4
One-Tailed
and
Two-Tailed
7
Tests:
5
A General Procedure for
8-10
Conducting Hypothesis Tests:
6
Common test statistics
2
11-13
7
Importance:
14
8
Criticism:
14
Introduction: A statistical hypothesis test is a method of making statistical decisions using experimental experimental data. In statistics, a result is called stat statis isti tica call lly y sign signif ific ican antt if it is unli unlike kely ly to have have occu occurr rred ed by chance. The phrase "test of significance" was coined by Ronald Fis Fisher: her: "Cri Critica ticall tests of this this kind kind may be calle alled d test tests s of signi igniffican icance ce,, and when when such tests sts are are avai availa lab ble we may disc discov over er whet whethe herr a seco second nd samp sample le is or is not not sign signif ific ican antl tly y different from the first.
Hypo Hypoth thes esis is test testin ing g is some someti time mes s call called ed conf confir irma mato tory ry data data analysis, in contrast to exploratory data analysis. In frequency probability, these decisions are almost always made using nullhypot ypoth hesis sis tes tests; ts; that is, is, ones ones that that answ answe er the the ques uestion tion Assuming that the null hypothesis is true, what is the probability of observing a value for the test statistic that is at least as extreme as the value that was actually observed? One use use of hypoth hypothes esis is testin testing g is decidi deciding ng wheth whether er exper experime imenta ntall results
contain
enough
information
conventional wisdom.
3
to
cast
doubt
on
Statistical hypothesis testing is a key technique of frequentist stati tatis stic tical infe nferenc rence e, and is wide widelly used sed, but but also lso much criticized. The main alternative to statistical hypothesis testing is Bayesian inference.
The critical region of a hypothesis test is the set of all outcomes whic which, h, if they they occu occur, r, will will lead lead us to deci decide de that that ther there e is a difference. That is, cause the null hypothesis to be rejected in favo favour ur of the the alte altern rnat ativ ive e hypo hypoth thes esis is.. The The crit critic ical al regi region on is usually denoted by C.
A statis statistic tical al hypothesis is an assumption about a population parameter. This assumption may or may not be true. The best way to determine whether a statistical hypothesis is true would be to exam xamine ine the the enti ntire popul opulat atiion. on. Sin Since tha that is ofte often n impra im practi ctica cal, l, resear research chers ers typica typically lly examin examine e a random random sampl sample e from from the the popu popula lati tion on.. If samp sample le data data are are cons consis iste tent nt with with the the statistical hypothesis, the hypothesis is accepted; if not, it is rejected.
There are two types of statistical hypotheses. 1.
Null hypothesis. The null hypothesis, denoted by H 0, is usual usually ly the hypot hypothe hesis sis that that sample sample obser observat vation ions s resul resultt purely from chance.
2.
Alter Alternat native ive
hypoth hypothesis esis. The altern alternati ative ve hypoth hypothes esis, is,
denote noted d by H1 or Ha, is the hypo ypothe thesis sis that that sampl ample e observations are influenced by some non-random cause.
4
For example, suppose we wanted to determine whether a coin was fair and balanced. A null hypothesis might be that half the flips would result in Heads and half, in Tails. The alternative hypothesis might be that the number of Heads and Tails would be very very diff differe erent. nt. Symbo Symbolic licall ally, y, these these hypoth hypothes eses es would would be expressed as H0: P = 0.5 Ha: P ≠ 0.5
Suppose we flipped the coin 50 times, resulting in 40 Heads and 10 Tails. Given this result, we would be inclined to reject the null hypothesis and accept the alternative hypothesis.
Decision Errors: Two types of errors can result from a hypothesis test.
1.
Type I error . A Type I error occurs when the researcher rejects a null hypothesis when it is true. The probability of level. committing a Type I error is called the significance level. alpha, and is often denoted This probability is also called alpha, by α.
2.
Type II error . A Type II error occurs when the researcher accepts a null hypothesis that is false. The probability of Beta, and is often committing a Type II error is called Beta, denoted by β. The probability of not committing a Type II error is called the Power of the test.
5
Decision Rules: The The anal analys ysis is plan plan incl includ udes es deci decisi sion on rule rules s for for acce accept ptin ing g or rejecting the null hypothesis. In practice, statisticians describe these decision rules in two ways - with reference to a P-value or with reference to a region of acceptance.
1.
P-value. The strength of evidence in support of a null P-value. Suppose the test hypothesis is measured by the P-value. statis statistic tic is equal equal to S. The P-value is the probability of observing a test statistic as extreme as S, assuming the null null hypo hypoth thes esis is is true true.. If the the PP-va valu lue e is less less than than the significance level, we reject the null hypothesis.
2.
Region of acceptance. The region of acceptance is a range of values. If the test statistic falls within the region of acceptance, the null hypothesis is accepted. The region of acceptance is defined so that the chance of making a Type I error is equal to the significance level.
The set of values outside the region of acceptance is called the region of rejection rejection.. If the test statistic falls within the region of rejection, the null hypothesis is rejected. In such cases, we say that the hypothesis has been rejected at the α level of significance.
6
These approaches are equivalent. Some statistics texts use the P-val -valu ue
app approac roach h;
oth others use
the reg region ion
of acce accep ptanc tance e
approach. In subsequent lessons .
One-Tailed and Two-Tailed Tests: 1.
One-Tailed Test : A test of a statistical hypothesis, hypothesis, where the region of rejection is on only one side of the sampling test. distribution, distribution, is called a one-tailed test.
For examp example, le, suppo suppose se the null null hypoth hypothes esis is states states that that the mean is less than or equal to 10. The alternative hypothesis would be that the mean is greater than 10. The region of rejection would consist of a range of numbers located located on the right side of sampling distribution; that is, a set of numbers greater than 10.
7
2.
Two-Tailed Test: A test of a statistical hypothesis, where the region of rejection is on both sides of the sampling test. distribution, distribution, is called a two-tailed test.
For examp example, le, suppo suppose se the null null hypoth hypothes esis is states states that that the mean is equal to 10. The alternative hypothesis would be that the mean is less than 10 or greater than 10. The region of reje reject ctio ion n woul would d cons consis istt of a rang range e of numb number ers s loca locate ted d located on both sides of sampling distribution; that is, the region of rejection would consist partly of numbers that were less than 10 and partly of numbers that were greater than 10
A
General
Procedure
for
Conducting
Hypothesis Tests: All hypothesis tests are conducted the same way. The rese resear arch cher er stat states es a hypo hypoth thes esis is to be test tested ed,, form formul ulat ates es an analysis plan, analyzes sample data according to the plan, and accepts or rejects the null hypothesis, based on results of the analysis.
1.
State State the hypoth hypothese eses. s. Every hypothes hypothesis is test requires requires the analyst to state a null hypothesis and an alternative hypothesis. The hypotheses are stated in such a way that
8
they are mutually exclusive. That is, if one is true, the other must be false; and vice versa.
2.
Formulate an analysis plan. plan. The The analysis plan describes how to use samp ample data ata to acce accep pt or rej reject ect the null null hypothesis. It should specify the following elements. a.
Sign Si gnif ific ican ance ce
leve levell.
Often,
researchers
choose
significance levels equal to 0.01, 0.05, or 0.10; but any value between 0 and 1 can be used. b.
Test method. Typically, the test method involves a
test statistic and a sampling distribution. Computed from sample data, the test statistic might be a mean sco score, re,
prop roport ortion, on,
diff diffe eren rence
bet betwee ween
means ans,
difference between proportions, z-score, t-score, chisquare, etc. Given a test statistic and its sampling distri distribut bution ion,, a resear researche cherr can asses assess s probab probabili ilitie ties s associated with the test statistic. If the test statistic probability is less than the significance level, the null hypothesis is rejected.
3.
Ana Analyz lyze e samp sample le data data.. Usin Using g samp sample le data data,, perf perfor orm m computations called for in the analysis plan. a.
Test statistic. statistic. When When the null null hypoth hypothes esis is invol involves ves a
mean ean or prop roporti ortion on,, use eith either er of the the follo ollowi win ng equations to compute the test statistic.
Test statistic = (Statistic - Parameter) Parameter) / (Standard deviation of statistic) Test statistic = (Statistic - Parameter) Parameter) / (Standard error of statistic)
9
where
Parameter
is
the
value
appearing
in
the
null
hypothesis, and Statistic is the point estimate of Parameter. As part of the the anal nalysis ysis,, you you may nee need to compu ompute te the the stand tandar ard d
devi deviat atio ion n
or
stan standa darrd
error rror of the the
stat statis isti tic c.
Previously, we presented common formulas for the standard deviation
When
the
and
parameter
standard
in
the
null
hypoth othesis
error.
involves
categorical data, you may use a chi-square statistic as the test test statis statisti tic. c. Instru Instructi ctions ons for comput computing ing a chi-s chi-squ quare are test test stat statis isti tic c are are pres presen ente ted d in the the less lesson on on the the chi-s hi-squ quar are e goodness of fit test. b.
P-value. The P-value is the probability of observing a
samp sample le stat statis isti tic c as extr extrem eme e as the the test test stat statis isti tic, c, assuming the null hypothesis is true.
4.
Interpret the results. If the sample findings are unlikely, given the null hypothesis, the researcher rejects the null hypothesis. Typically, this involves comparing the P-value to the significance level, and rejecting the null hypothesis when the P-value is less than the significance level.
10
Common test statistics In the table below, the symbols used are defined at the bottom of the table. Many other tests can be found in other articles. Name
Formula
Assumptions or notes
(Normal population or n > 30) and σ known. ( z z is the distance from the mean in relation to the standard deviation of the mean). For non-normal distributions it is possible to calculate a minimum proportion of a population that falls within k standard deviations for any k (see: Chebyshev's inequality ).
One-sample z-test
11
Two-sample z-test
Normal population and independent observations and σ1 and σ2 are known
One-sample t-test
(Normal population or n > 30) and σ unknown
Paired t-test
(Normal population of differences or n > 30) and σ unknown
Two-sample pooled t-test, equal variances*
(Normal populations or n1 + n2 > 40) and independent observations and σ1 = σ2 and σ1 and σ2 unknown [5]
Two-sample unpooled ttest, unequal variances*
(Normal populations or n1 + n2 > 40) and independent observations and σ1 ≠ σ2 and σ1 and σ2 unknown [6]
One proportion ztest
. n p 0 > 10 and n (1 − p0) > 10 and it is a SRS (Simple Random Sample).
Two proportion ztest, pooled
n1 p1 > 5 and n1(1 − p1) > 5 and n2 p2 > 5 and n2(1 − p2) > 5 and independent observations
Two proportion ztest, unpooled
n1 p1 > 5 and n1(1 − p1) > 5 and n2 p2 > 5 and n2(1 − p2) > 5 and independent observations
12
One of the following • All expected counts are at least 5
One-sample chi-square test
• All expected counts are > 1 and no more that 20% of expected counts are less than 5
*Twosample F test for equality of variances •
α, the probability the probability of Type I error (rejecting error (rejecting a null hypothesis when it is in fact true)
•
n = sample size n1 = sample 1 size n2 = sample 2 size
•
= sample mean
•
μ0 = hypothesized
• •
Arrange so
(α H0 for F F > F (α
•
•
•
•
•
mean
•
μ2 = population 2
•
•
standard deviation •
σ2 = population
•
s2 = sample
•
/ 2,n1 −
•
s1 = sample 1
•
standard deviation
• •
s2 = sample 2
p0 = hypothesized p1 = proportion 1 p2 = proportion 2 min{n1,n2} = minimum of n1 and n2
standard deviation
•
t = t statistic df = degrees of
•
freedom = sample mean of differences
= x/n = sample proportion,, unless proportion specified otherwise population proportion
variance
μ1 = population 1
σ = population
s = sample standard deviation
mean •
and reject
1,n2 − 1)[7]
population mean •
>
•
x1 = n1 p p1 x2 = n2 p p2 2 χ = Chi-squared statistic
•
F = F statistic
d 0 = hypothesized population mean difference
variance
•
sd = standard deviation of differences
In general, the subscript 0 indicates a value taken from the null hypothesis, hypothesis , H0, which should be used as much as possible in constructing its test statistic.
13
Importance: Statistical hypothesis testing plays an important role in the whole of statistics and in statistical inference. For example, Lehmann (1992) in a review of the fundamental paper by Neyman Neyman and Pearso Pearson n (1933) (1933) says: says: "Never "Neverthe theles less, s, despit despite e their shortcomings, the new paradigm formulated in the 1933 paper, paper, and the many many devel developm opmen ents ts carrie carried d out within within its framework continue to play a central role in both the theory and practice of statistics and can be expected to do so in the foreseeable future".
Criticism: Some statisticians have commented that pure "significance tes testin ting" has has what hat is actu actual allly a rat rather her stra stran nge goal goal of detecting the existence of a "real" difference between two populations. In practice a difference can almost always be foun found d give given n a larg large e enou enough gh sampl ample, e, the the typi typica call lly y more more relevant goal of science is a determination of causal effect size size.. The The amou amount nt and and natu nature re of the the diff differ eren ence ce,, in othe otherr 14
words, is what should be studied. Many researchers also feel that that hypot ypothe hesi sis s test testin ing g is some someth thin ing g of a mi misn snom omer er.. In prac practi tice ce a sing single le stat statis isti tic cal test test in a sing single le stud study y neve neverr "proves" anything.
Rejection of the null hypothesis at some effect size has no bearing on the practical significance at the observed effect size. A statistically significantly not be relevant in practice due to other, larger effects of more concern, whilst a true effect of practical significance may not appear statistically significant if the test lacks the power to detect it. Appropriate spec specif ific icat atio ion n of both both the the hypo hypoth thes esis is and and the the test test of said said hypoth hypothes esis is is theref therefore ore im impor portan tantt to provi provide de infere inference nce of practical utility.
15