Statistics formula sheet
This has mean nθ and variance nθ (1 The Poisson distribution:
Summarising data
sx =
n
1 n
(xi
−1
i=1
2
− x)
xi .
i=1
Sample variance: 2
=
Continuous distributions n
1 n
−1
2
xi
i=1
2
− nx
Distribution function: .
y
F (y) = P (X
Sample covariance: g=
n
−1
−
−
−1
n
n
1
(xi x)(yi y ) =
i=1
xi yi
i=1
− nx y
.
f (x) =
P (a < X
≤ b) =
E (X ) = µ =
∪ B) = P (A) + P (B) − P (A ∩ B).
xf (x) dx.
∞
Var(X ) =
∞
(x
−∞
2
− µ) f (x) dx =
k
h(t) =
k
P (A
i=1
∩B )= i
P (A Bi )P (Bi ).
|
i=1
f (x) =
|
Discrete distributions
P (A Bi )P (Bi )
k i=1
|
P (A Bi )P (Bi )
|
.
(xi
Weibull density:
n x θ (1 x
− θ)
x
µ
2
σ
for x
∈ [−∞, ∞].
≥ 0.
xi ∈S
≥ 0.
Test for population mean 2
xi p(xi )
2
−µ .
Data: Single sample of measurements x1 , . . . , xn . Hypothesis: H : µ = µ0 . Method:
n−x
1 2
This has mean λ−1 and variance λ−2 .
The binomial distribution:
− − −
xi p(xi ).
− µ) p(x ) = i
−
f (t) = λ exp( λt) for t
Variance:
xi ∈S
f (t) . 1 F (t)
−
for x = 0 , 1, . . . , n .
2
−µ .
Exponential density:
2
−∞
f (t) = λκtκ−1 exp( λtκ ) for t
xi ∈S
1 exp 2πσ 2
√
Mean value: E (X ) = µ =
x2 f (x) dx
Normal density with mean µ and variance σ 2 :
Bayes’ formula: P (A Bi )P (Bi ) = P (A)
Hazard function:
Partition law: For a partition B1 , B2 , . . . , Bk
p(x) =
Variance:
∩ B ) = P (A)P (B|A) = P (B)P (A|B ).
Var(X ) =
− F (a).
−∞
Multiplication law:
|
f (x) dx = F (b)
a
∞
Addition law:
P (Bi A) =
d F (x). dx
Expected value:
Probability
P (A) =
f (x) dx.
−∞
b
g r= . sx sy
P (A
≤ y) =
Evaluating Evaluating probabilities:
Sample correlation:
P (A
Density function:
n
1
−
This has mean λ and variance λ.
n
1 x= n
λx exp( λ) for x = 0 , 1, 2, . . . . x!
p(x) =
Sample mean:
− θ).
• •
| − |√
Calculate x, s2 , and t = x µ0 n/s. Obtain critical value from t-tables, df = n
− 1.
•
Reject H at the 100 p% level of significance if t > c,
||
where c is the tabulated tabulated value corresponding corresponding to column p.
Paired sample t-test
•
Calculate s2 = (n
•
y
100(1
− p)% confidence interval for the difference in population means i.e. µ − µ , is x
(x
Hypothesis: H : µ = 0.
− y)
√
y
± c
1 1 + n m
s2
.
Calculate x, s2 and t = x n/s.
Regression and correlation − 1. Reject H at the 100 p% level of significance if |t| > c, The linear regression model: Obtain critical value from t-tables, df = n
where c is the tabulated tabulated value corresponding corresponding to column p.
yi = α + βx i + zi .
Least squares estimates of α and β :
Two sample t-test
ˆ= β
Data: Two separate samples of measurements x1 , . . . , xn
and y1 , . . . , ym . Hypothesis: H : µx = µy .
Calculate x, sx2 , y , and sy2 . Calculate s2 = (n
− 1)s
2
x
+ (m
n i=1
xi yi n x y , (n 1)sx2
−
−
and α ˆ=y
− βˆ x.
Confidence interval for β
Method:
• •
− 1)s /(n + m − 2). • Look in t-tables, df = n + m − 2, column p. Let the
are the pairwise differences between the two original sets of measurements.
• • •
2
+ (m
x
tabulated value be c say.
Data: Single sample of n measurements x1 , . . . , xn which
Method:
2
− 1)s
2
− 1)s
y
x−y . 1 1 2 s + n m
/(n + m
− 2).
•
Calculate t =
• •
Obtain critical value from t-tables, df = n + m
− 2.
• •
ˆ as given previously. Calculate β
• •
ˆ) = Calculate SE (β
•
Calculate sε2 = sy2
− β ˆ s 2
2
x.
sε2 . (n 2)sx2
Look in t-tables, df = tabulated value be c.
− n − 2, column column p.
Let the the
ˆ c SE (β ˆ). 100(1 p)% confidence interval for β is β ).
−
±
Reject H at the 100 p% level of significance if t > c,
||
where c is the tabulated tabulated value corresponding corresponding to column p.
Test for ρ = 0 Hypothesis: H : ρ = 0.
CI for population mean
•
Calculate t=r
Data: Sample of measurements x1 , . . . , xn . Method:
• • •
2
Calculate x, sx . Look in t-tables, df = n tabulated value be c say.
− 1, column column p.
100(1 p)% confidence interval for µ is x
−
Let th the
• •
n 2 1 r2
−
1/2
−
.
Obtain critical value from t-tables, df = n
− 2. Reject H at 100 p% level of significance if |t| > c,
where c is the tabulated value value correspondi corresponding ng to column p.
± cs /√n. x
Approximate CI for proportion θ CI for difference in population means Data: Separate samples x1 , . . . , xn and y1 , . . . , ym . Method:
•
Calculate x, sx2 , y , sy2 .
p
± 1.96
p(1 n
− p) −1
where p is the observed proportion in the sample.
Test for a proportion Hypothesis: H : θ = θ0 .
• •
p θ0 . θ0 (1 θ0 ) n Obtain critical value from normal tables.
Test statistic z =
− −
Comparison of proportions Hypothesis: H : θ1 = θ2 .
•
Calculate
•
Calculate
p =
z=
n1 p1 + n2 p2 . n1 + n2 p1
p(1
•
p2
− −
p) n11 + n12
Obtain appropriate critical value from normal tables.
Goodness of fit Test statistic
m
χ2 =
i=1
(oi
2
−e ) i
ei
where m is the number of categories. Hypothesis H : F = F 0 .
• • • • •
Calculate the expected class frequencies under F 0 . Calculate the χ2 test statistic given above. Determine the degrees of freedom, ν say. Obtain critical value from χ2 tables, df = ν . Reject H : F = F 0 at the 100 p% level of significance if χ2 > c where c is the tabulated critical value.