Statistics Formula Sheet

Statistics formula sheet

This has mean nθ and variance nθ (1 The Poisson distribution:

Summarising data



sx =

n

1 n

(xi

−1

i=1

2

− x)

xi .

i=1

Sample variance: 2

=

Continuous distributions n

1 n



−1

2

xi

i=1

2

− nx



Distribution function: .

y

F (y) = P (X

Sample covariance: g=

n

−1



−

−

−1

n

 n

1

(xi x)(yi y ) =

i=1

xi yi

i=1

− nx y



.

f (x) =

P (a < X

≤ b) =



E (X ) = µ =

∪ B) = P (A) + P (B) − P (A ∩ B).

xf (x) dx.

∞

Var(X ) =



∞

(x

−∞

2

− µ) f (x) dx =

k



h(t) =

k

P (A

i=1

∩B )= i



P (A Bi )P (Bi ).

|

i=1

f (x) =

|

Discrete distributions

P (A Bi )P (Bi )



k i=1

|

P (A Bi )P (Bi )

|

.

(xi

Weibull density:

n x θ (1 x

− θ)

x

µ

2

σ

for x

∈ [−∞, ∞].

≥ 0.

xi ∈S

≥ 0.

Test for population mean 2

xi p(xi )

2

−µ .

Data: Single sample of measurements x1 , . . . , xn . Hypothesis: H : µ = µ0 . Method:

n−x

1 2

This has mean λ−1 and variance λ−2 .

The binomial distribution:



−  −   −

xi p(xi ).

− µ) p(x ) = i

−

f (t) = λ exp( λt) for t

Variance:

xi ∈S

f (t) . 1 F (t)

−



for x = 0 , 1, . . . , n .

2

−µ .

Exponential density:



2

−∞

f (t) = λκtκ−1 exp( λtκ ) for t

xi ∈S



1 exp 2πσ 2

√

Mean value: E (X ) = µ =

x2 f (x) dx

Normal density with mean µ and variance σ 2 :

Bayes’ formula: P (A Bi )P (Bi ) = P (A)



Hazard function:

Partition law: For a partition B1 , B2 , . . . , Bk

p(x) =



Variance:

∩ B ) = P (A)P (B|A) = P (B)P (A|B ).

Var(X ) =

− F (a).

−∞

Multiplication law:

|

f (x) dx = F (b)

a

∞

Addition law:

P (Bi A) =

d F (x). dx

Expected value:

Probability

P (A) =

f (x) dx.

−∞

b

g r= . sx sy

P (A

≤ y) =

Evaluating Evaluating probabilities:

Sample correlation:

P (A



Density function:

n

1

−

This has mean λ and variance λ.

n

1 x= n



λx exp( λ) for x = 0 , 1, 2, . . . . x!

p(x) =

Sample mean:

− θ).

• •

| − |√

Calculate x, s2 , and t = x µ0 n/s. Obtain critical value from t-tables, df = n

− 1.

•

Reject H at the 100 p% level of significance if t > c,

||

where c is the tabulated tabulated value corresponding corresponding to column p.

Paired sample t-test

•

Calculate s2 = (n



•

y

100(1

− p)% confidence interval for the difference in population means i.e. µ − µ , is x

(x

Hypothesis: H : µ = 0.

− y)

√

y

   ± c

1 1 + n m

s2

.

Calculate x, s2 and t = x n/s.

Regression and correlation − 1. Reject H at the 100 p% level of significance if |t| > c, The linear regression model: Obtain critical value from t-tables, df = n


yi = α + βx i + zi .

Least squares estimates of α and β :

Two sample t-test

ˆ= β

Data: Two separate samples of measurements x1 , . . . , xn

and y1 , . . . , ym . Hypothesis: H : µx = µy .

Calculate x, sx2 , y , and sy2 . Calculate s2 = (n



− 1)s

2

x

+ (m



n i=1

xi yi n x y , (n 1)sx2

−

−

and α ˆ=y

− βˆ x.

Confidence interval for β

Method:

• •



− 1)s /(n + m − 2). • Look in t-tables, df = n + m − 2, column p. Let the

are the pairwise differences between the two original sets of measurements.

• • •

2

+ (m

x

tabulated value be c say.

Data: Single sample of n measurements x1 , . . . , xn which

Method:

2

− 1)s

2

− 1)s

y

x−y . 1 1 2 s + n m



/(n + m

− 2).

•

Calculate t =

• •

Obtain critical value from t-tables, df = n + m

  

− 2.

• •

ˆ as given previously. Calculate β

• •

ˆ) = Calculate SE (β

•

Calculate sε2 = sy2

− β ˆ s 2



2

x.

sε2 . (n 2)sx2

Look in t-tables, df = tabulated value be c.

− n − 2, column column p.

Let the the

ˆ c SE (β ˆ). 100(1 p)% confidence interval for β is β ).

−

±

Reject H at the 100 p% level of significance if t > c,

||


Test for ρ = 0 Hypothesis: H : ρ = 0.

CI for population mean

•

Calculate t=r

Data: Sample of measurements x1 , . . . , xn . Method:

• • •

2

Calculate x, sx . Look in t-tables, df = n tabulated value be c say.

− 1, column column p.

100(1 p)% confidence interval for µ is x

−

Let th the

• •

n 2 1 r2

−

1/2

−

.

Obtain critical value from t-tables, df = n

− 2. Reject H at 100 p% level of significance if |t| > c,

where c is the tabulated value value correspondi corresponding ng to column p.

± cs /√n. x

Approximate CI for proportion θ CI for difference in population means Data: Separate samples x1 , . . . , xn and y1 , . . . , ym . Method:

•

Calculate x, sx2 , y , sy2 .

p



± 1.96

p(1 n

− p) −1

where p is the observed proportion in the sample.

Test for a proportion Hypothesis: H : θ = θ0 .

• •

p θ0 . θ0 (1 θ0 ) n Obtain critical value from normal tables.

Test statistic z =

 − −

Comparison of proportions Hypothesis: H : θ1 = θ2 .

•

Calculate

•

Calculate

p =

z=

n1 p1 + n2 p2 . n1 + n2 p1

p(1

•

p2

 − −

p) n11 + n12



Obtain appropriate critical value from normal tables.

Goodness of fit Test statistic

m

χ2 =

 i=1

(oi

2

−e ) i

ei

where m is the number of categories. Hypothesis H : F = F 0 .

• • • • •

Calculate the expected class frequencies under F 0 . Calculate the χ2 test statistic given above. Determine the degrees of freedom, ν say. Obtain critical value from χ2 tables, df = ν . Reject H : F = F 0 at the 100 p% level of significance if χ2 > c where c is the tabulated critical value.

Statistics Formula Sheet

Recommend Documents