10739308_10203914552004027_1698851518_n

Mathematical Statistics with Applications Student’s Solutions Manual

Kandethody M.Ramachandran Department of Mathematics and Statistics University of South Florida Tampa,FL

Chris P.Tsokos Department of Mathematics and Statistics University of South Florida Tampa,FL

AMSTERDAM • BOSTON • HEIDELBERG • LONDON NEW YORK • OXFORD • PARIS • SAN DIEGO SAN FRANCISCO • SINGAPORE • SYDNEY • TOKYO

Academic Press is an imprint of Elsevier

Elsevier Academic Press 30 Corporate Drive, Suite 400, Burlington, MA 01803, USA 525 B Street, Suite 1900, San Diego, California 92101-4495, USA 84 Theobald’s Road, London WC1X 8RR, UK Copyright © 2009, Elsevier Inc. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Permissions may be sought directly from Elsevier’s Science & Technology Rights Department in Oxford, UK: phone: (+44) 1865 843830, fax: (+44) 1865 853333, E-mail: [email protected]. You may also complete your request on-line via the Elsevier homepage (http://elsevier.com), by selecting “Customer Support” and then “Obtaining Permissions.” Library of Congress Cataloging-in-Publication Data Applications submitted ISBN 13: 978-0-08-096443-0

For all information on all Elsevier Academic Press publications visit our Web site at www.elsevierdirect.com Typeset by: diacriTech, India 09 10

9 8 7 6 5 4 3 2 1

Contents CHAPTER 1 Descriptive Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

CHAPTER 2 Basic Concepts from Probability Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

CHAPTER 3 Additional Topics in Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

21

CHAPTER 4 Sampling Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 CHAPTER 5 Point Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 CHAPTER 6 Interval Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 CHAPTER 7 Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 CHAPTER 8 Linear Regression Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 CHAPTER 9 Design of Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 CHAPTER 10 Analysis of Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

91

CHAPTER 11 Bayesian Estimation and Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 CHAPTER 12 Nonparametric Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 CHAPTER 13 Empirical Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 CHAPTER 14 Some Issues in Statistical Applications: An Overview . . . . . . . . . . . . . . . . . . 127

iii

This page intentionally left blank

Chapter

1

Descriptive Statistics EXERCISES 1.2 1.2.1.

The suggested solutions: For qualitative data we can have color, sex, race, Zip code and so on. For quantitative data we can have age, temperature, time, height, weight and so on. For cross section data we can have school funding for each department in 2000. For time series data we can have the crude oil price from 1995 to 2008.

1.2.3.

The suggested questions can be 1. What types of data the amount is? 2. Are these Federal Agency get same amount of money? If not, why? 3. Which Federal Agency should get more money? Why? The suggested inferences we can make is 1. These Federal Agency get different amount of money. 2. The differences of money between the Agencies are kind of big.

EXERCISES 1.3 1.3.1.

For stratified sample, we can say suppose we decide to sample 100 college students from the population of 1000 (that is 10% of the population). We know these 1000 students come from three different major, Math, Computer Science and Social Science. We have Math 200, CS 400 and SS 400 students. Then we choose 10% of each of them Math 20, CS 40 and SS 40 by using random sampling within each major. For cluster sample, we can say suppose we decide to sample some college students from the population of 2000. We know these 2000 students come from 20 different countries and we choose 3 out of the 20 countries by random sampling. Then we get all the individual information from each of the 3 countries.

1

2 CHAPTER 1 Descriptive Statistics


By minitab (a) Bar graph Bar graph for the percent of road mileage 35.00% 30.00%

C2

25.00% 20.00% 15.00% 10.00% 5.00% 0.00% Poor

Mediocre

Fair

Good

Very good

C1

(b) Pie chart Pie chart of the percent of road mileage Category Poor Very good Good Fair Mediocre

(a) Bar graph Bar graph 40.00% 30.00% C2

1.4.3.

20.00% 10.00% 0.00% Coal

Natural Nyclear Petrolium Renewable Gas Electric Power Energy C1

Student’s Solutions Manual 3

(b) Pareto chart 100

0.8

80

0.6

60

0.4

40

0.2

20 0

0.0 C1

Petrolium Natural Gas

Percentage Percent Cum %

0.40 40.0 40.0

Coal

0.23 23.0 63.0

0.22 22.0 85.0

Nyclear Renewable Electric Energy Power 0.08 0.07 8.0 7.0 93.0 100.0

(c) Pie chart Pie chart of species species Category Coal Natural Gas Nyclear Electric Power Petrolium Renewable Energy

(a) Bar graph Bar graph 6 5 4 Count

1.4.5.

3 2 1 0 A

B

C C1

D

F

Percent

Percentage

Pareto graph 1.0


(b) Pie chart Pie chart species Category A B C D F

(a) Pie chart Pie chart species Category Mining Construction Manufacturing Transportation Wholesale Retail Finance Services

(b) Bar graph Bar graph

C2

1.4.7.

8000 7000 6000 5000 4000 3000 2000 1000 0

M

in

in

C

g

on

ru st

ct

M

io

n

an

uf

t ac

ur

in

Tr

g

an

s

r po

ta

tio

n W

l ho

es

C1

al

e R

et

ai

l n Fi

an

ce r Se

vi

ce

s


1.4.9.

Bar chart Bar graph 80 70 60 C2

50 40 30 20 10 0 1900

1980 C1

1990

2000

(a) Bar graph Bar graph 300 250

C2

200 150 100 50

ea

te

H

D

ia

K rt P n idn e u ey m on ia St ro k Su e ic id e

s

r be

C

ce

an C

hr C

Ac

ci

de

nt s on ic

0

C1

(b) Pareto graph Pareto graph 100

700 600

80

500 400

60

300

40

200

20

100

0

0 C1

H Percentage Percent Cum %

ea

rt C

an

ce

r St

ro

ke

C Pn

e

um

on

ia Ac

d ci

en

ts D

i

e ab

te

s O

th

er

268.0 119.4 58.5 42.3 35.1 34.5 23.9 30.2 4.4 38.7 28.8 8.5 6.1 5.1 5.0 3.5 38.7 67.6 76.0 82.1 87.2 92.2 95.6 100.0

Percent

Percentage

1.4.11.

1960


1.4.13. Histogram 9 8

Frequency

7 6 5 4 3 2 1 0 60

75

80

90

C1

(a) Stem and leaf Stem-and-leaf of C1 N = 20 Leaf Unit = 10 1 4 7 3 4 99 8 5 00011 10 5 22 10 5 4455 6 5 6667 2 5 9 1 6 0 (b) Histogram Histogram 5 4 Frequency

1.4.15.

3 2 1 0 480

500

520

540 C1

560

580

600


(c) Pie chart Pie chart species

Category 475 493 499 502 503 506 510 517 525 526 542 546 553 558 565 568 572 595 605


Mean is 165.6667 and standard deviation is 63.15397

1.5.3.

Data is 3,3,5,13 and standard deviation is 4.760952

1.5.5.

(a) lower quantiles is 80, median is 95, upper quantiles is 115 and inter quantile range is 35. The lower limit of outliers is 27.5 and upper limit of outliers is 167.5. (b) The box plot is

(c) Therefore there are no outliers.


1.5.7. l

fi (mi − x¯ ) =

i=1

1.5.9.

l

fi (mi ) −

i=1

l

x¯ = n¯x − n¯x = 0

i=1

(a) Mean is 33.105, variance is 177.0430 and range is 48.19. (b) Lower quantile is 24.9225, median is 32 and upper quantiles is 42.985. The inter quantile range is 18.0625. The lower limit of outliers is −2.17125 and upper limit of outliers is 70.07875. Therefore there are no outliers. (c)

(d) Histogram of y 8

Frequency

6

4

2

0 0

10

20

30 y

40

50

60

1.5.11.

(a) Mean is 110, standard deviation is 83.4847. (b) 68%, 95%, 99.7%.

1.5.13.

(a) Mean is 3.7433, variance is 3.501 and standard deviation is 1.871323.


(b) Frequency table Class 1 2 3 4 5

Interval 0–1.6 1.7–3.3 3.4–5 5.1–6.7 6.8–8.4

Frequency 4 10 9 5 2

mi .8 2.5 4.2 5.9 7.6

Mifi 3.2 25 37.8 29.5 15.2

(c) By grouped data, Mean is 3.69, variance is 3.62 and standard deviation is 1.9. The results are similar to the none grouped data. L = 25

1.5.15.

fm = 139615 w=4 Fb = 178859 n = 514661 w M=L+ (.5n − Fb ) = 27.24822. fm

1.5.17.

(a) Mean is 44.27, variance is 536.15 and standard deviation is 23.15. (b) L = 40 fm = 59 w = 19 Fb = 69 n = 180 M=L+

w (.5n − Fb ) = 46.763. fm

EXERCISES 1.8 (a)

Histogram of y 20 15 Frequency

1.8.1.

10 5 0 66

68

70

72

74

y

76

78

80


(b) Mean is 74.0625, median is 74, variance is 7.223892 and standard deviation is 2.68773. (c)

The lower limit of outliers is 66 and upper limit of outliers is 82. Therefore we have no outlier.

Chapter

2

Basic Concepts from Probability Theory EXERCISES 2.2 2.2.1.

(a) S = {(R, R, R), (R, R, L), (R, L, R), (L, R, R), (R, L, L), (L, R, L), (L, L, R), (L, L, L)} (b) P = (c) P = (d) P = (e) P =

2.2.3.

(a) P = (b) P = (c) P =

2.2.5. 2.2.7.

(a) Probability space is S = {(1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6), (3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6), (4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6), (5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6), (6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6)} (c) P =

2.2.11.

5 36 5 36 4 8 3 8

(d) P = P A B = {(H, H), (H, T ), (T , H)} .

(b) P = 2.2.9.

7 8 4 8 3 8 2 8

6 36 1 36

(a) Probability space is S = {N, N, N, S, S} N stands for normal and S stands for spoiled. 6 (b) P = 35 × 24 = 20 (c) No more than one means no or just one. P = 35 × 24 + 2 × 25 × 34 = .9 P = p + 2q

11

12 CHAPTER 2 Basic Concepts from Probability Theory

2.2.13.

(a) Since A ⊂ B then let A ∪ Ac = B, we know P(A) + P(Ac ) = P(B) by the axiom 3. Since P(Ac ) ≥ 0 by axiom 1 we know P(A) + P(Ac ) ≥ P(A) then P(A) ≤ P(B). (b) Let C = A ∩ B, A = Aa ∪ C and B = Ba ∪ C, by axiom 3 we know P(A ∪ B) = P(Aa ) + P(Ba ) + P(C). Since P(A) = P(Aa ) + P(C) and P(B) = P(Ba ) + P(C) by axiom 3 again, we know P(Aa ) = P(A) − P(C) and P(Ba ) = P(B) − P(C). Plug them back in previous equation P(A ∪ B) = P(Aa ) + P(Ba ) + P(C) and we get the following equation P(A ∪ B) = P(A) + P(B) − P(C) = P(A) + P(B) − P(A ∩ B). If A ∩ B = φ then by axiom 1 we know P(A ∩ B) = 0 and just plug in we complete the proof.

2.2.15.

(a) From 2.2.13 we know P(A ∪ B) = P(A) + P(B) − P(A ∩ B) and from axiom 2 we know P(A ∪ B) ≤ 1, we can see that P(A) + P(B) − P(A ∩ B) ≤ 1 and that complete the proof P(A) + P(B) − 1 ≤ P(A ∩ B). (b) From 2.2.13 we know P(A1 ∪ A2 ) = P(A1 ) + P(A2 ) − P(A1 ∩ A2 ). From axiom 1 we know P(A1 ∩ A2 ) ≥ 0 it means −P(A1 ∩ A2 ) ≤ 0 and we can get the following inequality P(A1 ∪ A2 ) = P(A1 ) + P(A2 ) − P(A1 ∩ A2 ) ≤ P(A1 ) + P(A2 ). P = .24 + .67 − .09 = .82 P = 1 − .82 = .18 P = 1 − .09 = .91 P = 1 − .09 = .91 P = 1 − .82 = .18

2.2.17.

(a) (b) (c) (d) (e)

2.2.19.

(a) P = .55 (b) P = .3 (c) P = .7

2.2.21.

(a) P = (b) P (c) P (d) P

2.2.23.

3 5 3 5

2 2 1 4 + 5 × 4 = .4 = × 24 + 35 × 24 = .6 = 2 × 35 × 24 + 25 × 14 = = 35 × 24 = .3

×

.7

Without loss of generality let us assume An is increasing sequence then A1 ⊂ A2 ⊂ . . . ⊂ ∞

An ⊂ . . .. We know that if A1 ⊂ A2 ⊂ . . . ⊂ An ⊂ . . . then A1 ∪ A2 ∪ . . . ∪ An ∪ . . . = ∪ Ai = ∞

i=1

limn→∞ An . From the condition we know lim An = ∪ Ai and if we take probability on i=1 n→∞ ∞ both sides then lim P(An ) = P ∪ Ai = P lim An n→∞


(a) 45 (b) 1

i=1

n→∞


(c) 10 (d) 5400 (e) 2520 2.3.3.

1024

2.3.5.

53130

2.3.7.

155117520

2.3.9.

440

2.3.11.

(a) p = .4313 × .44425 × .46798 × .5263 × 1 = .04719 (b) p = .0001189 (c) p = .4313 × .21419 × .10344 × 1 × 1 = .009557

2.3.13.

180

2.3.15.

(a) p = 1 −

365 × 364 × ... (365 − 20 + 1) 36520

= .4114

(b) p = 1 − .2936 = .7063 (c) If n = 23 then p = .4927 2.3.17.

p = 1 − .27778 − .16667 = .5556

2.3.19.

(a) (b) (c) (d)

2.3.21.

The question is asking when the cell does the splitting to produce a child. There will be a cell with half of the chromosomes. According to this understanding we have (a) 223

7776 3.954 × 1021 5.36447 × 1028 3.022285 × 1012

(b)

23 9 223

= .097416


(a) .999 (b) 13

2.4.3.

(a) P(A|B) + P(Ac |B) =

P(A∩B) P(B) P(A|Bc ) =

+

P(Ac ∩B) P(B)

P(B|A)×P(A)+P(B|Ac )×P(Ac ) = P(B) P(B) P(B) = 1 c know P(A|B ) = 1 − P(A|B) = P(Ac |B)

=

(b) (i) if P(A|B) + 1 then we means A and B are symmetric in probability. But it is not always true.

that

(ii) if P(A|B) + P(Ac |Bc ) = 1 then we know P(Ac |Bc ) = 1 − P(A|B) = P(Ac |B) That means B and Bc ’s conditional probability are same, which is same as A and B are independent. But that is not always true.


2.4.5.

If A and B are independent then P(A ∩ B) = P(A) × P(B) (i) P(Ac ∩B) = P(B)−P(A∩B) = P(B)−P(A)×P(B) = (1−P(A))(P(B)) = P(Ac )×P(B) then we know Ac and B are independent. (ii) According to (i) just switch A and B and we can prove it. (iii) P(Ac ∩ Bc ) = P(Bc ) − P(A ∩ Bc ) = P(Bc ) − P(A) × P(Bc ) = (1 − P(A))(P(Bc )) = P(Ac ) × P(Bc ).

2.4.7.

P(E|F ) =

2.4.9.

.1948

1 13

=

4 52

= P(E) then E and F are independent

2.4.11.

(a) P = .031125 (b) P = .06

2.4.13.

.8

2.4.15.

(a) P(a dime is selected) = = =

12

P(a dime is selected|box i is selected)P(box i is selected)

i=2 12

i 12 P(the sum of dies = i) i=2 2(1)+3(2)+4(3)+5(4)+6(5)+7(6)+8(5)+9(4)+10(3)+11(2)+12(1) 12(36)

= .583333 (b) P(box 4 is selected|a penny is selected) =

P(box 4 is selected & a penny is selected) (3/36)(8/12) = P(a penny is selected) 1 − P(a dime is selected)

= .13333

2.4.17.

P = .6575

2.4.19.

P = .60976

2.4.21.

(a) P(Accident rate) = .25 × .086 + .257 × .044 + .347 × .056 + .146 × .098 = .066548 (b) P(gourp4|Accident) =

.146×.098 .0665

= .215

2.4.23.

P = .16667

2.4.25.

P(Working) = P(B, C) + P(A, B, notC) + P(A, notB, C) = [1 − P(notB) − P(notC) − P(notB, notC)] + P(A)P(B|notC)P(notC) + P(A) [P(notB) − P(notB, notC)] = [1 − 0.1 − 0.05 − 0.75 ∗ 0.05] + 0.85 ∗ 0.25 ∗ 0.05 + 0.85 ∗ [0.1 − 0.75 ∗ 0.05] = 0.8875 + 0.010625 + 0.053125 = 0.95

2.4.27.

(b) P(type is O|type is B) = 2.4.29.

18 17 16 40 × 39 + 40 18 39 = .4615

(a) P(same type of blood) =

×

15 39

+

4 40

×

3 39

+

2 40

Let E denote A ends up with all the money when he starts with i.

×

1 39

= .358974


Let F denote A ends up with all the money when he starts with N − i. For A starts with N − i means B starts with i because N is the total money A and B has so if we got P(E) then P(F ) = 1 − P(E). Let H denote the event that the first flip lands heads and p denote the probability to have H on the first flip. P(E) = P(E|H)P(H) + P(E|H C )P(H C ) This probability represents that A gets a head and combined with the probability if B win the first coin. Now we let P(E) = P(E|H)p + P(E|H C )(1 − p) = Pi and define this as the first round. New, given that the first flip lands heads, the situation after the first bet is that A has i + 1 units and B has N − (i + 1) units. Since the successive flips are assumed to be independent with a common probability p of heads, it follows that, from that point on, A’s probability of winning all the money is exactly the same as if the game were just starting with A having an initial fortune of i + 1 and B having an initial fortune of N − (i + 1) Therefore, P(E|H) = Pi+1 and P(E|H C ) = Pi+1 . Let q to be 1 − p, Pi = pPi+1 + qPi−1

i = 1, 2, . . . , N − 1

By applying the condition that P0 = 0 and PN = 1 Pi = 1Pi = (p + q)Pi = pPi+1 + qPi−1 Pi+1 − Pi =

q (Pi − Pi−1 ) p

i = 1, 2, . . . , N − 1

After plug in i We got q q (P1 − P0 ) = P1 p p 2 q q P3 − P2 = (P2 − P1 ) = P1 p p P2 − P1 =

q PN − PN−1 = (PN−1 − PN−2 ) = p

If pq = 1 Then P2 − P1 = P1 and P2 = 2P1 P3 = 3P1 PN = NP1 PN = 1

N−1 q P1 p


Which means P1 = Therefore Pi = Ni If

q p

1 N

⎛ ⎛ q N−1 ⎞ q N ⎞ q − q ⎝1 − p p ⎠ = P1 ⎝ p ⎠ ≺ 1, then PN − P1 = 1 − P1 = P1 p 1 − pq 1 − pq ⎛

⎞

⎛

⎞

N N q 1 − pq − pq p ⎠ 1 = P1 ⎝ + 1⎠ = P1 ⎝ 1 − pq 1 − pq ⎛

⎞ 1 − pq P1 = ⎝ N ⎠ 1 − pq

Add all equations, we got PN − P1 = P1

q + p

2 N−1 q q + ··· + p p

Add first i − 1 of all equations, we got Pi − P 1 = P1

⎛ i−1 ⎞ q 2 i−1 1 − p q q q⎜ q ⎟ + + ··· + = P1 ⎝ ⎠ p p p p 1 − pq

⎛

⎞⎛ i ⎞ ⎛ i ⎞ q q q 1 − 1 − 1 − p p p ⎟⎜ ⎜ ⎟ ⎜ ⎟ Pi = ⎝ ⎠=⎝ N ⎠ ⎝ N ⎠ 1 − pq q q 1− p 1− p

Then if we start with N − i, just replace p by q and i by N − i. ⎛

N−i ⎞ 1 − pq ⎜ ⎟ p Qi = ⎝ N ⎠ if ≺ 1 q p 1− q Qi =

N −i p if = 1 N q

Pi + Qi = 1


(a) c = e−λ (b) P = e−λ (c) P = 1 − e−λ − λe−λ −

λ2 e−λ 2


2.5.3.

2.5.5.

2.5.7.

⎧ ⎪ ⎪ 0, ⎪ ⎪ ⎪ ⎪ ⎨.2,

F (x) =

.3, ⎪ ⎪ ⎪ .7, ⎪ ⎪ ⎪ ⎩ 1, ⎧ 0, ⎪ ⎪ ⎪ ⎨.2, p(x) = ⎪ .6, ⎪ ⎪ ⎩ .2, (a) c =

where

x ≺ −5

where

−5 ≤ x ≺ 0

where

0≤x≺3

where

3≤x≺6

where

6≤6

where

−∞ ≺ x ≤ −1

where

−1 ≺ x ≤ 3

where

3≺x≤9

where

x≥9

1 9

(b) P = .7037 ⎧ ⎪ ⎨0, (c) F (x) = 2.5.9. 2.5.11.

2.5.13.

f (x) =

1 3 x , ⎪ 27

⎩

1,

0, 2x , (1+x)2

where

x≺0

where

0≤x≤3

where

x3

x≤0 x0

where where

p = .7013 − .55809 = .1432 p = .7364 0, where f (t) = β(t−γ)β−1 (t−γ)β e− α , where α

t≺0 0≤t


m(t) = 16 et + e2t + e3t + e4t + e5t + e6t Var(x) = 2.9167

2.6.3.

(a) E(Y ) = 3.6 E(Y 2 ) = 17.2 E(Y 3 ) = 95.3 VAR(Y ) = 4.24 (b) My (t) = e−t × .1 + .05 + e2t × .25 + e5t × .4 + e6t × .2

2.6.5.

E(X) =

x

xP(X = x) =

2.6.7.

a = 12 b=1

2.6.9.

(a) E(c) =

∞

n=1

2n P(X = 2n ) =

∞

n=1

2n 21n =

cf (x) = c

(b) E(cg(x)) = cg(x)f (x) = c g(x)f (x) = cE(g(x))

∞

n=1

1=∞


(c) E

gi (x) = gi (x)fi (x) = g(x)f (x) = E(gi (x))

i

i

i

i

(d) V (ax + b) = E((ax + b) − E(ax + b))2 = E(ax + b − aE(x) − b)2 = E(a(x − E(x))2 = a2 V (x) Plug in b = 0 get another one. 2.6.11.

E(X) = c × 1 + 0 = c V (X) = Ex2 − (E(x))2 = c2 − c2 = 0 CDF is F (X) = ε(x − c) where ε is indicator function

2.6.13.

Mx (t) = eλ(e −1) t E(X) = Mx (0) = λet eλ(e −1) |t=0 = λ t t E(X2 ) = Mx (0) = λet eλ(e −1) + (λet )2 eλ(e −1) |t=0 = λ + λ2 V (X) = E(X2 ) − (E(x))2 = λ + λ2 − λ2 = λ

2.6.15.

1 x (a) p = x+1 let q = 1 − p = x+1 where x start from 0 to infinity means number of failures st before the 1 success. Therefore the total number of trails is x + 1. ∞ ∞ ∞

x+1 E(x) = xp(1 − p)x = p xp(q)x q 1q = qp (q)x = qpp−2 = pq by negax x=0 x=0 x=0 tive binomial

t

Another way to prove is ∞

E(x) ==

xp(1 − p)x =p

x=0 ∞

d

E(x2 ) =

∞

(q)x

dq

x2 p(1 − p)x =pq

∞

d

∞

x2 (q)x−1 = pq

x=0

= pq

∞

d p1 (xpqx ) x=0

= pq

dq

dq q

x=0

∞ d(xq x ) dq

x=0

(xqx )

∞ d(q)x dq

1 d 1−q 1 pq q = pq = pq = 2 = 2 dq p (1 − q) p

x=0

1 E(x) d 1−q = pq dq

1 q d (1−q)2 d 1−q p 1 − 2q + q2 + 2q − 2q2 pq(1 − q2 ) pq − pq3 = pq = pq = = dq dq (1 − q)4 p4 p4

V (x) = E(x2 ) − (E(x))2 = =

x=0

x=0

= pq

x(q)x−1 q = pq

x=0

= pq

∞

pq − pq3 q2 pq − pq3 − p2 q2 pq − pq2 (q + p) − 2 = = 4 4 p p p p4

pq(1 − q) p2 q q = 4 = 2 4 p p p


2.6.17. 2.6.19. 2.6.21.

∞ ∞

p p (b) Mx (t) = ext p(1 − p)x = p (et q)x = 1−e t q = 1−(1−p)et x=0 x=0 1 When et q ≺ 1 and that is et q ≺ 1 take ln both sides give us ln(et ) ≺ ln 1−p That is when t ≺ − ln(1 − p). 2 1 3 2 2 −3 1 3 E(x) = 0 x(x2 )dx + 1 x 6x−2x dx + 2 x (x−3) 2 2 dx = 8 + 1 + 8 = 1.5 ∞ 0 −1 Mx (t) = 0 ext 12 e−x dx + −∞ ext 21 ex dx = (t+1)(t−1) ∞ ∞ ∞ (t−α)y 1 1 My (t) = 0 ety αe−αy dy = α 0 ety−αy dy = α (t−α) d(t − α)y = α (t−α) e(t−α)y |∞ 0 = 0 e −1 α α (t−α) = (α−t) t ≺ α and 0 ≺ α α and MGF uniquely define the PDF therefore we know that x Since My (t) = Mx (t) = (α−t) has the⎧ same distribution as y and ⎨αe−αy , 0 ≺ α and 0 ≺ x g(x) = ⎩0, otherwise


Chapter

3

Additional Topics in Probability EXERCISES 3.2 3.2.1.

(a) P(X = 7) =

10 7 10−7 7 (0.5) (0.5)

= 120(0.5)7 (0.5)3 = 0.117 (b) P(X ≤ 7) = 1 − P(8) − P(9) − P(10) = 1 − 0.044 − 0.010 − 0.001 = 0.945 (c) P(X > 0) = 1 − P(0) = 1 − 0.001 = 0.999 (d)

E(X) = 10(0.5) = 5 Var(X) = 10(0.5)(1 − 0.5) = 2.5

3.2.3.

(a) P(Z > 1.645) = 0.05, so z0 = 1.645. (b) P(Z < 1.175) = 0.88, so z0 = 1.175. (c) P(Z < −1.28) = 0.10, so z0 = −1.28.

3.2.5.

(d) P(Z > −1.645) = 0.95, so z0 = −1.645. 20 − 10 (a) P(X ≤ 20) = P Z ≤ = P(Z ≤ 2) = 0.9772 5 5 − 10 (b) P(X > 5) = P Z > = P(Z > −1) = 0.8413 5 12 − 10 15 − 10 (c) P(12 ≤ X ≤ 15) = P ≤Z≤ 5 5 = P(0.4 ≤ Z ≤ 1) = P(Z ≤ 1) − P(Z < 0.4) = 0.8413 − 0.6554 = 0.1859

21 Mathematical Statistics with Applications Copyright © 2009 by Academic Press, Inc. All rights of reproduction in any form reserved.

22 CHAPTER 3 Additional Topics in Probability

(d) P(|X − 12| ≤ 15) = P(−15 ≤ X − 12 ≤ 15) = P(−3 ≤ X ≤ 27) −3 − 10 27 − 10 =P ≤Z≤ 5 5 = P(−2.6 ≤ Z ≤ 3.4) = P(Z ≤ 3.4) − P(Z < −2.6) = 0.9997 − 0.0047 = 0.9950 3.2.7.

Let X = the number of people satisfied with their health coverage, then n = 15 and p = 0.7. 10 15−10 (a) P(X = 10) = 15 10 (0.7) (0.3) = 3003(0.7)10 (0.3)5 = 0.206 There is a 20.6% chance that exactly 10 people are satisfied with their health coverage. (b) P(X ≤ 10) = 1 − P(X = 11) − P(X = 12) − P(X = 13) − P(X = 14) − P(X = 15) = 0.515 There is a 51.5% chance that no more than 10 people are satisfied with their health coverage. (c) E(X) = np = 15(0.7) = 10.5

3.2.9.

Let X = the number of defective tubes in a certain box of 400, then n = 400 and p = 3/100 = 0.03. 400 (a) P(X = r) = (0.03)r (0.97)400−r r

400 (b) P(X ≥ k) = 400 (0.03)i (0.97)400−i i=k i (c) P(X ≤ 1) = P(X = 0) + p(X = 1) = 0.0000684 (d) Part (c) shows that the probability that at most one defective is 0.0000684, which is very small.

3.2.11.

p(x) =

e−λ λx x!

≥ 0 since λ > 0 and x ≥ 0.

Since each p(x) ≥ 0, then p(x) ≤

x p(x) =

1 2

e−λ λx

λx −λ −λ 1 + λ + λ + · · · = e = e x x x! x! 1! 2!

= e−λ (eλ ) = 1 here we apply Taylor’s expansion on eλ . This shows that p(x) ≤ 0 and 3.2.13.

x p(x)

= 1.

The probability density function is given by ⎧ ⎨1 , f (x) = 10 ⎩0,

0 ≤ x ≤ 10 otherwise


Hence, 9 P(5 ≤ X ≤ 9) =

1 dx = 0.4. 10

5

Hence, there is a 40% chance that a piece chosen at random will be suitable for kitchen use. 3.2.15.

The probability density function is given by ⎧ ⎨ 1 , f (x) = 100 ⎩0,

0 ≤ x ≤ 100 otherwise

80 1 60 100 dx = 0.2. 100 1 (b) P(X > 90) = 90 dx = 0.1. 100 (c) There is a 20% chance that the efficiency is between 60 and 80 units; there is 10% chance that the efficiency is greater than 90 units. (a) P(60 ≤ X ≤ 80) =

3.2.17.

Let X = the failure time of the component. And X follows exponential distribution with rate 0.05. Then the p.d.f. of X is given by f (x) = 0.05e−0.05x , x > 0.

Hence, 10 R(10) = 1 − F (10) = 1 − 0.05e−0.05x dx = 1 − 1 − e−0.5 = e−0.5 = 0.607. 0

3.2.19.

The uniform probability density function is given by f (x) = 1,

0 ≤ x ≤ 1.

Hence, P(0.5 ≤ X ≤ 0.65 and X ≤ 0.75) P(X ≤ 0.75) 0.65 1dx P(0.5 ≤ X ≤ 0.65) 0.15 = 0.5 = = 0.2 = 0.75 P(X ≤ 0.75) 0.75 1dx

P(0.5 ≤ X ≤ 0.65|X ≤ 0.75) =

3.2.21.

First, find z0 such that P(Z > z0 ) = 0.15.

0

P(Z > 1.036) = 0.15, so z0 = 1.036. x0 = 72 + 1.036 · 6 = 78.22 The minimum score that a student has to get to an “A” grade is 78.22.


3.2.23.

3.2.25.

3.2.27.

1.9 − 1.96 2.02 − 1.96 ≤Z≤ = P(−1.5 ≤ Z ≤ 1.5) = 0.866 0.04 0.04 P(X < 1.9 or X > 2.02) = 1 − P(1.9 ≤ X ≤ 2.02) = 0.134 13.4% of the balls manufactured by the company are defective. 125 − 115 (a) P(X > 125) = P Z > = P(Z > 1) = 0.16 10 95 − 115 = P(Z < −2) = 0.023 (b) P(X < 95) = P Z < 10 (c) First, find z0 such that P(Z < z0 ) = 0.95. P(Z < 1.645) = 0.95, so z0 = 1.645. x0 = 115 + 1.645 · 10 = 131.45 (d) There is a 16% chance that a child chosen at random will have a systolic pressure greater than 125 mm Hg. There is a 2.3% chance that a child will have a systolic pressure less than 95 mm Hg. 95% of this population have a systolic blood pressure below 131.45. P(1.9 ≤ X ≤ 2.02) = P

First find z1 , z2 and z3 such that P(Z > z1 ) = 0.2, P(Z > z2 ) = 0.5 and P(Z > z3 ) = 0.8

Using standard normal table, we can find that z1 = −0.842, z2 = 0 and z3 = 0.842. Then y1 = 0 + (−0.842) · 0.65 = −0.5473 ⇒ x1 = exp(y1 ) = 0.58, similarly we can obtain x2 = 1 and x3 = 1.73. For the probability of surviving 0.2, 0.5 and 0.8 the experimenter should choose doses 0.58, 1 and 1.73, respectively. ∞ 3.2.29.

(a) MX (t) = E(etX ) =

etx 0

1 = (α)βα

∞

1 xα−1 e−x/β dx (α)βα

1 − βt xα−1 exp − x dx β

0

=

1 (α)βα

∞ 0

1 = (α)βα

β u 1 − βt

β 1 − βt

α−1

e−u

β 1 − βt du by letting u = x with 1 − βt > 0 1 − βt β

α ∞ uα−1 exp(−u)du 0

note that the integrand is the kernel density of (α, 1) α β 1 (α) · 1α = (α)βα 1 − βt = (1 − βt)−α when t < β1 .

Student’s Solutions Manual 25 (0) = −α(1 − βt)−α−1 (−β) (b) E(X) = MX = αβ, and t=0 (2) E(X2 ) = M (0) = d [ −α(1 − βt)−α−1 (−β)] X

dt

t=0

= α(α + 1)(1 − βt)−α−2 (−β)2 t=0 = α(α + 1)β2 Then Var(X) = E(X2 ) − E(X)2 = α(α + 1)β2 − (αβ)2 = αβ2 . 3.2.31.

(a) First consider the following product ∞ (α)(β) =

∞

uα−1 e−u du

0

0

∞ =

x 0 ⎛

=⎝

2(α−1) −x2

e

∞

2xdx

y2(β−1) e−y 2ydv by letting u = x2 and v = y2 2

0

= ⎝2 ⎛

vβ−1 e−v dv

⎞⎛

∞ |x|

2α−1 −x2

e

0

dx⎠ ⎝2 ⎞⎛

∞

−x2

|x|2α−1 e

dx⎠ ⎝

−∞

∞ ∞ =

⎞

∞ |y|

2β−1 −y2

e

dy⎠

0

noting that the integrands are even functions

⎞

∞

−y2

|y|2β−1 e

dy⎠

−∞

|x|2α−1 |y|2β−1 e−(x

2 +y2 )

dxdy

−∞ −∞

Transforming to polar coordinates with x = r cos θ and y = r sin θ (α)(β) =

2π∞ 2 |r cos θ|2α−1 |r sin θ|2β−1 e−r rdrdθ 0 0

=

∞ 2π 2 r 2α+2β−2 e−r rdr (cos θ)2α−1 (sin θ)2β−1 dθ 0

0

⎛ =⎝

1 2

⎞ ⎞ ⎛ π/2 ⎜ ⎟ sα+β−1 e−s ds⎠ ⎝4 (cos θ)2α−1 (sin θ)2β−1 dθ ⎠ by letting s = r 2

∞ 0

0

π/2 = (α + β)2 (cos θ)2α−1 (sin θ)2β−1 dθ 0

1 = (α + β)2 0

1 t α−1/2 (1 − t)β−1/2 √ dt by letting t = cos2 θ 2 t(1 − t)


1 t α−1 (1 − t)β−1 dt

= (α + β) 0

= (α + β)B(α, β)

Hence, we have shown that B(α, β) =

1 (b) E(X) = B(α, β)

1 x·x

α−1

(1 − x)

β−1

(α)(β) . (α + β)

B(α + 1, β) dx = B(α, β)

0

1 0

x(α+1)−1 (1 − x)β−1 dx B(α + 1, β)

B(α + 1, β) (α + 1)(β) (α + β) = ·1= B(α, β) (α + β + 1) (α)(β) =

E(X2 )

α(α)(β) (α + β) α = , and (α + β)(α + β) (α)(β) α+β

1 = B(α, β)

1 2

x ·x

α−1

(1 − x)

β−1

B(α + 2, β) dx = B(α, β)

0

1 0

x(α+2)−1 (1 − x)β−1 dx B(α + 2, β)

B(α + 2, β) (α + 2)(β) (α + β) = ·1= B(α, β) (α + β + 2) (α)(β) α(α + 1)(α)(β) (α + β) α(α + 1) = . (α + β)(α + β + 1)(α + β) (α)(β) (α + β)(α + β + 1) 2 α(α + 1) αβ α = Then Var(X) = E(X2 )−E(X)2 = − α+β . (α + β)(α + β + 1) (α + β)2 (α + β + 1) =

3.2.33.

In this case, the number of breakdowns per month can be assumed to have Poisson distribution with mean 3. −3 1

(a) P(X = 1) = e 1!3 = 0.1494. There is a 14.94% chance that there will be just one network breakdown during December. (b) P(X ≥ 4) = 1 − P(0) − P(1) − P(2) − P(3) = 0.3528. There is a 35.28% chance that there will be at least 4 network breakdowns during December. 7 −3 x

e 3 (c) P(X ≤ 7) = x! = 0.9881. There is a 98.81% chance that there will be at most 7 x=0

network breakdowns during December. 4 3.2.35.

(a) P(1 < X < 4) = 1

1 x2−1 e−x/1 dx = (2) · 12

4 = −xe−x 1 − = 0.6442

4 1

4

xe−x dx

1

(−e−x )dx by integration by parts


The probability that an acid solution made by this procedure will satisfactorily etch a tray is 0.6442. 4 4 4 1 1 1−1 −x/2 (b) P(1 < X < 4) = x e dx = e−x/2 dx = − e−x/2 1 = 0.4712. 1 (1) · 2 2 1

1

The probability that an acid solution made by this procedure will satisfactorily etch a tray is 0.4712.


(a) The joint probability function is

8 x

P(X = x, Y = y) =

6 y

10 4−x−y , 24 4

where 0 ≤ x ≤ 4, 0 ≤ y ≤ 4, and 0 ≤ x + y ≤ 4.

8 3

6 0

10 4−3−0 (b) P(X = 3, Y = 0) = = 0.053. 24 4 ⎛ ⎞⎛ ⎞⎛ ⎞ 8 6 10 ⎝ ⎠⎝ ⎠⎝ ⎠ 2 2 x 1 4−x−1

⎛ ⎞ (c) P(X < 3, Y = 1) = P(X = x, Y = 1) = = 0.429. 24 x=0 x=0 ⎝ ⎠ 4 (d)

y x

0

1

2

3

4

Sum

0

0.020

0.068

0.064

0.019

0.001

0.172

1

0.090

0.203

0.113

0.015

2

0.119

0.158

0.040

3

0.053

0.032

4

0.007

Sum

0.289

1 1

1 1 f (x, y)dxdy =

3.3.3. −1 −1

0.317 0.085 0.007

0.461

0.217

0.034

⎛ c(1 − x)(1 − y)dxdy = c⎝

−1 −1

=c x−

0.421

1 x2 2 −1

y−

1 y2 2 −1

1

−1

= 4c

0.001 ⎞⎛

(1 − x)dx⎠ ⎝

1

−1

1.00 ⎞

(1 − y)dy⎠

28 CHAPTER 3 Additional Topics in Probability 1 1 Thus, if c = 1/4, then −1 −1 f (x, y)dxdy = 1. And we also see that f (x, y) ≥ 0 for all x and y. Hence, f (x, y) is a joint probability density function. 3.3.5.

3.3.7.

By definition, the marginal pdf of X is given by the row sums, and the marginal pdf of Y is obtained by the column sums. Hence, xi

−1

3

5

otherwise

fX (xi )

0.6

0.3

0.1

0

yi

−2

0

1

4

otherwise

fY (yi )

0.4

0.3

0.1

0.2

0

From Exercise 3.3.5 we can calculate the following. P(X = −1|Y = 0) =

3.3.9.

0.1 P(X = −1, Y = 0) = = 0.33. fY (0) 0.3

(a) The marginal of X is 2 fX (x) =

2 f (x, y)dy =

x

x

4 8 xydy = (4x − x3 ), 1 ≤ x ≤ 2. 9 9

⎛ 2 ⎞ 1.75 8 4 ⎝ xydy⎠ dx = 4x − x3 dx 9 9

1.75

(b) P(1.5 < X < 1.75, Y > 1) =

x

1.5

1.5

1.75 x4 4 2x2 − = 0.2426. = 9 4 1.5 3.3.11.

Using the joint density in Exercise 3.3.9 we can obtain the joint mgf of (X, Y ) as 2 2 8 M(X,Y ) (t1 , t2 ) = E(et1 X+t2 Y ) = et1 x+t2 y xydydx 9 1 x ⎛ 2 ⎞ 2 2 1 8 t1 x ⎝ 8 t1 x x t2 y = xe e ydy⎠ dx = xe K − et2 x + 2 et2 x dx 9 9 t2 t2 x

1

where K

=

e2t2 t22

8 = K 9

1

(2t2 − 1) 2 t1 x

xe 1

8 dx − 9t2

2

2 (t1 +t2 )x

x e 1

2 x t1 x 1 8 e − 2 et1 x = K 9 t1 t1 1

8 dx + 2 9t2

2 1

xe(t1 +t2 )x dx

Student’s Solutions Manual 29 2 x2 (t1 +t2 )x 2x 2 (t +t )x (t +t )x 1 2 1 2 e − e + e t1 + t2 (t1 + t2 )2 (t1 + t2 )3 1 2 8 1 x + 2 e(t1 +t2 )x − e(t1 +t2 )x (t1 + t2 )2 9t2 t1 + t2 1 8 − 9t2

After simplification we then have t1 + 3t2 − t12 − 3t22 − 4t1 t2 + t12 t2 + 2t1 t22 + t23 t1 +t2 (2t2 − 1)(1 − t1 ) t1 +2t2 e + e t22 (t1 + t2 )3 t12 t22 −t1 − 3t2 + 2t12 + 6t22 + 8t1 t2 − 4t12 t2 + −8t1 t22 − 4t23 + t22 (t1 + t2 )3 (2t2 − 1)(2t1 − 1) 2t1 +2t2 + e t12 t22

M(X,Y ) (t1 , t2 ) =

3.3.13.

(a) fX (x) =

n

f (x, y) =

y=0

n

y=0

6xy n(n + 1)(2n + 1)

2 =

n

36x2 2

[n(n + 1)(2n + 1)]

y2

y=0

6x2 , x = 1, 2, . . . , n. n(n + 1)(2n + 1) 2 n n n

6xy 36y2 f (x, y) = = x2 fY (y) = 2 [n(n + 1)(2n + 1)] x=0 x=0 x=0 n(n + 1)(2n + 1) =

=

6y2 , y = 1, 2, . . . , n. n(n + 1)(2n + 1)

Given y = 1, 2, . . . , n, we have

f (x|y) =

f (x, y) = fY (y)

2 6xy 6x2 n(n + 1)(2n + 1) = , x = 1, 2, . . . , n. n(n + 1)(2n + 1) 6y2 n(n + 1)(2n + 1)

(b) Given x = 1, 2, . . . , n, we have

f (y|x) =

3.3.15.

(a) E(XY ) = (b) E(X) = E(Y ) =

x,y

x,y

x,y

f (x, y) = fX (x)

xy · f (x, y) =

x · f (x, y) =

2 6xy 6y2 n(n + 1)(2n + 1) = , y = 1, 2, . . . , n. n(n + 1)(2n + 1) 6x2 n(n + 1)(2n + 1) 3

3

xy · f (x, y) =

x=1 y=1 3

3

x · f (x, y) =

5 , and 3

y · f (x, y) =

11 . 6

x=1 y=1

y · f (x, y) =

3

3

x=1 y=1

35 . 12


35 5 11 5 − · =− . 12 3 6 36 3

3 2

5 (c) Var(X) = [x − E(X)]2 · f (x, y) = x − 53 · f (x, y) = , and 9 x,y x=1 y=1 Then, Cov(X, Y ) = E(XY ) − E(X)E(Y ) =

Var(Y ) =

x,y

[y − E(Y )]2 · f (x, y) =

Then, ρXY = √ 3.3.17.

3

3

y− x=1 y=1

11 2 6

· f (x, y) =

23 . 36

Cov(X, Y ) −5/36 = √ = −0.233. (5/9)(23/36) Var(X)Var(Y )

Assume that a and c are nonzero. Cov(U, V ) = Cov(aX + b, cY + d) = acCov(X, Y ), Var(U) = Var(aX + b) = a2 Var(X), and Var(V ) = Var(cY + d) = c2 Var(Y ). Cov(U, V ) acCov(X, Y ) ac ρXY = = 2 2 Var(U)Var(V ) a Var(X)c Var(Y ) (ac)2 ac ρ , if ac > 0 . = ρXY = XY −ρXY , otherwise |ac|

Then, ρUV = √

3.3.19.

We fist state the famous Cauchy–Schwarz inequality: |E(XY )| ≤ E(X2 )E(Y 2 ) and the equality holds if and only if there exists some constant α and β, not both zero, such that P(α |X|2 = β |Y |2 ) = 1. Now, consider

Cov(X, Y ) = 1 ⇔ |Cov(X, Y )| = Var(X)Var(Y ) |ρXY | = 1 ⇔ √ Var(X)Var(Y ) ⇔ |E(X − μX )(Y − μY )| = E(X − μX )2 E(Y − μY )2

By the Cauchy–Schwarz inequality we have P α |X − μX |2 = β |Y − μY |2 = 1

⇔ P(X − μX = K(Y − μY )) = 1 for some constant K ⇔ P(X = aY + b) = 1 for some constants a and b. 3.3.21.

(a) First, we compute the marginal densities. fX (x) =

∞ ∞ f (x, y)dy = e−y dy = e−x , x ≥ 0, and x

x

y fY (y) =

y f (x, y)dx =

0

0

e−y dx = ye−y , y ≥ 0.


For given y ≥ 0, we have the conditional density as f (x|Y = y) =

f (x, y) e−y 1 = = , 1 −y fY (y) y e y

0 ≤ x ≤ y.

y Then, (X|Y = y) follows Uniform(0, y). Thus, E(X|Y = y) = . 2 ∞ y −y dx dy = ∞ 1 y3 e−y dy = 3, (b) E(XY ) = y x xy · f (x, y)dxdy = 0 xye 0 0 2 ∞ −x E(X) = x x · fX (x)dx = 0 xe dx = 1, and ∞ E(Y ) = y y · fY (y)dy = 0 y2 e−y dy = 2. Then, Cov(X, Y ) = E(XY ) − E(X)E(Y ) = 3 − 1 · 2 = 1. (c) To check for independence of X and Y fX (1)fY (1) = e−2 = e−1 = f (1, 1).

Hence, X and Y are not independent. 3.3.23.

Let σ 2 = Var(X) = Var(Y ). Since X and Y are independent, we have Cov(X, Y ) = E(XY ) − E(X)E(Y ) = E(X)E(Y ) − E(X)(Y ) = 0. Then, Cov(X, aX + Y ) = aCov(X, X) + Cov(X, Y ) = aVar(X) = aσ 2 , and Var(aX + Y ) = a2 Var(X) + Var(Y ) = (a2 + 1)σ 2 . Thus, ρX, aX+Y = a Cov(X, aX + Y ) aσ 2 = . √ √ 2 2 2 2 Var(X)Var(aX + Y ) a +1 σ (a + 1)σ


The pdf of X is fX (x) =

1 a

if 0 < x < a and zero otherwise. y−d

c y−d FY (y) = P(Y < y) = P(cX + d < y) = P X < = fX (x)dx c ⎧ ⎪ ⎪ 1, ⎪ ⎨ = y−d, ⎪ ⎪ ⎪ ⎩ ac 0,

Then, fY (y) = 3.4.3.

⎧ y−d 1, ⎪ if ≥a ⎪ ⎨ c y−d y−d , if 0 <
−∞

if y ≥ ac + d if d < y < ac + d otherwise

dFY (y) 1 = if d < y < ac + d and zero otherwise. dy ac

Let U = XY and V = Y . Then X = U/V and Y = V , and

∂x J = ∂u ∂y ∂u

∂x 1 u ∂v = v − v2 = 1 . ∂y v 0 1 ∂v


Then the joint pdf of U and V is given by u 1 fU,V (u, v) = fX,Y , v · |J| = fX,Y , v v v v u

Then the pdf of U is given by ∞

fU (u) =

fU,V (u, v)dv =

1 , v dv. v v

u

−∞

v

3.4.5.

fX,Y

The joint pdf of (X, Y ) is − 1 fX,Y (x, y) = e 2πσ1 σ2

2 x2 + y 2σ12 2σ22

, −∞ < x < ∞, −∞ < y < ∞; σ1 , σ2 > 0.

We can easily show that the marginal densities are ∞ fX (x) =

f (x, y)dy = √ −∞

1 2πσ1

∞ fY (y) =

2

e

− x2 σ1

, −∞ < x < ∞, and

2

f (x, y)dx = √ −∞

1 2πσ2

− y2

e

σ2

, −∞ < y < ∞.

This implies that X ∼ N 0, σ12 and Y ∼ N 0, σ22 . Also, notice that fX,Y (x, y) = fX (x)fY (y) for all x and y, thus X and Y are independent. By the definition of Chi-Square distribution and the independency of X and Y , we know that σ12 X2 + σ12 Y 2 ∼ χ2 (2). Therefore, the pdf of U = σ12 X2 + σ12 Y 2 is 1

2

1

fU (u) =

3.4.7.

1 −u e 2, 2

2

u > 0.

Let U = X + Y and V = Y . Then X = U − V and Y = V , and ∂x ∂u J = ∂y ∂u

∂x ∂v 1 −1 = = 1. ∂y 0 1 ∂v

Then the joint pdf of U and V is given by fU,V (u, v) = fX,Y (u − v, v) · |J| = fX,Y (u − v, v).

Thus, the pdf of U is given by fU (u) =

∞

−∞ fX,Y (u − v, v)dv.


3.4.9.

(a) Here let g(x) =

x−μ d −1 , and hence, g−1 (z) = σz + μ. Thus, g (z) = σ. Also, σ dz

2 1 x−μ 1 fX (x) = √ e− 2 σ , −∞ < x < ∞. Therefore, the pdf of Z is fZ (z) = fX (g−1 (z)) 2πσ d −1 √1 − 12 z2 , −∞ < z < ∞, which is the pdf of N(0, 1). dz g (z) = 2π e

(b) The cdf of U is given by

√ √ √ (X − μ)2 X−μ √ FU (u) = P(U ≤ u) = P ≤ u = P − u ≤ ≤ u =P − u≤Z≤ u σ σ2 √

√

u =

1 √ 2π √

1 2 e− 2 z dz

− u

u =2 0

1 2 1 √ e− 2 z dz, since the integrand is an even function. 2π

d 2 1 1 u 1 u FU (u) = √ e− 2 √ = √ u− 2 e− 2 , u > 0 and du 2 u 2π 2π zero otherwise, which is the pdf of χ2 (1). Hence, the pdf of U is fU (u) =

3.4.11.

Since the support of the pdf of V is v > 0, then g(v) = 12 mv2 is a one-to-one function on d −1 1 m 2 the support. Hence, g−1 (y) = 2y m . Thus, de g (y) = 2 2y m . Therefore, the pdf of E is given by √ d 2y 2y 1 c 2y − 2β·y f (y) = fV (g−1 (y)) g−1 (y) = c e−β m √ = √ e m , dy m 2my m3

3.4.13.

y > 0.

√ Y Let U = X2 + Y 2 and V = tan−1 X . Here U is considered to be the radius and V is the angle. Hence, this is a polar transformation and hence is one-to-one. Then X = U cos V and Y = U sin V , and ∂x ∂u J = ∂y ∂u

∂x ∂v cos v −u sin v = = u cos2 v + u sin2 v = u. ∂y sin v u cos v ∂v

Then the joint pdf of U and V is given by fU,V (u, v) = fX,Y (uv, v) · |J| = =

3.4.15.

1 1 exp − 2 u2 cos2 v + u2 sin2 v u. 2 2πσ 2σ

u − u22 e 2σ , u > 0, 0 ≤ v < 2π. 2πσ 2

The joint pdf of (X, Y ) is fX,Y (x, y) = fX (x)fY (y) =

1 − x+y e 2 , x, y > 0. 4


Apply the result in Exercise 3.4.14 with β = 2. We have the joint pdf of U = as fU,V (u, v) = 12 e−(u+v) , v > −2u, v > 0. Thus, the pdf of U is given by

X−Y 2

and V = Y

fU (u) =

fU,V (u, v)dv v

=

⎧∞ ⎪ 1 −(u+v) 1 ⎪ ⎪ ⎪ e dv = e−u , u ≥ 0 ⎪ ⎪ 2 2 ⎪ ⎨ 0

⎪ ∞ ⎪ ⎪ 1 −(u+v) 1 ⎪ ⎪ ⎪ e dv = eu , u < 0. ⎪ ⎩ 2 2 −2u


(a) Note that X follows β(5, 5), then apply the result in Exercise 3.2.31 we have μ = 1/2 and σ 2 = 1/44. From the Chebyshev’s theorem P(μ − Kσ < X < μ + Kσ) ≥ 1 −

3.5.3.

1 . K2

Equating μ − Kσ to 0.2 and μ + Kσ to 0.8 with μ = 1/2 and σ = 1 obtain K = 2. Hence, P(0.2 < X < 0.8) ≥ 1 − 2 = 0.75. 2 0.8 (b) P(0.2 < X < 0.8) = 0.2 630x4 (1 − x)4 dx = 0.961.

σ 2 = E(X − μ)2 = (x − μ)2 f (x)

=

x

x≤μ−Kσ

≥

(x − μ)2 f (x) +

x≤μ−Kσ

μ−Kσ
(x − μ)2 f (x) +

(x − μ)2 f (x) +

√

1/44 = 0.15, we

(x − μ)2 f (x)

x≥μ+Kσ

(x − μ)2 f (x)

x≥μ+Kσ

Note that (x − μ)2 ≥ K2 σ 2 for x ≤ μ − Kσ or x ≥ μ + Kσ. Then the above inequality can be written as ⎡

σ 2 ≥ K2 σ 2 ⎣

f (x) +

x≤μ−Kσ

⎤ f (x)⎦

x≥μ+Kσ

= K2 σ 2 [P(X ≤ μ − Kσ) + P(X ≥ μ + Kσ)] = K2 σ 2 P(|X − μ| ≥ Kσ)

This implies that P(|X − μ| ≥ Kσ) ≤

1 K2

or P(|X − μ| < Kσ) ≥ 1 −

1 . K2


3.5.5.

Apply Chebyshev’s theorem we have Xn E(Xn − np)2 P − p < 0.1 = P(|Xn − np| < 0.1n) ≥ 1 − n (0.1n)2 =1−

Var(Xn ) np(1 − p) =1− 0.01n2 0.01n2

= 1 − 100

p(1 − p) 100 1 1 ≥1− · , since p(1 − p) ≤ . n n 4 4

This implies that we want to find n such that 1−

3.5.7.

25 100 1 · = 0.9 ⇒ = 0.1 ⇒ n = 250. n 4 n

Let X1 , . . . , Xn denote each toss of coin with value 1 if head occurs and 0 otherwise. Then, X1 , . . . , Xn are independent variables which follow Bernoulli distribution with p = 1/2. Thus, E(Xi ) = 1/2, and Var(Xi ) = 1/4. For any ε > 0, from the law of large numbers we have Sn Sn 1 1 P − < ε → 1 as n → ∞, i.e. will be near to for large n. n 2 n 2

If the coin is not fair, then the fraction of heads, getting head for large n. 3.5.9.

Sn n ,

will be near to the true probability of

Note that E(Xi ) = 0, and Var(Xi ) = i. Hence, X1 , . . . , Xn are not identically distributed. Thus, the conditions of the law of large numbers stated in the text are not satisfied. There is a weaker version of the weak law of large numbers which requires only E(Xi ) = μ, and Var(X) → 0 as n → ∞. However, in this case ⎛ n ⎞ n n

Xi ⎟ Var(Xi ) i n(n+1) ⎜ 1 ⎜ i=1 ⎟ i=1 i=1 2 Var(X) = Var ⎜ = = → as n → ∞. ⎟= 2 2 2 ⎝ n ⎠ 2 n n n

Therefore, the conditions of the weaker version are not satisfied, either. 3.5.11.

First note that E(Xi ) = 2 and Var(Xi ) = 2. From the CLT,

X100 − 2 follows approximately √ 2/ 100

N(0, 1). Hence, we have

X100 − 2 2−2 P(X100 > 2) = P > √ √ 2/ 100 2/ 100

3.5.13.

≈ P(Z > 0) = 0.5, where Z ∼ N(0, 1).

First note that E(Xi ) = 1/2 and Var(Xi ) = 1/12. Then, by CLT we know that Zn = √X−1/2 √ 1/12/ n

approximately follows N(0, 1) for large n.

S√ n −n/2 n/12

=


3.5.15.

From the Chebyshev’s theorem P(μ − Kσ < X < μ + Kσ) ≥ 1 −

1 . K2

Equating μ − Kσ to 104 and μ + Kσ to 140 with μ = 122 and σ = 2, we obtain K = 9. Hence, 1 P(104 < X < 140) ≥ 1 − 2 = 0.988. 9

3.5.17.

Let Xi = 1 if the ith person in the sample is color blind and 0 otherwise. Then each Xi follows Bernoulli distribution with estimated probability 0.02, and E(Xi ) = 0.02 and

Var(Xi ) = 0.0196. Let Sn = ni=1 Xi . Sn /n−0.02 We want P(Sn ≥ 1) = 0.99. By the CLT, √ follows approximately N(0, 1). Then, 0.0196/n 1/n − 0.02 0.99 = P(Sn ≥ 1) ≈ P Z ≥ √ . 0.0196/n

3.5.19.

Using the normal table, √1/n−0.02 = −2.33. Solving this equation, we have n = 359.05. 0.0196/n Thus, the sample size must be at least 360. 1 100 Let X1 , . . . , X100 be iid with μ = 1 and σ 2 = 0.04. Let X = 100 i=1 Xi . By the CLT,

√ X−1 0.04/100

follows approximately N(0, 1). Then,

1−1 0.99 − 1 P(0.99 ≤ X ≤ 1) ≈ P √ ≤Z≤ √ = P(−0.5 ≤ Z ≤ 0) = 0.1915. 0.04/100 0.04/100

3.5.21.

Let Xi = 1 if the ith dropper in the sample is defective and 0 otherwise. Then each Xi follows 500 Bernoulli distribution with estimated probability 10000 = 0.05, and E(Xi ) = 0.05 and

125 125 −125·0.05 follows approximately Var(Xi ) = 0.0475. Let S125 = i=1 Xi .From the CLT, S√ 0.0475·125 N(0, 1). Hence, we have 2 − 125 · 0.05 P(S125 ≤ 2) ≈ P Z ≤ √ = P(Z ≤ −1.744) = 0.041. 0.0475 · 125

Chapter

4

Sampling Distributions EXERCISES 4.1 4.1.1.

(a) There are 53 = 10 equally likely possible samples of size 3, so the probability for each is 1 10 without replacement: X −1 −2 3 −1 3 −1 3 0

(−2, −1, 0) (−2, −1, 1) (−2, −1, 2) (−2, 0, 1) (−2, 0, 2) (−2, 1, 2) (−1, 0, 1) (−1, 0, 2) (−1, 1, 2) (0, 1, 2)

13

0 13 23

1

M −1 −1 −1 −0 0 1 0 0 1 1

S 1 21 3 41 3 21 3 4 41 3 1 21 3 21 3 1

(i) X p(X)

−1

−2/3

−1/3

0

1/3

2/3

1

1/10

1/10

2/10

2/10

2/10

1/10

1/10

M p(M)

−1 3/10

0 4/10

1 3/10

(ii)

(iii) S p(S)

1 3/10

√

73

4/10

2 1/10

√

13 3

2/10

37

38 CHAPTER 4 Sampling Distributions

(iv)

E X = (−1) 1 10 + −2 3 1 10 + −1 3 1 10 + 0 1 10 + 1 3 2 10 + 2 3 1 10 + (1) 1 10 = 0 2 2 2 2 2 1 1 1 + − + 0 E X = (−1)2 1 10 + −2 3 10 3 10 10 2 2 2 1 2 2 1 1 + 13 10 + 3 10 + 1 10 = 3 2 2 1 1 Var X = E X − EX = − 02 = 3 3

(b) We can get 53 = 125 samples of size 3 with replacement 4.1.3.

Population: {1, 2, 3}. p(x) = 1 3, for x in {1, 2, 3} N N

(a) μ = N1 ci = 2, σ 2 = N1 (ci − μ)2 = 2 3 i=1

i=1

(b) Sample

X

Sample

X

Sample

X

(1, 1, 1)

1

(2, 1, 1)

11 3

(3, 1, 1)

12 3

(1, 1, 2)

11 3

(2, 1, 2)

12 3

(3, 1, 2)

2

(1, 1, 3)

12 3

(2, 1, 3)

2

(3, 1, 3)

21 3

(1, 2, 1)

11 3

(2, 2, 1)

12 3

(3, 2, 1)

2

(1, 2, 2)

12 3

(2, 2, 2)

2

(3, 2, 2)

21 3

(1, 2, 3)

2

(2, 2, 3)

21 3

(3, 2, 3)

22 3

(1, 3, 1)

12 3

(2, 3, 1)

2

(3, 3, 1)

21 3

(1, 3, 2)

2

(2, 3, 2)

21 3

(3, 3, 2)

22 3

(1, 3, 3)

21 3

(2, 3, 3)

22 3

(3, 3, 3)

3

1

11 3

12 3

2

21 3

22 3

3

1/27

1/9

2/9

7/27

2/9

1/9

1/27

X p(X)

2 (c) E X = X x · p(x) = 2, E X = X x2 · p(x) = 42 9, then Var X = 2 9 4.1.5.

Since

i

(xi − x)2 =

i

xi2 − nx2 , we have E(S )2 =

1 n

i

EXi2 −

nEX n

2

Assuming the sampling from a population with mean μ and variance σ 2 , we have 2 1 E S = n σ 2 + μ2 − n

σ2 + μ2 n

Student’s Solutions Manual 39 σ2 = σ2 − n n−1 = σ2 < σ2 = E S2 n

4.1.7.

Let X be the weight of sugar X ∼ N(μ = 5 lb, σ = 0.2 lb) n

Then X = 1n Xi is the mean weight, where n = 15. i=1

σ2 n .

Then X ∼ N(5, 0.22 15), and √X−5 =Z∼ 0.22 15 √ 2 N(0, √ 1 ). Therefore, the probability requested is P(−0.2 < X − 5 < 0.2) = P(− 15 < Z < 15) = 0.9999 By Corollary 4.2.2, E(X) = μ, and Var(X) =

4.1.9.

Let X be the height. X ∼ N(μ = 66, σ 2 = 22 ), and √ 170) = P(Z > 2 26) = 0

X−66 √ 2 26

= Z ∼ N(0, 12 ). Then, P(X >

4.1.11.

2 Let X be the time. X ∼ N(μ = 95, σ 2 = 102 ). Then X−95 10 = Z ∼ N(0, 1 ). Therefore, P(X < 85) = P(Z < 1) = 0.8413, or 84.13% of measurement times will fall below 85 seconds.

4.1.13.

According the information, μ = 215 and σ = 35. = 0.0007 (a) If n = 55, we can assume X ∼ N(μ, σ), then P X > 230 = P Z > 230−215 35 √55 (b) If n = 200, we can assume X ∼ N(μ, σ), then P X > 230 = P Z > 230−215 =0 35 √200 (c) If n = 35, we can assume X ∼ N(μ, σ), then P X > 230 = P Z > 230−215 = 0.0056 35 √35 (d) Increasing the sample size, decrement the probability

4.1.15.

Let T be the temperature. Since n = 60, we assume T ∼ N(98.6, 0.952 ). Then T ∼ N(98.6, 0.952 60). Therefore, P(T ≥ 99.1) = 0


2 We have that Y ∼ χ(15) (a) We can see, for example in a table, that P(Y ≤ 6.26) = 0.025. Then y0 = 6.26 (b) Choosing upper and lower tail area to 0.025, and since P(Y ≤ 27.5) = 0.975, and 2 = 27.5, a = P(Y ≤ 6.26) = 0.025, then P(a < Y < b) = 0.95, then b = χ0.975,15 2 χ0.025,15 = 6.26 (c) P(Y ≥ 22.307) = 1 − P(Y < 22.3) = 0.10

4.2.3.

2 ≡ (k = v/2, θ = 2). In our case T ∼ (1, 2), then T = If X ∼ (k, θ), then X ∼ χ(v)

(n, 2/n) (a) With n = 3, T ∼ (3, 2/3) (b) P(T > 2) = 0.4232

1 n

n

i=1

T ∼


4.2.5.

Since X1 , X2 , . . . , X5 are i.i.d. N(55, 223), then Y = 2 n X−55

5

i=1

2 n X−55

(Xi −55)2 223

2 ∼ χ(5)

2 , Z is Chi-square distributed with 4 degrees (a) Since Z = Y − 223 , and 223 ∼ χ(1) of freedom and Y is Chi-square distributed with 5 degrees of freedom

(b) Yes (c) (i) P(0.62 ≤ Y ≤ 0.76) = 0.0075 (ii) P(0.77 ≤ Z ≤ 0.95) = 0.0251 2

4.2.7.

2 ∼ χ(n−1) . Since the random sample comes from a normal distribution, (n−1)S σ2 Setting the upper and lower tail area equal to 0.05, even this is not the only choices, and using 2 = χ0.95,14 = 23.68, a Chi-square table with n − 1 = 14 degrees of freedom, we have (n−1)b σ2 (n−1)a 2 and σ 2 = χ0.05,14 = 6.57. Then, with σ = 1.41, b = 3.36, and a = 0.93

4.2.9.

Since T ∼ t8 (a) P(T ≤ 2.896) = 0.99 (b) P(T ≤ −1.860) = 0.05 (c) Since t-distribution is symmetric, we find a such that P(T > a) =

4.2.11.

4.2.13. 4.2.15.

0.01 2 .

Then a = 3.35

According with the information, μ = 11.4, n = 20, y = 11.5, and s = 2, then t = sy−μ √ = n 0.224. The degrees of freedom are n − 1 = 19, so the critic value is 1.328 at α = 0.05-level. Then, the data tend to agree with the psychologist claims. 2 , then X ∼ α = v , β = 2 , then E(X) = v (2) = v and Var(X) = v If X ∼ χ(v) 2 2 2 (2)2 = 2v If X1 , X2 , . . . , Xn is from N(μ, σ 2 ) (n−1)S 2 σ2

2 is from χ(n−1) & 2 then, by Exercise 4.2.13, Var (n−1)S = 2(n − 1) σ2

then, by Theorem 4.2.8,

Since Var(aX) = a2 Var(X),

%

(n−1)2 Var(S 2 ) σ4

Simplifying after multiplying by

σ4 , (n−1)2

= 2(n − 1)

we obtain Var(S 2 ) =

2σ 4 n−1

4.2.17.

If X and Y are independent random variables from an exponential distribution with com2 and 2Y ∼ χ2 then mon parameter θ = 1, then using 4.2.16 with n = 1, 2X ∼ χ(2) (2) X 2X = ∼ F (2, 2) Y 2Y

4.2.19.

If X ∼ F (9, 12) (a) P(X ≤ 3.87) = 0.9838 (b) P(‘X ≤ 0.196) = 0.01006 (c) F0.975 (9, 12) = 0.025 then F0.975 = 3.4358. 1 , where 0.025 = P(X < F0.975 ) = P X1 > F0.975 Then

1 F0.975

1 X

∼ F (12, 9)

= 3.8682 and F0.975 = 0.258518. Thus, a = 0.2585, b = 3.4358


4.2.21.

If X ∼ F (n1 , n2 ) the PDF is given by ⎧ n1 −1 ⎪ 2 n1 n1 −(n1 +n2 )/2 ⎪ ⎨ [(n1 + n2 )/2] n1 x 1+ x , n2 n2 f (x) = (n1 /2)(n2 /2) n2 ⎪ ⎪ ⎩0,

0
Then n1 −1 ∞ 2 n1 −(n1 +n2 )/2 [(n1 + n2 )/2] n1 n1 EX = x x 1+ x dx (n1 /2)(n2 /2) n2 n2 n2 0

=

n1 −1 ∞ n1 [(n1 + n2 )/2] n1 n1 2 n1 −(n1 +n2 )/2 x 2 1+ x dx (n1 /2)(n2 /2) n2 n2 n2 0

−1 −1 −1 Let y = 1 − 1 + nn12 x then x = nn12 y(1 − y)−1 and dx = nn12 (1 − y)−2 dy and −1 lim 1 − 1 + nn12 x = 1, then x→∞

[(n1 +n2 )/2] EX = (n 1 /2)(n2 /2)

n1 n2

n2 2

n2 n1

n2 +1 1 2

y

n2 2

(1 − y)n1 −2 dy, which converges for n1 > 2.

0

1 ∞ α−1 −x For α > 0, β > 0 yα−1 (1 − y)β−1 dy = (α)(β) e dx with the 0 x (α+β) , where (α) = 0

property (α) = (α − 1)(α − 1). Then EX =

[(n1 + n2 )/2] (n1 /2)(n2 /2)

n1 n2

n2 2

n2 n1

n2 +1 2

[(n1 + n2 )/2] n2 n1 (n1 /2)(n2 /2 − 1)(n2 /2 − 1) n2 , n2 > 2 = n2 − 2 =

(n1 /2 + 1)(n2 /2 − 1) [(n1 + n2 )/2] n (n /2)(n /2 − 1) 1

2

1

2

[(n1 + n2 )/2]

Similarly, [(n1 + n2 )/2] EX2 = (n1 /2)(n2 /2)

n1 n2

n1 2

n2 n1

n2 +2 1 2

y

n1 2

(1 − y)

n2 2 −3 dy,

which converges for n2 > 4

0

2 [(n1 + n2 )/2] n2 (n1 /2 + 1)(n1 /2)(n1 /2)(n2 /2 − 2) (n1 /2)(n2 /2 − 1)(n2 /2 − 2)(n2 /2 − 2) n1 [(n1 + n2 )/2] 2 n1 (n1 + 2) n2 = , n2 > 4. n1 (n2 − 2)(n2 − 4) =

Now, Var(X) = EX2 − (EX)2 . Therefore, EX =

n2 , n2 > 2 n2 − 2

and

VarX =

n22 (2n1 + 2n2 − 4) n1 (n2 − 2)2 (n2 − 4)


4.2.23.

If X1 , X2 , . . . , Xn1 is a random sample from a normal population with mean μ1 and variance σ 2 and if Y1 , Y2 , . . . , Yn2 is a random sample from an independent normal population (n − 1)S 2 2 2 1 1 with mean μ2 and variance σ 2 , then X ∼ N μ1 , σ n1 , Y ∼ N μ2 , σ n2 , ∼ σ2 2 (n2 − 1)S2 2 2 χ(n , and ∼ χ(n . 1 −1) 2 −1) σ2 2 2 (n −1)S 2 (n −1)S 2 2 Then X − Y ∼ N μ1 − μ2 , σn1 + σn2 and 1 σ 2 1 + 2 σ 2 2 ∼ χ(n 1 +n2 −2) then

X−Y ' −(μ1 −μ2 ) σ2 σ2 n +n 1

∼ N(0, 12 ) and

(n1 −1)S12 σ2

+

(n2 −1)S22 σ2

2 ∼ χ(n 1 +n2 −2)

2

Then, since the samples are independent, we have by definition that X − Y − (μ1 − μ2 ) ( σ2 σ2 + n1 n2 ∼ T(n1 +n2 −2) ), * (n2 − 1)S22 * (n1 − 1)S12 + + σ2 σ2 [n1 + n2 − 2]

This after simplification becomes: (

4.2.25.

X − Y − (μ1 − μ2 ) ∼ T(n1 +n2 −2) (n1 − 1)S12 + (n2 − 1)S22 1 1 + n1 + n2 − 2 n1 n2

Q.E.D.

2 with v > 0, then the pdf of X is given by If X ∼ χ(v)

f (x) =

⎧ ⎨

1 e−x/2 xv/2−1 , (v/2)2v/2 ⎩ 0,

0
Then, by definition of MGF, 1 MX (t) = (v/2)2v/2

∞ ex(t−1/2) xv/2−1 dx = 0

= (1 − 2t)−v/2 ,

v/2 ∞ −w v/2−1 1 e w dw 1 − 2t (v/2) 0

t<

1 2

∞ (t) = υ(1 − 2t)−v/2−1 , then EX = M (t) | Since (v/2) = 0 e−w wv/2−1 dw and MX t=0 = v X −v/2−2 , then EX2 = MX (t) |t=0 = v2 + 2v MX (t) = υ(v + 2)(1 − 2t) Therefore, VarX = EX2 − (EX)2 = 2v


4.2.27.

Let X be a random variable with PDF f (x) =

⎧ ⎪ ⎨

2 , π(1 + x2 )

⎪ ⎩0,

0
X does not follows T1 distribution (i.e. X is not t(1) distributed), even X2 ∼ F (1, 1). Therefore, X2 ∼ F (1, n) does not (necessarily) imply X ∼ t(n)


f (x) =

1 −x/10 ,x 10 e

> 0, then the cumulative distribution of τ1 is t Fτ1 (t) =

t 1 −x/10 e dx = −e−x/10 = 1 − e−t/10 . 0 10

0

Let Y represent the life length of the system, then Y = min(τ1 , τ2 ) and FY (y) = 1 − [1 − Fτi (y)]2 , then the pdf of Y is of the form fY (y) = 2fτi (y)[1 − Fτi (y)] and is given by ⎧ 1 ⎪ ⎨ e−y/5 , fy (y) = 5 ⎪ ⎩0,

0
4.3.3.

X1 , X2 take values 0, 1; X3 take values 1, 2, 3, and Y1 = min {X1 , X2 , X3 } Since the values of X1 , X2 are less or equal to the values for X3 , Y1 take values 0, 1 Since the values of X3 are greater than the values for X1 , X2 , then Y3 = max {X1 , X2 , X3 } take values 1, 2, 3 Since Y1 ≤ Y2 ≤ Y3 , Y2 take values 0, 1

4.3.5.

Let X1 , X2 , . . . , Xn be a random sample from exponential distribution with mean θ, then x the common pdf is given by f (x) = 1θ e− θ , if x > 0 Using Theorem 4.3.2, the pdf of the k-th order statistic is given by fk (y) = fYk (y) =

y n! f (y)(F (y))k−1 (1 − F (y))n−k , where F (y) = 1 − e− θ (k − 1)!(n − k)!

& % y &n−k y k−1 1 − e− θ e− θ % y &n−1 ny Then, the pdf of Y1 is f1 (y) = nf (y) e− θ = nθ e− θ , which is the pdf of an exponential distribution with mean nθ , and the pdf of Yn is then fk (y) =

n! (k−1)!(n−k)! f (y)

%

fn (y) = nf (y)[F (y)]n−1 ny &n−1 n y% = e− θ 1 − e− θ θ


4.3.7.

X1 , . . . , Xn a random sample are i.i.d with pdf f (x) = 12 , 0 ≤ x ≤ 2 x then F (x) = 0 21 dx = 2x , if 0 ≤ x ≤ 2 ⎧ 1, x > 2 ⎪ ⎪ ⎨ x then F (x) = , 0≤x≤2 ⎪ ⎪ ⎩2 0, x < 0 Then, using Theorem 4.3.3 the joint pdf of Y1 and Yn is given by % x &1−1 1 n! 1 n−1−1 % y &n−n 1 1 fY1 ,Yn (x, y) = y− x · 1− · · (1 − 1)!(n − 1 − 1)!(n − n)! 2 2 2 2 2 2 =

n(n − 1) (y − x)n−2 , if 0 ≤ x < y ≤ 2 and fY1 ,Yn (x, y) = 0, otherwise. 2n

Now, let R = Yn − Y1 and Z = Yn , and consider the functions r = yn − y1 , z = yn , then their inverses are y1 = z − r, yn = z. Then the corresponding Jacobian of the one-to-one transformation is ∂y1 ∂r J = ∂yn ∂r

∂y1 ∂z −1 = ∂yn 0 ∂z

1 = −1 1

Then the joint pdf of R and Z is g(r, z) = |−1|fY1 ,Yn (z − r, z) =

n(n − 1) n−2 r , if 0 ≤ r ≤ z ≤ 2, and g(w, z) = 0, otherwise. 2n

Then, the pdf of the range R = Yn − Y1 is 2 fR (r) =

g(r, z)dz = r

4.3.9.

n(n − 1)(2 − r)r , if 0 ≤ r ≤ 2 and fR (r) = 0, otherwise. 2n

X1 , . . . , Xn a random sample from N(10, 4) P(Yn > 10) = 1 − P(Yn ≤ 10)

The CDF Fn (y) of Yn is [F (y)]n , where F (y) is the cdf of X evaluated in y Then, P(Yn ≤ y) = Fn (y) = [F (y)]n = [P(X ≤ y)]n and P(X ≤ y) = P Z ≤ y−10 2 % &n Then P(Yn ≤ y) = P Z ≤ y−10 2 Then P(Yn > y) = 1 − P(Y % n < y) =1− P Z ≤

y−10 2

n &


Therefore P(Yn > 10) = 1 − [P(Z ≤ 0)]n = 1 − (0.5)n 4.3.11.

X1 , . . . , Xn is a random sample from Beta(x = 2, β = 3) The joint pdf of Y1 and Yn , according Theorem 4.3.3, is given by fY1 ,Yn (x, y) =

n! [F (x)]1−1 [F (y) − F (x)]n−1−1 [1 − F (y)]n−n f (x)f (y) (1 − 1)!(n − 1 − 1)!(n − n)!

= n(n − 1)[F (y) − F (x)]n−2 f (x)f (y),

if

x
Since Xi ∼ Beta(X = 21 β = 3) for i = 1, 2, . . . , n, the pdf is f (x) =

(α + β) α−1 χ (1 − x)β−1 , (α)(β)

x ∈ [0, 1]

and, the DF is ⎧ 0, ⎪ ⎪ ⎪ ⎨ (α + β) x α−1 (1 − t)β−1 dt, F (x) = 0 t (α)(β) ⎪ ⎪ ⎪ ⎩ 1,

x≤0 0≤x≤1 x≥1

In our case, f (x) =

(5) 4! x2−1 (1 − x)3−1 = x(1 − x)2 = 12x(1 − x)2 , (2)(3) 1!2!

if

0≤x≤1

and x F (x) =

, 12t(1 − t)2 dt = 12

2x3 x4 x2 − + , 2 3 4

if

0≤x≤1

0

Then, the joint pdf fY1 ,Yn (x, y), using the 4.3.3, is given by ,

-n−2 2y3 y4 2x3 x4 y2 x2 fY1 ,Yn (x, y) = n(n − 1) 12 − + − 10 − + 12x(1 − x)2 12y(1 − y)2 2 3 4 2 3 4 2 1 n−2 1 2 = 12n n(n − 1) y − x2 − y 3 − x3 + y 4 − x4 xy(1 − x)2 (1 − y)2 , 2 3 4

if 0 ≤ x ≤ y ≤ 1, and fY1 ,Yn (x, y) = 0, otherwise


2 = 4, then μ = 8 and σ 2 = 4 X1 , X2 , . . . , Xn , where n =150, μ = 8, σ x x X − μx 2 z By Theorem 4.4.1: lim P Z ≤ = √1 ∞ e−μ 2 du 2k n→∞ σx 10−8 = 0.44 then P(7.5 < X < 10) = P 75−8 2

4.4.3.

Let T be the time spent by a customer coming to certain gas station to fill up gas Suppose T1 , T2 , . . . , Tn are independent random variables, with μt = 3 minutes, σt2 = 1.5 minutes, and n = 75 n

Ti is the total time spent by the n customers Then Y = i=1 Since, Y = 3 hours = 180 minutes, P(Y < 180) = P Y < 180 = P(Y < 2.4) n 1.5 and by Theorem 4.4.1, Y ∼ N 3, 75 For practical purposes n = 60 is large enough 2.4 − 3 Therefore, P(Y < 180) = P Z < √ = 0, where Z ∼ N(0, 12 ) 0.02 There is 0% chance that the total time spent by the customers is less than 3 hours.

4.4.5.

1250 students took it, and μ = 69%, σ = 5.4%, and n = 60 students (is large enough). Then, μx = 69 and σx = √5.4 = 0.6997137 60 Then P(X ≤ 75.08) = P Z ≤ 75.08−69 0.697137 = 1 There is almost 100% chance that the average score is less than 75.08%.

4.4.7.

Xi ∼ N(μ1 , σ12 ) for i ∈ {1, 2, . . . , n}, then X ∼ N(μ1 , σ12 /n) Yj ∼ N(μ2 , σ22 ) for j ∈ {1, 2, . . . , m}, then Y ∼ N(μ2 , σ22 /m) Therefore,

σ2 σ2 X − Y ∼ N μ1 − μ2 , 1 + 2 n m

4.4.9.

X ∼ Binomial (n = 20, p = 0.2) P(X ≤ 10) =

10

20 0.2x (1 − 0.2)20−x = 0.9944 x

x=0

Using normal approximation: 10 − np + 0.5 P(X ≤ 10) = P Z ≤ √ = 0.99986 np(1 − p)

4.4.11.

q = 6% of person making reservations will not show up each day Rental company reserves for n =215 persons 200 automobiles available. p = 0.94 is the probability of the person making reservation will show up each day Let X be the number of the persons making reservation will show up. Then X ∼ Binomial(n = 215, p = 0.94)

The probability requested is P(X ≤ 200) = P Z ≤

200 √ − np + 0.5 np(1 − p)

= 0.3228


4.4.13.

SIDS occurs between the ages 28 days and one year Rate of death due to SIDS is 0.0013 per year. Randon sample of 5000 infants between the ages 28 day and are your Let X be the number of SIDS related deaths p = 0.00103, n = 5000 Then X ∼ Bin(n = 5000, p = 0.00103) The probability requested is 10 − np − 0.5 P(X > 10) ≈ P Z > √ = 0.0274 np(1 − p)


Chapter

5

Point Estimation EXERCISES 5.2 5.2.1.

The pdf of a geometric distribution is: f (x) = p(1 − p)x−1

for x = 1, 2…………also, μ = p1 .

By the methods of moments E(X) = x =

1 p

∴p ˆ =x

(b) For the given data

5.2.3.

x

x = n i = 2+4+···+22+12 = 16.11 18 ∴p ˆ = 16.11

Xi ∼ U(θ − 1, θ + 1) By definition: f (x) =

⎧ ⎨

1 θ+1−θ+1 ,

⎩0,

if θ − 1 < x < θ + 1

otherwise.

(a) By definition E(x) = x =

θ−1+θ+1 2

∴x=θ

This implies the moment of estimator for θ = x (b) θˆ = x 11.72+12.81+12.09+13.47+12.37 4

θˆ = 53.18 5.2.5.

The pdf of the exponential function is given as: f (x) =

5.2.9.

⎧ ⎨e−(x−θ) , ⎩0,

if x ≥ 0

otherwise.

Similar as above

49

50 CHAPTER 5 Point Estimation

5.2.11.

We have E(X) = μ = x

2

Variance(σ 2 ) = 1n Xi − X . 2 / σ 2 = 1n xi − x2 ( 2 Xi 1

2 ∴ σˆ = n Xi − n The method of moments ( 2 Xi 1

2− X i n n 5.2.13.

estimator

for

σ

is

given

by

T (X1 , . . . . . . .Xn ) =

The method of moments estimator for μ = T (X1 , . . . . . . .Xn ) = E(X) = X ∴ T (X1 , . . . . . . .Xn ) = X 2

Xi − X Since μ and σ 2 both are unknown. σ 2 = 1n 2

Xi − X , This implies σ 2 = (n−1).1 n.(n−1) 2 ∴ σ 2 = n−1 n s n−1 Let s 2 = n s2 ∴ method of moments estimator for σ 2 is given by: 2

s 2 = 1n ni=1 Xi − X


n x n−x x p (1 − p)

f (x) =

L(p, x1 , x2 , . . . .xn ) = log

n 1 0 n i=1

xi

+

Xi log p + n − Xi log(1 − p)

Xi n − Xi + (−1) p 1−p

Xi n − Xi ∂ log L(p, X1 , X2 , . . . .Xn ) = − ∂p p 1−p

∂ log L(p, X1 , X2 , . . . .Xn ) = ∂p

For maximum likelihood estimator of p ∂ log L(p, X1 , X2 , . . . .Xn ) = 0 ∂p

n − Xi Xi − =0 p 1−p

(1 −

p)

2

3 Xi − p n − Xi = 0. This implies Xi = p Xi + n − Xi

X

p = n i, ∴ p = X. Hence, MLE of p = pˆ = X. By invariance property qˆ = 1 − pˆ

is MLE of q.


5.3.3.

f (x) = 1θ e− θ implies L(θ) = x

1 θn e

−

θ

Xi

Xi θ

ln L(θ) = −n ln θ −

Now taking the derivatives with respect to α and β and setting both equal to zero, we have

i ln L = −n θ + θ2 = 0

−nθ + Xi = 0.

∂ ∂θ

X

∴ θˆ = X

From the given data: 1+2+········+7+2 . 14

MLE of θ is given by θ = X = 5.3.5.

∴ θˆ = 6.07.

Here pdf of X is given by f (x) =

⎧ −x2 ⎪ ⎨ 2x e α3 α2

⎪ ⎩0,

2 4 −x L(α, X1 , . . . . . . . .Xn ) = (α22 )2 ni=1 Xi e α2

ln L = ln 2 − 2 ln α + ni=1 ln Xi − ni=1 ∂ 2n 2 n 2 i=1 Xi ∂α ln L = − α + α3

∂ ∂α = 0 implies, 2 n 2 − 2n i=1 Xi α + α3

−nα2 + X2 = 0

2 X α2 = n i 2 Xi ∴ αˆ = n .

if, x > 0

otherwise

Xi 2 α2

=0

5.3.7. ⎧ x α ⎪ ⎨ αα xα−1 e − β if, x ≥ 0 f (x) = β ⎪ ⎩0, otherwise.

L(α, β, X) =

αn βnα

4

α−1 e i=1 nXi

Xi α β

2 Xi ln Xi − β α

n Xi X ∂ n L(α, β, x) = + ln X − n ln β − ln β i i i=1 ∂α α β α αn ∂ Xi (a)β−α−1 ∂β L(α, β, x) = − β − α ∂ (−α−1) ∴ ∂β L(α, β, x) = − αn Xi β + αβ L(α, β, x) = n ln α − α ln β + (α − 1)

(5.1)


∂ For maximum likelihood estimator of α: ∂α ln L = 0. This implies;

α

X X n n i ln β i = 0 i=1 ln Xi − n ln β − α + β

α (−α−1) similarly, − αn Xi = 0 β + αβ α (−α−1) − αn Xi β + αβ α 5 6 α Xi − Xi α+1 αn α − = 0 solving for β we get β = β β β αn

Hence, α α−1 − X α+1 n i + Xi − α Xi 2 αn , α

− Xi α+1 ln Xi − ln αn

There is no closed form solution for α and β. In this case, one can use numerical methods such as Newton-Raphson method to solve for α, and then with this value to solve for β. 5.3.9.

γ(2θ) [x(1 − x)](θ−1), θ γ(θ)2

≥0 . /

ln L(θ, x) = n ln γ(2θ) − ln γ(θ)2 + (θ − 1) ni=1 ln(Xi − 1) % &

2γ (2θ) 2γ (θ)2 ∂ ln L(θ, x) = n − + ni=1 ln Xi (Xi − 1). 2 ∂θ γ(2θ) γ(0)

f (x) =

5.3.13. f (x) =

This implies L(θ, x) =

1 (3θ+2)2

⎧ ⎨ 1

3θ+2

⎩0,

if, 0 ≤ x ≤ 3θ + 2

otherwise.

for 0 ≤ x ≤ 3θ + 2 1 . Which is positive and decreasing function (3θ+2)2 max(Xi )−2 , the likelihood drops to 0, creating discontinuity 3

When 3θ +2 ≥ max(Xi ), the likelihood is of θ (for fixed n). However, for θ < i )−2 at point max(X . 3

Hence we will not be able to find the derivative. The MLE is the largest order statistic θˆ = max(Xi )−2 = Xn . 3 5.3.15.

Here X ∼ N(μ, σ 2 ) (xi − μ)2 n n ln L(μ, σ, x) = − ln 2π − ln σ 2 − 2 2 2σ 2 n

i=1

n ∂L = (Xi − μ) ∂μ i=1

n

∂L −n 1 2 Similarly ∂σ 2 = 2σ 2 + 2σ 2 i=1 (Xi − μ)

For maximum likelihood estimates of μ and σ; (Xi − μ) = 0 implies μ ˆ =X


Similarly, for σ 2 , n −n 1 + (Xi − μ)2 = 0 2σ 2 2σ 2 i=1 (

(Xi − X) ∴ σˆ = . n

5.3.17.

f (x) = 1θ e− xθ

It is given that the reliability R(x) = 1 − F (x). This implies F (x) = f (x). Hence F (x) = x − θ12 e− θ . Thus, R(x) = 1 − F (x) and L(x, θ) =

n 0 Xi 1 1 + 2 e− θ . θ

i=1


5.4.3.

∞

i.e E(X) = 1n E(Xi ). Where, E(X) = θ xe−(x−θ) dx. By integration by

parts E(X) = (1 + θ). Thus E X = 1n (1 + θ). This implies E(X) = 1 + θ. 1

Sample standard deviation s = n−1 (xi − X)2 E(X) = E

Xi n

' E(s) = E ' E(s) = E ' E(s) =

1 (Xi − μ + μ − X)2 n−1

1

1 & 1 % 2 2 (xi − μ) − (X − μ) n

1 & 1 % 2 2 E(xi − μ) − E(X − μ) n

-1 ' , 1 σ2 2 σ − E(s) = n n

5.4.5.

Let Y = C1 X1 + C2 X2 + · · · · · · · · + Cn Xn . For an unbiased estimate, we need to have E(C1 E(X1 ) + · · · · · · · · · · · + Cn E(Xn )) = μ That is, C1 μ + · · · · · · Cn μ = μ Which is possible if and only if Ci s = 1 1 nμ + ······ · + nμ = μ ∴ nμ n = μ i.e μ = μ. Verified.

1 n

for all i = 1, 2 . . . . . . . . . n. This implies

1 nμ

+


5.4.7.

Xi ∼ U(0.θ). Yn = max X1 , . . . . . Xn θˆ = Yn is the MLE of θ. (a) By method of moment estimator E(X) = X = Hence the method of moment estimator θˆ = 2X (b) E(θˆ ) = E(Yn ) E(θˆ ) = E{max(X1 , . . . Xn )}

θ+0 2 .

This implies θ = 2X.

nθ ∴ E(θˆ ) = n+1

E(θˆ ) = E(2X). E(θˆ ) = 2.E(θ/2). That is, E(θˆ ) = θ. Hence, θ2 ia an unbiased estimate of θ. ˆ ˆ (c) E(θˆ3 ) = n+1 n E(θ ) This implies E(θ3 ) = θ. ˆ ∴ θ is an unbiased estimate of θ. 5.4.9.

Here, Xi ∼ N(μ, σ 2 )

(x − μ)2 f (x) = √ exp − 2σ 2 2πσ 2 1

1

We have E(μ) ˆ = E(X) = μ μ is an unbiased estimate for μ. By definition, the unbiased estimate μ ˆ that minimizes the mean square error is called the minimum variance unbiased estimate (MVUE) of μ. MSE(μ) ˆ = E(μ ˆ − μ)2

That is MSE(μ) ˆ = var(μ). ˆ Minimizing the MSE implies that the minimization of Var(μ). ˆ μ ˆ is the MVUE for μ. 5.4.11.

E(M) = E(X) = μ. Thus, sample median is an unbiased estimate of population mean μ.

Now, Var(X) = 1n (X − μ)2

Var(M) = 1n (M − μ)2 Where Var(X) ≤ VarM

5.4.13. fσ (X) =

|X| 1 exp − for 2σ σ

∞
The likelihood function is given by:

1 1 |Xi | f (X1 , X2 , . . . . . . . . Xn , σ) = n n exp − . 2 σ σ

5

6

|X | Take g |Xi | , σ = σ1n exp − σ i and h(X1 , X2 . . . . . . . Xn ) =

∴ |Xi | is sufficient for σ.

1 2n


5.4.15.

(a) The likelihood function is given by:

1 Xi f (x1 , x2 . . . . . . x n , θ) = n exp − θ θ

5 6

X Take g Xi , θ = θ1n exp − θ i and h(x1 , . . . . . xn ) = 1 for if x < 0 and h(x1 , . . . . . xn ) = 0 if x ≥ 0.

∴ Xi is sufficient for θ. 5 6 Similarly, f (x1 , x2 . . . . . . x n , θ) = θ1n exp − nX θ 5 6 Take g X, θ = θ1n exp − nX and h(x1 , . . . . . xn ) = 1 for x < 0 and h(x1 , . . . . . xn ) = 0 if θ x ≥ 0. X is sufficient for θ. Sample mean is an unbiased estimate for population mean EX = θ . 5.4.17.

The likelihood function is given by: f (x1 , x2 . . . xn ) =

⎧ ⎨ 1n

if, − 2θ ≤ xi ≤ 2θ , i = 1, 2, . . ..

⎩0,

otherwise.

θ

Let us write fθ (x1 , x2 . . . . . . . . . x n ) = h(x1 , x2 . . . . . . . . x − n) g x(1) , x(n)

where ⎧ 1 ⎨ θn g min x(i) , max (i) = ⎩0,

if, − 2θ ≤ (x(1) , x(n) ) ≤ 2θ i = 1, 2, . . .. otherwise.

Hence, (min Xi , max Xi ), 1 ≤ i ≤ n, is sufficient for θ. 5.4.19.

The likelihood function is given by:

n f (x1 , x2 , . . . . . . . . xn , μ) = (2π)− 2 exp −

(xi − μ)2 2

1

The above expression can be written as: n 1 n 5 6 2 2 2 f (x1 , X2 , . . . . . . . . xn , μ) = exp −xi − 2μx1 + nμ exp − xi − 2μ xi i=1

i=2

2 3 Let g(X1 , μ) = exp −x12 − 2μx1 + nμ2 This implies that we can not take the other function which is only the function of Xi s. Hence by factorization theorem, X1 is not sufficient statistics for μ.


5.4.21.

The likelihood function is given by ⎧ ⎨θ n 4n (x )θ−1 i=1 i f (x1 , . . . . . . x n ) = ⎩0 otherwise

for 0 < x < 1, θ > 0

Let U = (X1 , X2 , . . . . . . . . Xn ) then 4 g(x1 , . . . . . . . Xn , θ) = θ n ni=1 xiθ−1 and h(x1 , . . . . . . x n ) = 1. Therefore U is sufficient for θ. 5.4.23.

The likelihood function is given by 2n f (x1 , . . . . . . . . xn ) = n α

n 0

1 Xi

e−

2 xi α

i=1

2 2n − xi2 4 Let g x , α = αn e α and h(x1 , . . . . . . . . xn ) = ni=1 xi

i 2 Hence, xi is sufficient for the parameter α.


54

6

n n ln L(p, x) = ln xi ln p ni=1 (n − xi ) ln(1 − p) For MLE (1 − p) xi − i=1 xi +

x x p (n − xi ) = 0. This implies p = n2 i . Suppose Yn = n i . Thus pˆ = Ynn . Where E Ynn = 1n E(Yn ) = n12 nnp = p. pˆ is an unbiased estimate of p. Similarly Var pˆ = Var Ynn =

x

Var n2 i This implies Var pˆ = 5.5.3.

n np(1 − p). n4

Thus Var pˆ → 0 as

E(Xi ) = μi , E(Xi2 ) = μ 2 and E(Xi ) = μ 4 . 2

E Xi − μ + μ − X E(s2 ) = 1n

This implies E(s2 ) = nσ n−σ . That is E(s2 ) = ∴ S 2 is an biased estimator of σ 2 2 Here S 2 = n−1 n S n−1 2 2 Var(S ) = n2 σ → 0 as n → ∞. 2

2

n → ∞.

Yn n

is consistent.

(n−1)σ 2 . n

(n−1)s 2 2 ∼ χ(n−1) . σ 2 2 (n−1)s 2 Thus, Var = 2(n − 1). This implies (n−1) var(S 2 ) = 2(n − 1). σ2 σ4 2 σ2. Finally, Var(S 2 ) = n−1 2 Moreover, Bias(s2 ) = E(s2 ) − σ 2 . This implies −σn → 0 as n → ∞. Thus s2 is an unbiased estimate of σ 2 .

Where

5.5.5. 5.5.7.

Here E(x) = θ and var(x) = θ 2 (For exponential distribution). Now E(X = Eθ) and 2 Var(θ) = θn → 0 as n → ∞. ∴ X is an unbiased estimate of θ.

n Here, ln(α, X) = −n ln α + 1−α i=1 ln(Xi ). α

Differentiating above equation and equating to zero we get αˆ =

−

n i=1 ln Xi

n


5.5.9.

5.5.11.

1 1 Here, var(θˆ 1 ) = 12n and Var(θˆ 2 ) = 12 . Thus the efficiency of θˆ2 relative to θˆ1 is (θˆ2 , θˆ1 ) = 1n < 1 for n ≥ 2. Here the efficiency depends on n. Thus θˆ1 is more efficient than θˆ2 for n ≥ 2.

By definition E(x) = θ and var(X) = θ 2 . Here, ln f (x) = − ln(θ) − xθ , E(X2 ) = θ 2 . Moreover 2∂ 3 X Var ∂θ ln f (x) = var −1 θ + θ 2 . Hence, 1

5

∂ ln f (x) nVar ∂θ

6 =

θ2 + var(X). n

∴ X is efficient for θ.

5.5.13.

Using given pdf ln f (X) = c − Thus

x−μ 2σ 2

where c =

E

−n ln(2πσ 2 ) . 2

1 ∂2 −1 ln f (x) = 2 . ∂2 σ

This implies

2 ∂ 1 E ln f (x) = 2 ∂μ σ

Hence, 1 2 ∂ ∂2 E ln f (x) = −E 2 ln f (x) . ∂μ ∂

5.5.15.

Because each Xi has a uniform distribution on the interval (0.θ), μ = E(Xi ) = θ/2 and θ2 var(Yi ) = 12 . θ2 ∴ E(θˆ2 ) = θ and var(θˆ2 ) = 3n .

To find the variance of θˆ1 and θˆ3 we must have density of Xn . n−1

fn (X) = n[FX (x)]n − 1fX (x). That is fn (X) = nxθn for 0 ≤ x ≤ θ. This implies E(θˆ ) = n θ. Thus θˆ1 is not an unbiased estimate of θ. n+1

Similarly. E(θˆ3 ) = θ. θˆ3 is an unbiased estimator of θ. θ2 Now var θˆ1 = (n+1)n2 (n+2) and var θˆ3 = n(n+2) . ˆ ˆ (b) The efficiency of θ1 relative to θ2 is given by: (n + 1)3 (n + 1) e θˆ1 , θˆ2 = . 3n2 3

Hence θˆ1 is more efficient than θˆ1 if (n+1)3n(n+1) >1 2 Similarly, e θˆ2 , θˆ3 = n+2 > 1 if n > 1. 3 θˆ2 is efficient than θˆ3 if n > 1.


5.5.17.

It can be easily verified that E θˆ1 = θ, E θˆ2 = θ and E θˆ3 = θ 31 2 6n−17 2 σ , Var θˆ2 = 25(n−3) σ . Similarly, Var θˆ1 = 81 σ2 ˆ Var θ3 = n . ˆ ˆ Now the corresponding efficiencies are given by e θˆ2 , θˆ1 = 31775(n−3) 81(6n−17) , e θ3 , θ1 = e θˆ3 , θˆ2 = (6n−17)n .

31n 81 .

25(n−3)

5.5.21.

The ratio of the joint density function of two sample points is given by; , n n 1n n L(x1 , . . . . xn ) −1 2 2 = exp Xi − Yi − 2μ Xi − Yi . L(y1 , . . . yn ) 2σ 2 i=1

i=1

i=1

n

i=1

For this ratio to be free of μ and σ 2 , we must have i=1 Xi = i=1 Yi and ni=1 Xi2 =

n Yi2 . i=1

Thus ni=1 Xi and ni=1 Xi2 are jointly minimal sufficient statistics for μ and σ 2 . Since X is unbiased estimate for μ and s2 is an unbiased estimate for σ 2 . The estimators are functions of the minimal sufficient statistics. This implies the X and s2 are MVUES for μ and σ 2 . 5.5.23.

n

The ratio of joint density function at two sample points, we have 4n (Yi ) Xi − Yi L(x1 , . . . . xn ) = 4ni=1 λ . L(y1 , . . . yn ) i=1 (Xi )!

For the ratio to be free of λ we must have sufficient statistics for μ.

Xi −

Yi = 0. Thus

Xi , form the minimal

5.5.25.

L(x1 , . . . . xn ) = e Xi − Yi L(y1 , . . . yn )

For the ratio to be independent of β, we need to have Xi = Yi . Thus Xi is minimal sufficient for β. Now E Xi = nX is UMVUE, by Rao–Blackwells theorem. 5.5.27. 4n (Yi ) X2 − Y 2 L(x1 , . . . . xn ) i i . 4 = ni=1 e L(y1 , . . . yn ) i=1 (Yi )!

The ratio to be free of β, we must have Xi2 − Yi2 = 0. Therefore Xi2 is MVUE for β. Moreover s2 is an unbiased estimator for σ 2 .

Chapter

6

Interval Estimation EXERCISES 6.1

6.1.3.

(a) We are 99% confident that the estimate value of the parameter lies in the confidence interval. (b) 99% confidence interval is wider (c) When μ is known but σ 2 is unknown we use t-distribution for the sample size n ≤ 30. If the distribution is binomial and there are enough number of samples such that np ≥ 5 and np(1 − p) ≥ 5, then we use normal approximation. (d) More the information higher the confidence interval. So the sample size is inversely proportional to the width of the confidence interval. x−μ √ ≤ 2.75 = k (a) p −2.81 ≤ σ/ n σ p −2.81 · √n ≤ x − μ ≤ 2.75 · √σn = k p −2.81 · √σn − x ≤ −μ ≤ 2.75 · √σn + x = k p x − 2.75 · √σn ≤ μ ≤ x + 2.81 · √σn = k (b) Confidence interval for μ is given by x − 2.75 · √σn , x + 2.81 · √σn (c) Confidence level = k

6.1.5.

(a) Here xi ∼ N(μ, σ 2 )

6.1.1.

(n−1)s2 σ2

2 ∼ χ(n−1) 2

Pivot = (n−1)s ; where only σ 2 is unknown. σ2 2 p a < (n−1)s
59

60 CHAPTER 6 Interval Estimation p

1

2 χ1−α/2

<

σ2 (n−1)s2

2 p (n−1)s < σ2 < χ2 α/2

<

1

=1−α

2 χα/2

(n−1)s2 2 χ1−α/2

=1−α

So (1 − α) · 100% confidence interval for σ 2 is given by (n−1)s2 2 χα/2

< σ2 <

(n−1)s2 2 χ1−α/2

(b) n = 21, x = 44.3, s = 3.96 α = 0.1, 1 − α = 0.9, α/2 = 0.05 2 2 = χ0.95,20 = 10.851 χ1−α/2,20 2 = χ2 χα/2 0.05,20 = 31.410

90% confidence interval is given by (n−1)s2 2 χα/2

6.1.9.

< σ2 <

(n−1)s2 2 χ1−α/2

=

20(3.96)2 31.41

< σ2 <

20(3.96)2 10.851

We are 90% confident that σ 2 lies in the interval (9.985, 28.903). x−μ √ ≤b =1−α (a) p a ≤ σ/ n x−μ √ < zα/2 = 1 − α p −zα/2 < σ/ n p −zα/2 · √σn < x − μ < zα/2 · √σn = 1 − α p −zα/2 · √σn − x < −μ < zα/2 · √σn − x = 1 − α p x − zα/2 · √σn < μ < x + zα/2 · √σn = 1 − α So the confidence interval is x − zα/2 · √σn , x + zα/2 · √σn . (b) If σ 2 is unknown, we use sample variance for the estimation of confidence interval a sampling distribution. Thus the confidence interval is and t-distribution as σ σ x − tα/2 · √n , x + tα/2 · √n .


(a) n = 1200 pˆ = 0.35 zα/2 = z0.25 = 1.96 95% confidence interval is given by ˆ p) ˆ 0.35(1−0.35) = 0.35 ± 1.96 = (0.323, 0.377) pˆ ± zα/2 p(1− n 1200


(b) pˆ = 0.6 pˆ ± zα/2

p(1− ˆ p) ˆ n

= 0.6 ± 1.96 0.6(1−0.6) = (0.572, 0.628) 1200

(c) pˆ = 0.15

6.2.3.

0.15 ± 1.96 0.15(1−0.15) = (0.13, 0.17) 1200 (d) We are 95% confident that the percentage of people who find political advertising to be true is (0.323, 0.377), the percentage of people who find political advertising to be untrue is (0.572, 0.628), the percentage of people who find falsehood in commercial is (0.13, 0.17). 5 6 2 (a) Here f (x) = √ 1 exp − (x−μ) 2 2σ 2πσ 5

6 n (x −μ)2 2 L(x1 , x2 , . . . , xn , μ, σ ) = (2πσ 2 )− 2 exp − 2σi 2 ln L(x1 , x2 , . . . , xn , μ, σ 2 ) = − 2n ln(2πσ 2 ) − ∂ ∂μ

ln L(x1 , x2 , . . . , xn , μ, σ 2 ) =

(xi −μ)2 2σ 2

(xi −μ)2 2σ 2

For MLE estimator ∂ ∂μ

ln L(x1 , x2 , . . . , xn , μ, σ 2 ) = 0 (xi − μ) = 0

So μ ˆ =x (b) (1 − α)100% confidence interval for μ is given by x − zα/2 · p x − zα/2 · √σn < μ < x + zα/2 · √σn = 1 − α p x − 2 · √σn < μ < x + 2 · √σn = 0.954 (1 − α) = 0.954 α = 0.046 α/2 = 0.023 zα/2 = 2.0 from z table Thus verified. (c) p x − k · √σn < μ < x + k · (1 − α) = 0.90 α = 0.10 α/2 = 0.05 zα/2 = 1.645 k = 1.645 6.2.5.

n = 50 p˜ =

18 50

=

9 25

√σ n

= 0.90

√σ , x + zα/2 n

·

√σ n

62 CHAPTER 6 Interval Estimation

np˜ = 18 > 5 n(1 − p) ˜ = 32 > 5 So the given data can be approximated as a normal distribution. Here 1 − α = 0.98 α = 0.02 α/2 = 0.01 zα/2 = z0.01 = 2.325 Thus the 98% confidence interval is given by 9 16 ˜ p) ˜ 9 25 · 25 p˜ ± zα/2 p(1− = ± 2.325 = (0.202, 0.518) n 25 50 6.2.7.

n = 50 x = 11.4 σ = 4.5 1 − α = 0.95 α = 0.05 zα/2 = 1.96 95% confidence interval is x ± zα/2 √σn = 11.4 ± 1.96 √4.5 = (10.153, 12.647) 50

6.2.9.

n = 400 pˆ = 0.3 npˆ = 120 > 5 n(1 − p) ˆ = 280 > 5 95% confidence interval is given by ˆ p) ˆ 0.3·0.7 pˆ ± zα/2 p(1− = 0.3 ± 1.96 = (0.255, 0.345) n 400

6.2.11.

Proportion of defection pˆ =

40 500

=

1 − α = 0.9 α/2 = 0.05 zα/2 = 1.645 90% confidence interval is given by 2 23 2 25 · 25 ± 1.645 = (0.06, 0.1) 25 500

2 25


6.2.13.

x ∼ N(μ, 16) p(x − 2 < μ < x + 2) = 0.95 zα/2 √σn = 2 n = (zα/2 σ2 )2 =

6.2.15.

1.96·4 2 2

= 15.37 ≈ 16

n = 425 pˆ = 0.45 npˆ > 425 · 0.45 > 5 n(1 − p) ˆ = 425 · 0.55 > 5 95% confidence interval is given by ˆ p) ˆ 0.45·0.55 = 0.45 ± 1.96 = (0.403, 0.497) pˆ ± zα/2 p(1− n 425 For 98% confidence interval 1 − α = 0.98 α = 0.02 zα/2 = 2.335 = (0.394, 0.506) 0.45 ± 2.335 0.45·0.55 425

6.2.19.

pˆ =

52 60

1 − α = 0.95 α/2 = 0.025 zα/2 = 1.96 The 95% confidence interval is given by 52 8 ˆ p) ˆ 52 60 · 60 pˆ ± zα/2 p(1− = ± 1.96 = (0.781, 0.953) n 60 60 6.2.21.

σ = 35 E = 15 E = zα/2 √σn z σ 2 1.96·35 2 n = α/2 = = 20.92 ≈ 21 E 15

6.2.23.

x = 12.07 σ = 12.91 1 − α = 0.98 α/2 = 0.01 zα/2 = 2.335


98% confidence interval for mean is given by 1.91 x ± zα/2 · √σn = 12.07 ± 2.335 √ = (11.32, 12.82) 35


(a) When standard deviation is not given and there is not enough sample size, we use t-distribution. (b) As differences decreases the sample size n increases which means we are closing in on the true parameter value of θ. (c) The data are normally distributed, and the values of x and the sample standard deviation are known.

6.3.3.

x = 20 s=4 1 − α = 0.95 (a) x ± tα/2,4 ·

√s n

= 20 ± t0.025,4 √4

(b) 20 ± t0.025,9 √4

5

10

(c) 20 ± t0.025,19 √4 20

6.3.5.

x = −2.22 s = 0.005 n = 26 98% confidence interval for μ is 1.67 x ± tα/2,25 · √sn = −2.22 ± 2.485 √ = (−3.03, −1.41) 26

6.3.7.

x = 0.905 s = 1.67 1 − α = 0.98 n = 10 98% confidence interval for μ is √ 0.905 ± t0.025,9 0.005 10

6.3.9. 6.3.11.

Similar to 6.3.8 x = 410.93 s = 312.87


95% confidence interval for μ is √ x ± tα/2,14 · √sn = 410.93 ± 2.145 312.87 = (237.65, 584.21) 15

6.3.13.

x = 3.12 √ s = 1.04 n = 17 99% confidence interval for μ is √ x ± tα/2,4 · √sn = 3.12 ± 2.921 √1.04 = (2.40, 3.84) 17

6.3.15.


6.3.17.



x = −2.2 s = 1.42 1 − α = 0.90 α = 0.10 n = 20 90% confidence interval for σ 2 is given by 2 19(1.42)2 (n−1)s2 (n−1)s2 , = 19(1.42) = (1.1663, 4.3015) 32.85 , 8.90655 χ2 χ2 α/2,19

6.4.3.

1−α/2,19

x = 60.908 s2 = 12.66 1 − α = 0.99 α = 0.01 n = 10


99% confidence interval for σ 2 is given by (n−1)s2 (n−1)s2 9·12.66 , = 9·12.66 = (4.8321, 65.8613) 23.58 , 1.73 χ2 χ2 α/2,9

6.4.5.

1−α/2,9

x = 2.27 s2 = 1.02 1 − α = 0.99 α = 0.01 n = 18 99% confidence interval for σ 2 is given by (n−1)s2 (n−1)s2 17·1.02 , = 17·1.02 = (0.6287, 3.0475) 27.58 , 5.69 χ2 χ2 α/2,9

6.4.9.

1−α/2,9

From excel or by calculation, sample variance s2 = 148.44, sample mean x = 97.24, n = 25 99% confidence interval for population variance is given by (n−1)s2 (n−1)s2 24·148.44 , = 24·148.44 = (97.8456, 360.3642) 36.41 , 9.886 χ2 χ2 α/2,24

6.4.11.

1−α/2,24

x = 13.95 s2 = 495.085 1 − α = 0.98 α = 0.02 n = 25 98% confidence interval for σ 2 is given by (n−1)s2 (n−1)s2 24·495.085 , χ2 = 24·495.085 = (276.5194, 1095.1189) 42.97 , 10.85 χ2 α/2,24

1−α/2,24


For procedure I, x1 = 98.4, s12 = 235.6, n1 = 10 For procedure II, x2 = 95.4, s22 = 87.15, n2 = 10 α = 0.02 α/2 = 0.01 zα/2 = 2.985 98% confidence interval for difference of mean is ' x1 − x2 ± zα/2

s12 n1

+

s22 n2

= 98.4 − 95.4 ± 2.985 235.6 10 + = (−13.9580, 19.9580)

87.15 10


6.5.3.

x1 = 16.0,

s1 = 5.6,

n1 = 42

x2 = 10.6,

s2 = 7.9,

n2 = 45

α = 0.01 α/2 = 0.005 zα/2 = 2.575 99% confidence interval for difference of mean is

' x1 − x2 ± zα/2

6.5.5.

s12 n1

+

s22 n2

2 = 16.0 − 10.6 ± 2.575 (5.6) 42 +

(7.9)2 45

= (1.6388, 9.1612)

x1 = 58, 550, s1 = 4, 000, n1 = 25 x2 = 53, 700, s2 = 3, 200, n2 = 23 Since σ12 = σ22 but unknown, we can use pooled estimator Sp2 =

(n1 −1)s12 +(n2 −1)s22 (n1 +n2 −2)

Sp2 =

24(4,000)2 +22(3,200)2 46

Sp = 3639.398 The 90% confidence interval is x1 − x2 ± tα/2,(n1 +n2 −2) · Sp n11 +

1 n2

1 + = 58, 550 − 53, 700 ± 2.326 · 3639.398 25 6.5.7.

1 23

= (2404, 7296)

x1 = 28.4, s1 = 4.1, n1 = 40 x2 = 25.6, s2 = 4.5, n2 = 32 (a) MLE of μ1 − μ2 is given by (x1 − x2 ) (b) 99% confidence interval for μ1 − μ2 is ' x1 − x2 ± zα/2

s12 n1

+

s22 n2

= 28.4 − 25.6 ± 2.565 = (0.1678, 5.4322)

6.5.9.

x1 = 148, 822, s1 = 21, 000, n1 = 100 x2 = 155, 908, s2 = 23, 000, n2 = 150 1 − α = 0.98 α = 0.02

(4.1)2 40

+

(4.5)2 32


α/2 = 0.01 zα/2 = 2.575 98% confidence interval for difference of mean is given by ' 2 s12 s22 x1 − x2 ± zα/2 n1 + n2 = 148, 822 − 155, 908 ± 2.2 (21,000) + 100

(23,000)2 150

= (−508, 13, 664) 6.5.11.

x1 = 35.18, s12 = 19.76, n1 = 11 x2 = 38.76, s22 = 12.69, n2 = 13 1 − α = 0.9 α = 0.1 90% confidence interval for

σ12 σ22

is given by

s12 s12 1 1 , 2 F s2 n1 −1,n2 −1,1−α/2 s22 Fn1 −1,n2 −1,α/2

= =

6.5.13.

19.76 1 19.76 1 12.69 F10,12,0.95 , 12.69 F10,12,0.05

19.76

1 19.76 12.69 2.75 , 12.69

· 2.91 = (0.5662, 4.5313)

x1 = 68.91, s12 = 287.17, n1 = 12 x2 = 80.66, s22 = 117.87, n2 = 12 1 − α = 0.95 α = 0.05 α/2 = 0.025 (a) 95% confidence interval of difference of mean is ' x1 − x2 ± zα/2

s12 n1

+

s22 n2

= 68.91 − 80.66 ± 1.96 281.17 12 +

117.87 12

= (−23.0525, −0.4475) (b) 95% confidence interval for

σ12 σ22

s12 s2 1 1 , 1 s22 Fn1 −1,n2 −1,1−α/2 s22 Fn1 −1,n2 −1,α/2

is given by

= =

281.17 1 281.17 1 117.87 F11,11,0.975 , 117.87 F11,12,0.025

281.17 118.87

·

1 281.17 3.48 , 118.87

· 3.48 = (0.6855, 8.3055)

Chapter

7

Hypothesis Testing EXERCISES 7.1 : μ = μ0 : μ > μ0 : μ = μ0 : μ > 1.2μ0

7.1.1.

(a) H0 H1 (b) H0 H1

7.1.3.

H0 : p = 0.5 H1 : p > 0.5 n = 15 (a) α = probability of type I error = p(reject H0 |H0 is true) = p(y ≥ 10|p = 0.5) = 1 − p(y ≤ 10|p = 0.5) =1−

15

c(15, y)(0.5)y (0.5)15−y

y=10

= 1 − 0.941 = 0.059 (b) β = p(accept H0 |H0 is false) = p(y ≤ 9|p = 0.7) =

9 c(15, y)(0.7)y (0.3)15−y y=0

= 0.278 (c) β = p(y ≤ 9|p = 0.6) =

9

c(15, y)(0.6)y (0.4)15−y

y=0

= 0.597

69

70 CHAPTER 7 Hypothesis Testing

(d) For α = 0.01 0.01 = p(y ≥ k|p = 0.5) From binomial table α = 0.01 falls between k = 2 and k = 3. However, for k = 3, α = 0.018 which exceeds 0.01. If we want to limit α to be no more than 0.01, we will take k = 2. That is we reject H0 if k ≥ 2. For α = 0.01 0.03 = p(y ≥ k|p = 0.5) From binomial table α = 0.03 falls between k = 3 and k = 4. However, for k = 4, α = 0.059 which exceeds 0.05. If we want to limit α to be no more than 0.05, we will take k = 3. That is we reject H0 if k ≥ 3. (e) When α = 0.01. From part d, rejection region is of the form (y ≥ 2). For p = 0.7 β = p(y ≥ 2|p = 0.7) = 1 − p(y < 1|p = 0.7) = 1 − 0.000 =1 7.1.5.

n = 25 σ=4 H0 : μ = 10 H1 : μ > 10 (a) α = probability of type I error = p(reject H0 |H0 is true) = p(x > 11.2|μ = 10) x−μ √ > 11.2−10 √ = P σ/ |μ = 10 n = P(z > 1)

4/ 25

= 0.1587 (b) β = p(accept H0 |H0 is false) = p(x ≤ 11.2|μ = 11) x−μ √ ≤ 11.2−11 √ = P σ/ |μ = 11 n 4/ 25

= P(z ≤ 0.25)

= 0.5787 (c) μ0 = 10 μa = 11 zα = z0.01 = 2.33 zβ = z0.8 = −0.525 n= =

(zα +zβ )2 σ 2 (μa −μ0 )2 (2.33−0.525)2 ·42 (11−10)2

= 52.13 rounded up to 53


7.1.9.

σ 2 = 16 H0 : μ = 25 H1 : μ = 25 n= =

(zα +zβ )2 σ 2 (μa −μ0 )2 (1.645+1.645)2 ·16 (1)2

= 173.19 rounded up to 174


H0 : μ = μ0 H1 : μ = μ1 L(μ) = √ 1n 2π

σn

L(μ0 ) = √ 1n 2π

L(μ1 ) = √ 1n 2π

L(μ0 ) L(μ1 )

5 ln

5

6 (x −μ)2 exp − 2σi 2

σn σn

5

6 (xi −μ0 )2 exp − 2σ 2 5

6 (xi −μ1 )2 exp − 2σ 2

5

6

(xi −μ0 )2 (xi −μ1 )2 = exp − 2σ + 2 2σ 2 5

6

(xi −μ1 )2 − (xi −μ0 )2 = exp 2σ 2

L(μ0 ) L(μ1 )

6

= = = =

(xi −μ1 )2 − (xi −μ0 )2 2 2σ

2

xi −2nxμ1 +μ21 − xi2 −2nxμ0 −μ20 2 2σ 2nx(μ0 −μ1 )−(μ0 −μ1 )(μ0 +μ1 ) 2σ 2

(μ2 −μ2 ) (μ0 −μ1 ) xi − 02σ 2 1 σ2

Therefore, the most powerful test is to reject H0 if

2 μ0 − μ21 − ≤ ln k 2σ 2 2

μ0 − μ21 (μ0 − μ1 ) xi ≤ ln k + σ2 2σ 2 ⎧ 2 2 ⎫ ⎪ ⎨ σ 2 ln k + μ0 −μ1 ⎪ ⎬ 2 xi ≤ =C ⎪ ⎪ (μ0 − μ1 ) ⎭ ⎩ (μ0 − μ1 ) σ2

xi

Assume μ0 < μ1 , rejection region for μ = μ1 is given by

Where C =

⎧ ⎨

⎫ 2 μ2 0 −μ1 ⎬

σ 2 ln k+ 2 (μ0 −μ1 ) ⎩

⎭

xi ≤ C


Rejection region for μ = μ0 is given by 7.2.5.

xi ≥ C

H0 : η = η 0 H1 : η < η 0 2y −y2 /η2 e x>0 η2 n 2 2 n 4 2yi L(η) = e− yi /η η2 i=1 n 2 2 n 4 2yi L(η0 ) = e− yi /η0 2 η0 i=1 n 2 2 n 4 2yi L(η1 ) = e− yi /η1 2 η1 i=1

f (y) =

n 4 2yi 2

η0

2 2 2 2 L(η0 ) i=1 = n n e yi /η1 − yi /η0 4 2yi L(η1 ) η21

i=1 2n

= 6

η1 η0

e

yi2 /η21 −

yi2 /η20

5

η1 0) ln L(η = 2n ln yi2 /η21 − yi2 /η20 ≤ ln k L(η1 ) η0 +

2 yi ≤ C 5 6 2 2 η0 η1 Where C = ln k − 2n ln ηη10 η2 −η2 0

7.2.7.

1

H0 : p = p 0 H1 : p = p1 where p1 > p0 f (p) = px (1 − p)1−x

xi (1 − p)n− xi

x L(p0 ) = p0 i (1 − p0 )n− xi

x L(p1 ) = p1 i (1 − p1 )n− xi xi n− xi L(p0 ) p0 1−p0 L(p1 ) = p1 1−p1

L(p) = p

≤k

Taking natural logarithm, we have

1 − p0 p0 + n− xi ln ≤ ln k p1 1 − p1 1 − p0 1 − p0 p0 ln − ln xi + n ln ≤ ln k p1 1 − p1 1 − p1 : 1 − p0 p0 1 − p0 ln − ln xi ≤ ln k − n ln 1 − p1 p1 1 − p1

xi ln

To find the rejection region for a fixed value 5 6 ;of 5 α,write theregion 6as

p0 1−p0 0 ln xi ≤ C, where C = ln k − n ln 1−p 1−p1 p1 − ln 1−p1



f (x) =

√1 2πσ

L(σ1 ) = √

5 6 2 exp − (x−μ) 2σ 2 5

6 (xi −μ)2 1 exp − n 2 n 2σ

2π

Here 0 = Hence

2

σ02

σ1

3

1

2 3 and a = R − σ02

1 n 2 1 (xi − μ)2 L σ0 = max √ exp − 2σ02 2πσ0

Since the only unknown parameter in the parameter space is σ 2 , −∞ < σ 2 < ∞; maximum likelihood function is achieved when σ 2 equals to its maximum likelihood estimator 1 (xi − x)2 n n

2 = σˆ mle

i=1

λ=

σ12

n/2

σ02

=

exp

(xi − x)2

(xi − μ)2 2σ12

n/2

nσ02

−

(xi − μ)2

1

2σ02

1

n (xi − μ)2 (xi − μ)2

exp − 2 (xi − x)2 2σ02

The likelihood ratio test has the rejection region: Reject if λ ≤ k, which is equivalent to

2

3 n (x −μ)2 (x −μ)2 n ≤ ln k (xi − x)2 − 2n ln nσ02 + 2 (xi −x)2 − 2σi 2 2 ln 7.3.3.

i

5 6 2 f (x) = √ 1 exp − (x−μ) 2 2σ 2πσ 5

6 (x −μ)2 L(σ1 ) = √ 1n n exp − 2σi 2 2π

L(σ2 ) = √ 1n 2π

L(σ22 ) L σ12

σ1

σ2n

Let λ = =

0

1

5

6 (y −μ)2 exp − 2σi 2 σ2 σ1

n

2

5

(yi −μ)2 exp − 2σ 2 2

(xi −μ)2 2σ12

6

Thus the likelihood ratio test has the rejection region Reject H0 if λ ≤ k n ln

σ2 σ1

+

(yi − μ)2 σ22

(yi − μ)2 2σ22

−

−

(xi − μ)2 σ12

(xi − μ)2 2σ12

≤ ln k

≤ 2 ln k − 2n ln

σ2 σ1


1 (xi − x)2 n n

σˆ 12 = σˆ 22 =

1 n

i=1 n

(xi − x)2

i=1

n (yi − μ2 )2 n (xi − μ1 )2 σ2

− ≤ 2 ln k − 2n ln σ1 (yi − y)2 (xi − x)2

The rejection region is 7.3.5.

n

(yi − μ2 )2 n (xi − μ1 )2

−

≤C (yi − y)2 (xi − x)2

f (x) = 1θ e−x/θ for x > 0 5 6 x L(θ) = θ1n exp − θ i L(θ0 ) =

1 θ0n

L(θ1 ) =

1 θ1n

L(θ0 ) L(θ1 )

5 6 x exp − θ0 i 5 6 x exp − θ1 i

5

6 x x exp θ1 i − θ0 i n 5

6 x x = θθ10 exp θ1 i − θ0 i =

θ1n θ0n

We reject the null hypothesis if

1 1 − xi ≤ ln k θ1 θ0 θ1 θ1 θ0 xi ≤ ln k − n ln θ0 θ0 − θ1

n ln

θ1 θ0

+

Where we reject null hypothesis if xi ≤ m1 or xi ≥ m2 5 6 θ1 θ0 1 Provided m1 = ln k − n ln θθ10 θ0 −θ1 and m2 =

ln k−n ln

θ1 θ0

θ0 −θ1 θ1 θ0


n = 50, α = 0.02, x¯ = 62, s = 8 H0 : μ ≥ 64 H1 : μ < 64 (a) The observed test statistic (n ≥ 30) is z=

x¯ − μ0 62 − 64 −2 = −1.769 = √ = √ 1.13 s/ n 8/ 50

(b) p-value = p(z < −1.769) = p(z > 1.769) = 0.0384


(c) Smallest α = 0.0384 p-value > 0.02 We fail to reject the null hypothesis. 7.4.3.

H0 : μ = 0.45 H1 : μ < 0.45 Here x¯ = s2

=

20.5 77

22.8 78

= 0.2923

= 0.2666

s = 0.16309 (a) Test statistic z =

x¯ 0.2923 − 0.45 = −0.85 √ = √ s/ n 0.16309/ 78

Rejection region is {z < −z0.01 } = {z < −2.33} Since z = −2.33 < −0.85, the null hypothesis is rejected at α = 0.01 p-value = p{z < −0.85} = p{z > 0.85} = 0.1977 (b) Here z = −0.85 zα/2 = z0.005 = 2.58 Rejection region is {z < −z0.005 , z > z0.005 }, i.e. {z < −2.58, z > 2.58} p-value = min{z < −0.85, z > 0.85} = 0.3954 (c) Assumptions: even though population standard deviation is unknown, because of the large sample size, normal distribution is assumed. 7.4.5.

x¯ = $58, 800, n = 15, s = $8, 300 (a) P(reject H0 |H0 is true) = 0 Since the probability of rejecting null hypothesis equals to zero. Therefore, the null hypothesis is accepted. (b) α = 0.01 H0 : μ = 55, 648 H1 : μ > 55.648 t=

x¯ −μ √ s/ n

=

58,800−55,648 √ 8,300/ 15

=

3152 2143.05

= 1.47

Rejection region is {t > t0.01,14 } t0.01,14 = 2.624 Since t = 2.642 is greater than 1.47, we fail to reject the null hypothesis 7.4.7.

H0 : p0 = 0.3 H1 : p0 > 0.3 pˆ = z=

550 1500 = p−p ˆ 0 p0 q0 n

0.366

=

0.366−0.3 0.3·0.7 1500

=

0.066 0.0118

Rejection region is {z > z0.01 } z0.01 = 2.33

= 5.593


i.e. {z > 2.33} Yes, customer has preference over ivory color 7.4.9.

(a) x¯ = 42.9, s = 6.3674, α = 0.1 H0 : μ = 44 H1 : μ = 44 The data is normally distributed. z=

42.9 − 44 −1.1 x¯ − μ = −0.5644 √ = √ = 2.013 s/ n 6.3674/ 10

2 Rejection region for z is |z| < z0.05 } where z0.05 = 1.645 −z0.05 = −1.645 < −0.56444 We fail to reject the null hypothesis (b) 90% confidence interval for μ is x¯ − zα/2 · √sn , x¯ − zα/2 · √sn = (42.9 − 1.645 · 2.013, 42.9 + 1.645 · 2.013) = (39.588, 46.21138) (c) From a, we can see that it is reasonable to take μ = 44. The argument is supported by the confidence interval in b. 7.4.11.

x¯ = 13.7, s = 1.655, n = 20 H0 : μ = 14.6 H1 : μ < 14.6 t=

13.7−14.6 √ 1.655/ 10

=

−0.9 0.523

= −1.7198

Rejection region is {z > z0.01 } −z0.01 = −2.33 We reject the null hypothesis. Thus there is a statistical evidence to support this claim. 7.4.13.

x¯ = 32,277, s = 1,200, n = 100, α = 0.05 H0 : μ = 30,692 H1 : μ > 30,692 z=

32,277−30,692 √ 1,200/ 100

= 13.20 2 3 Rejection region is z > z0.05 z0.05 = 1.645 Hence we reject the null hypothesis that the expenditure per consumer is increased from 1994 to 1995 7.4.15.

H0 : μ = 1.129 H1 : μ = 1.129 t=

1.24−1.129 √ 0.01/ 24

=

0.111 0.00204

= 54.41

t0.05,23 = 2.807 Where t = 54.41 > t0.05,23 Thus we reject the null hypothesis. That is price of gas is changed recently.



H0 : μ 1 − μ 2 = 0 H1 : μ 1 − μ 2 = 0 z=

(¯y1 −¯ y2 )−(μ ¯ 1 −μ ¯ 2) ' 2 2 s1 s2 n1 + n2

=

74−71 81 100 50 + 50

=

3 1.9026

= 1.5767

2 3 Rejection region for z is |z| > z0.025 z0.025 = 1.96 Since z0.025 > 1.57, we fail to reject the null hypothesis. To see the significant difference we need to have α = 0.0582 level of significance. 7.5.3.

x¯ 1 = 58, 550, x¯ 2 = 53, 700, s1 = 4000, s2 = 3200, n1 = 25, n2 = 23 H 0 : μ1 − μ 2 = 0 H1 : μ 1 − μ 2 > 0 Sp2 =

(n1 −1)s12 +(n2 −1)s22 (n1 +n2 −2)

=

24(4000)2 +22(3200)2 25+23−2

= 13245217

Sp = 3639.3979 t=

(¯x1−¯x2 )−0 Sp n1 + n1

=

2

1

(58,550−53,700)−0 1 1 3639 25 + 23

= 4.61288

Rejection region is {t > t0.05,46 } i.e. {t > 1.679} Since 1.679 < 4.61288, we reject the null hypothesis. Thus implies that there exists significant evidence to show that the male’s salary is higher than that of female. 7.5.5.

x¯ 1 = 105.9, x¯ 2 = 100.5, s12 = 0.21, s22 = 0.19, n1 = 80, n2 = 100 H 0 : μ1 − μ 2 = 0 H1 : μ 1 − μ 2 = 0 Use t =

7.5.7.

(¯' x1 −¯x2 )−0 2 s1 n1

s2

and use two sided t-test.

+ n2

2

x¯ 1 = 7.65, x¯ 2 = 9.75, s1 = 0.9312, s2 = 0.852, n1 = 10, n2 = 10 (a) H0 : μ1 − μ2 = 0 H1 : μ 1 − μ 2 > 0 Sp2 =

(n1 −1)s12 +(n2 −1)s22 (n1 +n2 −2)

=

9(0.9312)2 +9(0.852)2 18

Sp = 0.892479 t=

(¯x1−¯x2 )−0 Sp

1 1 n1 + n2

=

(7.65−9.75)−0 1 1 0.89 10 + 10

= −5.276

Rejection region is {t > t0.05,18 } i.e. {t > 1.734} Thus fail to reject the null hypothesis. (b) H0 : σ12 = σ22 H1 : σ12 = σ22 Test statistic F =

s12 s22

=

(0.9312)2 (0.852)2

= 1.1945

From the F -table F0.025(9,9) = 4.03


F0.95(9,9) =

1 4.03

= 0.248

Rejection region is F > 4.03 and F < 0.248 Since the observed value of the statistic 1.1945 < 4.03, we fail to reject the null hypothesis (c) H0 : μd = 0 H1 : μd > 0 d¯ = −2.1 Sd = 1.1670 t=

¯ 0 d−d √ Sd / n

−2.1 √ 1.670/ 10

=

= −3.16

From t-table t0.05,9 = 1.833 We reject the null hypothesis, this implies hat the down stream is less than the upstream. 7.5.9.

(a) x¯ 1 = 2.04, x¯ 2 = 3.55, s1 = 0.551, s2 = 0.6958, n1 = 14, n2 = 14 H0 : μ1 − μ2 = 0 H1 : μ1 − μ2 < 0 Since the variances are equal and unknown Sp2 = Sp = t=

(n1 −1)s12 +(n2 −1)s22 √ (n1 +n2 −2)

=

13(2.04)2 +13(3.55)2 26

= 8.38

8.38 = 2.89

(¯x1−¯x2 )−0 Sp n1 + n1

=

2

1

(2.04−3.55) 1 1 2.89 13 + 13

= −1.13321

Rejection region is {t < −t0.05,13 } i.e. {t < −1.77} Here −1.13321 > −1.77, we fail to reject the null hypothesis. (b) Test statistic F =

s12 s22

= 0.6281

From the F -table F0.025(13,13) = 3.11 F0.95(13,13) =

1 3.11

= 0.321

Rejection region is F > 3.11 and F < 0.321, we fail to reject the null hypothesis (c) H0 : μd = 0 H1 : μd < 0 d¯ = −1.5071 Sd = 0.7467 t=

¯ 0 d−d √ Sd / n

=

−1.5071 √ 0.7467/ 14

= −7.55196

From t-table −t0.05,9 = −1.833 t = −7.5598 < t0.05,9 = −1.83 We reject the null hypothesis. 7.5.11.

x¯ 1 = 106, x¯ 2 = 109, s1 = 10, s2 = 7, n1 = 17, n2 = 14 Test statistic F =

s12 s22

=

100 49

= 2.0408

From the F -table F0.01(10,7) = 2.585 F0.90(10,7) =

1 2.585

= 0.3868


Rejection region is F > 2.585 and F < 0.3868 Since observed value of test statistic = 2.0408 < 2.585, we fail to reject the null hypothesis.


c = 3, r = 3 (c − 1)(r − 1) = 4 2 = 9.48 χ0.05,4 Hence the rejection region is ϕ2 > 9.48 By using contingency table ϕ2 = 43.86 ϕ2 falls in the rejection region at α = 0.05, we reject the null hypothesis. That is collective bargaining is dependent on employee classification.

7.6.3.

O1 = 12, O2 = 14, O3 = 78, O4 = 40, O5 = 6 We now compute πi (i = 1, 2, 3, 4, 5) using continuity correction π1 = p(x ≤ 55) π2 = p z ≤ 65.5−70 4 π3 = p z ≤ 75.5−70 4 π4 = p z ≤ 85.5−70 4 π5 = p z ≤ 95.5−70 4 Taking above probability we need to find Ei and follow 6.6.5.

7.6.5.

E1 = 950 · 0.35, E2 = 950 · 0.15, E3 = 950 · 0.20, E4 = 950 · 0.30 O1 = 950 · 0.45, O2 = 950 · 0.25, O3 = 950 · 0.02, O4 = 950 · 0.28 ϕ2 = 834.7183 From Chi-Square table 2 = 7.81 χ0.05,3 Thus ϕ2 = 834.71 > 7.81, we reject the null hypothesis, at least one probabilities is different from hypothesized value.


Chapter

8

Linear Regression Models EXERCISES 8.2 8.2.1.

(a) Proof see example 8.2.1 (b) SSE 2 follows central Kai square with degree of freedom n − 2 σ E SSE = n − 2 and σ 2 is a constant σ2 Therefore E(SSE) = (n − 2)σ 2 (a) Least-squares regression line is yˆ = −84.1674 + 5.0384x (b)

150 50

100

y

200

250

8.2.3.

30

40

50

60

70

x

8.2.5.

8.2.7.

(a) Check the proof in Derivation of βˆ 0 and βˆ 1 . (b) We know the line of best fit is yˆ = βˆ 0 + βˆ 1 x and plug in the point (¯x, y¯ ) we get βˆ 0 = y¯ − βˆ 1 x¯ . Complete the proof. y = β1 x + ε n n

SSE = e2i = [yi − (βˆ 1 xi )]2 i=1

i=1

81

82 CHAPTER 8 Linear Regression Models

∂ ∂(SSE) ∂β1

=

n

[yi −(βˆ 1 xi )]2

i=1

∂β1

= −2 [yi − (βˆ 1 xi )]xi n

i=1

n

= −2 [xi yi − (βˆ 1 xi2 )] = 0 n

βˆ 1 =

i=1

(xi yi )

i=1 n

i=1

(xi2 )

yˆ = −40.175 + .9984x Least-squares regression line is yˆ = .62875 + .83994x

35 20

25

30

y

40

45

50

8.2.9.

20

40 x

50

2

y

3

4

Least-squares regression line is yˆ = 2.2752 + .00578x

1

8.2.11.

30

50

100 x

150

60


EXERCISES 8.3 (a) Least-squares regression line is yˆ = 57.2383 − .4367x (b)

15

20

25

30

y

35

40

45

50

8.3.1.

40

50

60

70

80

90

x

(c) The 95% confidence intervals for β0 is (40.5929, 73.8837) The 95% confidence intervals for β1 is (−.6806, −.1928) 8.3.3.

βˆ 1 and y¯ are both normally distributed. In order to show these two are independent we just need to show the covariance of them is 0. (Property of normal distribution) Cov βˆ 1 , y¯ = E βˆ 1 × y¯ − E βˆ 1 × E (¯y) n Sxy 1 × y¯ − β1 × E yi =E Sxx n

i=1

n n (yi − y¯ ) (xi − x¯ ) 1 i=1 (β0 + β1 x¯ )

n × y¯ − β1 × =E 2 n i=1 (xi − x¯ ) i=1

= β1 × (β0 + β1 x¯ ) − β1 × β0 − β12 x¯ =0

84 CHAPTER 8 Linear Regression Models

(a) Least-squares regression line is yˆ = −474.76 + 24.95x (b)

1000 0

500

y

1500

2000

8.3.5.

30

35

40 x

45

50

(c) The 95% confidence intervals for β0 is (−3418.974, 2469.454) The 95% confidence intervals for β1 is (−45.429, 95.329) 8.3.7.

By assumption and from normal equation we know n

(yi − y¯ )2 =

i=1

n

n

εi = 0 and

i=1

n

εi xi = 0

i=1

(yi − yˆ + yˆ − y¯ )2

i=1

= = = = = = =

n i=1 n i=1 n i=1 n i=1 n i=1 n i=1 n i=1

(yi − yˆ )2 + (yi − yˆ )2 + (yi − yˆ )2 + (yi − yˆ )2 + (yi − yˆ )2 + (yi − yˆ )2 + (yi − yˆ )2 +

n i=1 n i=1 n i=1 n i=1 n i=1 n i=1 n i=1

(ˆy − y¯ )2 + 2 (ˆy − y¯ )2 + 2 (ˆy − y¯ )2 + 2 (ˆy − y¯ )2 + 2

n i=1 n i=1 n i=1 n

(yi − yˆ )(ˆy − y¯ ) (εi )(ˆy − y¯ ) (εi )(ˆy) − 2

n

(εi )(¯y)

i=1

(εi )(βˆ 0 + βˆ 1 xi ) − 2

i=1

(ˆy − y¯ )2 + 2βˆ 0

n i=1

n

(εi )(¯y)

i=1

(εi ) + 2βˆ 1

n

(εi )(xi ) − 2(¯y)

i=1

(ˆy − y¯ )2 + 2βˆ 0 × 0 + 2βˆ 1 × 0 − 2(¯y) × 0 (ˆy − y¯ )2

n i=1

(εi )



The 95% prediction interval for x = 92 is from 83.6195 to 111.7445 We can conclude with 95% confidence that the true value of Y at the point x = 92 will be somewhere between 83.6195 and 111.7445

8.4.3.

The 95% prediction interval for x = 12 is from 84.20125 to 113.2464

8.4.5.

The 95% prediction interval for x = 85 is from 128.7849 to 191.9912 We can conclude with 95% confidence that the true value of Y at the point x = 85 will be somewhere between 128.7849 and 191.9912 The assumption is the linear regression still valid for beyond our domain and therefore our y do make sense.


8.5.3.

(a) At 95% confidence level, we got the z is 1.3853 which is less than critical value. Therefore we do not reject Ho and it means X and Y are independent. (b) P-value is 0.08297713. (c) The assumption is (x, y) follows the bivariate normal distribution and this test procedure is approximate. σxy Sxy = √ =ρ E(γ) = E σ Sxx Syy xx σyy Therefore γ is not an unbiased estimator of the population coefficient ρ.

8.5.5.

(a) At 95% confidence level, we got the z is 1.237125 which is less than critical value. Therefore we do not reject Ho and it means X and Y are independent. (b) P-value is 0.108. (c) yˆ = 77.87 + .624x (d) The usefulness of this model is we can use this model to make prediction. (e) The assumption is (x, y) follows the bivariate normal distribution and this test procedure is approximate.


⎡ ⎤ yˆ 1 ⎢yˆ ⎥ ⎢ 2⎥ (a) yˆ = Xβˆ where yˆ = ⎢ ⎥ ⎣yˆ 3 ⎦ yˆ 4 ⎡

4

9

11

⎤

⎥ 23 24⎦ 11 24 39

⎢ (b) XT X = ⎣ 9

4×1

. ,X = 1

X1

X2

/ 4×3

⎡ˆ ⎤ β1 ⎢βˆ ⎥ ˆ and β = ⎣ 2 ⎦ βˆ 3

3×1

86 CHAPTER 8 Linear Regression Models ⎡

⎤ 3.41489 −.92553 −.3936 (XT X)−1 = ⎣−.92553 .37234 .0319 ⎦ −.3936 .0319 .117 ⎡ ⎤ 18 T ⎣ (X Y ) = 41 ⎦ 47 ⎡ ⎤ 5.02128 (c) βˆ = ⎣ .10638 ⎦ −.2766 3×1 (d) The estimation of error variance is 2.14286. 8.6.3.

yˆ = −84.1674 + 5.0384x 8 347 T X X= 347 16807 1.19648 −.0247 (XT X)−1 = −.0247 .0005695 1075 (XT Y ) = 55475

Chapter

9

Design of Experiments EXERCISES 9.2 9.2.1.

Response is amount of fat was absorbed by has-brown potatoes Factors are frying durations and different type of fats Factor types are frying durations which quantitative and continuous and different type of fats which is qualitative and discrete. Treatments There are 16 treatments 2 min with animal fat I, 2 min with animal fat II, 2 min with vegetable fat I and 2 min with vegetable fat II 3 min with animal fat I, 3 min with animal fat II, 3 min with vegetable fat I and 3 min with vegetable fat II 4 min with animal fat I, 4 min with animal fat II, 4 min with vegetable fat I and 4 min with vegetable fat II 5 min with animal fat I, 5 min with animal fat II, 5 min with vegetable fat I and 5 min with vegetable fat II

9.2.3.

Procedure for random assignment 1. Number the experimental units from 1 to 30. 2. Use a random number table or a statistical software to get a list of numbers that is a random permutation of the numbers 1 to 30. 3. Give treatment 1 to the experimental units having the first 10 numbers in the list. Treatment 2 will be given to the next 10 numbers in the list, and so on, give treatment 3 to the last 10 units in the list.

87

88 CHAPTER 9 Design of Experiments

Here response is rose bushes and factor is different fertilizers. 24, 12, 30, 21, 8, 3, 20, 1, 11,18, 13, 15, 28, 5, 25, 29, 4, 10, 14, 19, 26, 9, 2, 6, 22, 16, 23, 7, 27,17 Brand A B C 9.2.5.

24 13 26

12 15 9

30 28 2

21 5 6

20 4 23

1 10 7

11 14 27

18 19 17

Procedure for a randomized complete block design with 3 replications 1. Group the experimental units into 3 groups (called blocks), each containing 3*3 homogeneous experimental units. 2. In group 1, number the experimental units from 1 to 9 and generate a list of numbers which are random permutation of the numbers 1 to 9. 3. In group 1, assign treatment 1 to the experimental units having numbers given by the first 3 numbers in the list. Assign treatment 2 to the experiments having next 3 numbers in the list, and so on until treatment 3 receives 3 experimental units. 4. Repeat steps 2 and 3 for the remaining blocks of experimental units. G 3(A) 1(A) 2(A) 6(B) 8(B) 7(B) 9(C) 5(C) 4(C)

9.2.7.

Subject 8 3 25 29 22 16

R 5(A) 4(A) 3(A) 9(B) 1(B) 6(B) 8(C) 7(C) 2(C)

J 1(A) 2(A) 4(A) 7(B) 8(B) 5(B) 6(C) 3(C) 9(C)

19, 37, 52, 42, 13, 34, 56, 48, 44, 43, 24, 12, 5, 32, 40, 23, 41, 11, 10, 6, 30, 26, 18, 8, 2, 29, 21, 36, 1, 54, 20, 39, 33, 27, 49, 16, 51, 15, 28, 47, 53, 35, 31, 3, 38, 25, 17, 55, 4, 50, 14, 22, 9, 46, 45, 7 Brand A B C D

19 40 1 31

37 23 54 3

52 41 20 38

42 11 39 25

13 10 33 17

34 6 27 55

Subject 56 48 30 26 49 16 4 50

44 18 51 14

43 8 15 22

24 2 28 9

12 29 47 46

5 21 53 45

32 36 35 7


9.2.9.

Start Days 2 3 5 1 4

1 A B C D E

New material 3 2 4 B C D C D E D E A E A B A B C

Days 1 2 3 4 5

1 D A B E C

New material 3 2 4 E A B B C D C D E A B C D E A

5 C E A D B

Days 1 2 3 4 5

1 D A B E C

New material 2 3 4 A E B C B D D C E B A C E D A

5 C E A D B

5 E A B C D

Then

The final is

9.2.11. Grid 1 2 3 4

1 D C A B

Grid 2 3 A B B D D C C A

4 C A B D


One factor at a time experiment to predict average amount of profit Assume we fix proportion at 40% then if we increase the quality from ordinary to fine we get average profit decrease from 25,000 to 10,000. If we fix proportion at 60% then if we increase the quality from ordinary to fine we get average profit decrease from 9500 to 3000.

90 CHAPTER 9 Design of Experiments

Then if we change the setting from 40% and ordinary to 60% and fine then we get from 25,000 to 3000. 9.3.3.

In fractional factorial experiment, only a fraction of the possible treatments are actually used in the experiment. A full factorial design is most ideal design through which we could obtain information on all main effects and interactions. Due to prohibitive size of the experiments, such designs are not practical to run. The total number of distinct treatments will be 2×2 = 4. Fractional factorial experiments are used in which trials are conducted at only a well-balanced subset of the possible combinations of levels of factors. This allows the experimenter to obtain information about all main effects and interactions while maintaining the size of the experiment manageable. The experiment is carried out in a single systematic effort.

EXERCISES 9.4 9.4.1. σ1 = n1 =

√

4 = 2 and σ2 =

√

9=3

2 3 100 = 40 and n2 = 100 = 60 2+3 2+3

EXERCISES 9.5 9.5.1. ¯ = k[(X ¯ − T )2 + S 2 ] = k[(14.15 − 14.5)2 + .422 ] = .2946k. L

Chapter

10

Analysis of Variance EXERCISES 10.2 10.2.1.

(a) We need to test H0 : μ1 = μ2 vs. H1 : μ1 = μ2 From the random sample, we obtain the following needed estimates n1 = n2 = y,

i y1 = 1.888889, y2 = 2.777778, i j yij2 = 120, i j yij = 42, Total SS = 2i=1 nj=1 2 2

i yij − y = 22, SSE = 2i=1 nj=1 yij − y = 18.4444, SST = 3.55556 Where Total SS = SSE+SST, then MST = and F = MST MSE = 3.084337

SST 1

= 3.55556, MSE =

SSE n1 +n2 −2

= 1.152778,

With α = 0.05, Fα,(1,n1 +n2 −2) = 4.49 Since 3.0843 is not greater than 4.49, H0 is not rejected. There is not enough evidence to indicate that the means differs for the tho populations. (b) S 2 = Sp2 = MSE = 1.152778, y1 = 1.888889, y2 = 2.777778 Then, the t-statistic is T = '

y1 − y2 = −1.7562 1 1 2 S n1 + n2

Now, t0.025,14 = 2.12, and the rejections region is {|t| > 2.12} Since −1.7562 is not less than −2.12, H0 is not rejected, which implies that there is no significant difference between the mean for the two populations. t 2 = F , implying that in the two sample case, t-test and F -test lead to the same result. 10.2.3.

(a) At α = 0.01, we need to test H0 : μ1 = μ2 vs. H1 : μ1 = μ2

2

We need the estimates y1 = 48.84615, y2 = 45.91667, i j yij = 56582, i 2 2

2 ni

2 ni y = 1186, Total SS = y − y = 318.16, SSE = y − y ij ij ij j i=1 i=1 j=1 j=1 = 264.609, SST = 53.551, where Total SS = SSE + SST, then MST = SST 1 = 53551, SSE MSE = n1 +n2 −2 = 11.50474, and F = 4.654693. At α = 0.01, Fα,(1,n1 +n2 −2) = 7.881134, then F0.01 (1, 23) = 7.88

91

92 CHAPTER 10 Analysis of Variance

(b) Assumptions: The samples are assumed to be independent from the Normal population with respective means μ1 , μ2 and equal but unknown variances. (c) S 2 = Sp2 = MSE = 11.50474, y1 = 48.84615, y2 = 4591667. Then, the t-statistic is: y − y2 t = ' 1 = 2.1574 1 1 2 S n1 + n2

Now, t0.005,23 = 2.807, and the rejection region is {|t| > 2.807}. Since 2.1574 is not greater than 2.807, H0 is not rejected, which implies that there is no significant difference between the mean time to relief for the two populations, and t 2 = F implies that in the two sample case, t-test and F = test lead to the same result 10.2.5.

Let Xi , with Xi ∼ N(μ, σ 2 ) for i = 1, 2, 3, . . . , n1 , and let Yj , with Yj ∼ N(μ, σ 2 ) for j = 1, 2, 3, . . . , n2 , be two set of independent random variables. To test H0 : μ1 = μ2 vs. Ha : μ1 = μ2 we reject H0 when ' X−Y > tn1 +n2 −1;α 2 . Now, for ANOVA, with n11 + n12 Sp2 μ=

n1

n2 i=1 xi + i=1 yi n1 +n2

, we have

n1

(x−μ)2 +

n2

(y−μ)2

i=1 i=1 MST n1 (x2 − 2x · μ + μ2 ) + n2 (y2 − 2y · μ + μ2 ) 2−1 F= = n1 =

n2 2 2 MSE Sp2 i=1 (xi −x) + i=1 (yi −y)

n1 +n2 −2

=

n1 x2 + n2 y2 − (n1 + n2 )μ2 Sp2

=

n1 n2 2 2 n1 x + n2 y n1 +n2 [x − 2x · y + y ] , since μ = n1 + n2 Sp2

=

Therefore F =

2 (X−Y ) . 1 1 Sp2 + n n 1

(x − y)2

1 1 2 n 1 + n 2 Sp

Then, we reject H0 if

2

Since (tn1 +n2 −2,α/2 )2 ≡ F1,n1 +n2 −2,α and

2 (X−Y ) 1 1 Sp2 + n n 1

2 (x−y) 1 1 Sp2 n +n 1

2

> F1,n1 +n2 −2,α . > k ⇔ ' X−Y > k for appropri n11 + n12 Sp2 2

ate values k and k , the probability for this events are the same. Hence, the two sample t-test and the analysis of variance are equivalent for testing H0 : μ1 = μ2 vs. Ha : μ1 = μ2 . Note: In the text, xi is y1i , x is y1 , yi is y2i , and y is y2 .



(a) Assuming that the samples are from populations which are normally distributed with equals variances and means μ1 , μ2 , μ3 . In our case, n1 = n2 = n3 = 4, k = 3,

k

ni ti N = i=1 ni = 12, Ti = j=1 yij , T i = ni , T1 = 1488, T2 = 1704, T3 = 1434,

k ni

i i=1 j=1 yij T 1 = 372, T 2 = 426, T 3 = 358.3, y = = 385.5, ki=1 nj=1 yij2 = N 2

ni k 2 1835388, CM = Ny = 1783323, Total SS = = 52065, or i=1 j=1 yij − y 2

k ni

k ni 2 Total SS = = 41859, i=1 i−1 j=1 yij − CM = 52065, SSE = j=1 yij − T i 2

k

k Ti2 SSE SST = i=1 ni T i − y = i=1 ni − CM = 10206, S 2 = MSE = n1 +n2 +···+nk −k = SSE SST MST N−k = 4651, MST = k−1 = 5103, F = MSE = 1.0972. At α = 0.05Fα,(k−1,N−k) = 4.2565 Therefore, the ANOVA table is Source of Variation Treatments Error Total

Degrees of Freedom 2 9 11

ANOVA Table Sum of Mean Squares Square 10206 5103 41859 4651 52065

F -Statistic 1.0972

p-Value 0.37462

From the table, since the p-value is more than 0.05, we reject at α = 0.05 the null hypothesis H0 : μ1 = μ2 = μ3 (b) Letting H0 : The mean auto insurance premium paid per six months by all drivers insured for each of these companies is the same. Based on the data, there is evidence to suggest that the mean auto insurance premium pay per six months by all drivers insured for each of these companies is the same. 10.3.3.

n1 = n2 = · · · = nk = n

n j=1

because

n

Therefore, 10.3.5.

(yij − y)2 =

n .

n /2 (yij − T i ) + (T i − y) = (yij − T i )2 + n (T i − y)2 ,

j=1

j=1

n

ni

ni

ni j=1 (yij − T i ) = j=1 yij − n T i = j=1 yij − ni T i = j=1 yij − j=1 yij 2

k n

k n

k 2 2 i=1 j=1 (yij − y) = i=1 j=1 (yij − T i ) + n i=1 T i − y

= 0.

2

(a) From exercise 10.3.4 we know that SST = ki=1 ni T i − y (where T stand for 2

i “Treatment”), and SS Total = ki=1 nj=1 yij − y . Then, SSE = SS Total − SST =

ni k

yij − y

2

−

i=1 j=1

=

ni k i=1 j=1

k

2 ni T i − y

i=1

yij − y

2

−

ni k 2 Ti − y i=1 j=1


and ni ni 2 . /2 yij − y = yij − T i + T i − y j=1

j=1

=

ni j=1

yij − T i

2

+

ni

Ti − y

2

j=1

ni

ni

ni

ni j=1 yij − T i = j=1 yij − ni T i = j=1 yij − j=1 yij = 0 2 k ni 2 2 k ni

k ni Then i=1 j=1 yij − y = i=1 j=1 yij − T i + i=1 j=1 T i − y 2

i Therefore, SSE = ki=1 nj=1 yij − T i ni yij −T i 2 (b) Since yij ∼ N μ, σ 2 , j=1 σ 2 ∼ χn2i −1 , and since they are independent, SSE = σ2 2

k ni yij −T i

k follows a chi-square distribution with i=1 (ni − 1), or N − k, i=1 j=1 σ2 degrees of freedom.

i yij , T i = ntii , T1 = 92, n1 = n4 = 5, n2 = n3 = 6, k = 4, N = ki=1 ni = 22, Ti = nj=1 Since

10.3.7.

k

ni

yij

T2 = 69, T3 = 75, T4 = 94, T 1 = 18.4, T 2 = 11.5, T 3 = 12.5, T 4 = 18.8, y = i=1 N j=1 = 2

ni

k ni 2 = 5338, CM = Ny2 = 4950, Total SS = 15, ki=1 y y − y = 388, i=1 j=1 ij j=1 ij 2

i

yij2 − CM = 388, SSE = ki−1 nj i= 1 yij − T i = 147, SST = or Total SS = ki=1 nj=1 2 k Ti2

k SSE SSE 2 i=1 ni T i − y = i=1 ni − CM = 241, S = MSE = n1 + n2 + ··· + nk − k = N−k = 8.16667, SST MST MST = k−1 = 80.3333, F = MSE = 9.8367 Therefore, the ANOVA table is (a) Source of Variation Treatments Error Total

Degrees of Freedom 3 18 21

ANOVA Table Sum of Mean Squares Square 241 80.3333 147 8.166667 388

F -Statistic 9.8367

p-Value 0.00046

Assumptions: The samples are randomly selected from the 4 populations in an independent manner. The populations are normally distributed with equal variances σ 2 and means μ1 , μ2 , μ3 , μ4 (b) Since F is greater than critical value at α = 0.05, there is sufficient evidence to indicate a difference between the mean number of customers served by the 4 employees. 10.3.9.

Assumptions: The samples are normally selected from the population in an independent manner. The populations are assumed to be normally distributed with common variances.

ni ti n1 = n2 = n3 = n4 = 5, k = 4, N = ki=1 ni = 20, Ti = j=1 yij , T i = ni , T1 = 34.3, T2 = 39.6, T3 = 45.9, T4 = 34.2, T 1 = 6.86, T 2 = 7.92, T 3 = 9.18, T 4 = 6.84,

Student’s Solutions Manual 95 2

k ni

k CM = Ny2 = 1185.8, Total SS = = 23.78, or Total SS = i=1 i=1 j=1 yij − y 2 2

ni

k ni

k 2 − CM = 23.78, SSE = y y − T = 5.36, SST = n T − y = ij i i i i−1 i=1 j=1 ij j=1

k Ti2 SSE SSE SST 2 i=1 ni − CM = 18.42, S = MSE = ni + n2 + ··· + nk − k = N−k = 0.335, MST = k−1 = 6.14, F = MST MSE = 18.3284 At α = 0.01Fα,(k−1,N−k) = 5.2922 Since F > 5.2922, the sample evidence supports the alternative hypothesis that the true rental and homeowner vacancy rates by area indeed different for all five years at 0.01 l of significance level.

k

ni ti 10.3.11. n1 = n2 = n3 = 6, k = 3, N = i=1 ni = 18, Ti = j=1 yij , T i = ni , T1 = 1273, T2 = 1275, T3 = 1257, T 1 = 212.16667, T 2 = 212.5, T 3 = 209.5 CM = Ny2 = 2

i

i 804334.7222, Total SS = ki=1 nj=1 yij − y = 5994.27778, or Total SS = ki=1 nj=1 2

k ni

k yij2 − CM = 5994.27778, SSE = = 5961.8333, SST = i−1 i=1 j=1 yij − T i 2 k Ti2 SSE SSE 2 ni T i − y = i=1 ni − CM = 32.4444, S = MSE = ni + n2 + ··· + nk − k = N−k = 397.456, SST MST = k−1 = 16.2222, F = MST MSE = 0.0408 At α = 0.01 Fα,(k−1,N−k) = 6.3589 Since F < 6.3589, based on the data there is not enough evidence to support the alternative hypothesis that the true mean cholesterol levels for all races in the United States during 1978–1980 are different at 0.01 of significance level.


yij − y = yij − T i − Bj + y + T i − y + Bj − y

Then

yij − y

2

2 2 2 = yij − T i − Bj + y + T i − y + Bj − y + 2 yij − T i − Bj + y T i − y + 2 yij − T i − Bj + y Bj − y + 2 T i − y Bj − y

Then b k j=1 i=1

yij − y

2

=

b k

yij − T i − Bj + y

2

+

j=1 i=1

+2

b k

b k

Ti − y

j=1 i=1

yij − T i − Bj + y T i − y

j=1 i=1

+2

k b yij − T i − Bj + y Bj − y j=1 i=1

+2

k b T i − y Bj − y j=1 i=1

2

+

b k 2 Bj − y j=1 i=1


10.4.3.

b

b

b Now, y , and by = b · 1n j=1 yij − T i = j=1 yij − bT i = 0, since bT i = %j=1 ij &

k b

k b b k b k b 1 1 y = y = y = y = bj=1 Bj ij ij ij ij i=1 j=1 i=1 j=1 j=1 i=1 j=1 k i=1 bk k

Then, bj=1 Bj − y = bj=1 Bj − by = 0, and bj=1 yij − T i − Bj + y = bj=1 yij − T i

− bj=1 Bj − y = 0 % b &

Then, bj=1 ki=1 yij − T i − Bj + y T i − y = ki=1 T i − y j=1 yij − T i − Bj + y =0 k % k &

b k and j=1 i=1 yij − T i − Bj + y Bj − y = j=1 Bj − y i=1 yij − T i − Bj + y =0 % b &

= 0. and bj=1 ki=1 T i − y Bj − y = ki=1 T i − y B − y j j=1 2 b k 2 b k 2

b k Therefore, j=1 j=1 yij − y = j=1 i=1 yij − T i − Bj + y + j=1 i=1 T i − y 2 2 2

b k

+ ki=1 bj=1 Bj − y = + b ki=1 T i − y + k bj=1 j=1 i=1 yij − T i − Bj + y 2 Bj − y

k b W = ki=1 bj=1 (yij − μ − τi − βj )2 , then ∂W i=1 j=1 (yij − μ − τi − βj ) ∂μ = −2

k b

k b If ∂W i=1 j=1 (yij − τi − βj )− i=1 j=1 μ = 0, and since by the restriction ∂μ = 0, then

b

k ˆ =y j=1 βj = i=1 τi = 0, the solution is given by μ Now, for any fixed i ∂W ∂τi = −

b

j=1 (yij

− μ − τi − βj )

b

b ∂W if ∂W j=1 (yij − μ − τi − βj ) = 0. Then, since j=1 βj = 0 then τî = ∂τi = 0, then ∂μ = T i − μ, ˆ for any i = 1, 2, . . . , k; i.e., τî = T i − y

k

k ∂W For any fixed j, ∂W i=1 (yij − μ − τi − βj ). If ∂βi = 0, since i=1 τi = 0 the solution ∂βi = −2 ˆ is βj = Bj − y, j = 1, 2, . . . , b. 10.4.5.

T1 = 602, T2 = 619, T3 = 427, T4 = 439B1 = 386, B2 = 390, B3 = 414, B4 = 437, B5 = 460 b = 5, k = 4, n = bk = 20

2

b SSB CM = 1n B = 217778.5, SSB = 1k bj=1 Bj2 − CM = 986.8, MSB = b−1 = 246.7, j j=1

ni k k 1 SST 2 SST = b i=1 Ti − CM = 6344.55, MST = k−1 = 2114.85, Total SS = i=1 j=1 yij2 − CM SSE = 7390.55, SSE = Total SS − SSB − SST = 59.2, MSE = n−b−k+1 = 4.9333 To test if the true income lower limits of top 5 percent of U.S. households for each races are the same, F = MST MSE = 5.35283, F0.05,3,12 = 3.49. Since the observed value F > 3.49, we reject the null hypothesis and conclude that there is difference in the true income lower limits of top 5 percent of U.S. households for each races. To test if the true income lower limits of top 5 percent of U.S. households for each year between 1994–1998 are the same, F = MSB MSE = 50.00676, and F0.05,4,12 = 3.2592. Since the observed value F > 3.2592, we conclude that there is difference in the true income lower limits of top 5 percent of U.S. households for each year among 1994–1998


10.4.7.

T1 = 112, T2 = 110, T3 = 133, T4 = 157, B1 = 201, B2 = 165, B3 = 146 b = 3, k = 4, n = bk =12 2

b

SSB CM = 1n = 21845.33, SSB = 1k bj=1 Bj2 − CM = 390.1667, MSB = b−1 = j=1 Bj

k k 1 SST 2 195.0833, SST = b i=1 Ti − CM = 482, MST = k−1 = 160.667, Total SS = i=1

ni SSE 2 − CM = 1234.667, SSE = Total SS − SSB − SST = 362.5, MSE = y j=1 ij n−b−k+1 = 60.41667 To test if the true mean performance for different hours of sleep are the same, F = MST MSE = 2.6593, F0.05,3,6 = 4.7571. Since the observed value F < 4.7571, there is evidence to conclude that there is no difference in the true mean performance for different hours of sleep To test if the true mean performance for each category of the test are the same, F = MSB MSE = 3.2290, and F0.05,2,6 = 5.1433. Since the observed value F < 5.1433, there is evidence to conclude that there is no difference in the true mean performance for each category of the test.


(a) For simplicity of computation, we will use SPSS. The following is the output. Oneway ANOVA Average Time Sum of Mean Squares df Square Between Groups .900 3 .300 Within Groups 3.919 12 .327 Total 4.818 15

F .919

Sig. .461

(b) Since there is no significant difference, Tukey’s method is not necessary. (c) Since F is smaller than the critical value, F0.05,3,12 = 3.4903, there is evidence to conclude there is no difference in the average time to process claim forms among the four processing facilities. Assumptions: The samples are randomly selected from the 4 populations in an independent manner. The population are normally distributed with equal variances σ 2 and mean μ1 , μ2 , μ3 , μ4 . 10.5.3.

(a) Oneway

Between Groups Within Groups Total

ANOVA Income Sum of Squares df Mean Square 6344.550 3 2114.850 1046.000 16 65.375 7390.550 19

F 32.350

Sig. .000


Since F is greater than critical value F , F0.05,3,16 = 3.24, based on data provided, there is evidence to conclude that there is difference in the income lower limits of top 5 percents of U.S. households for each races for all five years at 0.05 level of significance. (b) Post Hoc Tests Multiple Comparisons Income Tukey HSD 95% Confidence Interval (I) (J) Mean Std. Lower Upper race_num race_num Difference (I–J) Error Sig. Bound Bound 1

2

2

−3.4

5.11371 0.91

3 4

35.00000∗ 32.60000∗

5.11371 0 5.11371 0

1 2

49.6304 47.2304

−11.2304

18.0304

5.11371 0 5.11371 0

23.7696 21.3696

53.0304 50.6304

−35.00000∗ −38.40000∗

5.11371 0 5.11371 0

−49.6304 −53.0304

−20.3696 −23.7696

5.11371 0.97

−17.0304

12.2304

3.4

5.11371 0.91

−2.4

4 4

11.2304

20.3696 17.9696

38.40000∗ 36.00000∗

1 3 4

3

−18.0304

1

−32.60000∗

5.11371 0

−47.2304

−17.9696

2 3

−36.00000∗ 2.4

5.11371 0 5.11371 0.97

−50.6304 −12.2304

−21.3696 17.0304

(c) μi − μj

Ti − Tj

Tukey Interval

Reject or N.R.

Conclusion

μ1 − μ2

120.4 − 123.8

(−18.03,11.23)

N.R.

μ1 = μ2

μ1 − μ3

120.4 − 85.4

(20.37, 49.63)

R

μ1 = μ3

μ1 − μ4

120.4 − 87.8

(17.97, 47.23)

R

μ1 = μ4

μ2 − μ3

123.8 − 85.4

(23.77, 53.03)

R

μ2 = μ3

μ2 − μ4

123.8 − 87.8

(21.37, 50.63)

R

μ2 = μ4

μ3 − μ4

85.4 − 87.8

(−17.03, 12.23)

N.R.

μ3 = μ4

Assuming the samples are randomly selected in an independent manner, the populations are normally distributed with equal variances σ 2 , and based on 95% Tukey intervals, All races is similar to White, and Black is similar to Hispanic. All other true income lower limits for each races are different.


EXERCISE 10.7 10.7.1.

Oneway

Between Groups Within Groups Total

ANOVA Cholesterol Sum of Squares df Mean Square 32.444 2 16.222 5961.833 15 397.456 5994.278 17

F .041

Sig. .960

Since F is smaller than critical value F , F0.01,2,15 = 6.36, based on data provided, there is not enough evidence to conclude that there is difference in the true cholesterol levels for all races in United States during 1987–1980.


Chapter

11

Bayesian Estimation and Inference EXERCISES 11.2 11.2.1.

By Bayes’ rule, P(The die is loaded |3 consecutive fives) =

P(3 consecutive fives |Loaded)P(Loaded) P(3 consecutive fives |Loaded)P(Loaded) + P(3 consecutive fives |Fair)P(Fair)

=

(0.6)3 0.02 = 0.488 (0.6)3 0.02 + (1/6)3 0.98

11.2.3.

(a) f (p|x) ∝ π(p)f (x|p) ∝ p2 {px (1 − p)n−x } ∝ px+2 (1 − p)n−x , which is the kernel density of β(x + 3, n − x + 1). Hence, the posterior distribution of p is β(x + 3, n − x + 1). (b) f (p|x) ∝ π(p)f (x|p) ∝ {pa−1 (1 − p)b−1 }{px (1 − p)n−x } ∝ px+a−1 (1 − p)n−x+b−1 , which is the kernel density of β(x + a, n − x + b). Hence, the posterior distribution of p is β(x + a, n − x + b).

11.2.5.

(a) Note that f (λ|x) ∝ π(λ)f (x|λ) 4 ∝ e−μλ (λe−λxi ) i

∝ λn e−(μ+

xi )λ , which

is the kernel density of (n + 1, μ +

Hence, the posterior distribution of λ is (n + 1, μ + ni=1 xi ).

n (b) E(λ|x) = (n + 1)/(μ + i=1 xi ).

n

i=1 xi ).

101

102 CHAPTER 11 Bayesian Estimation and Inference

11.2.7.

(a) f (λ|x) ∝ π(λ)f (x|λ) ∝ e−λ ∝λ

0

(e−λ λxi )

i

n

xi e−(n+1)λ , which is the kernel density of

Hence, the posterior distribution of λ is n (b) E(λ|x) = i=1 xi + 1 /(n + 1).

n

i=1 xi

i=1

xi + 1, n + 1 .

+ 1, n + 1 .

1 1 (xi − μ)2 μ2 0 f (μ|x) ∝ π(μ)f (x|μ) ∝ exp − 2 exp − 2·2 2σ 1

(xi − μ)2 μ2 ∝ exp − 2 − 2·2 2σ 1 1 n 2 − n¯xμ μ ∝ exp − + 2 2 σ2 ⎧ 2 ⎫ n x¯ ⎨ 1 1 ⎬ n 2 + μ − ∝ exp − , 1 + n ⎩ 2 σ2 ⎭ 2 2

11.2.9.

2

σ

which is the kernel density of N

n ¯ 2x 1 +n σ2 2

,

1

1 +n σ2 2

Hence the posterior distribution of μ is N

n ¯ 2x

. n ,

1 + σ2 2

1 1 n 2 +2

.

σ


(a) We have seen from Example 11.2.7 that the posterior distribution of μ given x1 , x2 , . . . , xn is normally distributed with n x¯ n x¯ 1 n¯x 1 9 2 Mean = 1 σ n = 9 n = , and Variance = 1 = n = 9 + n. n 1 + 9 + n 1 + 9 9 1 + σ2 1 + σ2

Hence, a 95% credible interval for μ is n¯x ± z0.025 9+n

'

' 9 n¯x 9 = ± 1.96 . 9+n 9+n 9+n

0.92 + 1.05 + · · · + (−4.78) = 0.8725. 20 Using the results of part (a), a 95% credibel interval for μ is

(b) First note that n = 20 and x¯ =

' 20(0.8725) 9 ± 1.96 = (−0.490, 1.694) 9 + 20 9 + 20


11.3.3.

We of μ is Exercise 11.2.8 that the posterior distribution have seen from

50 50 x + α, 50 + β = (12.1, 52) since α = 0.1, β = 2, and x = i=1 i i=1 xi = 12. Since (μ|x = 12) ∼ (12.1, 52). Using the procedures summarized in Section 11.3 we can show that Pr(0.121 < μ < 0.381) = 0.95, thus (0.121, 0.381) is a 95% credible interval for μ.

11.3.5.

Let x = sodium intake in this ethnic group, and μ = the mean sodium intake for this group. Then x¯ ∼ N(μ, 3002 ) and μ ∼ N(2700, 2502 ). From Example 11.2.7 the posterior distribution of μ given x¯ = 3000 is normally distributed with 1 2700 + 1 x¯ 1 2700 + 1 3000 2 3002 = 2502 3002 Mean = 250 1 = 2822.95, and 1 1 + 1 + 2502 3002 2502 3002

Variance =

1 1 + 1 2502 3002

= 36885.25.

√ Hence, a 95% credible interval for μ is 2822.95 ± z 36885.25 = 2822.95 ± 0.025 √ 1.96 36885.25 = (2446.52, 3199.38). 11.3.7.

Let x = the number of calls received in five minutes. Then x ∼ Poi(5λ) π(λ|x = 25) ∝ π(λ)f (x = 25|λ) ∝ e−2λ {e−5λ (5λ)25 } ∝ λ25 e−7λ , which is the kernel density of (25 + 1, 7).

Hence, the posterior distribution of λ is (26, 7). Knowing the posterior distribution of λ we can show that Pr(2.43 < λ < 5.27) = 0.95. Thus (2.43, 5.27) is a 95% credible interval of λ.


(a) Referring to Exercise 11.3.1. we know that x¯ = 0.8725 and n = 20 and the posterior distribution of μ is normal with n x¯ 20 0.8725 1 1 2 Mean = 1 σ n = 91 20 = 0.784, and Variance = 1 = 1 20 = 0.4045. n + + + + 4 4 9 4 4 9 σ2 σ2

We can now compute α0 = P(μ ≤ 0|¯x = 0.784) = P

−0.784 =P z≤ √ 0.4045

μ − 0.784 0 − 0.784 ≤ √ √ 0.4045 0.4045

= P(z ≤ −1.2327) = 0.109


and α1 = P(μ > 0|¯x = 0.784) = 1 − α0 = 1 − 0.109 = 0.891

Thus, α0 /α1 = 0.109/0.891 = 0.122 < 1, and we reject H0 . (b) First compute z=

0.8725 x¯ − 0 = 1.3. √ = √ σ/ n 9/20

Thus, z = 1.3 < z0.95 = 1.645, and we do not reject H0 . In this case, we see that we obtain different decisions from using the Bayesian approach than from using the classical approach. 11.4.3.

We have seen from Exercise 11.3.3. the posterior distribution of μ is (12.1, 52). Using any statistical software, we can compute α0 = P(μ ≤ 0.1|x = 12) = P(μ ≤ 0.1 and μ ∼ (12.1, 52)) = 0.0067,

and α1 = P(μ > 0.1|x = 12) = 1 − α0 = 1 − 0.0067 = 0.9933. Thus, α0 < α1 , and we reject H0 . 11.4.5.

We have seen from Exercise 11.3.5. the posterior distribution of μ is N(2822.95, 36885.25). We can now compute

α0 = P(μ ≤ 2400|¯x = 3000) = P

μ − 2822.95 2400 − 2822.95 ≤ √ √ 36885.25 36885.25

= P(z ≤ −2.202) = 0.014 and α1 = P(μ > 2400|¯x = 3000) = 1 − α0 = 1 − 0.014 = 0.986.

Thus, α0 < α1 , and we reject H0 .

EXERCISES 11.5 11.5.1. Expected Re turn = 25P(H)P(H) + 15P(H)P(T ) + 15P(T )P(H) + (−15)P(T )P(T ) = 25(1/2)(1/2) + 15(1/2)(1/2) + 15(1/2)(1/2) − 15(1/2)(1/2) = 10

Since the expected return is positive, in a long run we should win. Thus, we should play this game. 11.5.3.

According to the government forecast, we write P(G) = 0.7 to denote that the economy will expand with a 70% chance, and P(B) = 0.3 to denote that the economy will decline. Expected Earning = 300000P(G) + (−200000)P(B) = 300000(0.7) − 200000(0.3) = 150000


Since the expected earning is greater than 50000, the optimal decision is to open a new office. Here we made an assumption that the government forecast will be correct with 100% certainty. 11.5.5.

(a) Let G and B denote the true state of nature, and let G and B denote the weather person’s prediction. Based on the record we have P(G |G) = 6/8 = 3/4 and P(G |B) = 3/7. Initially assume the prior as P(G) = P(B) = 1/2. Using the Bayes’ theorem, we obtain the likelihood as P(G|G ) = =

P(G |G)P(G)

P(G |G)P(G) + P(G |B)P(B) (3/4)(1/2) (3/4)(1/2) + (3/7)(1/2)

= 7/11

and P(G|B ) = =

P(B |G)P(G) P(B |G)P(G) + P(B |B)P(B) (1/4)(1/2) (1/4)(1/2) + (4/7)(1/2)

= 7/23

Updated prior when the weather person predicts good weather: π(G) = P(G|G ) = 7/11; π(B) = 1 − π(G) = 4/11.

Updated prior when the weather person predicts bad weather: π(G) = P(G|B ) = 7/23; π(B) = 1 − π(G) = 16/23.

(b) When the weather person predicts good weather: Expected gain if we insure = 125π(G) + 135π(B) = 125(7/11) + 135(4/11) = 128.64,

and Expected gain if we do not insure = 200π(G) = 200(7/11) = 127.27.

Therefore our decision, given that the weather person predicts good weather, is to insure.


When the weather person predicts bad weather: Expected gain if we insure = 125π(G) + 135π(B) = 125(7/23) + 135(16/23) = 131.96,

and Expected gain if we do not insure = 200π(G) = 200(7/23) = 60.87.

Therefore our decision, given that the weather person predicts bad weather, is also to insure. 11.5.7.

By assuming a uniform prior we have π(θ1 ) = π(θ2 ) = π(θ3 ) = 1/3. The expected utility for decision d1 = 0 · π(θ1 ) + 10 · π(θ2 ) + 4 · π(θ3 ) = 0(1/3) + 10(1/3) + 4(1/3) = 4.67. The expected utility for decision d2 = (−2) · π(θ1 ) + 5 · π(θ2 ) + 1 · π(θ3 ) = (−2)(1/3) + 5(1/3) + (1/3) = 1.33

Therefore our decision is d1 . 11.5.9.

(a) States of Nature → Decision Space ↓

p1

p2

……

pk

Predicting p1 (d1 )

g

−l

……

−l

Predicting p2 (d2 )

−l

g

……

−l

……

……

……

……

−l

−l

……

g

…… Predicting pk (dk )

(b) The expected utility for decision di is given by E(U|di ) =

k j=1

U(di , pj )π(pj ) = g

1 g − (k − 1)l 1 + (k − 1)(−l) = . k k k

Therefore the expected utility is the same for all decision di .


(c) When X = x1 , compute the updated prior as P(X = x1 |pi )π(pi ) π(pi |X = x1 ) =

P(X = x1 |pj )π(pj ) j

ai (1/k) =

aj (1/k) j

ai = , aj

The expected utility for decision di =

k

i = 1, 2, . . . , k.

j

U(di , pj )π(pj |X = x1 )

j=1

aj ai j=i = g − l

aj aj j

j

ai = (g + l) − l, aj

i = 1, 2, . . . , k.

j

Since the above expression only depends on the term ai , therefore our decision is di where i is such that ai is the largest among a1 , . . . , ak . When X = x2 , compute the updated prior as P(X = x2 |pi )π(pi ) π(pi |X = x2 ) =

P(X = x2 |pj )π(pj ) j

(1 − ai )(1/k) =

(1 − aj )(1/k) j

(1 − ai ) , =

(1 − aj )

i = 1, 2, . . . , k.

j

The expected utility for decision di =

k

U(di , pj )π(pj |X = x2 )

j=1

(1 − aj ) (1 − ai ) j=i

=g −l (1 − aj ) (1 − aj ) j

j

(1 − ai ) = (g + l)

− l, (1 − aj )

i = 1, 2, . . . , k.

j

Since the above expression only depends on the term (1 − ai ), therefore our decision is di where i is such that ai is the smallest among a1 , . . . , ak .


12

Chapter

Nonparametric Tests EXERCISES 12.2 12.2.1.

In this case n = 9, p = 0.5, X ∼ Bin(n, p). If P(X ≤ a) = 0.025, using the Binomial table, a = 1, then b = n+1−a = 9. Then the first and ninth value in the order list, an approximate 95% confidence interval is 2.7 < M < 8.5

12.2.3.

(a) The next normal probability plot generated by a SPSS statistical software shows that the normality assumption may not be satisfied. Normal P-P Plot of VAR00001 1.0

Expected Cum Prob

0.8

0.6

0.4

0.2

0.0 0.0

0.2

0.4 0.6 Observed Cum Prob

0.8

1.0

109

110 CHAPTER 12 Nonparametric Tests

(b) Looking in the table for n = 10, p = 0.5, X ∼ Bin(n, p). If P(X ≤ a) = 0.025, using the Binomial table, a = 1, then b = n + 1 − a = 10. Then, an approximate 95% confidence interval for the central median is greater than 57.3 and less than 66.7. That is, we have at least 95% chance that the true median air pollution index for the city will fall in the interval (57.3, 66.7). 12.2.5.

Looking in the table for n = 6, p = 0.5, X ∼ Bin(n, p). If P(X ≤ a) = 0.005, using the Binomial table, a = 0.

12.2.7.

Looking in the table for n = 15, p = 0.5, X ∼ Bin(n, p). If P(X ≤ a) = 0.005, using the Binomial table, a = 2, then b = n + 1 − a = 14. That is, we have 99% chance that the true median time required to prune an acre of grapes will fall in the interval (4.2, 5.8).

12.2.9.

Looking in the table for n = 15, p = 0.5, X ∼ Bin(n, p). If P(X ≤ a) = 0.025, using the Binomial table, a = 3, then b = n + 1 − a = 13. Then, an approximate 95% confidence interval for the median in-state tuition costs is 3683 < M < 5212. That is, we have 95% chance that the median is-state tuition costs fall in (3683, 5212).


Let m0 = 7.75 We test H0 : M = m0 vs. Ha : M = m0 at α = 0.01 (i) Since there is a tie, n = 8. In this case, n+ = 3 and N + ∼ Bin(n, p = 12 ). Then P N + ≥ 3 = 0.85547, which is not less than α2 . Then H0 is not rejected. Based on the sample, the median interest rate in the city is not significantly different from 7.75 (ii) Eliminating the tie, it can be obtained the next table xi 7.625 7.875 7.625 8 7.5 8 7.375 7.25

zi = |xi − 65| 0.125 0.125 0.125 0.25 0.25 0.25 0.375 0.5

Sign − + − + − + − −

Rank 2 2 2 5 5 5 7 8

thus W + = 12, n = 8. Then H0 is rejected since the rejection region is W + ≤ 3 or W + ≥ 33. Then the same conclusion as the Sign test. 12.3.3.

Let m0 = 1000 We test H0 : M = m0 vs. Ha : M > m0 at α = 0.05

(i) Since there is no tie, n = 10. In this case, n+ = 6 and N + ∼ Bin n, p = 12 . Then + P N ≥ 6 = 0.37695, which is not less than α2 . Then H0 is not rejected. Based on the sample, the median SAT scores is not significantly greater than 1000


(ii) It can be obtained the next table xi 986 1065 1089 890 1128 1157 1224 765 1355 567

zi = |xi − 65| 14 65 89 110 128 157 224 235 355 433

Sign − + + − + + + − + −

Rank 1 2 3 4 5 6 7 8 9 10

thus W + = 32, n = 10. Then H0 is rejected since the rejection region is W + ≥ 10. Based on the sample we conclude that the median SAT scores is significantly greater than 1000. 12.3.5.

Let m0 = 250. We test H0 : M = m0 vs. Ha : M > m0 at α = 0.05 (i) Since there is no tie, n = 20. Using large sample approximation, N + ∼ N(μ = np, σ 2 = + np(1 − p)), where p = 1/2 then Z = 2N√n−n follows standard normal distribution. In

this case n+ = 11, then z = 0.447214. Since zα = 1.6448, the rejection region is |z| ≥ zα . Based on the sample we conclude that the median weight of NFL players is not significantly greater than 250 pounds. (ii) We have xi 254 246 259 234 232 269 229 274 276 285 285 288 211 296 298 193 192 311 189 178

zi = |xi − 65| 4 4 9 16 18 19 21 24 26 35 35 38 39 46 48 57 58 61 61 72

Sign + − + − − + − + + + + + − + + − − + − −

Rank 1.5 1.5 3 4 5 6 7 8 9 10.5 10.5 12 13 14 15 16 17 18.5 18.5 20


W + − 1 n(n+1)

4 thus W + = 108, n = 20. Using normal approximation, Z = √n(n+1)(2n+1)/24 then z = 0.111998. Since zα = 1.6448, the rejection region es |z| ≥ zα . We reach the same conclusion as the sign test.

12.3.7.

Using the difference as “after”-“before” we test H0 : M = 0 vs. Ha : M < 0 at α = 0.05 (i) We have

Before After Difference Sign

185 188 3 +

222 217 −5 −

235 229 −6 −

198 190 −8 −

224 226 2 +

197 185 −12 −

228 225 −3 −

234 231 −3 −

Then n+ = 2. Using large sample approximation, N + ∼ N(μ = np, σ 2 = np(1 − p)), + where p = 1/2 then Z = 2N√n−n follows standard normal distribution. In this case,

n = 8 and n+ = 2, then z = −1.4142. Since zα = −1.645, the rejection region is z ≤ zα . Based on the sample, there is not enough evidence to conclude that the new diet reduces the systollic blood pressure on individuals of over 40 years old. (ii) We have

xi 2 3 −3 −3 −5 −6 −8 −12

zi = |xi − 65| 2 3 3 3 5 6 8 12

Sign + + − − − − − −

Rank 1 3 3 3 5 6 7 8

W + − 1 n(n+1)

4 thus W + = 4, n = 8. Using normal approximation, Z = √n(n+1)(2n+1)/24 then z = −1.96. Since zα = −1.645, the rejection region es z ≤ zα . We reach the same conclusion as the sign test.

EXERCISES 12.4 Assumptions: Observations are randomly selected and n1 ≤ n2 12.4.1.

We need to test H0 : m1 = m2 vs. Ha : m1 = m2 , where m1 is the median for American conference, and m2 is the median for National conference. In this case n1 = n2 = 6.


Combining and keeping track the next table is obtained Value 0.455 0.545 0.545 0.636 0.636 0.636 0.727 0.727 0.818 0.818 0.818 0.909

Population N A N N N N A A A A N A

Rank 1 2.5 2.5 5 5 5 7.5 7.5 10 10 10 12

Then R = 28.5, W = R − 12 n2 (n2 + 1), so w = 28.5 − 12 (6)(6 + 1) = 7.5. For α = 0.05 the rejection region is W ≤ 28 or W ≥ 50. There is enough evidence to conclude that the two samples comes from population with different medians. 12.4.3.

If we select n2 numbers from {1, 2, . . . , n1 + n2 } at random without replacement and if Xi is the ith number selected then E

n 2

Xi

=

i=1

n2 (n1 + n2 + 1) 2

and Var

n 2

Xi

i=1

=

n2 (n1 + n2 + 1)n1 . 12

R is the sum of the ranks r(Xi ) of observations from population 2. Under the null hypothesis they have the same distribution. Since the rank r(Xi ) takes one of 1, 2, . . . , n1 + n2 , if (without loss in generality) all n2

observations are distinct then R = r(Xi ) has i=1

E(R) =

n2 (n1 + n2 + 1) 2

and Var(R) =

n2 (n1 + n2 + 1)n1 . 12


We have W = R − 12 n2 (n2 + 1), then 1 n2 (n2 + 1) 2 n2 (n1 + n2 + 1) 1 − n2 (n2 + 1) = 2 2 n2 n1 = 2

E(W) = E(R) −

and Var(W) = Var(R) =

12.4.5.

n2 (n1 + n2 + 1)n1 12

Using Wilcoxon-Rank-Sum test: We need to test H0 : m1 = m2 vs. Ha : m1 < m2 , where m1 is the median net conversion in female rat, and m2 is the median net conversion in male rat. In this case n1 = 12, n2 = 14. Combining and keeping track the next tables are obtained

Value Population Rank

5.1 F 1

5.5 F 2

6.5 F 3

7.2 F 4

7.5 F 5

9.5 F 6

9.8 M 7.5

9.8 F 7.5

10.4 F 9

11.2 F 10

11.6 M 11

12.8 13.1 M M 12 13

Value 13.5 13.8 13.8 14.2 14.5 15.1 15.8 15.9 16.0 16.0 16.7 16.9 17.3 Population M M F M F M F M M M M M M Rank 14 15.5 15.5 17 18 19 20 21 22.5 22.5 24 25 26 Then R = 250, W = R − 12 n2 (n2 + 1), so w = 28.5 − 12 (14)(14 + 1) = 145. We will use 1 n2 /2 Wilcoxon-Rank-Sum test for large sample, then we use the statistic Z = √n nW(n−n+n , 1 2 1 2 +1)/12 for which the realization is z = 3.1375. For α = 0.05, zα = −1.64485, and the rejection region is z < − 1.64485. There is not enough evidence to conclude that the median net conversion of progesterone in male rats is larger than in female rats. Using Median test: Testing the same hypothesis, the grand median is 13.3. The next tables can be obtained Sample Population Above/Below Median

5.1 F B

5.5 F B

6.5 F B

7.2 F B

7.5 F B

9.5 F B

9.8 M B

9.8 F B

10.4 F B

Sample Population Above/Below Median

11.2 F B

11.6 M B

12.8 M B

13.1 M B

13.5 M A

13.8 M A

13.8 F A

14.2 M A

14.5 F A


Sample Population Above/Below Median

15.1 M A

15.8 F A

15.9 M A

16 M A

16 M A

16.7 M A

16.9 M A

17.3 M A

Then, it can be obtained the next table

sample 1 sample 2 total

Below 9 4 13

Above Total 3 12 10 14 13 26

The total above is Na = 13, the total below is Nb = 13 the sample sizes are n1 = 12, the −E(N1a ) statistic for large sample is given by Z = N√1aVar(N , which realization is z = −2.31455, 1a ) since N1a = 3 is the number of observation in sample 1 above the median, E(N1a ) = (13)(12) Na n1 1 n 2 Nb = 6, and Var(N1a ) = Nna2n(n−1) = 1.68. n = 26 At α = 0.05, zα = −1.64485. Then, the rejection region is zα < −1.64485. Therefore, the same conclusion is reached. 12.4.7.

We need to test H0 : m1 = m2 vs. Ha : m1 > m2 , where m1 is the median for sample II, and m2 is the median for sample I. In this case, the sample sizes are n1 = 8 and n2 = 12. Combining and keeping track the next table can be obtained Value Population Rank

4 S1 1

6 S1 2

7 S1 3

8 S1 4.5

8 S1 4.5

10 S1 6

11 S1 7

12 S1 8.5

Value Population Rank

14 S1 13

15 S1 14.5

15 S2 14.5

16 S2 16

17 S2 17.5

17 S2 17.5

18 S2 19

19 S2 20

12 S1 8.5

13 S1 16.5

13 S2 16.5

13 S2 16.5

Then R = 89, W = R− 12 n2 (n2 +1), so w = 89− 12 (12)(12+1) = 11. For α = 0.01, the rejection region is W ≥ 115. There is no enough evidence to suggest that the median for sample I is less than the median for sample II.


We need to test H0 : M1 = M2 = M3 = M4 = 0 vs. Ha : Not all Mi ’s equal 0, where Mi is the

true median for group i, i = 1, 2, 3, 4. In this case n1 = n2 = n3 = n4 = 8, N = 4i=1 ni = 32,

2 ri2 12 r1 = 80, r2 = 116, r3 = 115, and r4 = 177. Then H = N(N+1) i=1 ni − 3(N + 1) = 7.832386 2 2 = χ0.05,3 = 7.8147. There is enough evidence to suggest that not (i) At α = 0.05, χα,k−1 all Mi ’s are equal to 0 2 2 (ii) At α = 0.01, χα,k−1 = χ0.01,3 = 11.3449. There is enough evidence to suggest that not all Mi ’s are equal to 0


12.5.3.

For

12 n = 2, H = N(N+1)

2

i=1

ri2 ni

12 − 3(N + 1) = N(N+1)

2

i=1

1 ni

ri −

ni (N+1) 2

2

.

Then,

since

12 r1 + r2 = N(N+1) , N = n1 + n2 , and (−1)2 = 1, it can be obtained H = N(N+1) 2 2 2 2 n1 (N+1) n1 (N+1) 1 12 1 + n12 N(N+1) − r1 − n2 (N+1) = N(N+1) + n12 r1 − n1 r1 − 2 2 2 n1 r1 − 2 2 2 2 n1 (N+1) n1 (N+1) n1 (N+1) 12 N 12 . Therefore, H = · − = − . r r 1 1 2 N(N+1) n1 n2 2 n1 n2 (N+1) 2

Now, we reject H0 if, and only if H > k, for certain value according the rejection rule using 2 the Kruskall-Wallis test. And, H > k ⇔ r1 − n1 (N+1) > k , for appropriate k ⇔ r1 > c1 or 2 r1 > c1 , for some c1 and c2 , which correspond to the rejection rule using Willcoxon-RankSum test. Thus, they are equivalent. 12.5.5.

We need to test H0 : Yields of corn for fertilizers are equal vs. Ha : Not all yields of corn for fertilizer are equal. The corresponding table is

ri

None: 11 1 2.5 5 6 2.5 8.5 8.5 4 49

Fertilizer I: 15.5 14 10 12 7 13 18 15.5 105

Fertilizer II: 26 23 23 21 29.5 20 19 28 17 206.5

Fertilizer III: 31.5 36 23 33 31.5 25 29.5 38 27 274.5

In this case, n1 = n3 = n4 = n5 = 9, n2 = 8, k = 5, N =

Fertilizer IV: 44 41 38 43 40 34 42 38 35 355 k

ni = 44, r1 = 49, r2 = 105,

i=1

r3 = 206.5, r4 = 274.5, and r5 = 355. Then H =

12 N(N+1)

k

i=1

ri2 ni

− 3(N + 1) = 39.29066

2 At α = 0.01, χα,k−1 = 13.2767. There is enough evidence to suggest a difference in yields of corn from the different fertilizers.

Chapter

13

Empirical Methods CHAPTER 13 Statistical software R is used for this chapter. All outputs and codes given are in R. R is a free statistical software, and it can be downloaded from the website: www.r-project.org


(a) Let θ = μ and θˆ = sample mean. Use the command “jackknife” in R, we obtain the results below: $jack.se [1] 12.66607 $jack.bias [1] 0 $jack.values [1] 287.1818 289.2727 276.3636 279.4545 287.4545 284.0909 284.8182 288.0909 [9] 286.7273 289.0000 287.0000 288.5455

Using the same notations defined in Section 13.2, we have the relation between our notations and the results in R as follows: jack.values = θˆ −1 , θˆ −2 , θˆ −3 , . . . , θˆ −n

n 1 jack.bias = (n − 1) θˆ −k − θˆ n

k=1

s∗ jack.se = standard error of the jackknife esitmate = √ n

117

118 CHAPTER 13 Empirical Methods

Thus, the jackknife estimate can be obtained by θˆ ∗ =

=

n n 1 ∗ 1 θˆ k = nθˆ − (n − 1)θˆ −k n n

1 n

k=1 n k=1

k=1

θˆ + (n − 1)θˆ − (n − 1)θˆ −k

n 1 θˆ −k − θˆ = θˆ − (n − 1) n

k=1

= θˆ − jack.bias

Since θˆ = the sample mean of the complete data = 285.67, we have the jackknife estimate of μ as θˆ ∗ = θˆ − jack.bias = 285.67 − 0 = 285.67. (b) A 95% jackknife confidence interval for μ is θˆ ∗ ± tα/2,n−1 · jack.se = 2.85.67 ± 2.201 · 12.666 = (257.789, 313.545).

(c) Compare the 95% jackknife confidence interval with Example 6.3.3, where the confidence interval is (257.81,313.59). Thus, in this exercise through the two methods, we get a very close confidence interval for μ. 13.2.3.

Let θ = μ and θˆ = sample mean. From R we can obtain the following results: $jack.se [1] 1.050376 $jack.bias [1] 0

Since θˆ = the sample mean of the complete data = 61.22, we have the jackknife estimate of μ as θˆ ∗ = θˆ − jack.bias = 61.22 − 0 = 61.22. A 95% jackknife confidence interval for μ is θˆ ∗ ± tα/2,n−1 · jack.se = 61.22 ± 2.262 · 1.05 = (58.844, 63.596).

There is a 95% chance that the true mean falls in (58.844, 63.596). 13.2.5.

Let θ = σ 2 and θˆ = sample variance. From R we can obtain the following results: $jack.se [1] 2247.042 $jack.bias [1] 0

Since θˆ = the sample variance of the complete data = 9386, we have the jackknife estimate of σ 2 as θˆ ∗ = θˆ − jack.bias = 9386 − 0 = 9386. A 95% jackknife confidence interval for σ 2 is θˆ ∗ ± tα/2,n−1 · jack.se = 9386 ± 2.262 · 2247.042 = (4302.838, 14469.16).


There is a 95% chance that the true variance falls in (4302.838, 14469.16). Compare the 95% jackknife confidence interval with Example 6.4.2, where the confidence interval is (4442.3, 31299). Thus, in this case the jackknife confidence interval for σ 2 is much shorter than the classical one in Example 6.4.2. 13.2.7.

(a) Let θ = μ and θˆ = sample mean. From R we can obtain the following results: $jack.se [1] 0.1837461 $jack.bias [1] 0

Since θˆ = the sample mean of the complete data = 2.317, we have the jackknife estimate of μ as θˆ ∗ = θˆ − jack.bias = 2.317 − 0 = 2.317. A 95% jackknife confidence interval for μ is θˆ ∗ ± tα/2,n−1 · jack.se = 2.317 ± 2.201 · 0.184 = (1.912, 2.721).

(b) Let θ = σ 2 and θˆ = sample variance. From R we can obtain the following results: $jack.se [1] 0.1682317 $jack.bias [1] 0

Since θˆ = the sample variance of the complete data = 0.405, we have the jackknife estimate of σ 2 as θˆ ∗ = θˆ − jack.bias = 0.405 − 0 = 0.405. A 95% jackknife confidence interval for σ 2 is θˆ ∗ ± tα/2,n−1 · jack.se = 0.405 ± 2.201 · 0.168 = (0.035, 0.775).

(c) There is a 95% chance that the true mean falls in (1.912, 2.721); and there is a 95% chance that the true variance falls in (0.035, 0.775).

EXERCISES 13.3 Please note that resampling procedure may produce different bootstrap samples every time. Hence the results might be different. You can perform the bootstrapping using any statistical software. 13.3.1.

Using statistical software R we have created N = 8 bootstrap samples of size 20. Next we ∗ ∗ calculated the mean of each bootstrap samples, denoted by X1 , . . . , XN . Then we have the following results: ∗

The bootstrap mean = X =

N 1 ∗ Xi = 5.875; N i=1

) * * The standard error = +

1 ∗ ∗ (Xi − X )2 = 1.137. N −1 N

i=1


13.3.3.

Using statistical software R we have created N = 199 bootstrap samples of size 10. Then 0.025 × (199 + 1) = 5 and 0.975 × (199 + 1) = 195. Thus, respectively, the 0.025 and 0.975 quantiles of the sample means are the 5th and 195th values of ascending-ordered sample means from the bootstrap samples. Then we get the 95% bootstrap confidence interval for μ as (59.07, 63.16).

13.3.5.

(a) Using statistical software R we have created N = 199 bootstrap samples of size 6. Then 0.025 × (199 + 1) = 5 and 0.975 × (199 + 1) = 195. Thus, respectively, the 0.025 and 0.975 quantiles of the sample means are the 5th and 195th values of ascending-ordered sample means from the bootstrap samples. Then we get the 95% bootstrap confidence interval for μ as (150.667, 1149.5). (b) And, respectively, the 0.025 and 0.975 quantiles of the sample medians are the 5th and 195th values of ascending-ordered sample medians from the bootstrap samples. Then we get the 95% bootstrap confidence interval for the population median as (110, 1366.5).


(a) Since S ∼ N(0, θ 2 ) and N ∼ N(0, σ 2 ), then Y = S + N ∼ N(0, σ 2 + θ 2 ). Then the likelihood function of Y is 1

L(θ; y) = fY (y|θ) = 2π(σ 2 + θ 2 )

exp −

1 2 y . 2(σ 2 + θ 2 )

And ∂ ln L(θ; y) θ y2 θ =− 2 + . ∂θ σ + θ2 (σ 2 + θ 2 )2

Solving

∂ ln L(θ; y) = 0, we obtain the MLE as ∂θ 2 31/2 θˆ MLE = max 0, y2 − σ 2 .

(b) The complete likelihood for Y and S is ∂s ∂s LC (θ; y, s) = fY ,S (y, s|θ) = fS,N (s, y − s|θ)|J| where J = ∂n ∂s

∂s ∂y ∂n ∂y

s2 1 (y − s)2 = fS (s|θ)fN (y − s|θ) = exp − 2 − 2πθσ 2θ 2σ 2


The conditional probability density of S given Y = y is

fY ,S (y, s|θ) s2 (y − s)2 ∝ exp − 2 − fY (y|θ) 2θ 2σ 2 ⎧ 2 ⎫ y ⎨ 1 1 ⎬ 1 2 σ ∝ exp − + 2 s− 1 2 1 ⎩ 2 θ ⎭ σ 2 + 2

h(s|θ, y) =

Thus S|θ, y ∼ N Fix θ0 , consider

1 θ2

+

θ

y σ2 1 σ2

,

1

1 + 1 θ2 σ 2

.

Q(θ|θ0 , y) = Eθ0 [ln LC (θ; y, S)|θ0 , y] , = Eθ 0

s2 (y − s)2 constant − 2 ln θ − 2 − 2θ 2σ 2 ⎡

= (constant of θ) − 2 ln θ −

Solving

σ

-

⎛

y σ2

⎞2 ⎤

1 ⎢ 1 ⎜ ⎟ ⎥ +⎝ ⎣ 1 1 1 ⎠ ⎦ 2θ 2 1 + 2 + 2 θ2 σ θ2 σ

∂Q(θ|θ0 , y) = 0, we have ∂θ ⎧ ⎡ ⎛ ⎪ y ⎪ ⎪ ⎨1 ⎢ ⎜ 1 ⎢ ⎜ σ2 θˆ = +⎜ ⎢ 1 1 1 1 ⎪ ⎣ ⎝ 2 ⎪ ⎪ + 2 + 2 ⎩ σ σ θ02 θ02

⎞2 ⎤⎫1/2 ⎪ ⎪ ⎟ ⎥⎪ ⎬ ⎟ ⎥ ⎟ ⎥ ⎠ ⎦⎪ ⎪ ⎪ ⎭

Then we obtain the EM algorithm as ⎧ ⎡ ⎛ ⎪ y ⎪ ⎪ ⎨1⎢ ⎜ 1 2 ⎢ ⎜ σ θˆ (k+1) = +⎜ ⎢ 1 1 ⎪2⎣ 1 ⎝ 1 ⎪ ⎪ + 2 + 2 ⎩ 2 2 σ σ θ(k) θ(k)

13.4.3.

⎞2 ⎤⎫1/2 ⎪ ⎪ ⎟ ⎥⎪ ⎬ ⎟ ⎥ . ⎟ ⎥ ⎠ ⎦⎪ ⎪ ⎪ ⎭

Let n = n1 + n2 + n3 and θ = (p, q). Let x = (n1 , n2 , n3 ) be the observed data and z = (n11 , n12 , n21 , n22 , n3 ) be the compete data where n11 is the number of male identical pairs, n21 is the number of female identical pairs, and n21 and n21 are the non-identical pairs respectively for males and females. Here, the complete data set z has a multinomial distribution with the likelihood given by

n [p(1 − q)]n11 [(1 − p)(1 − q)2 ]n12 [pq]n21 L(θ; z) = n11 , n12 , n21 , n22 , n3 × [(1 − p)q2 ]n22 [2(1 − p)q(1 − q)]n3


Then ln L(θ; z) = (constant of θ) + (n11 + n21 ) ln p + (n12 + n22 + n3 ) ln(1 − p) + (n21 + 2n22 + n3 ) ln q + (n11 + 2n12 + n3 ) ln(1 − q)

For multinomial distribution the expected value of each class is n multiplied by the probability of that class. Then, for θˆ (k) = (pˆ (k) , qˆ (k) ) as the kth step estimate, using the Bayes’ rule we have (k) n11 = E n11 |θˆ (k) , x = n1

pˆ (k) (1 − qˆ (k) ) pˆ (k) (1 − qˆ (k) ) + (1 − pˆ (k) )(1 − qˆ (k) )2

,

(k) (k) n12 = E n12 |θˆ (k) , x = n1 − n11 , (k) n21 = E n21 |θˆ (k) , x = n2

pˆ (k) qˆ (k) pˆ (k) qˆ (k) + (1 − pˆ (k) )(ˆq(k) )2

,

(k) (k) n22 = E n22 |θˆ (k) , x = n2 − n21 .

Then, Q θ|θˆ (k) , x = Eθˆ [ln L(θ; Z)|θˆ (k) , x] (k)

(k) (k) (k) (k) = (constant of θ) + n11 + n21 ln p + n12 + n22 + n3 ln(1 − p) (k) (k) (k) (k) + n21 + 2n22 + n3 ln q + n11 + 2n12 + n3 ln(1 − q)

Solving

∂Q ∂p

= 0 and

∂Q ∂q

= 0, we then obtain the EM algorithm as (k)

(k)

n + n21 pˆ (k+1) = 11 , n (k)

qˆ (k+1) =

13.4.5.

(k)

n21 + 2n22 + n3 (k)

(k)

n + n12 + n22 + n3

.

(a) The survival function is S(y) = Pr(Y > y) = 1 − (y − θ), where Y ∼ N(θ, 1) and is the cdf of N(0, 1). Then the likelihood of X and Y is L(θ; x, y) =

0 n1 n2 0 1 1 [1 − (yi − θ)}], and √ exp − (xi − θ)2 2 2π i=1 i=1

ln L(θ; x, y) = constant −

1 2 1 (xi − θ)2 + ln[1 − (yi − θ)}]. 2

n

n

i=1

i=1


Solving

∂ ln L(θ;x,y) ∂θ

= 0, then the MLE, θˆ MLE , is the solution such that

φ yi − θˆ MLE = 0, where φ is the pdf of N(0, 1). n1 x¯ − θˆ MLE + i=1 1 − yi − θˆ MLE

n2

(b) Let z = (x1 , . . . , xn1 , z1 , . . . , zn2 ) be the complete data set. Then the likelihood is LC (θ; z) =

, n1 +n2 n1 n2 1 1 1 2 2 exp − (xi − θ) − (zi − θ) , and √ 2 2 2π i=1 i=1

ln LC (θ; z) = constant −

1 2 1 1 (xi − θ)2 − (zi − θ)2 . 2 2

n

n

i=1

i=1

For i = 1, . . . , n2 , the conditional pdf of Zi given X = x, Y = y and θ = θ0 is h(zi |θ0 , x, y) =

f (zi , zi ≥ yi |θ0 ) φ(zi − θ0 ) = , f (zi ≥ yi |θ0 ) 1 − (yi − θ0 )

zi ≥ yi .

Then, Q(θ|θ0 , x, y) = Eθ0 [ln LC (θ; Z)|θ0 , x, y] = constant −

1 2 1 1 (xi − θ)2 − Eθ0 (zi − θ)2 2 2

n

n

i=1

i=1

1 2 1 1 = constant − (xi − θ)2 − 2 2

Solving

∂Q ∂θ

∞ (zi − θ)2 h(zi |θ0 , x, y)dzi .

n

n

i=1

i=1 yi

= 0, after a lengthy computation we then obtain θ=

n2 n1 n2 1 φ(yi − θ0 ) x¯ + θ0 + . n1 + n2 n1 + n 2 n1 + n2 1 − (yi − θ0 ) i=1

Therefore, we obtain the EM algorithm as θˆ (k+1) =

n2 φ(yi − θˆ (k) ) n1 n2 1 x+ θˆ (k) + . n1 + n2 n1 + n2 n1 + n2 1 − (yi − θˆ (k) ) i=1

EXERCISES 13.5 All the algorithms and generations of simple distributions in this section can be done by any statistical software. The statistical software, R, is used here.


13.5.1.

The 1st iteration: We have x0 = 6. Step 1: Generate j from A = {aij }. Suppose the software generated j = 7. Step 2: r =

π(7) π(6)

=

e−3 37 /7! e−3 36 /6!

= 0.4286

Step 3: Generate u from U(0, 1). Suppose the software generated u = 0.5494. Since r < u, we reject the new state 7 and stay at state 6. Set x1 = x0 = 6. The 2nd iteration: Start with x1 = 6. Step 1: Generate j from A = {aij }. Suppose the software generated j = 5. −3 5

e 3 /5! Step 2: r = π(5) π(6) = e−3 36 /6! = 2 Step 3: Since r > 1, set x2 = j = 5.

The 3rd iteration: Start with x2 = 5. Step 1: Generate j from A = {aij }. Suppose the software generated j = 6. −3 6

e 3 /6! Step 2: r = π(6) π(5) = e−3 35 /5! = 0.5 Step 3: Generate u from U(0, 1). Suppose the software generated u = 0.7594. Since r < u, set x3 = x2 = 5.

The first 3 iterations are given above. The readers can follow the same algorithm to obtain more sample points. Note that different results may appear due to the different generated values each time. 13.5.3.

The Metropolis–Hastings algorithm is given below: For t = 0, start with an arbitrary point, x(0) . Step 1: Generate y from the proposal density ([α], [α]/α). 1 α−1 exp − y α−1 y α (α)β β π(y) x −y = xy Step 2: r = = exp (t)β . (t) x α−1 1 π(x(t) ) exp − (t) αx (α)β

(t)

β

Step 3: Acceptance/Rejection. Generate u from U(0, 1). If r ≥ u, set x(t+1) = y (i.e. accept the proposed new state); else set x(t+1) = x(t) (i.e. reject the proposed new state). Step 4: Set t = t + 1, go to step 1. 13.5.5.

Use the nominating matrix below

⎛ 1/2 ⎜1/2 ⎜ A=⎜ ⎝ 0 0

1/2 0 1/2 0

0 1/2 0 1/2

⎞ 0 0 ⎟ ⎟ ⎟. 1/2⎠ 1/2


The Metropolis–Hastings algorithm is given below: For k = 0, start with an arbitrary point, xk = i. Step 1: Generate j from the proposal matrix as follows: Generate u1 from U(0, 1). For i = 0, if u1 ≥ 0.5, set j = 1; else set j = 0. For i = 1 or 2, if u1 ≥ 0.5, set j = i + 1; else set j = i − 1. For i = 3, if u1 ≥ 0.5, set j = 3; else set j = 2. Step 2: Calculate r = π(j) π(i) according to the target distribution π(x). Step 3: Acceptance/Rejection. Generate u2 from U(0, 1). If r ≥ u2 , set xk+1 = j; else set xk+1 = xk . Step 4: Set k = k + 1, go to step 1. 13.5.7.

π is an exponential random variable with parameter θ, i.e. π(x) = 1θ exp − xθ , x > 0. % & (y−x)2 Let the proposal density be qx (y) ∝ exp − 2(0.5) 2 . The Metropolis–Hastings algorithm is given below: For t = 0, start with an arbitrary point, x(0) > 0. % & (y−x(t) )2 Step 1: Generate y from the proposal density qx(t) (y) ∝ exp − 2(0.5) . 2 That is to generate y from N(x(t) , (0.5)2 ). & % y (x(t) −y)2 1 π(y)qy (x(t) ) θ exp − θ exp − 2(0.5)2 & = exp x(t)θ−y . % Step 2: r = = 2 x (y−x(t) ) 1 π(x(t) )qx(t) (y) exp − (t) exp − θ

θ

2(0.5)2

Let α = min{1, r}. Step 3: Acceptance/Rejection. Generate u from U(0, 1). If α ≥ u, set x(t+1) = y; else set x(t+1) = x(t) . Step 4: Set t = t + 1, go to step 1. 13.5.9.

From Example 13.5.5 with n = 15, α = 1, and β = 2 recall that X|Y = y ∼ Binomial(n, y) = Binomial(15, y), and Y |X = x ∼ β(x + α, n − x + β) = β(x + 1, 17 − x). For y0 = 1/3: (i) Generate x0 from Binomial(15, 1/3). Suppose the software generated x0 = 5. (ii) Generate y1 from β(x0 + 1, 17 − x0 ) = β(6, 12). Suppose the software generated y1 = 0.46 (approximated to the second digit). Then generate x1 from Binomial(15, 0.46), resulting in x1 = 5. (iii) Generate y2 from β(x1 + 1, 17 − x1 ) = β(6, 12), resulting in y2 = 0.26. Then generate x2 from Binomial(15, 0.26), resulting in x2 = 5.


Thus, for y0 = 1/3 a particular realization of Gibbs sampler for the first three iterations are (5, 0.33), (5, 0.46), and (5, 0.26). For y0 = 1/2: (i) Generate x0 from Binomial(15, 1/2). Suppose the software generated x0 = 10. (ii) Generate y1 from β(x0 + 1, 17 − x0 ) = β(11, 7). Suppose the software generated y1 = 0.31 (approximated to the second digit). Then generate x1 from Binomial(15, 0.31), resulting in x1 = 5. (iii) Generate y2 from β(x1 + 1, 17 − x1 ) = β(6, 12), resulting in y2 = 0.34. Then generate x2 from Binomial(15, 0.34), resulting in x2 = 5. Thus, for y0 = 1/2 a particular realization of Gibbs sampler for the first three iterations are (10, 0.5), (5, 0.31), and (5, 0.34). For y0 = 2/3: (i) Generate x0 from Binomial(15, 2/3). Suppose the software generated x0 = 11. (ii) Generate y1 from β(x0 + 1, 17 − x0 ) = β(12, 6). Suppose the software generated y1 = 0.61 (approximated to the second digit). Then generate x1 from Binomial(15, 0.61), resulting in x1 = 9. (iii) Generate y2 from β(x1 + 1, 17 − x1 ) = β(10, 8), resulting in y2 = 0.59. Then generate x2 from Binomial(15, 0.59), resulting in x2 = 7. Thus, for y0 = 2/3 a particular realization of Gibbs sampler for the first three iterations are (11, 0.66), (9, 0.61), and (7, 0.59). From the three cases with different initial values, we see that when y0 = 2/3 the samples have larger x and y values than the samples with y0 = 1/3. Thus, the choosing of initial values may have influence on the samples. However, this influence would be negligible if the algorithm is run for a large number of times. 2 σ1 ρσ1 σ2 μ1 , , then the conditional distribution of X given Y = y 13.5.11. If (X, Y ) ∼ N μ2 ρσ1 σ2 σ22 is (X|Y = y) ∼ N(μ1 + ρ σσ12 (y − μ2 ), σ12 (1 − ρ2 )). Apply the above result we have (X|Y = y) ∼ N(ρ · y, (1 − ρ2 )), and (Y |X = x) ∼ N(ρ · x, (1 − ρ2 )).

The Gibbs sampler is given below: Start with an arbitrary point, y(0) . Then obtain x(0) by generating a random value from N(ρ · y(0) , (1 − ρ2 )). For i = 1, . . . n, repeat Step 1: Generate y(i) from N(ρ · x(i−1) , (1 − ρ2 )) Step 2: Generate x(i) from N(ρ · y(i) , (1 − ρ2 )) Step 3: Obtain the ith sample as (x(i) , y(i) ). Set i = i + 1, go to step 1.

Chapter

14

Some Issues in Statistical Applications: An Overview EXERCISES 14.2 The following is a scatter plot of the data.

30 25 Percent Return

14.2.1.

20 15 10 5

0.5

1.0

1.5 2.0 2.5 Percent Expense Ratio

3.0

The sample correlation coefficient is 0.3249, indicating a mild positive correlation.

127

128 CHAPTER 14 Some Issues in Statistical Applications: An Overview

(a) The following is a scatter plot of the data. 40000

Expenditure

30000

20000

10000

10000

20000 30000 Revenue

40000

50000

(b) r = 0.9918. (c) The following is a Q-Q plot of revenue versus expenditure. 40000

30000 Expenditure

14.2.3.

20000

10000

10000

20000 30000 Revenue

40000

50000

(d) From the scatter plot being close to a line and r = 0.9918 we see that there is a strong positive liner relationship between revenue and expenditure. From the Q-Q plot we see that the quantiles fall nearly along the 45 degree line. Thus, we may conjecture that the revenue and the expenditure have the same probability distribution.


14.2.5.

The following is the dot plot for this data.

80

100

120

140

160

180

The above dot plot suggests the distribution of the median house prices is skewed towards the left, because most of the observations are to the left.


(a) The following table summarizes the z score, the modified z score, and the distribution free z score. data 1215.1 1109.9 1536.5 1797.8 1630.5 939.7 1219.7 519.9 830 780.1 1403.3 1869.7 2152.8 1410 532.8

z score −0.09852 −0.31406 0.559969 1.095325 0.752558 −0.66277 −0.0891 −1.52286 −0.88752 −0.98976 0.287066 1.242635 1.822656 0.300793 −1.49643

dist–free z 0.011804 0.281755 0.812933 1.483449 1.054144 0.718501 0 1.79574 1 1.128047 0.471132 1.66795 2.394406 0.488324 1.762638

modified z −0.12339 −0.39335 0.701342 1.371858 0.942553 −0.83009 −0.11159 −1.90733 −1.11159 −1.23964 0.359541 1.556359 2.282815 0.376733 −1.87423


Since no z scores and modified z scores have absolute values greater than 3.5, and no distributions free z scores are greater than 5, we then conclude that there is no obvious outliers. (b) The following is the boxplot.

2000

1500

1000

500

(c) Outliers in this case may represent an extreme observation which has either very high or very low rate of motor vehicle thefts. 14.3.3.

(a) The following table summarizes the z score, the modified z score, and the distribution free z score. data 67 63 39 80 64 95 90 93 21 36 44 66 100 66 72

z score −0.09135 −0.29066 −1.48652 0.55641 −0.24083 1.303824 1.054686 1.204169 −2.38342 −1.636 −1.23738 −0.14118 1.552962 −0.14118 0.157789

dist–free z 0.037037 0.333333 2.111111 0.925926 0.259259 2.037037 1.666667 1.888889 3.444444 2.333333 1.740741 0.111111 2.407407 0.111111 0.333333

modified z −0.1358 −0.4321 −2.20987 0.827163 −0.35802 1.938274 1.567904 1.790126 −3.54321 −2.4321 −1.8395 −0.20987 2.308644 −0.20987 0.23457


34 78 66 68 98 74 81 71 100 60 50 81 66 90 89 86 49 77 63 58 43

−1.73566 0.456755 −0.14118 −0.04152 1.453307 0.257444 0.606237 0.107961 1.552962 −0.44014 −0.93842 0.606237 −0.14118 1.054686 1.004858 0.855375 −0.98825 0.406927 −0.29066 −0.5398 −1.28721

2.481481 0.777778 0.111111 0.037037 2.259259 0.481481 1 0.259259 2.407407 0.555556 1.296296 1 0.111111 1.666667 1.592593 1.37037 1.37037 0.703704 0.333333 0.703704 1.814815

−2.58024 0.679015 −0.20987 −0.06173 2.160496 0.382719 0.901237 0.160496 2.308644 −0.65432 −1.39506 0.901237 −0.20987 1.567904 1.49383 1.271607 −1.46913 0.604941 −0.4321 −0.80247 −1.91358

Using z-score test and distribution-free test, there are no outliers. Using modified z-score test the observation 21 is a possible outlier. (b) The following is the boxplot. 100

80

60

40

20

Hence, the observation 21 is identified as an outlier using the boxplot.



a Normal Q-Q Plot 100

Sample Quantiles

80

60

40

20

22

21

0 1 Theoretical Quantiles

2

From the above normal probability plot we see that the data follows the straight line fairly well. Hence, the normality of the data is not rejected and no transformation is needed. (a) The following is the normal probability plot of the data. The graph below clearly shows that the data does not follow normal distribution. Normal Q-Q Plot

25000 Sample Quantiles

14.4.3.

15000

5000

21.5

21.0

20.5 0.0 0.5 Theoretical Quantiles

1.0

1.5


(b) Take the transformation y = ln(x) and look at the normal probability plot of the transformed data below. Normal Q-Q Plot

Sample Quantiles

10.0

9.5

9.0

8.5

8.0

7.5

21.5

21.0

20.5 0.0 0.5 Theoretical Quantiles

1.0

1.5

With the transformation, we can see that the transformed data falls much closer to the normal line. (a) The following is the normal probability plot of the data. The graph below clearly shows that the data does not follow normal distribution. Normal Q-Q Plot 40000

30000 Sample Quantiles

14.4.5.

20000

10000

21

0 Theoretical Quantiles

1



10.5

Sample Quantiles

10.0

9.5

9.0

8.5

8.0 21


1

With the transformation, we can see that the transformed data falls much closer to the normal line. (a) & (b) The following is the normal probability plot of the data. We see that the data follows the straight line except for one data points. This suggests that the data may follow normality but with a possible outlier. Normal Q-Q Plot

50 45 Sample Quantiles

14.4.7.

40 35 30 25 20 22

21


2


(c) The following is the boxplot of the data.

50 45 40 35 30 25 20

Hence, the observation 52 is identified as a possible outlier using boxplot. Further investigation is needed to check if there was a measurement error about this case. Or this observation may suggest that a particular car is significantly better than the others in terms of mileage per gallon. 14.4.9.

Use the data from Exercise 14.2.1. Let X = the percent expense ratio and Y = the percent return. Then we can calculate that s2 F= X = 0.0038. sY2

Since F < F0.025 (19, 19) = 0.3958, we then reject the null hypothesis at level 0.05. Thus, we suggest that the variances of the two populations are not equal. 14.4.11.

Let X = the bonus for female and Y = the bonus for male. The assumption of the test is that the random samples of X and Y are from independent normal distributions. To test the homogeneity of variances of X and Y we calculate the ratio s2 F= X = 0.8044 sY2

Since F0.025 (7, 7) = 0.2002 < F < F0.975 (7, 7) = 4.9949, then we do not reject the null hypothesis at level 0.05. Thus, we suggest that the variances of the two populations are equal. 14.4.13.

Let X1 , X2 and X3 be the scores of the students taught by the faculty, the teaching assistant and the adjunct, respectively. The assumption of the test is that the random samples of


X1 , X2 and X3 are from independent normal distributions. To test homogeneity of variances of X1 , X2 and X3 we first compute x¯ 1 = 81.6, x¯ 2 = 78.8 and x¯ 3 = 70.4. Letting yij = |xij − x¯ i |, we then obtain the following yij values. Faculty 11.4 20.6 5.4 6.6 10.4

Deviation Teaching Assistant 9.2 11.2 2.8 3.2 20.8

Adjunct 15.6 14.4 2.6 19.6 23.4

The test statistic is k

ni (¯yi. − y¯ .. )2 /(k − 1) i=1 z= where n1 = 5, n2 = 5, n3 = 5, k = 3 and n = 15 ni k

(yij − y¯ i. )2 /(n − k) i=1 j=1 =

MST 43.59 = = 0.8651 MSE 50.39

Since z < F0.95 (2, 12) = 3.8853, then at level 0.05 we do not reject the null hypothesis. That is we suggest that the variances of the three populations are equal.


(a) The following is the dot plot of the data of Exercise 14.4.5.

10000

20000

30000

40000

(b) Mean = y¯ = 13373.53, median = 7145, and standard deviation = s = 11924.47.


(c) A 95% confidence interval for the mean is s 11924.47 y¯ ± t0.975 (n − 1) √ = 13373.53 ± 2.1448 √ = (6769.98, 19977.08). n 15

(d) A 95% prediction interval is '

' 1 1 y¯ ± t0.975 (n − 1) · s 1 + = 13373.53 ± 2.1448 · 11924.47 1 + n 15 = (−13040.67, 39787.74).

Since state expenditure is nonnegative, we can take the 95% prediction interval as (0, 39787.74). (e) There is a 95% chance that the true mean falls in (6769.98, 19977.08). There is a 95% chance that the next observation falls in (0, 39787.74). The assumption of obtaining the confidence interval and prediction interval is that the data follows normal distribution or has large sample size to employ central limit theorem. 14.5.3.

(a) Let X = the midterm score and Y = the final score. The following is the scatter plot of the data with the fitted regression line. 100

80

Final

60

40

20

20

14.5.5.

40

60 Midterm

80

100

(b) The data does not show any particular pattern. No transformation is needed in this case. (c) Fitting the data we obtain the linear regression model as yˆ = 64.39 − 0.4048x. However, we haveR2 = 0.1345 meaning only 13.45% of the variation in y is explained by the variable x. (a) Let X = the in-state tuition and Y = the graduation rate. The following is the scatter plot of the data with the fitted regression line.


60

Graduation Rate

50

40

30

20

10 2000

3000

4000 5000 6000 In-State Tuition

7000

(b) Fitting the data by least squares method we obtain the linear regression model as yˆ = 18.6887 − 0.0043x. (c) We have R2 = 0.1618 meaning only 16.18% of the variation in y is explained by the variable x. Thus, from the small R2 and the scatter plot above we suggest that the least squares line is not a good model and must be improved.

EXERCISES 14.6 (a) The normal probability plot of the data is given below. From the normal plot we can see that the data significantly deviates from the normal line. Hence, we can not assume the data is normally distributed and nonparametric test is more appropriate. Normal Q-Q Plot

0.6 Sample Quantiles

14.6.1.

0.4

0.2

0.0 22

21


2



Sample Quantiles

21

22

23

24

22

21


2

We see that the transform data does not deviate from the normal line a lot. Thus, a parametric test can be used on the logarithmic filtered data.


(a) Let X = total revenue and Y = pupils per teacher. The following is the dot plot of the data of pupils per teacher.

14

15

16

17

18

19

20


The descriptive statistics of the pupils per teacher data is given below. Mean 16.6625

n 16

Std 2.0063

Min 14.2

Q1 14.975

Median 16.25

Q3 17.525

Max 20.2

(b) The boxplot of the pupils per teacher data is given below. From the boxplot below we see that no outlier exists. 20

19

18

17

16

15

14

The following is the normal probability plot of the pupils per teacher data. The data is not normal. Normal Q-Q Plot 20

Sample Quantiles

19

18

17

16

15 14 22

21


1

2


The following gives the normal plot by take the transformation z = y−2 which shows that the data becomes approximately normal. Normal Q-Q Plot

Sample Quantiles

0.0045

0.0035

0.0025 22

21


1

2

(c) A 95% confidence interval for the mean of the pupils per teacher is 2.0063 s y¯ ± t0.975 (n − 1) √ = 16.6625 ± 2.1315 √ = (15.59, 17.73). n 16

(d) The following is the scatter plot of total revenue vs. pupils per teacher with the fitted regression line. 20

Pupils per Teacher

19

18

17

16

15

14 5.0e 1 06

1.5e 1 07 Total Revenue

2.5e 1 07


(e) Fitting the data we obtain the linear regression model as yˆ = 17.06−5.512·10−8 x. However, we have R2 = 0.0324 meaning only 3.24% of the variation iny is explained by the variable x. Thus, the regression model is not a good representation of the relationship between total revenue and pupils per teacher. Let X = the in-state tuition and Y = the graduation rate. The following is the scatter plot of graduation rate vs. in-state tuition with the fitted regression line.

60

Graduation Rate

50

40

30

20

10 2000

3000

4000 5000 In-State Tuition

6000

7000

Fitting the data we then obtain the linear regression model as yˆ = 18.6887 − 0.0043x with R2 = 0.1618. To run a residual model diagnostics we look at the following three plots.

20

10 Residual

14.7.3.

0

10

20 2

4

6

8 Order

10

12

14


20

Residual

10

0

210

220 30

35

40 Fitted Value

45

50

Normal Q-Q

Standardized Residuals

20

10

0

210

220 21


1

There is nothing unusual about the residual plots. Therefore, the basic assumptions in regression analysis for the errors: independency, normality and homogeneity of variances are checked. And it seems to be no reason to reject these assumptions.


10739308_10203914552004027_1698851518_n

Recommend Documents