COMP 233 Winter 2016 PROBABILITY AND STATISTICS FOR COMPUTER SCIENCE Assignment 2 Solutions
1. (a) Suppose 1 in 1000 1000 persons persons has a certain certain disease. disease. A test detects detects the disease disease in 95% 95% of diseased diseased persons. persons. The test also also ”detec ”detects” ts” the diseas diseasee in 1% of health healthy y persons persons.. With With what what probab probabilit ility y does a positiv positivee test diagnose the disease? (b) Machines Machines M 1 , M 2 , M 3 produce these proportions of a article: M 1 : 5%, M 2 : 25%, M 3 : 70%. 70%. The The probability the machines produce defective articles is M 1 : 5%, M 2 : 4%, M 3 : 2%. What What is is the the probability a random article was made by machine M machine M 1 , given that it is defective? (c) A mac machin hinee M M consists of three independent parts, M 1 , M 2 , and M 3 . Suppo Suppose se that that M 1 functions 9 9 properly with probability 10 , M 2 functions properly with probability 10 , M 3 functions properly with 8 probability 10 , and that the machine M M functions functions if and only if its three parts function. function. What is the probability for the machine M machine M to malfunction? SOLUTION: (a) Given Given P ( 001, P (+|D) = 0.95, 95, P (+|Dc ) = 0.01. By Bayes’ formula: P (D) = 0.001, P (+ P (+ P ( P (D|+) =
P (+ P (+|D) · P ( P (D) 0.95 · 0.001 = = 8.68% P (+ P (+|D) · P ( P (D) + P (+ P (+|Dc ) · P ( P (Dc ) 0.95 · 0.001 + 0. 0.01 · 0.999
(b) We are given given that P that P ((M 1 ) = 0.05, 05, P P ((M 2 ) = 0.25, 25, P P ((M 3 ) = 0.70, and P and P ((D|M 1 ) = 0.05, 05, P P ((D|M 2 ) = 0.04, P ( P (D|M 3 ) = 0.02. Thus P ( P (M 1 |D) = =
P ( P (D|M 1 ) · P ( P (M 1 ) P ( P (D|M 1 ) · P ( P (M 1 ) + P ( P (D|M 2 ) · P ( P (M 2 ) + P ( P (D|M 3 ) · P ( P (M 3 ) 0.05 · 0.05 = 9.4% 4%.. 0.05 · 0.05 + 0. 0.04 · 0.25 + 0. 0.02 · 0.70
(c) The machine machine functions with probability probability 1 − 0.648 = 35. 35.2%.
9 10
·
9 10
·
8 10
= 64 64..8%, and hence malfunctions with probability
2. Three balls balls are selected selected at random from a bag containing containing 3 red, 3 green, and 4 blue balls. Define the random random variables R variables R = the number of red balls drawn, and G and G = the number of green balls drawn. List the values of (a) the joint probability probability mass function function p p R,G (r, g). (b) the marginal probability probability mass functions functions pR (r) and p and p G (g). (c) the joint distribution distribution function function F F R,G R,G (r, g). (d) the marginal distribution distribution functions functions F F R (r) and F and F G (g).
1
SOLUTION: pR,G (r, g )
(a), (b)
g = 0
g = 1
g = 2
g = 3
4
18
12
1
35
120
120
120
120
120
0
r = 0 r = 1 r = 2 r = 3 pG (g )
18
36
9
120
120
120
0
0
0
0
12
9
120
120
1
0
120
pR (r ) 63 120 21 120 1 120
35
63
21
1
120
120
120
120
g = 2
g = 3
1
F R,G R,G (r, g)
(c), (d)
g = 0 r = 0 r = 1 r = 2 r = 3 F G (g )
g = 1
F R (r)
4
22
34
35
35
120
120
120
120
120
22
76
97
98
98
120
120
120
120
120
34
97
118
119
119
120
120
120
120
120
35
98
119
120
120
120
1
1
35
98
119
120
120
120
1
1
3. For the preceding preceding problem, problem, also determine determine (a) The conditional conditional probability probability mass functions functions pR|G and p and p G|R . Are R and a nd G independent?
(b) E [R] and E and E [[G]
(c) V ar( ar(R) and V ar( ar(G) (d) cov( cov(R, G) SOLUTION: pR|G (r)
(a) r = r = r = r =
pG|R (r )
g = 0
g = 1
g = 2
g = 3 1
0 1 2 3
4
18
12
35
63
21
18
36
9
35
63
21
12
9
35
63
1
0
35
r = r = r = r =
0
0
0
0
0
0 1 2 3
R and G are not independent. (b) E [R] = E [G] =
35 120 35 120
· 0 +
(c) E [R2 ] = E [G2 ] =
63 120 63 120
·0 +
· 1 +
21 120 21 120
·2 +
· 1 +
1 120
· 4 +
· 3 =
1 120
108 120 156 120
·9 =
V ar( ar(R) = E [R2 ] − (E [R])2 = 0.49 = V ar( ar(G). (d) E [RG] RG] =
36 120
·1 +
9 120
· 2 +
9 120
· 2 =
cov( cov(R, G) = E [RG] RG] − E [R] · E [G] =
72 120
72 120
.
)2 = − 0.21 21.. − ( 108 120
2
.
.
g = 0
g = 1
g = 2
g = 3
4
18
12
1
35
35
35
35
0
18
36
9
63
63
63
12
9
21
21
0
0
1
0
0
0
4. (a) A trial consis consists ts of tossing tossing two two dice. The result result is counted counted as success successful ful if the sum of the outcome outcomess is 12. What is the probability probability that the number of successes successes in 36 such such trials is greater than one? What is this probability if we approximate it using the Poisson random variable? (b) Customers arrive at a counter at the rate of 20 per hour. Assume the arrivals arrivals have have a Poisson distribution. distribution. What is the probability that more than two customers arrive in a period of 5 minutes? SOLUTION: (a) The probabilit probability y of success is p is p = = 1/36. Let the random variable X variable X measure measure the number of successes in 36 trials. We have P ( P (X = = 0) =
36 p0 (1 − p) p)36 = (35/ (35/36)36 and P ( P (X = = 1) = 0
36 p1 (1 − p) p)35 = (35/ (35/36)35 . 1
∼ 26. Thus P ( P (X > 1) = 1 − P ( P (X = 0) − P ( P (X = = 1) = 1 − (35/ (35/36)36 − (35/ (35/36)35 = 26 .421% Using the Poisson approximation to the Binomial we have, with λ with λ = = np np = = 36 36//36 = 1, P ( P (X = = 0) = e = e−1 10 /0! = e−1
and
P ( P (X = = 1) = e = e −1 11 /1! = e −1 ,
so that P ( P (X > 1) = 1 − P ( P (X = 0) − P ( P (X = = 1) = 1 − 2e−1 ∼ 26 .424% = 26. (b) Here Here λ =
5 60
1.667, and · 2 0 = 1.
P ( P (X > 2) = 1 − P ( P (X = 0) − P ( P (X = 1) − P ( P (X = 2) = 1 − e−λ
λ0 λ1 λ2 = 23 23..4% 4%.. − e−λ − e−λ 0! 1! 2!
5. Approxim Approximately ately 20,000 marriages marriages took place in Qu´ ebec ebec last year. year. Assuming Assuming that each each person’s person’s birthday birthday is equally likely to be any of the 365 days of the year, estimate the probability that for one or more of these couples: (a) both partners partners were were born on April 1; (b) both partners partners celebrated celebrated their birthday birthday on the same day of the year. (Hint: The Poisson random variable can be used.) SOLUTION: 1 (a) The probability that a person was born on April 1 is 365 . Thus by the multiplication rule, the probability 2 that both partners were born on April 1 is 1/ 1/365 . So, on avera average, ge, the number number of couple coupless out of the 2 ∼ 20,000 that were born on April 1 is 20, 20, 000 000/ /365 = 0. 0 .15. The occurrence of couples being born on April 1 can be assumed to be a Poisson distribution with mean λ = 0.15. Thus
∼ 0.14 14.. P {N λ ≥ 1 } = 1 − P {N λ = 0} = 1 − e−0.15 = (b) The probability probability that both partners partners were born on a specific day is again 1/ 1/3652 . The probabilit probability y that 1 2 both partners were born on some day of the year is 365/ 365 /365 = 365 . Thus, Thus, on avera average, ge, the number number of couples out of the 20,000 that were born on the same day is 20, 20, 000 000//365 ∼ 54 .8. The occurrence occurrence of of = 54. couples being born on the same day can be assumed to be a Poisson distribution with mean of λ λ ∼ 54 .8. = 54. Thus ∼ 1. P {N λ ≥ 1 } = 1 − P {N λ = 0} = 1 − e−54.8 =
3
6. For the random random variable variable X X with density function f ( f (x) =
4x , 0 < x ≤ 21 4 − 4x , 21 < x ≤ 1 0 , otherwise
(a) Determine Determine the distribution distribution function F ( F (x), and draw the graphs f ( f (x) and of F ( F (x). (b) Determine Determine P P (( 13 < X ≤ 21 ). (c) Determine Determine E E [[X ]. ]. (d) Determine Determine V V ar( ar(X ), ), and σ(X ). ). SOLUTION: x
(a) For x or x ∈ [0, [0 , 21 ]: F ( F (x) = 0 4x dx = dx = 2x2 |x0 = 2x2 . x For x or x ∈ ( 12 , 1]: F ( F (x) = 21 + 4 − 4x dx = dx = 21 + (4x (4x − 2x2 )|x = 21 + (4x (4x − 2x2 ) − (2 − 21 ) = 4x − 2x2 − 1. 1 3
1 2
(b) P ( P ( < X ≤ ) =
5 18
.
(c) E [X ] = 21 . (d) V ar( ar(X ) =
1 24
, σ (X ) =
1
1
2
2
√ 6 12
.
7. For the random random variables variables X X,, Y Y with joint density function f ( f (x, y) =
cxy2 (1 − x)(1 − y 2 ), 0 ≤ x ≤ 1 , 0 ≤ y ≤ 1 0,
otherwise
(a) For what value value of c c is this a joint density function? (b) Using this this value value of c of c, compute the density functions of X and Y . Y . (c) What What is the valu valuee of Cov of Cov((X, Y )? Y )? (d) Determine Determine P P {X > Y }. SOLUTION: (a) 1
1
0
0
1
f ( f (x, y)dy dx =
1
cxy2 (1 − x)(1 − y2 )dy dx
0
0 1
= c·
1
x(1 − x)
0
1
x(1 − x)dx ·
0
2
y 2 (1 − y2 )dy dx
0
1
= c·
y2 (1 − y 2 )dy
0
x x y y5 − )|10 · ( − )|10 2 3 3 5 1 2 c = c · · = . 6 15 45 = c·(
Hence c = 45.
4
3
3
(b) 1
f X X (x) =
1
f ( f (x, y )dy =
0
45 45xy xy 2 (1 − x)(1 − y2 )dy
0
1
= 45 45x x(1 − x)
y2 (1 − y2 )dy
0
= 45 45x x(1 − x) ·
2 = 6x(1 − x). 15
(c) 1
f Y (y) = Y (y
1
f ( f (x, y )dx =
0
45 45xy xy2 (1 − x)(1 − y2 )dx
0
1
2
2
= 45 45yy (1 − y )
x(1 − x)dx
0
= 45 45yy 2 (1 − y 2 ) ·
1 15 2 = y (1 − y 2 ). 6 2
X and Y Y are independent since f ( f (x, y) = f X (y ). Thus Cov( Cov (X, Y ) Y ) = 0. X (x)f Y Y (y (d) x
1
P {X > Y } =
0
x
1
f ( f (x, y )dy dx = dx = 45
0 1
0
3
5
2
2
xy (1 − x)(1 − y )dy dx = dx = 45
0
x(1 − x)
0
1
3
5
y y x x − ) |x0 dx = dx = 45 x(1 − x)( − )dx = dx = 45 3 5 3 5 0 0 x5 x 7 x6 x8 1 1 1 1 1 19 ∼ = 45( − − + )|0 = 45( − − + ) = = 34 15 35 18 40 15 35 18 40 56
= 45
x(1 − x)(
x
1
y2 − y4 dy dx
0
1
0
4
(
6
x x x 5 x 7 − − + )dx 3 5 3 5
8. The side mea measur sureme ement nt of a die manufac manufactur tured ed by a com compan pany y is a random random number number X X that is uniformly distributed between 1 and 1.25 cm. (You may assume the die is a perfect cube.) (a) Determine Determine the distribution distribution function function of the volume volume of the die. (b) What is the probabilit probability y that the volume volume of a randomly randomly selected die manufactur manufactured ed by this company is greater than 1.424? SOLUTION: (a) Let X X denote denote the side-l side-leng ength th of the die. We know know that that X is X is a number chosen randomly from the interval [1, [1, 1.25]. The distrib distributi ution on function function of the volume volume Y = X 3 is F ( F (y) = P ( P (Y ≤ y). Thus Thus we we 3 3 1/3 have to determine F ( F (y ) = P ( P (X ≤ y) for y ∈ R. We know know that X ≤ y if and only if X ≤ y , so 1/3 F ( F (y) = P ( P (X ≤ y ). Since X is X is randomly chosen from [1, [1, 1.25], if y < 1, then P ( P (X ≤ y 1/3 ) = 0. For y such that 1 ≤ y 1/3 ≤ 1. 1 .25 we have P ( P (X ≤ y 1/3 ) = P ( P (X ∈ [1, [1 , y 1/3 ]) =
y1/3 − 1 = 4(y 4(y1/3 − 1). 1). 1.25 − 1
Finally, if y1/3 > 1. 1 .25, then P ( P (X ≤ y 1/3 ) = 1. Hence Hence we obtain F ( F (y) =
4(y 4(y
1/3
5
0 : y < 1 (1 .25)3 − 1) : 1 ≤ y ≤ (1. 1 : y > (1. (1.25)3 .
(b) The probabilit probability y P ( P (X > 1. 1 .424) can be calculated as
∼ 0.5. P ( P (Y > 1. 1 .424) = 1 − P ( P (Y ≤ 1. 1 .424) = 1 − F (1 F (1..424) = 1 − 4((1. 4((1.414)1/3 − 1) = 9. From past experie experience nce,, a profes professor sor knows knows that that the test score score of studen students ts taking taking a final final examin examinatio ation n is a random variable with mean 65. (a) Give Give an upper bound on the probability probability that a student’s student’s test score will exceed exceed 75. (b) Suppose in addition the professor professor knows knows that the variance variance of a student’s student’s test score score is equal to 30. What can be said about the probability that a student will score between 55 and 75? (c) How many many students students would have have to take the examination examination so as to ensure, with probability probability at least 0.8, that the class average would be within 5 of 65? SOLUTION: (a) An upper bound bound to the probabil probabilit ity y that that a studen student’s t’s test score score will exceed exceed 85 is 65/ 65/75, 75, namely namely,, by Markov’s inequality: P {X ≥ 75 } ≤ E [X ]/75 = 65/ 65/75 75.. (b) The probability probability that a student will score between between 55 and 75 is greater greater than or equal to 0.70 by Chebyshev’s inequality: σ2 − 65| ≥ 10} ≤ 2 = 30 P {|X − 30//100 100,, 10 or − 65| < 10 P {|X − < 10} > 1 > 1 − 0.30 = 0. 0.70 70.. (c) With With X =
X 1 +X 2 +...+X n
n
and knowing knowing that
X V ar X +X n+...+X = V = V ar X n + V ar n + . . . + V ar n = n1 V ar( = ar(X 1 ) + n1 V ar( ar(X 2 ) + . . . + n1 V ar( ar(X n ) = 30 n
1
2
n
2
1
2
2
2
2
X n
n
30
n .
then from Chebyshev’s inequality,
− 65| ≥ 5 } ≤ P {|X −
V ar( ar(X ) 30 30/n /n 6 = = , 52 25 5n
which is equal to 0. 0.2 when n = 6. So n = 6 would suffice. 10. A stick of length length 1 is split at a random randomly ly selected selected point point X , i.e., X is X is uniformly distributed in the interval [0, [0, 1]. Determine the expected length of the piece that contains the point 1/ 1/3. SOLUTION: Let the function L function L((X ) denote the length of the piece that contains the point 1/ 1/3 : L(x) =
1−x x
if x x < 1/ 1 /3 , if x x ≥ 1/ 1 /3 .
Since the density function of X X is f ( f (x) = 1 if x x ∈ [0, [0 , 1] and f and f ((x) = 0 otherwise, we have 1
E [L(X )] )] =
1/3
1
L(x)f ( f (x) dx =
0
L(x) dx =
0
= (x −
x 2 ) 2
0
1 3
0
6
+
x2 2
1 1 3
=
13 . 18
1
(1 − x) dx +
1/3
x dx