INTRODUCTION
TO
PROBABILITY
1|Page
Introduction to Probability
2|Page
Introduction to Probability
2|Page
3. INTRODUCTIO INTRODUCTION N TO PROBA BIL ITY OBJECTIVES On completion of this unit students should be able to: 1. Calculate and interpret simple probabilities using counting and tree diagrams 2. Calculate probabilities of mutually exclusive events, independent events and complementary events 3. From a crosstabulation calculate conditional c onditional probabilities 4. Use the probability distribution of a discrete random variable to calculate probabilities and to calculate and interpret the expected value and standard deviation of the random variable.
3.1
Prelude
Roger Federer is to play Andy Roddick. Roddick. Who will win? If you select Federer it is unlikely that you will will be totally certain of his hi s victory but rather ra ther you would say that he would probably win, or that he had a better (or higher) chance than Roddick of winning. The topic of probability aims to quantify this sense of probably or more likely or higher chance. If we say that 100% probability means that a player is certain to win and 0% means that they are certain to lose then 50% would mean that both both players have the same chance. If you select Federer to win, what what percentage probability would you allocate? 60% 80% 95%
only slightly more likely to win quite likely to win, only a small chance of losing almost certain to win
There are many other situations where the outcome is uncertain but where a probability can be assigned. Many of these are subjective so that different people will give different estimates of the probability. Examples include sporting contests, whether it will rain tomorrow and share prices rising or falling. In other situations the probability can be calculated exactly by simple counting methods and appropriate assumptions. For example in tossing a fair coin the probability of a head is 50% or 0.5. This is because there are only two equally likely outcomes. We would say that the chance is 1 out of 2, or 1/2 = 0.5. This example suggests a simple definition of probability in the case where all events are equally likely: Pr(event)
=
number of ways the event occurs total number of possible events
Example 3-1
In one roll of a fair 6 sided die (singular of dice) what is: (a) Pr(6) (b) Pr(even number) (c) Pr(number less than 5) (d) Pr(2 and 4)? Solution
(a) (b) (c) (d)
Pr(6) = 1/6 Pr(even number) = 3/6 =0.5 Pr(number < 5) = 4/6 = 2/3 Pr(2 and 4) = 0/6 = 0
as there is only one 6 and six faces in total as there are 3 even number faces as there are four numbers less than 5 as 2 and 4 cannot occur together.
3|Page
Introduction to Probability
Note that, as in the above example, probabilities are positive proper fractions, that is probabilities are numbers between 0 and 1 (inclusive). However in everyday language we may also convert them to percentages.
3.2
Tree Diagrams
As events become more complicated we need some special methods to help us count the number of different events. A simple visual method is the tree diagram. For tossing a fair coin the diagram looks like this:
The total number of events equals the number of branches at the end of the (horizontal!) tree. For a fair coin this is 2. Example 3-2
Draw a tree diagram for the random event of rolling a fair die. Solution
Tree diagrams are especially useful when two or more events take place together or successively. Example 3-3
Two fair coins are tossed. How many outcomes are there? How many of these have 1 head and 1 tail? Solution
4|Page
There are four outcomes, namely: {H,H}, {H,T}, {T,H} and {T,T}. Of these, two have one head and one tail. Hence Pr(1 head and 1 tail) = 2/4 = 0.5
as the four events are equally likely.
Note that in this example the sum of the four probabilities, each equal to one quarter (= 0.25), is 1. This is always true for the complete collection of outcomes that make up one random experiment. Basic properties of probabilities:
(i) probabilities are greater than or equal to zero and less than or equal to 1, ie 0 ≤ pi ≤ 1 (ii) the sum of all possible probabilities is one, ie ∑ pi = 1 .
3.3
Probabilities from Data Another source of exact probabilities is frequency tables. Example 3-4
One hundred people are surveyed to identify their bank. From the following table what is the probability that a person selected at random from the 100 people is a customer of the Comm bank? Bank
No. of customers
Bonza Comm Natty Melb Total
20 26 34 20 100
Solution
Pr(Comm) = 26/100 = 0.26, as each person has the same chance of selection in random sampling. Hence each person is equally likely to be selected.
3.4
Relationships between Events There are several ways in which two events can be related. The ideas will be illustrated using the simple random experiments of coin and dice and also tabulations of data. These relationships apply to any type of probabilities including subjective and exact probabilities.
3.4.1
The Complement of an Event The complement of an event is the collection of all the other possible events. So in one toss of a fair coin the complement of a head is a tail; for one roll of a fair die the complement of the even numbers {2, 4, 6} is the set of odd numbers {1, 3, 5}. Since the sum of the probabilities of all events equals 1 we have the connection that Pr(event) + Pr(complement) = 1.
5|Page
Introduction to Probability
Example 3-5
For the bank customer data in Example 3-4 find the complement of: (a) {Bonza} and (b) {Bonza, Melb} and the probabilities of these complements. Solution
(a)
The
complement
of
{Bonza}
Since Pr({Bonza}) + Pr({Comm, Natty, Melb}) then Pr({Comm, Natty, Melb})
(b)
{Comm,
Natty,
= = = =
1 1 – Pr({Bonza}) 1 – 20/100 0.8
Melb}.
The complement of {Bonza, Melb} is {Comm, Natty}. Hence: Pr({Comm, Natty})
3.4.2
is
= = =
1 – Pr({Bonza, Melb}) 1 – 40/100 0.6
Mutually Exclu sive Events Two events are mutually exclusive if they have no events in common. This is equivalent to saying that if one event occurs it is impossible for the other event to occur (at the same time). For example in rolling a fair die these pairs of events are mutually exclusive:
• •
{2} and {4}. {even numbers} and {odd numbers}.
(c/f Example 3-1(d))
However the following two events are not mutually exclusive: {1, 2, 3, 4} and {2, 4, 6}, as they have both 2 and 4 in common. This means that if, with one roll of the die, we get, say, a 4 then both events have occurred. In Australian Rules football scoring has two mutually exclusive outcomes: a goal and a behind. (A behind is a near miss of the goal on either side.) Example 3-6
Which of the following are mutually exclusive pairs of events? (a) (b) (c) (d) (e) (f)
Rolling a 7 or an 8 with a pair of dice Living in Geelong and working in Melbourne. Being under 21 years of age and being Prime Minister of Australia. Being a parliamentarian and being a farmer. Drawing a red card or drawing an Ace out of an ordinary deck of playing cards. Forming a five person committee that contains 2 women or that contains three women.
Solution
(a), (c) and (f) are mutually exclusive pairs of events. If two events are mutually exclusive then Pr(either one or the other occurs) = Pr(one) + Pr(other). If we denote two events using the letters A and B then if they are mutually exclusive Pr(A or B) = Pr(A) + Pr(B)
6|Page
Example 3-7
For one roll of a fair die, what is the probability of a 2 or a 4? Solution
We can calculate Pr(2 or 4) as 2/6 by the counting method, as there only two numbers in the event. But since {2} and {4} are mutually exclusive and Pr(2) = 1/6 = Pr(4) then Pr(2 or 4) = 1/6 + 1/6 = 2/6 In the bank data of Example 3-4, as each of the 100 customers belongs to only one bank the total frequency is 100 and belonging to a bank gives four mutually exclusive events. Example 3-4( b) we can calculate the probabilities in this way: and
Pr({Bonza or Melb}) = 20/100 + 20/100 = 0.4 Pr({Comm or Natty}) = 26/100 + 34/100 = 0.6
Consider the following table. One hundred people were asked which charities they supported. Charity
Frequency
Salvation Army 44 Smith Family 35 Heart Foundation 22 Deaf and Blind 24 Guide Dogs 28 total 153 As the total is more than 100 some people are supporting more than one charity. Hence this is not a set of mutually exclusive events and it would be incorrect to say, for example, Pr({Salvation Army or Smith Family})
= Pr({Salvation Army}) + Pr({Smith Family}) = 44/153 + 35/153 as some people are counted twice in this calculation.
3.4.3
Independ ent Events This is a quite different type of relationship to mutually exclusive as the property of independence is conferred by the physical method of performing the events. The basic method for guaranteeing independence is to select the items at random. This means that the selection of any one particular item has no influence on what other items will be selected. Some simple examples of random events that are independent are:
•
Tossing a fair coin several times. Each toss is (physically) independent of previous tosses as these cannot affect the outcome of any later toss. So if the first toss is a head on the second toss heads and tails are equally likely.
•
Rolling two fair dice together. Whatever comes up on one face cannot affect the other. Of course the two dice may collide but this can only lead to an accidental change in the numbers. There is no systematic effect such as if one is an even number then the other must be an odd number.
7|Page
Introduction to Probability
•
Two weeks Tattslotto numbers. The numbered balls are thoroughly mixed and each one is selected at random. The numbers selected in one week have no effect on which numbers are selected in subsequent weeks.
•
Using a computer program that generates random numbers, to select individuals for a survey. This method is used to randomly select telephone numbers for opinion polls.
Example 3-8
Which of the following pairs of events are independent and which are not independent? (a) (b) (c) (d)
Being a girl and having red hair. Having a flat tyre and being late for work. Living in Hawthorn and being over 2 metres tall. A student studying hard and passing an exam.
Solution
(a) and (c) are independent. (b) and (d) are (probably) dependent.
Example 3-9
In one toss of a fair coin we get 10 straight heads. On the next toss is tails more likely than heads? Solution
No! As each toss is independent of all other tosses then heads and tails are equally likely at every toss, regardless of what has happened before.
If two events are independent then the probability that both happen together can be calculated from their individual probabilities. Call two events A and B. If A and B are independent events then: Pr(A and B) = Pr(A).Pr(B) Example 3-10
Two fair coins are tossed together. What is the probability that both coins show a head? Solution Since Pr(head) = 0.5 for each coin, and the coins are tossed independently of each other then: Pr(2 heads) = 0.5x0.5 = 0.25. This agrees with, and supplements, the tree diagram analysis in example 3.3.
3.4.4
Condit ion al Probabil ity If two events are not independent then they are dependent. In this case the probability of one event occurring is affected by whether or not the other related event has occurred. Related events are often displayed in crosstabulations where two variables are measured for a set of individuals.
8|Page
Example 3-11
A Bondi Beach nightclub has the following data on the age and marital status of 140 customers:
Age Under 30
30 and over
Single
77
28
Married
14
21
Marital Status
(a) (b) (c) (d) (e)
Convert this table to show probabilities. Comment on the age of customers attending the club. Comment on the marital status of customers attending the club. What is the probability of finding a customer who is single and under the age of 30? If a customer is under 30, what is the probability that he or she is single?
Solution
In this table we have recorded the age and marital status of 140 people. Each cell gives a different combination of age and marital status. For example there are 14 people aged under 30 who are married. (a)
We divide each frequency by 140 to convert the frequencies to probabilities.
Age Under 30
30 and over
Single
0.55
0.20
Married
0.10
0.15
Marital Status
(b)
Note that the sum of the four probabilities is 1.
(c)
The majority of customers, 65%, are aged under 30.
(d)
The majority of customers, 75%, are single.
(e)
Pr(single and under 30) = 0.55
(f)
This question asks for a conditional probability. The condition is that the customer is under 30. This means that we must go back to the original table and only consider the under 30 column of the age variable.
Age Under 30 Single
77
Married
14
Marital Status
There are 91 customers under 30. Hence the (conditional) probability that a customer is single, given that they are under 30, is 77/91 = 0.846. Since the unconditional probability that a customer is single is 0.75 (from (c)) this shows that being single and being under 30 are not independent as these two probabilities are not equal.
9|Page
Introduction to Probability
3.5
Two General Examples Example 3-12
A company’s employees have been classified according to age and salary as shown below:
< 30 30 to 45 >45 TOTAL
< $25,000
$25,000 to $45,000
> $45,000
TOTAL
32 10 1 43
3 18 10 31
0 21 5 26
35 49 16 100
One employee is selected at random and two events are defined as follows: A: The employee is under 30 B: The employee’s salary is under $25,000. Express each of the following probabilities in words and find their numerical values: (a) Pr(A) (b) Pr(B (c) Pr(A and B) (d) Pr(A or B) (e) Pr(A occurs given that B occurs)
(f) Pr(B occurs given that A occurs)
(g) Find the probability that the employee’s salary is at least $25,000 given that the employee is at least 30 years of age. (h) Are events A and B mutually exclusive? (i) Are events A and B independent events? Solution
(a) Pr(A) = 35/100 = 0.35
Pr(employee is under 30) is 0.35
(b) Pr(B) = 43/100 = 0.43
Pr(employee’s salary is under $25,000) is 0.43
(c) Pr(A and B) = 32/100 = 0.32
Pr(employee is under 30 and salary is under $25,000) is 0.32
(d) Pr(A or B) = (32 + 3 + 0 + 10 + 1)/100 = 46/100 = 0.46
Pr(employee is under 30 or salary is under $25,000) is 0.46
(e) Pr(A occurs given that B occurs) = 32/43 = 0.744
Pr(an employee with salary under $25,000 is also under 30) is 0.744
(f) Pr(B occurs given that A occurs) = 32/35 = 0.914
Pr(an employee under 30 also has a salary under $25,000) is 0.914
(g) Pr(salary at least $25,000 given age at least 30) = (18 + 21 + 10 + 5)/(10 + 18 + 21 + 1 + 10 + 5) = 54/65 = 0.831 (h) A and B are not mutually exclusive as an employee can be both. (i) From (a) and (b), Pr(A)xPr(B) = 0.35x0.43 = 0.151, which is different from (c). Hence A and B are not independent.
10 | P a g e
Example 3-13 Note: The solutions to each part of this question will be given immediately after the question part.
Bendigo Power Generation Pty Ltd (BPG) is currently starting work on a project designed to increase the generating capacity of one of its plants in the Bendigo District of Central Victoria. The project is divided into two sequential stages: stage 1 (design) and stage 2 (construction). While each stage will be scheduled and controlled as closely as possible, management cannot predict beforehand the exact elapsed time for each stage of the project. An analysis of similar construction projects over the past 3 years has shown completion times for the design stage of 2, 3, or 4 months and completion times for the construction stage of 6, 7, or 8 months. (a) Draw a tree diagram for the BPG project, including a listing of all possible outcomes.
(b) Based on their experience and judgment, management concluded that the experimental outcomes were not equally likely. By conducting a study of the completion time for similar projects undertaken by BPG over the past three years management found the data on the next page. Using this data, calculate the probabilities associated with each sample point and add these probabilities to the tree diagram (on next page). Completion time (months) Stage 1
Stage 2
Sample Point
No. of past projects having these completion times
2 2 2 3 3 3 4 4 4
6 7 8 6 7 8 6 7 8
(2,6) (2,7) (2,8) (3,6) (3,7) (3,8) (4,6) (4,7) (4,8)
6 6 2 4 8 2 2 4 6 TOTAL = 40 projects
11 | P a g e
Introduction to Probability
These possible outcomes can also be presented in a table Stage 1 Design Stage 2 Construction
6 7 8
2
3
4
0.15 0.15 0.05
0.10 0.20 0.05
0.05 0.10 0.15
(c) Due to the critical need for additional power, management has set a goal of 10 months for the total project completion time. Hence the entire project will be completed late if the total elapsed time to complete both stages exceeds 10 months. What is the probability that the project will be finished on time, that is, in 10 months or less? Pr(on time) = (0.15 + 0.15 + 0.05) + (0.10 + 0.20) + 0.05 = 0.70 We can add these probabilities because they are all mutually exclusive outcomes. (d) What is the most likely or most probable completion time? Pr(8 months) = 0.15 Pr(11 months) = 0.15
Pr(9 months) = 0.25 Pr(12 months) = 0.15
Pr(10 months) = 0.30
Therefore the most likely completion time of the project 10 months. (e) What is the probability that the project will be completed within a year? Pr(completed within a year) = 1 because the largest possible prediction time is 12 months.
Complete Exercise Set 3A Calculating Probabilities on page 211
3.6
Discrete Probability Distributions
3.6.1
Random Variables In the previous sections and exercises the examples were concerned with random events and the probabilities of the different outcomes of these events. For example in one toss of a fair coin there were two outcomes, Head and Tail, each with probability 0.5; in Example 3-4 people were customers of one of four banks and the probability of being a customer of each bank was calculated from the frequencies. 12 | P a g e
In all these examples we have two things: a random variable, namely the outcome of a random experiment, and a set of probabilities corresponding to each of the outcomes. When we combine these two we have the probability distribution of the random variable. In our examples in this section probability distributions will be displayed as a table, as in Example 3-14. This is because all the random variables are discrete ie the outcomes are all distinct numbers or categories. (Random variables can also be continuous, in which case a graph or table is used to represent the probability distribution. An important example is discussed in section 4 below.) Random variables are given a single letter name such as X or Y. Thus in one toss of a fair coin X is the outcome, either a head or a tail. Example 3-14
(a) What is the probability distribution for one toss of a fair coin? Tossing a fair coin
x Pr(x)
Head
Tail
0.5
0.5
(b) What is the probability distribution for one roll of a fair die? Rolling a fair Die 1 2 x 1/6 1/6 Pr(x)
3 1/6
4 1/6
5 1/6
6 1/6
What is the probability of getting a number greater than 4? Name the probability rule that you need to calculate this probability. Pr(X > 4) = Pr(5) + Pr(6) = 1/6 + 1/6 = 2/6 = 1/3. Since 5 and 6 are mutually exclusive events we can add the individual probabilities. (c) Students in a university residence visit a nearby Pizza Parlour with varying frequencies per month. Let X be the number of times a student visits the Pizza Parlour each month. The probability distribution of X could look like this: x Pr(x)
0 0.1
1 0.3
2 0.4
3 0.2
If a student is selected at random find the probability that he/she will visit the Pizza Parlour: i) No times.
Pr(X = 0) = 0.1
ii) Once.
Pr(X = 1) = 0.3
iii) Two or more times
Pr(X ≥ 2) = 0.4 + 0.2 = 0.6, as the values of X are mutually exclusive.
NOTE: All the usual probability conditions are satisfied
ΣPr(X) = 1, i.e. the sum of all the probabilities is one, and 0 ≤ Pr(X) ≤ 1 for all values of X.
13 | P a g e
Introduction to Probability
3.6.2
Graphs of Probability Distribut ions We can make a graph of a probability distribution by putting the X values on the horizontal axis and the probabilities on the vertical axis. Example 3-15
Graphs of two of the probability distributions from Example 3-14, (a) and (c). (a)
(c)
In these graphs it is the top of the vertical lines that represents the probabilities. The rest of the line is drawn only to make the point more visible.
3.6.3
Expected Value and Standard Deviation of a Random Variable Given a set of data we can calculate the sample mean and standard deviation. Similarly, given a probability distribution we can calculate the mean and standard deviation of the corresponding random variable. The formulae and procedures are very similar to those for sample data and can be carried out for discrete distributions using Microsoft Excel. The mean of a numerical random variable X is also called the Expected Value of X, written E(X). The name derives from the idea that the mean is the value of X that would be expected as the most likely or average value in the long run. It is calculated from the formula: µ = E ( X )
= ∑ X . Pr( X )
The variance of a numerical random variable X is often written Var(X). For a discrete random variable, 2
Var ( X ) = E ( X ) − µ
The standard deviation is
2
Var ( X )
14 | P a g e
Example 3-16
Calculate the mean value of the result of one roll of a fair die. Solution µ = E ( X )
= ∑ X . Pr( X ) = 1 ×
1 6
+ 2×
1 6
+ 3×
1 6
+4 ×
1 6
+ 5×
1 6
+ 6×
1 6
=
21 6
= 3.5
The average of all the faces is 3.5, which can be seen by adding the numbers from 1 to 6 and then dividing by 6. Example 3-17
What is the average number of times per month that a student (from Example 3-14(c)) goes to the Pizza Parlour? Solution
The average number of times per student is the expected value of X. µ = E ( X )
= ∑ X . Pr( X ) = 0 × 0.1 + 1 × 0.3 + 2 × 0.4 + 3 × 0.2 = 1.7
ie a student selected at random is expected to go to the Pizza Parlour on average 1.7 times per month. The expected value and variance of a discrete random variable can be calculated using Microsoft Excel as in the following example. Example 3-18
Calculate the mean and standard deviation of the Pizza Parlour data of Example 3-14(c). Solution
Using Microsoft Excel 2007 to calculate the expected value, variance and standard deviation of a discrete random variable Step 1:
In a new Excel spreadsheet, name cell A1 X and cell A2 Pr(X).
Enter the probability distribution Enter the values of X and Pr(X) given on p.13 so that your spreadsheet appears as shown below.
15 | P a g e
Introduction to Probability
Step 2:
In cell C1, type X*Pr(X).
Calculating the expected value To calculate the expected value we need to first multiply the outcome (X) of each event in column 1 by its probability Pr(X) in column 2 of the spreadsheet. Therefore, go to cell C2 and type = a2*b2 and press ENTER.
To copy this formula to the other cells in column C, right-click cell C2 and select Copy from the resulting drop-down menu. Then press the Shift key, scroll down with the left mouse button so that C3 to C5 are also highlighted, right-click the mouse again, and select Paste. Your spreadsheet should then appear as shown below.
We now need to add the values in column C since the expected value is the sum ∑ X . Pr( X ) We will do this in row 7 of the spreadsheet here. First, type E(X) in cell A7 so we can keep track of our calculations. Then, in cell B7, type =sum(c2:c5) and press ENTER, to obtain the result below.
The mean E(X) is 1.7
16 | P a g e
Step 3:
Calculating E ( X 2 )
To calculate the variance, we first need to calculate E ( X 2 ) , which is equal to
∑ X
2
* P( X ) . Therefore, type X^2 in cell D1 and X^2*Pr(X) in E1.
To calculate X^2 for the first outcome (X) in row 2, go to cell D2 and type =a2^2 and press ENTER.
Then copy this formula to the other cells in column D by right-clicking cell D2 and selecting Copy from the resulting drop-down menu. Press the Shift key, scroll down with the left mouse button so that D3 to D5 are also highlighted, right-click the mouse again, and select Paste.
To calculate X^2*Pr(X) we want to multiply the value in a given cell of column D by the corresponding cell in column B. Therefore, go to cell E2 and type =d2*b2 and press ENTER. Then copy and paste this formula to cells E3 through E5 so that your spreadsheet appears as below.
We will calculate E ( X 2 ) in row 8. First, type E(X^2) in cell A8. Then, in cell B8, we will sum the X 2 * P( X ) values. That is, we will sum cells E2 to E5. Hence, in cell B8, type = sum(e2:e5) and press ENTER.
Hence E ( X 2 ) = 3.7 17 | P a g e
Introduction to Probability
Step 4:
calculating the variance
We will now calculate Var(X) in row 9, so type Var(X) in cell A9. Since Var ( X ) = E ( X 2 ) − µ 2 and we calculated E ( X 2 ) in cell B8 and µ in cell B7, go to cell B9 and type =b8-b7^2 and press ENTER.
Hence, variance = 0.81 Step 5:
Calculating the standard deviation
We will calculate the standard deviation in row 10 of our spreadsheet. First, type Standard Deviation in cell A10. The standard deviation is the square root of the variance (which we calculated in cell B9), which can be calculated using the sqrt function in Excel. That is, go to cell B10 and type =sqrt(b9) and press ENTER.
The value of the standard deviation is 0.9 in this example. Example 3-19
A company operates a chain of takeaway food outlets. It intends to build an additional outlet and surveys the following possible locations:
Ringwood Footscray St. Kilda
Probability of Success
Estimated Profit if Successful
Estimated Profit if Unsuccessful
0.6 0.4 0.5
$35,000 $60,000 $45,000
$2,500 $7,000 $6,000
Which location should be chosen? Consider both the expected profit and the risk involved as measured by the standard deviation. 18 | P a g e
Solution
For each location we can calculate the expected value of the profit from the probability distribution. For Ringwood the distribution is: x (profit $) Pr(x)
35,000 2,500 0.6 0.4 The probability of failure is 1 – 0.6 = 0.4 as failure is the complement of success. For Footscray the distribution is: x (profit $) Pr(x)
60,000 0.4
7,000 0.6
45,000 0.5
6,000 0.5
For St. Kilda the distribution is: x (profit $) Pr(x)
For Ringwood, the expected value is
µ = E(X) =
0.6x35,000 + 0.4x2,500 = 22,000.
We obtain the three expected profits and standard deviations individually, using Microsoft Excel as in Example 3-18. Location Ringwood Footscray St.Kilda
Expected profit ($) 22,000 28,200 25,500
Standard deviation ($) 15,922 25,965 19,500
Coeff of variation 72% 92% 76%
Hence we could choose Footscray as it is the most profitable location but noting that the expected profit is the most variable and hence the most risky. St Kilda is a compromise with the second highest expected profit but the second lowest risk as measured by the coefficient of variation.
Complete Exercise Set 3B Probability Distributions on page 244
19 | P a g e
Introduction to Probability
3.7
References: •
Anderson, D.R, et al Statistics for Business and Economics, Australian edition 1989. Chapter 4, pages 79 - 107 Chapter 5, pages 125 - 132
•
Levine, D. M. Krehbiel, T. C., Berenson, M. L./ Business Statistics: A First Course, Second Edition 2000, Prentice Hall. Chapter 4, pages 154 – 170, 179 - 183
•
McLean, A., Stephens, B., Business Mathematics and Statistics, Longman 1996. Chapter 20, pages 451 – 477 Chapter 21, pages 488 – 494
•
Selvanathan, A., Selvanathan, S., Keller, G., Warrack, B., Australian Business Statistics, Abridged third edition, Nelson 2004. Chapter 4, pages 114 – 140 Chapter 5, pages 182 – 190
•
Waxman, P., Business Mathematics and Statistics, Third edition, 1993. Chapter 19, pages 453 - 475.
TERMS YOU SHOULD KNOW: Probability Tree diagram Basic requirements of probability Mutually exclusive Independent events Conditional probability Random variable Probability distribution Expected value
Event Definition of probability of an event Complement Addition Law Multiplication Law Discrete random variable Probability function Standard deviation
USEFUL FORMULAE: Pr(event ) = 1 − Pr(complement )
E(X) = µ = Σxp(x)
Pr(A or B) = Pr(A) + Pr(B), for mutually exclusive events Pr(A and B) = Pr(A)×Pr(B), for independent events
20 | P a g e
3.8
Exercises Exercise Set 3A Calculating Probabilities (Solutions can be found on page 266) 1. Over a long period of time, the queue length of customers at the teller section of a major bank was observed to have the following probability distribution; Number in queue Probability
0 0.1
1 0.2
2 0.2
3 0.3
4 or more 0.2
Find the probability of (a) At most two people in the queue. (b)
No more than three people in the queue.
(c)
At least one person in the queue.
(d)
Two or more people in the queue.
(e)
Given there is at least one person in the queue, what is the probability that there are 4 or more?
2. An airline has offered a special vacation package to Fiji. The length of stay is either for 3 days or 7 days, and the type of accommodation can be either economy, regular, or deluxe. Consider the experiment of observing the choices made by the next person signing up for the package. (a)
How many experimental outcomes are possible?
(b)
Draw a tree diagram for this experiment.
3. An investor has two stocks: stock A and stock B. Each stock may increase in value, decrease in value, or remain unchanged. Consider the experiment as the investment in both of the two stocks. (a)
How many experimental outcomes are possible?
(b)
Draw a tree diagram for the experiment.
(c)
How many of the experimental outcomes result in an increase in value for at least one of the two stocks?
(d)
How many of the experimental outcomes result in an increase in value for both of the stocks?
4. Suppose that a manager of a large apartment complex provides the following subjective probability estimates about the number of vacancies that will exist next month: Vacancies
Probability
0 1 2 3 4 5
.05 .15 .35 .25 .10 .10
21 | P a g e
Introduction to Probability
List the components of each of the following events and calculate the probability of the events: (a) No vacancies. (b)
At least four vacancies.
(c)
Two or fewer vacancies.
5. The manager of a furniture store sells from zero to four china cabinets each week. Based on past experience, the following probabilities are assigned to sales of zero, one, two, three, or four cabinets: Pr (0) = 0.08 Pr (1) = 0.18 Pr (2) = 0.32 Pr (3) = 0.30 Pr (4) = 0.12
1.00 (a)
Are these valid probability assignments? Why or why not?
(b)
Let A be the event that two or fewer are sold in one week. Find Pr(A).
(c)
Let B be the event that four or more are sold in one week. Find Pr(B).
6. A sample of 100 customers of the Adelaide Gas and Electric Company resulted in the following frequency distribution of monthly charges. Amount $
Number
0-49 50-99 100-149 150-199 200-249
13 22 34 26 5
(a)
Let A be the event that monthly charges are $150 or more. Find Pr(A).
(b)
Let B be the event that monthly charges are less than $150. Find (Pr(B).
7. A survey of 50 students at a technical college regarding the number of extracurricular activities resulted in the following data Number of Activities Frequency
0 8
1 20
2 12
3 6
4 3
5 1
(a)
Let A be the event that a student participates in at least 1 activity. Find Pr(A).
(b)
Let B be the event that a student participates in 3 or more activities. Find Pr(B).
(c)
What is the probability that a student participates in exactly 2 activities?
22 | P a g e
8. A garage sells from 0 to 4 car batteries each week. Based on past experience, the following probabilities are assigned to sales of 0, 1, 2, 3 or 4 batteries. Pr (0) = 0.19 Pr (1) = 0.24 Pr (2) = 0.35 Pr (3) = 0.14 Pr (4) = 0.08
(a)
1.00 Are these valid probability assignments? Why or why not?
(b)
Let A be the event that two or fewer are sold in one week. Find Pr ( A).
(c)
Let B be the event that four or more are sold in one week. Find Pr(B).
(d)
Are A and B mutually exclusive? Find Pr(A and B) and Pr(A or B).
9. A purchasing agent has placed two rush orders for a particular raw material from two different suppliers, A and B. If neither order has arrived in 4 days the production process must be shut down until at least one of the orders arrives. The probability that supplier A can deliver the material in 4 days is 0.55. The probability that supplier B can deliver the material in 4 days is 0.35. Draw a tree diagram to help you answer the following questions. (a)
What is the probability that both suppliers deliver the material within 4 days? Since two separate suppliers are involved we are willing to assume independence.
(b)
What is the probability that the production process is shut down in 4 days because of a shortage in raw material (that is, both orders are late)?
(c)
What is the probability that at least one supplier delivers the material in 4 days?
10. To install a pool fence, a company responds to all sales inquiries by first sending a brochure. It then attends the property to measure and quote, and finally installs the fence. It classifies the outcome of all sales inquiries into three mutually exclusive categories depending on how far the customer progresses in the inquiry:- request for brochure; measure and quote given; and pool fence purchased. From its records for the last 6 months it has extracted the result of all inquiries and the suburb of the customer. The results are summarised in the table below.
Request for brochure Measure and quote Fence purchased Total
Eastern suburbs 100
Western suburbs 50
Other 50
Total 200
50
10
20
80
50 200
0 60
20 90
70 350
(a)
What is the probability that a sales contact results in a pool fence sale?
(b)
What is the probability that a sales inquiry from a person in the Eastern suburbs results in a pool fence sale?
(c)
What is the probability that a measure and quote inquiry comes from the Eastern or Western suburbs?
(d)
Are the events 'Request for brochure' and 'client is from the Eastern suburbs' independent? Give a reason for your answer. 23 | P a g e
Introduction to Probability
Exercise Set 3B Discrete Probability Distributions (Solutions can be found on page 30) 1. The heights of the Hawthorn Basketball Club team playing next week (including injury replacements) are: Height (cm) No. of Players
170 1
175 2
180 2
185 2
190 1
195 4
200 1
205 2
210 1
(a)
What is the probability of selecting a player at random who is 195cm tall?
(b)
What is the probability of selecting a player at random who is 210cm tall?
(c)
Find the probability distribution of this random variable. Determine the expected height of a Hawthorn Basketball Club team member using Microsoft Excel.
2. The demand for a product of Gippsland Industries varies greatly from month to month. Based on the past 2 years of data, the following probability distribution shows the company’s monthly demand. 300 0.20
Unit demand Probability
400 0.30
500 0.35
600 0.15
(a)
If the company places monthly orders based on the expected value of the monthly demand, what should Gippsland’s monthly order quantity be for this product?
(b)
Assume that each unit demanded generates $70 in revenue and that each unit ordered costs $50. How much will the company gain or lose in a month if it places an order based on your answer to part (a) and where the actual demand for the item is 300 units?
(c)
What is the standard deviation for the number of units demanded?
3. An investor has a certain amount of money available to invest now. Three alternative portfolio selections are available. The estimated profits of each portfolio under three economic conditions are shown in the table below.
Event
Economy declines No change Economy expands
A
Portfolio Selection B
C
$500 $1,000 $2,000
-$2,000 $2,000 $5,000
-$7,000 -$1,000 $20,000
(a)
Which portfolio has the highest expected return?
(b)
Which portfolio has the lowest risk?
(c)
Which investment would you recommend?
Probability
0.3 0.5 0.2
24 | P a g e
4. Consider the state of the economy. There are three possible changes in the economy between this year and next. For each type of change the probability and likely return are given in the following table. State of Economy
Probability of State p(x)
Return if state occurs
0.25 0.5 0.25
-0.05 0.15 0.35
Recession Stable Boom
(x)
Find the expected return and the standard deviation. 5. A state lottery is conducted in which 10,000 tickets are sold for $1 each. Six winning tickets are randomly selected: one grand prize winner of $5,000, one second prize winner of $2,000, one third prize winner of $1,000 and three other winners of $500 each. (a)
Construct a probability distribution, with the random variable (X) equivalent to the net return.
(b)
Compute the expected value of playing this game.
(c)
Would you play this game? Why or why not?
6. The demand for a dried fish product which is a specialty of Tran Trung Industries varies greatly from month to month. Based on the past three years of data, the following probability distribution shows the company’s monthly demand. Demand (kgs) Probability
200 0.15
300 0.30
400 0.40
500 0.15
(a)
If the company prepares monthly production on the basis of the expected value of the monthly demand, what quantity of dried fish should Tran Trung Industries prepare?
(b)
Assume that each kilogram of fish demanded generates $25 in revenue and that each kilogram of dried fish produced costs $17. How much will the company gain or lose if its monthly production is based on your answer from (a) and the actual demand for the product is 300 kgs?
7. The J.R. Ryland Computer Company is considering a plant expansion that will enable the company to begin production of a new computer product. The company’s managing director must determine whether to make the expansion a medium or large-scale project. An uncertainty involves the demand for the new product, which for planning purposes may be low demand, medium demand, or high demand. The probability estimates for the demands are .20, .50 and .30, respectively. Letting X and Y indicate the annual profit in $1,000s, the firm’s planners have developed profit forecasts for the medium and large-scale expansion projects.
Expansion Profits Medium-scale x Pr(x) Demand
Low Medium High
50 150 200
0.20 0.50 0.30
Large-scale y Pr(y)
0 100 300
0.20 0.50 0.30
25 | P a g e
Introduction to Probability
(a)
Compute the expected value for the profit associated with the two expansion alternatives. Which decision is preferred for the objective of maximising the expected profit?
(b)
Compute the standard deviation for the profit associated with the two expansion alternatives. Which decision is preferred for the objective of minimising the risk or uncertainty?
ANSWERS TO EXERCISES Exercises 3A Answers 1. (i)
Pr(X ≤ 2) = Pr(0) + Pr(1) + Pr(2) = 0.1 + 0.2 + 0.2 = 0.5
(ii)
Pr(X ≤ 3) = Pr(X ≤ 2) + Pr(3) = 0.5 + 0.3 = 0.8 OR Pr(X ≤ 3) = 1 – Pr(≥ 4) = 1 – 0.2 = 0.8
(iii)
Pr(X ≥ 1) = 1- Pr(0) = 1 – 0.1 = 0.9 OR Pr(X ≥ 1) = Pr(1) + Pr(2) + Pr(3) + Pr(≥ 4) = 0.2 + 0.2 + 0.3 + 0.2 = 0.9
(iv)
Pr(X ≥ 2) = Pr(2) + Pr(3) + Pr(≥ 4) = 0.2 + 0. 3 + 0.2 = 0.7
(v)
Pr(X ≥ 4 | X ≥ 1) =
Pr( X ≥ 4) Pr( X ≥ 1)
=
0 .2 0 .9
= 0.22
2. (a) Number of outcomes = No. of nights stay × No. of accommodation types = 2 × 3 = 6 possible outcomes (b)
26 | P a g e
3. (a)
Number of outcomes = No. of A changes × No. of B changes = 3 × 3 = 9 possible outcomes
(b)
(c)
5 = (↑,↑) (↑,U) (↑,↓) (U,↑) (↓,↑)
(d)
1 = (↑,↑)
4. (a)
Pr(X = 0) = 0.05
(b)
Pr(X ≥ 4) = Pr(4) + Pr(5) = 0.1 + 0.1 = 0.2
(c)
Pr(X ≤ 2) = Pr(0) + Pr(1) + Pr(2) = 0.05 + 0.15 + 0.35 = 0.55
Yes. All the probabilities are between 0 and 1 (0 ≤ pr(x) ≤ 1)and the sum of the probabilities is 1 (Σ pr(x) = 1).
5. (a)
(b)
Pr(A) = Pr(0) + Pr(1) + Pr(2) = 0.08 + 0.18 + 0.32 = 0.58
(c)
Pr(B) = Pr(4) = 0.12
6. (a) (b)
Pr(A) = Pr(X ≥ 150) =
26 + 5 31 = = 0.31 100 100
Pr(B) = Pr(X < 150) = 1 – 0.31 = 0.69
27 | P a g e
Introduction to Probability
7. (a)
Pr(A) = Pr(X ≥ 1) =
20 + 12 + 6 + 3 + 1
(b)
Pr(B) = Pr(X ≥ 3) =
6 + 3+1
(c)
Pr(X = 2) =
12 50
50 50
=
10 50
=
42 50
= 0.84
= 0.2
= 0.24
Yes. All the probabilities are between 0 and 1 (0 ≤ pr(x) ≤ 1)and the sum of the probabilities is 1 (Σ pr(x) = 1).
8. (a)
(b)
Pr(A) = Pr(2) + Pr(1) + Pr(0) = 0.35 + 0.24 + 0.19 = 0.78
(c)
Pr(B) = Pr(4) = 0.08
(d)
Yes, there is no overlap in the number of batteries sold. Pr(A and B) = 0 Pr(A or B) = Pr(A) + Pr(B) = 0.78 + 0.08 = 0.86
9.
(a)
Pr(Both on time) = Pr(OT, OT) = 0.1925
(b)
Pr(Both late) = Pr(L, L) = 0.2925
(c)
Pr(At least one on time) = Pr(OT, OT) + Pr(OT, L) + Pr(L, OT) = 0.1925 + 0.3575 + 0.1575 = 0.7075 OR Pr(At least one on time ) = 1 – Pr(both late) = 1 – 0.2925 = 0.7075
28 | P a g e
10.
(a)
Pr(contact leads to sale) = 70/350 = 0.2
(b)
Pr(Eastern suburbs inquiry results in a sale) = 50/200 = 0.25
(c)
Pr(Measure and quote from Eastern to Western suburbs) = 60/80 = 0.75
(d)
If two events are independent, then Pr(A and B) = Pr(A) × Pr(B) Let Pr(A) = Pr(Request for a brochure) = 200/350 = 0.571 Let Pr(B) = Pr(Client from Eastern suburbs) = 200/350 = 0.571 Thus, Pr(A) × Pr(B) = 0.571 × 0.571 = 0.326 Now, Pr(A and B) = 100/350 = 0.286 Since Pr(A and B) ≠ Pr(A) × Pr(B)Pr(A) i.e. 0.286 ≠ 0.326 the events are not independent.
29 | P a g e
Introduction to Probability
EXERCISE 3B Answers
1
a
Pr(player is 195 cm tall) = 4/16 = 0.25
b
Pr(player is 210 cm tall) = 1/16 = 0.0625
c
Probability distribution. Height 170 175 180 185 190 195 200 205 210 (cm) Probability 0.0625 0.125 0.125 0.125 0.0625 0.25 0.0625 0.125 0.0625 Enter the height and the corresponding probabilities in two columns. To calculate the expected value we need to first multiply the height of each in column 1 by its probability in column 2 of the spreadsheet. Therefore, go to cell C2 and type = a2*b2 and press ENTER
Thus the expected height of a Hawthorn basketball club member is 190 centimetres. 2
a b
Expected value=445 Revenue = 300 @ $70 = $21,000 Cost = 445 @ $50 = $22,250 Profit/Loss = ($1,250)
c
Standard deviation=97.34 30 | P a g e
3
a
Highest expected return are Portfolios A and B, with an expected return of $1,400.
b
Lowest risk (as measured by the standard deviation) is Portfolio A with $522.
c
It depends on what one is looking for in an investment. Portfolio B has a good return and medium risk, while Portfolio A has minimum risk but a lower return. Portfolio C offers a good return with very high risk.
4
Thus the expected return is 15% and the standard deviation is 14% (Note the coefficient of variation = 0.1414/0.15 = 94.27%, thus there is a great deal of uncertainty associated with the returns). 5
a
Net return $ (x) Pr(x)
4999
1999
999
499
-1
0.0001
0.0001
0.0001
0.0003
0.9994
b
31 | P a g e