Statistics Review

STATISTICS What is DATA? Data is a collection of facts, such as values or measurements. It can be numbers, words, measurements, observations or even just descriptions of things. Qualitative vs Quantitative Data can be qualitative or quantitative. Qualitative data is descriptive information (it describes something) Quantitative data, is numerical information (numbers).  

And Quantitative data can also be Discrete or Continuous: Discrete data can only take certain values (like whole numbers) Continuous data can take any value (within a range) Put simply: Discrete data is counted, Continuous data is measured Example: What do we know about Arrow the Dog? Qualitative: He is brown and black He has long hair He has lots of energy Quantitative: Discrete: He has 4 legs He has 2 brothers Continuous: He weighs 25.5 kg He is 565 mm tall To help you remember think "Quantitative is about Quantity"  

Collecting Data can be collected in many ways. The simplest way is direct observation. Example: you want to find how many cars pass by a certain point on a road in a 10-minute interval. So: simply stand at that point on the road, and count the cars that pass by in that interval. You collect data by doing a Survey. Census or Sample A Census is when you collect data for every member of the group (the whole "population"). A Sample is when you collect data just for selected members of the group. Example: there are 120 people in your local football club. You can ask everyone (all 120) what their age is. That is a census. Or you could just choose the people that are there this afternoon. That is a sample. A census is accurate, but hard to do. A sample is not as accurate, but may be good enough, and is a lot easier. Language

Data or Datum? The singular form is "datum", so we would say "that datum is very high". "Data" is the plural so we can say "the data are available", but it is also a collection of facts, so "the data is available" is fine too.

DISCRETE AND CONTINOUS DATA Data can be Descriptive (like "high" or "fast") or Numerical (numbers). And Numerical Data can be Discrete or Continuous: Discrete data is counted, Continuous data is measured Discrete Data Discrete Data can only take certain values. Example: the number of students in a class (you can't have half a student). Continuous Data Continuous Data is data that can take any value (within a range) Examples: A person's height: could be any value (within the range of human heights), not just certain fixed heights, Time in a race: you could even measure it to fractions of a second, A dog's weight, The length of a leaf, Lots more! 

   

BAR GRAPHS A Bar Graph (also called Bar Chart) is a graphical display of data using bars of different heights. Imagine you just did a survey of your friends to find which kind of movie they liked best. Here are the results: Table: Favorite Type of Movie

Comedy Action Romance 4 5 6 You could show that on a bar graph like this:

Drama 1

SciFi 4

It is a really good way to show relative sizes: it is easy to see which types of movie are most liked, and which are least liked, at a glance. glance . You can use bar graphs to show the relative sizes of many things, such as what type of car people have, how many customers a shop has on different days and so on. Example: Most Popular Fruit A survey of 145 people revealed their favorite fruit: Fruit: Apple Orange Banana Kiwifruit Blueberry Grapes People: 35 30 10 25 40 5 And here is the bar graph:

For that group of people Blueberries are most popular and Grapes are the least popular. Example: Student Grades In a recent test, this many students got the following grades: Grade: A B C D Students: 4 12 10 2 And here is the bar graph:

You can create graphs like that using our Data our Data Graphs (Bar, Line and Pie) page.

Histograms vs Bar Graphs Bar Graphs are good when your data is in categories(such as "Comedy", "Drama", etc). But when you have continuous data (such as a person's height) then use a Histogram.

PIE CHART Pie Chart - A special chart that uses "pie slices" to show relative sizes of data. Imagine you just did a survey of your you r friends to find which kind of movie they liked best. Here are the results: Table: Favorite Type of Movie

Comedy Action Romance 4 5 6 You could show that by this pie chart:

Drama 1

SciFi 4

It is a really good way to show relative sizes: it is easy to see which movie types are most liked, and which are least liked, at a glance. You can create graphs like that using our Data our Data Graphs (Bar, Line and Pie) page. Or you can make them yourself ... How to Make Them Yourself First, put your data into a table (like above), then add up all the values to get a total: Comedy Action Romance Drama SciFi TOTAL 4 5 6 1 4 20 Next, divide each value by the total and multiply by 100 to get a percent: Comedy Action Romance Drama SciFi TOTAL 4 5 6 1 4 20 100% 4/20 = 20% 5/20 = 25% 6/20 = 30% 1/20 = 5% 4/20 = 20% Now you need to figure out how many degrees for each "pie slice" (correctly called a sector). sector). A Full Circle has 360 degrees, so we do this calculation: Comedy Action Romance Drama SciFi TOTAL 4 5 6 1 4 20 4/20 = 20% 5/20 = 25% 6/20 = 30% 1/20 = 5% 4/20 = 20% 100% 4/20 × 360° 5/20 × 360° 6/20 × 360° 1/20 × 360° 4/20 × 360° 360° = 72° = 90° = 108° = 18° = 72° More Examples You can use pie charts to show the relative sizes of many things, such as: what type of car people have, how many customers a shop has on different days and so on. how popular are different breeds of dogs   

Example: Student Grades Here is how many students got each grade in the recent test: A B C 4 12 10 And here is the pie chart:

D 2

LINE GRAPHS Line Graph - A graph that shows shows information that is connected connected in some way (such as change over time) You are learning math facts, and each day da y you do a short sho rt test to see how good you are. These are the results: Table: Facts I got Correct

Day 1 3

Day 2 4

Day 3 12

Day 4 15

And here is the same data as a Line Graph:

You seem to be improving! You can create graphs like that using our Data our Data Graphs (Bar, Line and Pie) page.

SCATTER PLOTS A graph of plotted points that show the relationship between two two sets of data. In this example, each dot represents one person's weight versus their height. (The data is plotted on the graph as "Cartesian (x,y) Coordinates") Coordinates") Example: The local ice cream shop keeps track of how much ice cream they sell versus the temperature on that day. Here are their figures for the last 12 days: Ice Cream Sales vs Temperature

Temperature °C Ice Cream Sales 14.2° $215 16.4° $325 11.9° $185 15.2° $332 18.5° $406 22.1° $522 19.4° $412 25.1° $614 23.4° $544 18.1° $421 22.6° $445 17.2° $408 And here is the same data as a Scatter Plot:

It is now easy to see that warmer weather leads to more sales, but the relationship is not perfect. Line of Best Fit You can also draw a "Line of Best Fit" (also called a "Trend Line") on your scatter plot:

Try to have the line as close as possible to all points, and as many points above the line as below. Example: Sea Level Rise

A Scatter Plot of Sea Level Rise:

And here I have drawn on a "Line of Best Fit".

Correlation When the two sets of data are strongly linked together we say they have a High Correlation. The word Correlation is made of Co- (meaning "together"), and Relation Correlation is Positive when the values increase together, and Correlation is Negative when one value decreases as the other increases Like this:  

(Learn More About Correlation) Negative Correlation Correlations can be negative, which means there is a correlation but one value goes down as the other value increases. Example : Birth Rate vs Income Yearly Birth The birth rate tends to be lower in richer countries. Country Production Rate per Person Below is a scatter plot for about 100 different countries. Madagascar $800 5.70 India

$3,100 2.85

Mexico

$9,600 2.49

Taiwan

$25,300 1.57

Norway

$40,000 1.78

It has a negative correlation (the line slopes down) Note: I tried to fit a straight line to the data, but maybe a curve would work better, what do you think?

PICTOGRAPHS A Pictograph is a way of showing data using images. Each image stands for a certain number of things. Example: Apples Sold Here is a pictograph of how many apples were sold at the local shop over 4 months:

   

Note that each picture of an apple means 10 apples (and the half-apple picture means 5 apples). So the pictograph is showing: In January 10 apples were sold In February 40 apples were sold In March 25 apples were sold In April 20 apples were sold It is a fun and interesting way to show data. But it is not very accurate: in the example above we can't show just 1 apple sold, or 2 apples sold etc.

HISTOGRAM A Histogram is a graphical display of data using bars of different heights. heights. It is similar to a Bar Chart, but a histogram groups numbers into ranges And you decide what ranges to use!

  

Example: Dress Shop Survey You asked customers who bought one of the "Aurora" range of skirts how old they were. The ages were from 5 to 25 years old. You decide to put the results into groups group s of 5: The 1 to 5 years old range, The 6 to 10 years old range, etc... So when someone says "I am 17" you add 1 to the "16-20" range. And here is the result: You can see (for example) that there were 30 customers between 6 and 10 years old Histograms are a great way to show results of continuous of continuous data, such as: weight height how much time etc.    

But if your data is in categories (such as Country or Favorite Movie), then you should use a Bar Chart. Frequency Histogram A Frequency Histogram is a special histogram that uses vertical columns to show frequencies (how many times each score occurs):

Here I have added up how often 1 occurs (2 times), how often 2 occurs (5 times), etc, and shown them as a histogram.

FREQUENCY DISTRIBUTION

  

Frequency Frequency is how often something occurs. Example: Sam played football on Saturday Morning, Saturday Afternoon Thursday Afternoon The frequency was 2 on Saturday, 1 on Thursday and 3 for the whole week. Frequency Distribution By counting frequencies we can make a Frequency Distribution table. Example: Goals Sam's team has scored the following numbers of goals in recent games: 2, 3, 1, 2, 1, 3, 2, 3, 4, 5, 4, 2, 2, 3

 

Sam put the numbers in order, hen added up: how often 1 occurs (2 times), how often 2 occurs (5 times), etc, and wrote them down as a Frequency Distribution table: From the table we can see interesting things such as getting 2 goals happens most frequently only once did they get 5 goals This is the definition: Frequency Distribution: values and their frequency (how often each value occurs). Here is another example:

Example: Newspapers These are the numbers of newspapers sold at a local shop over the last 10 days: 22, 20, 18, 23, 20, 25, 22, 20, 18, 20 Let us count how many of each number there is: Papers Frequency Sold 18 2 19 0 20 4 21 0 22 2 23 1 24 0 25 1 It is also possible to group the values. Here they are grouped in 5s: Papers Frequency Sold 15-19 2 20-24 7 25-29 1 (Learn more about Grouped Frequency Distributions) Graphs After creating a Frequency Distribution table you might like to make a Bar Graph or a Pie Chart using the Data Graphs (Bar, Line and Pie) page.

STEM AND LEAF PLOTS A special table where each data value is split into a "leaf" "leaf" (usually the last digit) and a "stem" (the other digits). Like in this example: Example: "32" is split into "3" (stem) and "2" (leaf).

The "stem" values are listed down, and the "leaf" values go right (or left) from the stem values. The "stem" is used to group the scores and each "leaf" indicates the individual scores within each group.

CUMULATIVE TABLES AND GRAPHS Cumulative Cumulative means "how much so far". Think of the word "accumulate" which means to gather together.

To have cumulative totals, just add up the values as you go. Example: Jamie has earned this much in the last 6 months: Month Earned March $120 April $50 May $110 June $100

July $50 August $20 To work out the cumulative totals, just add up as you go. The first line is easy, the total earned so far is the same as Jamie earned that month: Month Earned Cumulative March $120 $120 But for April, the total earned so far is $120 + $50 = $170 : Month Earned Cumulative March $120 $120 April $50 $170 And for May we continue to add up: $170 + $110 = $280 Month Earned Cumulative March $120 $120 April $50 $170 May $110 $280

  

Do you see how we add the previous month's cumulative total to this month's earnings? Here is the calculation for the rest: June is $280 + $100 = $380 July is $380 + $50 = $430 August is $430 + $20 = $450 And this is the result Month Earned Cumulative March $120 $120 April $50 $170 May $110 $280 June $100 $380 July $50 $430 August $20 $450 The last cumulative total should match the total of all earnings: $450 is the last cumulative total ... ... it is also the total of all earnings: $120+$50+$110+$100+$50+$20 = $450 So we got it right. So that's how to do it, add up as you go down the list and you will have cumulative totals. You could also call it a "Running Total" Graphs You can make cumulative graphs if you want. Just plot each Month's cumulative total:

Cumulative Bar Graph Cumulative Line Graph

RELATIVE FREQUENCY

 

   

   

How often something happens divided by all outcomes. Example: if your team has won 9 games from a total of 12 games played: the Frequency of winning is 9 the Relative Frequency of winning is 9/12 = 75% All the Relative Frequencies add up to 1 (except for any rounding error). Example: Travel Survey 92 people were asked how they got to work: 35 used a car 42 took public transport 8 rode a bicycle 7 walked The Relative Frequencies (to 2 decimal places) are: Car: 35/92 = 0.38 Public Transport: 42/92 = 0.46 Bicycle: 8/92 = 0.09 Walking: 7/92 = 0.08 0.38+0.46+0.09+0.08 = 1.01 (It would be exactly 1 if we had used perfect accuracy),

CENTRAL VALUE The mean is just the average of the numbers. It is easy to calculate: add up all the numbers, then divide by how many numbers there are. In other words it is the sum divided by the count. Example 1: What is the Mean of these numbers? 6, 11, 7  

Add the numbers: 6 + 11 + 7 = 24 Divide by how many numbers (there are 3 numbers): 24 / 3 = 8 The Mean is 8 Why Does This Work? It is because 6, 11 and 7 added together is the same as 3 lots of 8:

It is like you are "flattening out" the numbers Example 2: Look at these numbers: 3, 7, 5, 13, 20, 23, 39, 23, 40, 23, 14, 12, 56, 23, 29 The sum of these numbers is 330 There are fifteen numbers. The mean is equal to 330 / 15 = 22 The mean of the above numbers is 22 Negative Numbers How do you handle negative numbers? Adding a negative number is the same as a s subtracting the number (without the negative). For example 3 + (-2) = 3-2 = 1. Knowing this, let us try an example: Example 3: Find the mean of these numbers: 3, -7, 5, 13, -2

  

The sum of these numbers is 3 - 7 + 5 + 13 - 2 = 12 There are 5 numbers. The mean is equal to 12 ÷ 5 = 2.4 The mean of the above numbers is 2.4

MEAN The mean is just the average of the numbers. It is easy to calculate: add up all the numbers, then divide by how many numbers there are. In other words it is the sum divided by the count. Example 1: What is the Mean of these numbers? 6, 11, 7  

Add the numbers: 6 + 11 + 7 = 24 Divide by how many numbers (there are 3 numbers): 24 / 3 = 8 The Mean is 8 Why Does This Work? It is because 6, 11 and 7 added together is the same as 3 lots of 8:

It is like you are "flattening out" the numbers Example 2: Look at these numbers: 3, 7, 5, 13, 20, 23, 39, 23, 40, 23, 14, 12, 56, 23, 29 The sum of these numbers is 330 There are fifteen numbers. The mean is equal to 330 / 15 = 22 The mean of the above numbers is 22 Negative Numbers How do you handle negative numbers? Adding a negative number is the same as subtracting the number (without the negative). For example 3 + (-2) = 3-2 = 1. Knowing this, let us try an example: Example 3: Find the mean of these numbers:   

3, -7, 5, 13, -2 The sum of these numbers is 3 - 7 + 5 + 13 - 2 = 12 There are 5 numbers. The mean is equal to 12 ÷ 5 = 2.4 The mean of the above numbers is 2.4

MEDIAN VALUE Median Value The Median is the "middle number" (in a sorted list of numbers). How to Find the Median Value To find the Median, place the numbers you are given in value order and find the middle number. Example: find the Median of {12, 3 and 5} Put them in order: 3, 5, 12 The middle number is 5, so the median is 5.

Example 2 Look at these numbers: 3, 13, 7, 5, 21, 23, 39, 23, 40, 23, 14, 12, 56, 23, 29 If we put those numbers in order we have: 3, 5, 7, 12, 13, 14, 21, 23, 23, 23, 23, 29, 39, 40, 56 There are fifteen numbers. Our middle number will be the eighth number: 3, 5, 7, 12, 13, 14, 21, 23, 23, 23, 23, 29, 39, 40, 56 The median value of this set of numbers is 23. (Note that it didn't matter if we had some numbers the same in the list) Two Numbers in the Middle BUT, if there are an even amount of numbers things are slightly different. In that case we need to find the middle pair of numbers, and then find the value that would be half way between them. This is easily done by adding them together and dividing by two. An example will help: 3, 13, 7, 5, 21, 23, 23, 40, 23, 14, 12, 56, 23, 29 If we put those numbers in order we have: 3, 5, 7, 12, 13, 14, 21, 23, 23, 23, 23, 29, 40, 56 There are now fourteen numbers and so we don't have just one middle number, we have a pair of middle numbers: 3, 5, 7, 12, 13, 14, 21, 23, 23, 23, 23, 29, 40, 56 In this example the middle numbers are 21 and 23. To find the value half-way between them, add them together and divide by 2: 21 + 23 = 44 44 ÷ 2 = 22 And, so, the Median in this example is 22. (Note that 22 was not in the list of numbers ... but that is OK, because half the numbers in the list are less, and half the numbers are greater.)

MODE The mode is simply the number which appears most often. Finding the Mode To find the mode, or modal value, first put the numbers in order, then count how many of each number. Example: 3, 7, 5, 13, 20, 23, 39, 23, 40, 23, 14, 12, 56, 23, 29 In order these numbers are: 3, 5, 7, 12, 13, 14, 20, 23, 23, 23, 23, 29, 39, 40, 56 This makes it easy to see which numbers appear most often. In this case the mode is 23. Another Example: {19, 8, 29, 35, 19, 28, 15} Arrange them in order: {8, 15, 19, 19, 28, 29, 35} 19 appears twice, all the rest appear only once, so 19 is the mode. How to remember? Think "mode is most"

More Than One Mode You can have more than one mode.

   

Example: {1, 3, 3, 3, 4, 4, 6, 6, 6, 9} 3 appears three times, as does 6. So there are two modes: at 3 and 6 Having two modes is called "bimodal". Having more than two modes is called "multimodal". Grouping When all values appear the same number of times the idea of a mode is not useful. But you could group them to see if one group has more than the others. Example: {4, 7, 11, 16, 20, 22, 25, 26, 33} Each value occurs once, so let us try to group them. We can try groups of 10: 0-9: 2 values (4 and 7) 10-19: 2 values (11 and 16) 20-29: 4 values (20, 22, 25 and 26) 30-39: 1 value (33) In groups of 10, the "20s" appear most often, so we could choose 25 as the mode. You could use different groupings and get a different answer!

MEAN FROM A FREQUENCY TABLE It is easy to calculate the Mean: Add up all the numbers, then divide by how many numbers there are. Example 1: What is the Mean of these numbers? 6, 11, 7  

Add the numbers: 6 + 11 + 7 = 24 Divide by how many numbers (there are 3 numbers): 24 ÷ 3 = 8 The Mean is 8 But sometimes you won't have a simple list of numbers, you might have a frequency table like this (the "frequency" says how often they occur): Score Frequency 1

2

2

5

3

4

4

2

5

1

(it says that score 1 occurred 2 times, score 2 occurred 5 times, etc)

You could list all the numbers like this: 1+1 + 2+2+2+2+2 + 3+3+3+3 + 4+4 + 5 Mean = (how many numbers) But rather than do lots of adds (like 3+3+3+3) it is often easier to use multiplication: 2×1 + 5×2 + 4×3 + 2×4 + 1×5 Mean = (how many numbers) And rather than count how many numbers there are, we can add up the frequencies: 2 ×1 + 5 ×2 + 4 ×3 + 2 ×4 + 1 ×5 Mean = 2+5+4+2+1

So let's calculate: Mean =

2 + 10 + 12 + 8 + 5

=

37

= 2.64... 14 14 And that is how to calculate the mean from a frequency table! Here is another example: Example: Parking Spaces per House in Hampton Street Isabella went up and down the street to find out how many parking spaces each house had. Here are her results: Parking Frequency Spaces 1 15 2 27 3 8 4 5 What is the mean number of Parking Spaces? Answer: 15+54+24+20 15×1 + 27× 27× 2 + 8× 3 + 5× 4 Mean = = = 2.05... 15+27+8+5 55 The Mean is 2.05 (to 2 decimal places) (much easier than adding all numbers separately!)

Notation Now you know how to do it, let's do that last example again, but using formulas. This symbol (called Sigma) means "sum up" (read more at Sigma Notation) So we can say "add up all frequencies" this way: (where f is frequency) And we would use it like this: Likewise we can add up "frequency "frequenc y times score" this way: (where f is frequency and x is the matching score) And the formula for calculating the mean from a frequency table is:

The x with the bar on top says "the mean of x"

So now we are ready to do d o our example above, but with correct notation. Example: Calculate the Mean of this Frequency Table x

f

1 2 3 4 And here it is:

15 27 8 5

There you go! You can use sigma notation. Calculate in the Table It is often better to do the calculations in the table. Example: (continued) From the previous example, calculate f × x in the right-hand column and then do totals: x

f

1 15 2 27 3 8 4 5 TOTALS: 55 And the Mean is then easy:

fx

15 54 24 20 113 Mean = 113 / 55 = 2.05...

RANGE The Range is the difference between the lowest and highest values. Example: In {4, 6, 9, 3, 7} the lowest value is 3, and the highest is 9. So the range is 9-3 = 6.

It is that simple! But perhaps too simple ... The Range Can Be Misleading The range can sometimes be misleading when there are extremely high or low values. Example: In {8, 11, 5, 9, 7, 6, 3616}: the lowest value is 5, and the highest is 3616, So the range is 3616-5 = 3611. The single value of 3616 makes the range large, but most values are around 10. So you may be better off using Interquartile Range or Standard or Standard Deviation.  

Range of a Function Range can also mean all the output values of a function, see Domain, Range and Codomain.

QUARTILE AND INTERQUARTILE RANGE Quartiles are the values that divide a list of numbers into quarters. First put the list of numbers in order Then cut the list into four equal parts The Quartiles are at the "cuts"   

Example: 5, 8, 4, 4, 6, 3, 8 Put them in order: 3, 4, 4, 5, 6, 8, 8 Cut the list into quarters:

  

  

And the result is: Quartile 1 (Q1) = 4 Quartile 2 (Q2), which is also the Median, = 5 Quartile 3 (Q3) = 8 Sometimes a "cut" is between two numbers ... the Quartile is the average of the two numbers. Example: 1, 3, 3, 4, 5, 6, 6, 7, 8, 8 The numbers are already in order Cut the list into quarters:

In this case Quartile 2 is half way between 5 and 6: Q2 = (5+6)/2 = 5.5 And the result is: Quartile 1 (Q1) = 3 Quartile 2 (Q2) = 5.5 Quartile 3 (Q3) = 7 Interquartile Range The "Interquartile Range" is from Q1 to Q3:

To calculate it just subtract Quartile 1 from Quartile 3, like this: Example:

The Interquartile Range is: Q3 - Q1 = 8 - 4 = 4 Box and Whisker Plot You can show all the important values in a "Box and Whisker Plot", like this:

A final example covering everything:

  

 

Example: Box and Whisker Plot and Interquartile Range for 4, 17, 7, 14, 18, 12, 3, 16, 10, 4, 4, 11 Put them in order: 3, 4, 4, 4, 7, 10, 11, 12, 14, 16, 17, 18 Cut it into quarters: 3, 4, 4 | 4, 7, 10 | 11, 12, 14 14 | 16, 17, 18 18 In this case all the quartiles are between numbers: Quartile 1 (Q1) = (4+4)/2 = 4 Quartile 2 (Q2) = (10+11)/2 = 10.5 Quartile 3 (Q3) = (14+16)/2 = 15 Also: The Lowest Value is 3, The Highest Value is 18 So now we have enough data for the Box and Whisker Plot:

And the Interquartile Range is: Q3 - Q1 = 15 - 4 = 11

STANDARD DEVIATION Standard Deviation The Standard Deviation is a measure of how spread out numbers are. Its symbol is σ (the greek letter sigma) The formula is easy: it is the square root of the Variance. So now you ask, "What is the Variance?" Variance The Variance is defined as: The average of the squared differences from the Mean. To calculate the variance follow these steps: Work out the Mean (the simple average of the numbers) Then for each number: subtract the Mean and square the result (the squared difference ). Then work out the average of those squared differences. (Why Square?) Example You and your friends have just measured the heights of your dogs (in millimeters):   

The heights (at the shoulders) are: ar e: 600mm, 470mm, 170mm, 430mm and 300mm. Find out the Mean, the Variance, and the Standard Deviation. Your first step is to find the Mean: Answer: 600 + 470 + 170 + 430 + 300 1970 = = 394 5 5 so the mean (average) height is 394 mm. Let's plot this on the chart: Mean =

Now, we calculate each dogs difference from the Mean:

To calculate the Variance, take each difference, square it, and then average the result:

So, the Variance is 21,704. And the Standard Deviation is just the square root of Variance, so: Standard Deviation: σ = √21,704 = 147.32... = 147 (to the nearest mm) And the good thing about the Standard Deviation is that it is useful. Now we can show which heights are within one Standard Deviation (147mm) of the Mean:

So, using the Standard Deviation we have a "standard" way of knowing what is normal, and what is extra large or extra small. Rottweilers are tall dogs. And Dachshunds are a bit short ... but don't tell them! Now try the Standard Deviation Calculator. But ... there is a small change with Sample Data Our example was for a Population (the 5 dogs were the only dogs we were interested in). But if the data is a Sample (a selection taken from a bigger Population), then the calculation changes! When you have "N" data values that are: The Population: divide by N when calculating Variance Var iance (like we did) A Sample: divide by N-1 when calculating Variance All other calculations stay the same, including how we we calculated the mean. Example: if our 5 dogs were just a sample of a bigger population of dogs, we would divide by 4 instead of 5 like this: Sample Variance = 108,520 / 4 = 27,130 Sample Standard Deviation = √27,130 = 164 (to the nearest mm) Think of it as a "correction" when your data is only a sample. Formulas Here are the two formulas, explained at Standard Deviation Formulas if you want to know more:  

The "Population Standard Deviation": The "Sample Standard Deviation": Looks complicated, but the important change is to divide by N-1 (instead of N) when calculating a Sample Variance.

*Footnote: Why square the differences? If we just added up the differences from the mean ... the negatives would cancel the positives: 4+4-4-4 4

=0

So that won't work. How about we use absolute values? values? |4| + |4| + |-4| + |-4| 4

=

4+4+4+4 4

=4

That looks good, but what about this case:

|7| + |1| + |-6| + |-2| 4

=

7+1+6+2 4

=4

Oh No! It also gives a value of 4, Even though the differences are more spread out! So let us try squaring each difference (and taking the square root at the end):

√

√

4 +4 +4 +4 4

7 +1 +6 +2 4

=√

=√

64 4

90 4

=4

= 4.74...

That is nice! The Standard Deviation is bigger when the differences are more spread out ... just what we want! In fact this method is a similar idea to distance between points, just applied in a different way. And it is easier to use algebra on squares and square roots than absolute values, which makes the standard deviation easy to use in other areas of mathematics.

UNIVARIATE/BIVARIATE Univariate: one variable, Bivariate: two variables Univariate means "one variable" (one type of data) Example: Travel Time (minutes): 15, 29, 8, 42, 35, 21, 18, 42, 26 The variable is Travel Time We can do lots of things with univariate data: Find a central value using mean, median and mode Find how spread out it is using range, quartiles and standard deviation Make plots like Bar Graphs, Pie Charts and Histograms  

Bivariate means "two variables", in other words there are two types of data With bivariate data you have two sets of related data that you want to compare: Example: An ice cream shop keeps track of how much ice cream they sell versus the temperature on that day. The two variables are Ice Cream Sales and Temperature. Here are their figures for the last 12 days: Ice Cream Sales vs Temperature


It is now easy to see that warmer weather leads to more sales, but the relationship is not perfect. So with bivariate data we are interested in comparing the two sets of data and finding any relationships. We can use Tables, Scatter Plots, Correlation, Line of Best Fit, and plain old common sense.

SCATTER PLOT OUTLIERS "Outliers" are values that "lie outside" the other values. When we collect data, sometimes there are values that are "far away" from the main group of data ... what do we do with them?

  

Example: Long Jump A new coach has been working with the Long Jump team this month, and the athletes' performance has changed. Augustus can now jump 0.15m further, June and Carol can jump 0.06m further. Here are all the results: Augustus: +0.15m Tom: +0.11m June: +0.06m

  

Carol: +0.06m Bob: + 0.12m Sam: -0.56m Oh no! Sam got worse. Here are the results on the number line:

The mean is: (0.15+0.11+0.06+0.06+0.12-0.56) / 6 = -0.06 / 6 = -0.01m So, on average the performance went DOWN. The coach is obviously useless ... right? Sam's result is an "Outlier" ... what if we remove Sam's result? Example: Long Jump (continued) Let us try the results WITHOUT Sam: Mean = (0.15+0.11+0.06+0.06+0.12)/6 = 0.08m Hey, the coach looks much better now! But is that fair? Can we just get rid of values we don't like? What To Do? You need to think "why is that value over there?" It may be quite normal to have high or low values People can be short or tall Some days there is no rain, other days there can be a downpour Athletes can perform better or worse on different days Or there may be an unusual reason for extreme data Example: Long Jump (continued) We find out that Sam was feeling sick that day. Not the coach's fault at all. So it is a good idea in this case to remove Sam's result. When you remove outliers YOU are influencing the data, it is no longer "pure", so you shouldn't just get rid of the outliers without a good reason! And when you do get rid of them, explain what you are doing and why. Mean, Median and Mode We saw how outliers affect the mean, but what about the median or mode or mode? ? Example: Long Jump (continued) The median ("middle" value): including Sam is: 0.085 without Sam is: 0.11 (went up a little) The mode (the most common value): including Sam is: 0.06 without Sam is: 0.06 (stayed the same) The mode and median didn't change very much. They also stayed around where most of the data is. So it seems that outliers have the biggest effect on the mean, and not so much on the median or mode. Hint: calculate the median and mode when you have outliers.   

 

 

CORRELATION When two sets of data are strongly linked together we say they have a High Correlation. The word Correlation is made of Co- (meaning "together"), and Relation

Correlation is Positive when the values increase together, and Correlation is Negative when one value decreases as the other increases Like this:  

Correlation can have a value: 1 is a perfect positive correlation 0 is no correlation (the values don't seem linked at all) -1 is a perfect negative correlation The value shows how good the correlation is (not how steep the line is), and if it is positive or negative. Example: Ice Cream Sales The local ice cream shop keeps track of how much ice cream they sell versus the temperature on that day, here are their figures for the last 12 days:   

Ice Cream Sales vs Temperature


You can easily see that warmer weather leads to more sales, the relationship is good but not perfect. In fact the correlation is 0.9575 ... see at the end how I calculated it. Correlation Is Not Good at Curves The correlation calculation only works well for relationships that follow a straight line. Our Ice Cream Example: there has been a heat wave! It gets so hot that people aren't going near the shop, and sales start dropping.

The correlation is now 0: "No Correlation" ... ! The calculated value of correlation is 0 (trust me, I worked it out), which says there is "no correlation". But we can see the data follows a nice curve that reaches a peak around 25° C. But the correlation calculation is not "smart" enough to see this. Moral of the story: make a Scatter Plot, and look at it! You may see more than the correlation co rrelation value says. Correlation Is Not Causation "Correlation Is Not Causation" ... by that I mean: when there is a correlation it does not mean that one thing causes the other Example: Sunglasses vs Ice Cream Our Ice Cream shop finds how many sunglasses were sold by a big store for each day and compares them to their ice cream sales:

The correlation between Sunglasses and Ice Cream sales is high Does this mean that sunglasses make people want ice cream? How To Calculate How did I calculate the value 0.9575 at the top? I used "Pearson's Correlation". There is software that can calculate it for you, such as the CORREL() function in Excel or OpenOffice Calc ... ... but here is how to calculate it yourself: Let us call the two sets of data "x" and "y" (in our case Temperature T emperature is x and Ice Cream Sales is y): Step 1: Find the mean of x, and the mean of y Step 2: Subtract the mean of x from every x value (call them "a"), do the same for y (call them "b") Step 3: Calculate: a × b, a 2 and b2 for every value Step 4: Sum up a × b, sum up a 2 and sum up b 2 Step 5: Divide the sum of a × b by the square root of [(sum of a 2) × (sum of b 2)] Here is how I calculated the first Ice Cream example (values rounded to 1 or 0 decimal places):  

  

As a formula it is:

Where:

Σ is Sigma, the symbol for "sum up"



is each x-value minus the mean of x (called "a" above)



is each y-value minus the mean of y (called "b" above) You probably won't have to calculate c alculate it like that, but at least you know it is not "magic", but simply a routine set of calculations. Approximate Values There are also approximate ways to calculate a correlation coefficient, such as "Spearman's rank correlation coefficient", but I prefer using a spreadsheet like above. 

PROBABILITY How likely something is to happen.

Many events can't be predicted with total certainty. The best we can do is say how likely they are to happen, using the idea of probability. Tossing a Coin When a coin is tossed, there are two possible outcomes: heads (H) or tails (T) We say that the probability of the coin landing H is ½. And the probability of the coin landing T is ½.  

Throwing Dice When a single die is thrown, there are six possible outcomes: 1, 2, 3, 4, 5, 6. The probability of any one of them is 1/6. Probability In general: Number of ways it can happen Probability of an event happening = Total number of outcomes Example: the chances of rolling a "4" with a die Number of ways it can happen: 1 (there is only 1 face with a "4" on it)

So the probability =

1

6 Example: there are 5 marbles in a bag: 4 are blue, and 1 is red. What is the probability that a blue marble will be picked? Number of ways it can happen: 4 (there are 4 blues) Total number of outcomes: 5 (there are 5 marbles in total) 4 So the probability = = 0.8 5 Probability Line You can show probability on a Probability Line:

Probability is always between 0 and 1

 

 

 

Probability is Just a Guide Probability does not tell us exactly what will happen, it is just a gu ide Example: toss a coin 100 times, how many Heads will come up? Probability says that heads have a 1/2 chance, so we would expect 50 Heads. But when you actually try it out you might get 48 heads, or 55 heads ... or anything really, but in most cases it will be a number near 50. Learn more at Probability Index. Words Some words have special meaning in Probability: Experiment: an action where the result is uncertain. Tossing a coin, throwing dice, seeing what pizza people choose are all examples of experiments. Sample Space: all the possible outcomes of an experiment Example: choosing a card from a deck There are 52 cards in a deck (not including Jokers) So the Sample Space is all 52 possible cards: {Ace of Hearts, 2 of Hearts, etc... } The Sample Space is made up of Sample Points: Sample Point: just one of the possible outcomes Example: Deck of Cards the 5 of Clubs is a sample point the King of Hearts is a sample point "King" is not a sample point. As there are 4 Kings that is 4 different sample points. Event: a single result of an experiment Example Events: Getting a Tail when tossing a coin is an event Rolling a "5" is an event. An event can include one or more possible outcomes: Choosing a "King" from a deck of cards (any of the 4 Kings) is an event Rolling an "even number" (2, 4 or 6) is also an event

The Sample Space is all possible outcomes. A Sample Point is just one possible outcome. And an Event can be one or more of the possible outcomes.

Hey, let's use those words, so you get used to them: Example: Alex decide to see how many times a "double" would come up when throwing 2 dice. Each time Alex throws the 2 dice is an a n Experiment. It is an Experiment because the result is uncertain. The Event Alex is looking for is a "double", where both dice have the same number. It is made up of these 6 Sample Points: {1,1} {2,2} {3,3} {4,4} {5,5} and {6,6} The Sample Space is all possible outcomes (36 Sample Points): {1,1} {1,2} {1,3} {1,4} ... {6,3} {6,4} {6,5} {6,6} These are Alex's Results: Is it a Experiment Double? {3,4} No {5,1} No {2,2} Yes {6,3} No ... ... After 100 Experiments, Alex had 19 "double" Events ... is that close to what you would would expect?

PROBABILITY LINE Probability is the chance that something will happen. It can be shown on a line.

The probability of an event occurring is somewhere between impossible and certain. As well as words we can use numbers (such as fractions or decimals) to show the probability of something happening: Impossible is zero Certain is one. Here are some fractions on the probability line:  

We can also show the chance that something will happen:

a) The sun will rise tomorrow. b) I will not have to learn mathematics at school. c) If I flip a coin it will land heads up. d) Choosing a red ball from a sack with 1 red ball and 3 green balls

Between 0 and 1 The probability of an event will not be less than 0. This is because 0 is impossible (sure that something will not happen). The probability of an event will not be more than 1. This is because 1 is certain that something will happen. 



BASIC COUNTING PRINCIPLE When there are m ways to do one thing, and n ways to do another, then there are m×n ways of doing both. Example: you have 3 shirts and 4 pants. That means 3×4=12 different outfits. Example: There are 6 flavors of ice-cream, and 3 different cones. That means 6×3=18 different single-scoop ice-creams you could order. It also works when you have more than 2 choices: Example: You are buying a new car. There are 2 body styles: sedan or hatchback There are 5 colors available: There are 3 models:



GL (standard model), SS (sports model with bigger engine) SL (luxury model with leather seats) 



How many total choices? You can see in this "tree" diagram:

You can count the choices, or just do the simple calculation:

Total Choices = 2 × 5 × 3 = 30 Independent or Dependent? But it only works when all choices are independent of each other. If one choice affects another choice (i.e. depends on another choice), then a simple multiplication is not right. Example: You are buying a new car ... but ... the salesman says "You can't choose black for the hatchback" ... well then things change!

You now have only 27 choices. Because your choices are not independent of each other. But you can still make your life easier with this calculation: Choices = 5×3 + 4×3 = 15 + 12 = 27

RANDOM WORDS Random Letters You would think it was easy to create random words ... just pick letters randomly and put them together, and voila! a random word. Well, here are 20 words made that way: tldkl oewkx dmwol vuptg hvwjk naqid avypr zwtip zgnzs bvdhd muyfd ighgd xhlng oyecn vjnsl ssjrx gxald tukxj rvfoq yxzxq It turns out that the words are not only nonsense, but quite hard to pronounce! (Try saying "tldkl" or "oewkx") You see, the probability is very unlikely ... you would have to try lots of random combinations before getting lucky. Why? Well, English has around 200,000 words (228,000 in the Oxford English Dictionary, including many words no longer used) ... but how many different words can be made with just 5 letters? 26 × 26 × 26 × 26 × 26 = 11,881,376 possible 5 letter words! And that is just the 5 letter words words ... Let us guess that there are 40,000 words in English that have 5 letters. So the probability of making a real word just randomly would be: 40,000 / 11,881,376 = 0.003, or about 0.3% chance So real words are rare . And we can see that putting random letters together is very unlikely to produce a real word. Vowels We can improve our success by insisting that a word have at least one vowel, since nearly every word in English has one (except fly, by and a few others). Like this: ectot gjaqv kuifg vzicu zspsu pdidb wqdis uerrs ucgej okimw fnevz ewxko ljgew aglgo jpfoq dcytu uwkcj dzioy wekdx xuybk This is a great improvement. More words can be pronounced. But there are still lots of strange words like "zspsu" and "xuybk" Letter Frequency So, our next improvement is to use less of the letters like j,x,z and q and more of the letters like e,t and s. In fact the frequency of letters in the English Language is well known. Here is how many times you wouldexpect to see a letter in every 1,000 letters:

a b c d e f g h i j k l m n o p q r s t u v w x y z 82 15 28 42 127 22 20 61 70 2 8 40 24 67 75 19 1 60 63 90 27 2 7 10 24 2 20 1 Can you see that "e" is common, but "z" is rare? "e" is lkely to occur 127 times in every 1,000, or as a ratio 127/1000 = .127 (=12.7%) "z" is lkely to occur only 1 time in every 1,000, or as a ratio 1/1000 = .001 (=0.1%) So, by selecting letters based on that frequency (a bit like rolling a 1,000 sided die (dice), where each die has 82 a's, 15 b's ... and only one z), we can get output like this: elnao etgov segty laast aessn siuon oenha eaoas ncoot ctwka dmswo dpuoh eewis ebdni laarm syucs idvos lhina igahh soyie Still no real words, but some are close. And most of them can be pronounced. (Great names if you are writing a science fiction novel!)  

but we can do better ... 2-Letter Frequencies We can take the idea of Letter Frequency one step further by asking "what is the frequency of letters that follow another letter" For example, if we already have a "t", the next letter is very likely to be an "h" (making "th"). To illustrate this, I built up a Table of Two-Letter Frequencies (from Alice's Adventures in ). Here is the line for "t": Wonderland ). Freq a b c d e

f g

h

i

jk

l

m n

o pq r

s

t

u v w x y z

275 18 12 990 149 153 333 125 65 54 t 238 41 727 11 3197 459 So, "h" occured 3197 times after a "t" ("th") ... but "b" never followed a "t" OK, let us start with a "t", and let us say we choose an "h" to make "th", then next we would use the "h"-row to choose another letter (maybe an "e" to make "the"), and so on ... well, here is a sample: the cur the bund hof arytowno d sheromasees asemedosouro f soacthake d imon binofowat oaten d heng wa The results are remarkable ... nonsense, but almost like some strange language. In fact we are not just making random words now, we are making random sentences! Higher Letter Frequencies Why stop there? We can make tables of three letter frequencies or more ... 3 Letter Frequencies How do 3 Letter Frequencies work? Well, say I already have two letters (like "ei") ... we then: look through the sample text for every time "ei" appears, randomly choose one of those look for the letter following "ei" (possibly "t"). then add the "t" to make "eit" and start again using "it" (... always the last two letters) Here is a sample: Either great into get very deep welled of it it, and to wondere started into the book about hear! Now, that looks good! By sampling from a real source we can get good results. 4 Letter Frequencies Using the same method I used groups of 3 Letters to decide on the 4th letter and got: Either the sides or conversations in time to happen next. First, she look down mind     

5 Letter Frequencies And with 5 Letter frequencies: There was just in time it all seemed quite natural); but to take out of time as she had not like to do

COMPLEMENT Complement of an Event: All outcomes that are NOT the event. When the event is Heads, the complement is Tails When the event is {Monday, Wednesday} the complement is {Tuesday, Thursday, Friday, Saturday, Sunday} When the event is {Hearts} the complement is {Spades, Clubs, Diamonds, Jokers} So the Complement of an event is all the other outcomes (not the ones you want). And together the Event and its Complement make all possible outcomes. outcomes. Probability Number of ways it can happen Probability of an event happening = Total number of outcomes Example: the chances of rolling a "4" with a die Number of ways it can happen: 1 (there is only 1 face with a "4" on it) Total number of outcomes: 6 (there are 6 faces altogether) 1 So the probability = 6 The probability of an event is shown using "P": P(A) means "Probability of Event A" The complement is shown by a little ' mark such as A' (or sometimes A c or A): P(A') means "Probability of the complement of Event A" The two probabilities always add to 1 P(A) + P(A') = 1

Example: Rolling a "5" or "6" Event A: {5, 6} Number of ways it can happen: 2 Total number of outcomes: 6 2 1 P(A) = = 6 3 The Complement of Event A is {1, 2, 3, 4} Number of ways it can happen: 4 Total number of outcomes: 6 P(A') =

4 6

=

2 3

Let us add them: P(A) + P(A') = 1 + 2 = 3 = 1

3

3

3

Yep, that makes 1 It makes sense, right? Event A plus all outcomes that are not Event A make up all possible outcomes. Why is the Complement Useful? It is sometimes easier to work out the complement first. Example. Throw two dice. What is the probability the two scores are different? Different scores are like getting a 2 and 3, or a 6 and 1. It is quite a long list: A = { (1,2), (1,3), (1,4), (1,5), (1,6), (2,1), (2,3), (2,4), ... etc ! } But the complement (which is when the two scores are the same) is only 6 outcomes: A' = { (1,1), (2,2), (3,3), (4,4), (5,5), (6,6) } And the probability is easy to work out: P(A') = 6/36 = 1/6 Knowing that P(A) and P(A') together make 1, we can calculate: P(A) = 1 - P(A') = 1 - 1/6 = 5/6 So in this case it's easier to work out P(A') first, then find P(A)

TYPE OF EVENTS

 

 

Life is full of random events! You need to get a "feel" for them to be a smart and successful person. The toss of a coin, throw of a dice and lottery draws are all examples of random events. Events When we say "Event" we mean one (or more) outcomes. Example Events: Getting a Tail when tossing a coin is an event Rolling a "5" is an event. An event can include several outcomes: Choosing a "King" from a deck of cards (any of the 4 Kings) is also an event Rolling an "even number" (2, 4 or 6) is an event Independent Events Events can be "Independent", meaning each event is not affected by any other events. This is an important idea! A coin does not "know" that it came up heads before ... each toss of a coin is a perfect isolated thing. Example: You toss a coin three times and it comes up "Heads" each time ... what is the chance that the next toss will also be a "Head"? The chance is simply 1/2, or 50%, just like ANY OTHER toss of the coin. What it did in the past will not affect the current toss! Some people think "it is overdue for a Tail", but really truly the next toss of the coin is totally independent of any previous tosses. Saying "a Tail is due", or "just one more go, my luck is due" is called The Gambler's Fallacy (Learn more at Independent Events.) Events.) Dependent Events

But some events can be "dependent" ... which means they can be affected by previous events ... Example: Drawing 2 Cards from a Deck After taking one card from the deck there are less cards available, so the probabilities change!





 

Let's say you are interested in the chances of getting a King. For the 1st card the chance of drawing a King is 4 out of 52 But for the 2nd card: If the 1st card was a King, then the 2nd card is less likely to be a King, as only 3 of the 51 cards left are Kings. If the 1st card was not a King, then the 2nd card is slightly more likely to be a King, as 4 of the 51 cards left are King. This is because you are removing cards from the deck. Replacement: When you put each card back after drawing it the chances don't change, as the events are independent. Without Replacement: The chances will change, and the events are dependent. You can learn more about this at Dependent Events: Conditional Probability Tree Diagrams When you have Dependent Events it helps to make a "Tree Diagram" Example: Soccer Game You are off to soccer, and love being the Goalkeeper, but that depends who is the Coach today: with Coach Sam your probability of being Goalkeeper is 0.5 with Coach Alex your probability of being Goalkeeper is 0.3 Sam is Coach more often ... about 6 of every 10 games (a probability of 0.6). Let's build the Tree Diagram! Start with the Coaches. We know 0.6 for Sam, so it must be 0.4 for Alex (the probabilities must add to 1):

Then fill out the branches for Sam (0.5 Yes and 0.5 No), and then for Alex (0.3 Yes and 0.7 No):

Now it is neatly laid out we could calculate probabilities (read more at "Tree Diagrams"). Diagrams"). Mutually Exclusive Mutually Exclusive means you can't get both events at the same time. It is either one or the other, but not both Examples: Turning left or right are Mutually Exclusive (you can't do both at the same time) Heads and Tails are Mutually Exclusive Kings and Aces are Mutually Exclusive What isn't Mutually Exclusive Kings and Hearts are not Mutually Exclusive, because you can have a King of Hearts! Like here:   



Aces and Kings are Mutually Exclusive

Hearts and Kings are not Mutually Exclusive

INDEPENDENT EVENTS Life is full of random events! You need to get a "feel" for them to be a smart and successful person. The toss of a coin, throwing dice and lottery draws are all examples of random events. Sometimes an event can affect the next event. Example: taking colored marbles from a bag: as you take each marble there are less marbles left in the bag, so the probabilities change. We call those Dependent Events, because what happens depends on what happened before (learn more about this at Conditional probability). probability). But otherwise they are Independent Events ... Independent Events Independent Events are not affected by previous events. This is an important idea! A coin does not "know" it came up heads before ... .... each toss of a coin is a perfect isolated thing. Example: You toss a coin and it comes up "Heads" " Heads" three times ... what is the chance that the next toss will also be a "Head"? The chance is simply 1/2 (or 0.5) just like ANY toss of the coin. What it did in the past will not affect the current toss! Some people think "it is overdue for a Tail", but really truly the next toss of the coin is totally independent of any previous tosses. Saying "a Tail is due", or "just one more go, my luck is due" is called The Gambler's Fallacy Of course your luck may change, because each toss of the coin has an equal chance. Probability of Independent Events "Probability" (or "Chance") is how likely something is to happen. So how do we calculate probability? Number of ways it can happen Probability of an event happening = Total number of outcomes

Example: what is the probability of getting a "Head" when tossing a coin? Number of ways it can happen: 1 (Head)

Total number of outcomes: 2 (Head and Tail) So the probability =

1

= 0.5 2 Example: what is the probability of getting a "5" or "6" when rolling a die? Number of ways it can happen: 2 ("5" and "6") Total number of outcomes: 6 ("1", "2", "3", "4", "5" and "6") 2 1 So the probability = = = 0.333... 6 3 Ways of Showing Probability Probability goes from 0 (imposssible) to 1 (certain):

   

It is often shown as a decimal or fraction. Example: the probability of getting a "Head" when tossing a coin: As a decimal: 0.5 As a fraction: 1/2 As a percentage: 50% Or sometimes like this: 1-in-2 Two or More Events You can calculate the chances of two or more independent events by multiplying the chances. Example: Probability of 3 Heads in a Row For each toss of a coin a "Head" has a probability of 0.5:

And so the chance of getting 3 Heads in a row is 0.125 Notation We use "P" to mean "Probability Of", So, for Independent Events: P(A and B) = P(A) × P(B) Probability of A and B equals the probability of A times the probability of B Example: you are going to a concert, and your friend says it is some time on the weekend between 4 and 12, but won't say more. What are the chances it is on Sunday between 10 and 12? Day: there are two days on the weekend, so P(Sunday) = 0.5 Time: between 4 and 12 is 8 hours, but you want between 10 and 12 which is only 2 hours: P(Your Time) = 2/8 = 0.25 And: P(Sunday and Your Time) = P(Sunday) × P(Your Time) = 0.5 × 0.25 = 0.125 Or a 12.5% chance

Another Example Imagine there are two groups: A member of each group gets randomly chosen for the winners circle, then one of those gets randomly chosen to get the big money prize:  

What is your chance of winnning the big prize? there is a 1/5 chance of going to the winners circle and a 1/2 chance of winning the big prize So you have a 1/5 chance followed by a 1/2 chance ... which makes a 1/10 chance overall: 1 1 1 1 × = = 5 2 5×2 10 Or you can calculate using decimals (1/5 is 0.2, and 1/2 is 0.5): 0.2 x 0.5 = 0.1 So your chance of winning the big money is 0.1 (which is the same as 1/10).  

Coincidence! Many "Coincidences" are, in fact, likely. Example: you are in a room with 30 people, and find that Zach and Anna celebrate their birthday on the same day. Would you say "wow, how strange", or "that seems reasonable, with so many people here". her e". In fact there is a 70% chance that would happen ... so it is likely. Why is the chance so high? Because you are comparing everyone to everyone else (not just one to many). And with 30 people that is 435 comparisons (Read Shared Birthdays to find out more.) Example: Snap! Did you ever say something the same as someone else, at the same time too? Wow, how amazing! But you were probably sharing an experience (movie, journey, whatever) and so your thoughts would be similar. And there are only so many ways of saying something ... ... so it is like the card game "Snap!" ... ... if you speak enough words together, they will eventually match up. So, maybe not so amazing, just simple chance at work. Can you think of other cases where a "coincidence" was simply a likely thing? Conclusion Probability is: (Number of ways it can happen) / (Total number of outcomes) Dependent Events (such as removing marbles from a bag) are affected by previous events Independent events (such as a coin toss) are not affected by previous events You can calculate the probability of 2 or more Independent events by multiplying    



Not all coincidences are really unlikely (when ( when you think about them).

DEPENDENT EVENTS Life is full of random events! You need to get a "feel" for them to be a smart and successful person. Independent Events Events can be "Independent", "Independent", meaning each event is not affected by any other events. Example: Tossing a coin. Each toss of a coin is a perfect isolated thing. What it did in the past will not affect the current toss. The chance is simply 1-in-2, or 50%, just like ANY toss of the coin. So each toss is an Independent Event. Dependent Events But events can also be "dependent" ... which means they can be affected by previous events ...

 

Example: Marbles in a Bag 2 blue and 3 red marbles are in a bag. What are the chances of getting a blue marble? The chance is 2 in 5 But after taking one out you change the chances! So the next time: if you got a red marble before, then the chance of a blue marble next is 2 in 4 if you got a blue marble before, then the chance of a blue marble next is 1 in 4 See how the chances change each time? Each event depends on what happened in the previous event, and is called dependent. That is the kind of thing we will be looking at here. "Replacement" Note: if you had replaced the marbles in the bag each time, then the chances would not have changed and the events would be independent: With Replacement: the events are Independent (the chances don't change) Without Replacement: the events are Dependent (the chances change)  

Tree Diagram A Tree Diagram: is a wonderful way to picture what is going on, so let's build one for our marbles example. There is a 2/5 chance of pulling out a Blue marble, and a 3/5 chance for Red:

We can even go one step further and see what happens when we select a second marble:

If a blue marble was selected first there is now a 1/4 chance of getting a blue marble and a 3/4 chance of getting a red marble. If a red marble was selected first there is now a 2/4 chance of getting a blue marble and a 2/4 chance of getting a red marble. Now we can answer questions like "What are the chances of drawing 2 blue marbles?" Answer: it is a 2/5 chance followed by a 1/4 chance:

Did you see how we multiplied the chances? And got 1/10 as a result. The chances of drawing 2 blue marbles is 1/10 Notation We love notation in mathematics! It means we can then use the power of algebra to play around with the ideas. So here is the notation for probability: P(A) means "Probability Of Event A" In our marbles example Event A is "get a Blue Marble first" with a probability of 2/5: P(A) = 2/5 And Event B is "get a Blue Marble second" ... but for that we have 2 choices: If we got a Blue Marble first the chance is now 1/4 If we got a Red Marble first the chance is now 2/4 So we have to say which one we want, and use the symbol "|" to mean "given": P(B|A) means "Event B given Event A" In other words, event A has already happened, now what is the chance of event B? P(B|A) is also called the "Conditional Probability" of B given A. And in our case: P(B|A) = 1/4 So the probability of getting 2 blue marbles is:  

And we write it as

"Probability of event A and event B equals the probability of event A times the probability of event B given event A"

Let's do the next example using only notation: Example: Drawing 2 Kings from a Deck Event A is drawing a King first, and Event B is drawing a King second. For the first card the chance of drawing a King is 4 out of 52 P(A) = 4/52

But after removing a King from the deck the probability of the 2nd card drawn is less likely to be a King (only 3 of the 51 cards left are Kings): P(B|A) = 3/51 And so: P(A and B) = P(A) x P(B|A) = (4/52) x (3/51) = 12/2652 = 1/221 So the chance of getting 2 Kings is 1 in 221, or about 0.5% Finding Hidden Data Using Algebra we can also "change the subject" of the formula, like this: Start with: P(A and B) = P(A) x P(B|A) Swap sides: P(A) x P(B|A) = P(A and B) Divide by P(A): P(B|A) = P(A and B) / P(A) And we have another useful formula:

"The probability of event B given event A equals the probability of event A and event B divided by the probability of event A

Example: Ice Cream 70% of your friends like Chocolate, and 35% like Chocolate AND like Strawberry. What percent of those who like Chocolate also like Strawberry? P(Strawberry|Chocolate) = P(Chocolate and Strawberry) / P(Chocolate) 0.35 / 0.7 = 50% 50% of your friends who like Chocolate also like Strawberry Big Example: Soccer Game You are off to soccer, and want to be the Goalkeeper, but that depends who is the Coach today: with Coach Sam the probability of being Goalkeeper is 0.5 with Coach Alex the probability of being Goalkeeper is 0.3 Sam is Coach more often ... about 6 out of every 10 games (a probability of 0.6). So, what is the probability you will be a Goalkeeper today?  

Let's build a tree diagram. First we show the two possible coaches: Sam or Alex:

The probability of getting Sam is 0.6, so the probability of Alex must be 0.4 (together the probability is 1) Now, if you get Sam, there is 0.5 probability of being Goalie (and 0.5 of not being Goalie):

If you get Alex, there is 0.3 probability of being Goalie (and 0.7 not):

The tree diagram is complete, now let's calculate the overall probabilities. Remember that: P(A and B) = P(A) x P(B|A) Here is how to do it for the "Sam, Yes" branch:

(When we take the 0.6 chance of Sam being coach and include the 0.5 chance that Sam will let you be Goalkeeper we end up with an 0.3 chance.) But we are not done yet! We haven't included Alex as Coach:

An 0.4 chance of Alex as Coach, followed by an 0.3 chance gives 0.12 And the two "Yes" branches of the tree together make: 0.3 + 0.12 = 0.42 probability of being a Goalkeeper today (That is a 42% chance) Check One final step: complete the calculations and make sure they the y add to 1:

0.3 + 0.3 + 0.12 + 0.28 = 1 Yes, they add to 1, so that looks right. Friends and Random Numbers Here is another quite different example of Conditional Probability. 4 friends (Alex, Blake, Chris and Dusty) each choose a random number between 1 and 5. What is the chance that any of them chose the same number? Let's add our friends one at a time ... First, what is the chance that Alex and Blake have the same number? Blake compares his number to Alex's number. There is a 1 in 5 chance ch ance of a match. As a tree diagram:

Note: "Yes" and "No" together makes 1 (1/5 + 4/5 = 5/5 = 1) Now, let's include Chris ... But there are now two cases to consider: If Alex and Billy did match, then Chris has only one number to compare to. But if Alex and Billy did not match then Chris has two numbers to compare to. And we get this:  

For the top line (Alex and Billy did match) we already have a match (a chance of 1/5). But for the "Alex and Billy did not match" there is now a 2/5 chance of Chris matching (because Chris gets to match his number against both Alex and Billy). And we can work out the combined chance by multiplying the chances it took to get there: Following the "No, Yes" path ... there is a 4/5 chance of No, followed by a 2/5 chance of Yes: (4/5) × (2/5) = 8/25 Following the "No, No" path ... there is a 4/5 chance of No, followed by a 3/5 chance of No: (4/5) × (3/5) = 12/25 Also notice that when you add all chances together you still get 1 (a good check that we haven't made a mistake): (5/25) + (8/25) + (12/25) = 25/25 = 1 Now what happens when we include Dusty? It is the same idea, just more of it:

OK, that is all 4 friends, and the "Yes" chances together make 101/125: Answer: 101/125 But notice something interesting ... if we had followed the "No" path we could have skipped all the other calculations and made our life easier:

The chances of not matching are: (4/5) × (3/5) × (2/5) = 24/125 So the chances of matching are: 1 - (24/125) = 101/125 (And we didn't really need a tree diagram for that!) And that is a popular trick in probability: It is often easier to work out the "No" case (This idea is shown in more detail at Shared Birthdays.) Birthdays.)

TREE DIAGRAMS Calculating probabilities can be hard, sometimes you add them, sometimes you multiply them, and often it is hard to figure out what to do ... tree diagrams to the rescue! Here is a tree diagram for the toss of a coin:

There are two "branches" (Heads and Tails) The probability of each branch is written on the branch The outcome is written at the end of the branch  

We can extend the tree diagram to two tosses of a coin:

How do you calculate the overall probabilities? You multiply probabilities along the branches You add probabilities down columns  

Now we can see such things as: The probability of "Head, Head" is 0.5×0.5 = 0.25 All probabilities add to 1.0 (which is always a good check) The probability of getting at least one Head from two tosses is 0.25+0.25+0.25 = 0.75 ... and more    

That was a simple example using independent events (each toss of a coin is independent of the previous toss), but tree diagrams are really wonderful for figuring out dependent events (where an event depends on what happens in the previous event) like this example: Example: Soccer Game You are off to soccer, and love being the Goalkeeper, but that depends who is the Coach today: with Coach Sam the probability of being Goalkeeper is 0.5 with Coach Alex the probability of being Goalkeeper is 0.3 Sam is Coach more often ... about 6 out of every 10 games (a probability of 0.6). So, what is the probability you will be a Goalkeeper today?  

Let's build the tree diagram. First we show the two possible coaches: Sam or Alex:

The probability of getting Sam is 0.6, so the probability of Alex must be 0.4 (together the probability is 1) Now, if you get Sam, there is 0.5 probability of being Goalie (and 0.5 of not being Goalie):

If you get Alex, there is 0.3 probability of being Goalie (and 0.7 not):

The tree diagram is complete, now let's calculate the overall probabilities. This is done by multiplying each probability along the "branches" of the tree. Here is how to do it for the "Sam, Yes" branch:

(When we take the 0.6 chance of Sam being coach and include the 0.5 chance that Sam will let you be Goalkeeper we end up with an 0.3 chance.) But we are not done yet! We haven't included Alex as Coach:

An 0.4 chance of Alex as Coach, followed by an 0.3 chance gives 0.12. Now we add the column: 0.3 + 0.12 = 0.42 probability of being a Goalkeeper today (That is a 42% chance) Check One final step: complete the calculations and make sure they add to 1:

0.3 + 0.3 + 0.12 + 0.28 = 1 Yes, it all adds up. Conclusion So there you go, when in doubt draw a tree diagram, multiply along the branches and add the columns. Make sure all probabilities add to 1 and you are good to go.

MUTUALLY EXCLUSIVE EVENTS Mutually Exclusive: can't happen at the same time. Examples: Turning left and turning right are Mutually Exclusive (you can't do both at the same time) Tossing a coin: Heads and Tails are Mutually Exclusive Cards: Kings and Aces are Mutually Exclusive What is not Mutually Exclusive: Turning left and scratching your head can happen at the same time Kings and Hearts, because you can have a King of Hearts! Like here:   

 

Aces and Kings are Hearts and Kings are Mutually Exclusive not Mutually Exclusive (can't be both) (can be both) Probability Let's look at the probabilities of Mutually Exclusive events. But first, a definition: Number of ways it can happen Probability of an event happening = Total number of outcomes Example: there are 4 Kings in a deck de ck of 52 cards. What is the probability of picking a King? Number of ways it can happen: 4 (there are 4 Kings) Total number of outcomes: 52 (there are 52 cards in total) 4 1 So the probability = = 52 13 Mutually Exclusive When two events (call them "A" and "B") are Mutually Exclusive it is impossible for them to happen together: P(A and B) = 0 "The probability of A and B together equals 0 (impossible)"

But the probability of A or B is the sum of the individual probabilities: P(A or B) = P(A) + P(B) "The probability of A or B equals the probability of A plus the probability of B"

 

 

 

Example: A Deck of Cards In a Deck of 52 Cards: the probability of a King is 1/13, so P(King)=1/13 the probability of an Ace is also 1/13, so P(Ace)=1/13 When we combine those two Events: The probability of a card being a King and an Ace is 0 (Impossible) The probability of a card being a King or an Ace is (1/13) + (1/13) = 2/13 Which is written like this: P(King and Ace) = 0 P(King or Ace) = (1/13) + (1/13) = 2/13 Special Notation Instead of "and" you will often see the symbol ∩ (which is the "Intersection" symbol used in Venn Diagrams) Instead of "or" you will often see the symbol ∪ (the "Union" symbol) Example: Scoring Goals If the probability of: scoring no goals (Event "A") is 20% scoring exactly 1 goal (Event "B") is 15% Then: The probability of scoring no goals and 1 goal is 0 (Impossible)



The probability of scoring no goals or 1 goal is 20% + 15% = 35% Which is written: P(A ∩ B) = 0 P(A ∪ B) = 20% + 15% = 35% Remembering To help you remember, think: "Or has more ... than And" ∪ is like a cup which holds more than ∩ Not Mutually Exclusive Now let's see what happens when events are a re not Mutually Exclusive. Example: Hearts and Kings

Hearts and Kings together is only the King of Hearts: But Hearts or Kings is: all the Hearts (13 of them) all the Kings (4 of them) But that counts the King of Hearts twice! So we correct our answer, by subtracting the extra "and" part:  

16 Cards = 13 Hearts + 4 Kings - the 1 extra King of Hearts Count them to make sure this works! As a formula this is: P(A or B) = P(A) + P(B) - P(A and B) "The probability of A or B equals the probability of A plus the probability of B minus the probability of A and B"

Here is the same formula, but using ∪ and ∩: P(A ∪ B) = P(A) + P(B) - P(A ∩ B) A Final Example 16 people study French, 21 study Spanish and there are 30 altogether. Work out the probabilities! This is definitely a case of not Mutually Exclusive (you ca n study French AND Spanish). Let's say b is how many study both languages: people studying French Only must be 16-b people studying Spanish Only must be 21-b And we get:  

(16-b) + b + (21-b) = 30 37 - b = 30 b=7 And we can put in the correct numbers:

So we know all this now: P(French) = 16/30 P(Spanish) = 21/30 P(French Only) = 9/30 P(Spanish Only) = 14/30 P(French or Spanish) = 30/30 = 1 P(French and Spanish) = 7/30 Lastly, let's check with our formula: P(A or B) = P(A) + P(B) - P(A and B) Put the values in: 30/30 = 16/30 + 21/30 – 7/30 Yes, it works!      

Summary: Mutually Exclusive A and B together is impossible: P(A and B) = 0 A or B is the sum of A and B: P(A or B) = P(A) + P(B) Not Mutually Exclusive A or B is the sum of A and B minus A and B: P(A or B) = P(A) + P(B) - P(A and B)  



FALSE POSITIVE & FALSE NEGATIVE Test Says "Yes" ... or does it? When you have a test that can say "Yes" or "No" (such as a medical test), you have to think: It could be wrong when it says "Yes". It could be wrong when it says "No". " No". Wrong?  

It is like being told you did something when you didn't! Or you didn't do it when you really did. There are special names for this, called "False Positive" and "False Negative": They say you did They say you didn't They are right! You really did "False Negative" They are right! You really didn't "False Positive" Here are some examples of "false positives" and "false negatives": Airport Security: a "false positive" is when ordinary items such as keys or coins get mistaken for weapons (machine goes "beep") Quality Control: a "false positive" is when a good quality item gets rejected, and a "false negative" is when a poor quality item gets accepted Antivirus software: a "false positive" is when a normal file is thought to be a virus Medical screening: low-cost tests given to a large group can give many false positives (saying you have a disease when you don't), and then ask you to get more accurate tests. 



 

But many people don't understand the true numbers behind "Yes" or "No", like in this example: Example: Allergy or Not? Hunter says she is itchy. There is a test for Allergy to Cats, but this test is not always right: For people that really do have the allergy, the test says "Yes" 80% of the time For people that do not have the allergy, the test says "Yes" 10% of the time ("false positive") Here it is in a table: Test says "Yes" Test says "No" Have allergy 80% 20% "False Negative" Don't have it 10% "False Positive" 90%  

Question: If 1% of the population have the allergy, and Hunter's test says "Yes", what are the chances that Hunter really has the allergy? Do you think 75%? Or maybe 50%? A test similar to this was given to Doctors and most guessed around 75% ... ... but they were very wrong! (Source: "Probabilistic reasoning in clinical medicine: Problems and opportunities" by David M. Eddy 1982, which this example is based on) There are two good ways to work this out: "Imagine a 1000" and "Tree Diagrams". Try Imagining A Thousand People When trying to understand questions like this, just imagine a large gr oup (say 1000) and play p lay with the numbers: Of 1000 people, only 10 really have the allergy (1% of 1000 is 10) The test is 80% right for people who have the allergy, so it will get 8 of those 10 right. But 990 do not have the allergy, and the test will say "Yes" to 10% of them, which is 99 people it says "Yes" to wrongly (false positive) So out of 1000 people the test says "Yes" to (8+99) = 107 people As a table: 1% have it Test says "Yes" Test says "No" Have allergy 10 8 2 Don't have it 990 99 891 1000 107 893 So 107 people get a "Yes" but only 8 of those really have the allergy:   



8 / 107 = about 7% So, even though Hunter's test said "Yes", it is still only 7% likely that Hunter has a Cat Allergy. As A Tree Drawing a tree diagram can really help:

First of all, let's check that all the percentages add up: 0.8% + 0.2% + 9.9% + 89.1% = 100% (good!) And the two "Yes" answers add up to 0.8% + 9.9% = 10.7%, but only only 0.8% are correct. 0.8/10.7 = 7% (same answer as above)

Conclusion When dealing with false positives and false negatives (or other o ther tricky probability questions) it pays to: Imagine you have 1,000 (of whatever) Or make a tree diagram  

SHARED BIRTHDAYS This is a great puzzle, and you get to learn a lot about probability along the way ... There are 30 people in a room ... what is the chance that any two of them celebrate their birthday on the same day? Assume 365 days in a year. Some people think "there are 30 people, and 365 days, so 30/365 sounds about right, and 30/365 = 0.08..." But no! The probability is much higher. It is actually likely there are people who share a birthday in that room. Because you should compare everyone to everyone else. And with 30 people that is 435 comparisons. But you also have to be careful not to over-count the chances. I will show you how to do it ... starting with a smaller example: Friends and Random Numbers 4 friends (Alex, Billy, Chris and Dusty) each choose a random number between 1 and 5. What is the chance that any of them chose the same number? We will add our friends one at a time ... First, what is the chance that Alex and Billy have the same number? Billy compares his number to Alex's number. There is a 1 in 5 chance of a match. As a tree diagram:

Note: "Yes" and "No" together make 1 (1/5 + 4/5 = 5/5 = 1) Now, let's include Chris ... But there are now two cases to consider (called "Conditional Probability"): Probability"): If Alex and Billy did match, then Chris has only one number to compare to. But if Alex and Billy did not match then Chris has two numbers to compare to. And we get this:  

For the top line (Alex and Billy did match) we already have a match (a chance of 1/5). But for the "Alex and Billy did not match" there is a 2/5 chance of Chris matching (against both Alex and Billy). And we can work out the combined chance by multiplying the chances it took to get there: Following the "No, Yes" path ... there is a 4/5 chance of No, followed by a 2/5 chance of Yes: (4/5) × (2/5) = 8/25 Following the "No, No" path ... there is a 4/5 chance of No, followed by a 3/5 chance of No: (4/5) × (3/5) = 12/25

Also notice that adding all chances together is 1 (a good check that we haven't made a mistake): (5/25) + (8/25) + (12/25) = 25/25 = 1 Now what happens when we include Dusty? It is the same idea, just more of it:

OK, that is all 4 friends, and the "Yes" chances together make 101/125: Answer: 101/125 But notice something interesting ... if we had followed the "No" path we could have skipped all the other calculations and made our life easier:

The chances of not matching are: (4/5) × (3/5) × (2/5) = 24/125 So the chances of matching are: 1 - (24/125) = 101/125 (And we didn't really need a tree diagram for that!) And that is a popular trick in probability: It is often easier to work out the "No" case

    

Example: what are the chances that with 6 people any of them celebrate their Birthday in the same month? (Assume equal months) The "no match" case for: 2 people is 11/12 3 people is (11/12) × (10/12) 4 people is (11/12) × (10/12) × (9/12) 5 people is (11/12) × (10/12) × (9/12) × (8/12) 6 people is (11/12) × (10/12) × (9/12) × (8/12) × (7/12) So the chance of not matching is: (11/12) × (10/12) × (9/12) × (8/12) × (7/12) = 0.22... Flip that around and we get the chance of matching: 1 - 0.22... = 0.78... So, there is a 78% chance of any of them celebrating their Birthday in the same month And now we can try calculating the "Shared Birthday" question we started with: There are 30 people in a room ... what is the chance that any two of them celebrate their birthday on the same day? Assume 365 days in a year. It is just like the previous example! But bigger and more numbers:

The chance of not matching: 364/365 × 363/365 × 362/365 × ... × 336/365 = 0.294... (I did that calculation in a spreadsheet, but there are also mathematical shortcuts) And the probability of matching is 1- 0.294... : The probability of sharing a birthday = 1 - 0.294... = 0.706... Or a 70.6% chance, which is likely! In fact the probability for 23 people is about 50%. And for 57 people it is 99% (almost certain!) So, next time you are in a room with a group of people why not find out if there are any shared birthdays? Footnote: In real life birthdays are not evenly spread out ... more babies are born in Spring. Also Hospitals prefer to work on weekdays, not weekends, so there are more births early in the week. And then there are leap years. But you get the idea.

COMBINATION & PERMUTATION What's the Difference? In English we use the word "combination" loosely, without thinking if the order of things is important. In other words: "My fruit salad is a combination of apples, grapes and bananas" We don't care what order the fruits are in, they could also be "bananas, grapes and apples" or "grapes, apples and bananas", its the same fruit salad. "The combination to the safe was 472" . Now we do care about the order. "724" would not

work, nor would "247". It has to be exactly 4-7-2. So, in Mathematics we use more precise language: If the order doesn't matter, it is a Combination. If the order does matter it is a Permutation. So, we should really call this a "Permutation Lock"! In other words: A Permutation is an ordered Combination. To help you to remember, think "Permutation ... Position" Permutations There are basically two types of permutation: Repetition is Allowed: such as the lock above. It could be "333". No Repetition: for example the first three people in a running race. You can't be first and second.  

1. Permutations with Repetition These are the easiest to calculate. When you have n things to choose from ... you have n choices each time! When choosing r of them, the permutations are: n × n × ... (r times) (In other words, there are n possibilities for the first choice, THEN there are n possibilites for the second choice, and so on, multplying each time.)

Which is easier to write down using an exponent of r: n × n × ... (r times) = n r Example: in the lock above, there are 10 numbers to choose from (0,1,..9) and you choose 3 of them: 10 × 10 × ... (3 times) = 10 3 = 1,000 permutations So, the formula is simply: nr where n is the number of things to choose from, and you choose r of them (Repetition allowed, order matters) 2. Permutations without Repetition In this case, you have to reduce the number of available choices each time. For example, what order could 16 pool balls be in? After choosing, say, number "14" you can't choose it again. So, your first choice would have 16 possibilites, and your next choice would then have 15 possibilities, then 14, 13, etc. And the total permutations would be: 16 × 15 × 14 × 13 × ... = 20,922,789,888,000 But maybe you don't want to choose them all, just 3 of them, so that would be only: 16 × 15 × 14 = 3,360 In other words, there are 3,360 different d ifferent ways that 3 pool balls could be selected out of 16 balls. But how do we write that mathematically? Answer: we use the "factorial function" The factorial function (symbol: !) just means to multiply a series of descending natural numbers. Examples: 4! = 4 × 3 × 2 × 1 = 24 7! = 7 × 6 × 5 × 4 × 3 × 2 × 1 = 5,040 1! = 1   

Note: it is generally agreed that 0! = 1. It may seem funny that multiplying no numbers together gets you 1, but it helps simplify a lot of equations. So, if you wanted to select all of the billiard balls the permutations would be: 16! = 20,922,789,888,000 But if you wanted to select just 3, then you have to stop the multiplying after 14. How do you do that? There is a neat trick ... you divide by 13! ... 16 × 15 × 14 × 13 × 12 ... = 16 × 15 × 14 = 3,360 13 × 12 ... Do you see? 16! / 13! = 16 × 15 × 14 The formula is written:

where n is the number of things to choose from, and you choose r of them (No repetition, order matters) Examples: Our "order of 3 out of 16 pool balls example" would be: 16! 16! 20,922,789,888,000 = = (16-3)! 13! 6,227,020,800

= 3,360

(which is just the same as: 16 × 15 × 14 = 3,360) How many ways can first and second place be awarded to 10 people? 10! 10! 3,628,800 = = = 90 (10-2)! 8! 40,320 (which is just the same as: 10 × 9 = 90) Notation Instead of writing the whole formula, people use different notations such as these:

Example: P (10,2) (10,2) = 90 Combinations There are also two types of combinations (remember ( remember the order does not matter now): Repetition is Allowed: such as coins in your pocket (5,5,5,10,10) No Repetition: such as lottery numbers (2,14,15,27,30,33)  

1. Combinations with Repetition Actually, these are the hardest to explain, so I will will come back to this later. 2. Combinations without Repetition This is how lotteries work. The numbers are drawn one at a time, and if you have the lucky numbers (no matter what order) you win! The easiest way to explain it is to: assume that the order does matter (ie permutations), then alter it so the order does not matter. Going back to our pool ball example, let us say that you just want to know which 3 pool balls were chosen, not the order. We already know that 3 out of 16 gave us 3,360 permutations. But many of those will be the same to us now, because we don't care what order! For example, let us say balls 1, 2 and 3 were chosen. These are the possibilites: Order does matter Order doesn't matter 123 123 132 213 231 312 321 So, the permutations will have 6 times as many possibilites. In fact there is an easy way to work out how many ways "1 2 3" could be placed in order, and we have already talked about it. The answer is: 3! = 3 × 2 × 1 = 6 (Another example: 4 things can be placed in 4! = 4 × 3 × 2 × 1 = 24 different ways, try it for yourself!) So, all we need to do is adjust our permutations formula to reduce it by how many ways the objects could be in order (because we aren't interested in the order any more):  

That formula is so important it is often just written in big parentheses like this:

where n is the number of things to choose from, and you choose r of them (No repetition, order doesn't matter) It is often called "n choose r" (such as "16 choose 3") And is also known as the "Binomial Coefficient" Notation As well as the "big parentheses", people also use these notations:

Example So, our pool ball example (now without order) is: 16! 16! = = 3!(16-3)! 3!×13! Or you could do it this way: 16×15×14 3×2×1

20,922,789,888,000 6×6,227,020,800

=

3360 6

= 560

= 560

So remember, do the permutation, then reduce by a further "r!" ... or better still ... Remember the Formula! It is interesting to also note how this formula is nice and symmetrical:

In other words choosing 3 balls out of 16, or choosing 13 balls out of 16 have the same number of combinations. 16! 16! 16! = = = 560 3!(16-3)! 13!(16-13)! 3!×13! Pascal's Triangle You can also use Pascal's Triangle to find the values. Go down to row "n" (the top row is 0), and then along "r" places and the value there is your answer. Here is an extract showing row 16: 1 14 91 364 ... 1 1

15

105 455 1365 ...

16 120 1 20 560 1820 4368 ...

1. Combinations with Repetition OK, now we can tackle this one ... Let us say there are five flavors of icecream: banana, chocolate, lemon, strawberry and vanilla. You can have three scoops. How many variations will there be? Let's use letters for the flavors: {b, c, l, s, v}. Example selections would be {c, c, c} (3 scoops of chocolate) {b, l, v} (one each of banana, lemon and vanilla)  

{b, v, v} (one of banana, two of vanilla) (And just to be clear: There are n=5 things to choose from, and you choose r=3 of them. Order does not matter, and you can repeat!) Now, I can't describe directly to you how to calculate this, but I can show you a special technique that lets you work it out. Think about the ice cream being in boxes, you could say "move past the first box, then take 3 scoops, then move along 3 more boxes to the end" and you will have 3 scoops of chocolate! So, it is like you are ordering a robot to get your ice cream, but it doesn't change anything, you still get what you want. Now you could write this down as (arrow means move, circle means scoop). In fact the three examples above would be written like this: {c, c, c} (3 scoops of chocolate): {b, l, v} (one each of banana, lemon and vanilla): {b, v, v} (one of banana, two of vanilla): OK, so instead of worrying about different flavors, we have a simpler problem to solve: "how many different ways can you arrange arrows and circles" Notice that there are always 3 circles (3 scoops of ice cream) and 4 arrows (you need to move 4 times to go from the 1st to 5th container). So (being general here) there are r + (n-1) positions, and we want to choose r of them to have circles. This is like saying "we have r + (n-1) pool balls and want to choose r of them". In other words it is now like the pool balls problem, but with slightly changed numbers. And you would write it like this: 

where n is the number of things to choose from, and you choose r of them (Repetition allowed, order doesn't matter) Interestingly, we could have looked at the arrows instead of the circles, and we would have then been saying "we have r + (n-1) positions and want to choose (n-1) of them to have arrows", and the answer would be the same ...

So, what about our example, what is the answer? (5+3-1)! 7! 5040 = = = 35 3!(5-1)! 3!×4! 6×24 In Conclusion Phew, that was a lot to absorb, so maybe you could read it again to be sure! But knowing how these formulas work is only half the battle. Figuring out ou t how to interpret a real world situation can be quite hard. But at least now you know how to calculate all 4 variations of "Order does/does not matter" and "Repeats are/are not allowed".

CUINCUNX A Quincunx or "Galton Board" (named after Sir Francis Galton) is a triangular array of pegs. Balls are dropped onto the top peg and then bounce their way down to the bottom where they are collected in little bins. Each time a ball hits one of the pegs, it bounces either left or right. But this is interesting: if there is an equal chance of bouncing left or right, then the pegs collecting in the bins form the classic "bell-shaped" curve of the normal distribution.

 

(If the probabilities are not even, you still get a nice "skewed" version version of the normal distribution.) Formula You can actually calculate the probabilities! Think about this: a ball would end up in the bin k places from the right ifif it has taken k left turns. In this example, the ball has taken two bounces to the left, and all other bounces were to the right. It ended up in the bin two places from the right. In the general case, if the quincunx has n rows then a possible path for the ball would be k bounces to the left and (n-k) bounces to the right. And if the probability of bouncing to the left left is p then we can calculate the probability of a certain path like this:



  



The ball bounces k times to the left with a probability of p: p And the other bounces (n-k) have the opposite probability of: (1-p)(n-k) So, the probabili probability ty of following following such such a path is p (1-p) nBut there could be many such paths! For example the left turns could be the 1st and 2nd, or 1st and 3rd, or 2nd and 7th, etc. You could list all such paths (LLRRR.., LRLRR..., LRRL...), but there are two easier ways. How Many Paths You can look at Pascal's Triangle. In fact, the Quincunx is just like Pascal's Triangle, with pegs instead of numbers. The number on each peg shows you how many different paths can be taken to get to that peg. Amazing but true. Or you can use this formula from the subject of Combinations: This is commonly called "n choose k" and written C(n,k). It is the calculation of the number of ways of distributing k things in a sequence of n.



(The "!" means "factorial", "factorial", for example 4! = 1×2×3×4 = 24) Putting it all together, the resulting formula is:



(Which, by the way, is the formula for the binomial distribution.) 



Example: For 10 rows (n=10) and probability of bouncing left of 0.5 (p=0.5), we can calculate the probability of being in the 3rd bin from the right (k=3) as:



also:



(This means there are 120 different paths that would end up with the ball in the 3rd bin from the right.) 

So we get:

In fact we can build a whole table for rows=10 and probability=0.5 like this: From Right: 10 9 8 7 6 5 4 3 2 1 0 Probability: 0.001 0.010 0.044 0.117 0.205 0.246 0.205 0.117 0.044 0.010 0.001 Example: 100 0 1 4 12 21 24 21 12 4 1 0 balls 



Now, of course, this is a random thing so your results may vary from this ideal situation. Another Example: If the probability were 0.8 then the table would look loo k like this: From Right 10 9 8 7 6 5 4 3 2 1 0 Probability 0.107 0.268 0.302 0.201 0.088 0.026 0.006 0.001 0.000 0.000 0.000 Example: 100 11 26 30 20 9 3 1 0 0 0 0 balls 



NORMAL DISTRIBUTION Data can be "distributed" (spread out) in different ways.

It can be spread out more on the left

... or more on the right

Or it can be all jumbled up But there are many cases where the data tends to be around a central value with no bias left or right, and it gets close to a "Normal Distribution" like this:

A Normal Distribution The "Bell Curve" is a Normal Distribution. And the yellow histogram shows some data that follows it closely, but not perfectly (which is usual). It is often called a "Bell Curve" because it looks like a bell. Many things closely follow a Normal Distribution: heights of people size of things produced by machines errors in measurements blood pressure marks on a test We say the data is "normally distributed". The Normal Distribution has: mean = median = mode symmetry about the center 50% of values less than the mean and 50% greater than the mean Quincunx     

  

You can see a normal distribution being created by random chance! It is called the Quincunx and it is an amazing machine. Have a play with it! Standard Deviations The Standard Deviation is a measure of how spread out numbers are (read that page for details on how to calculate it). When you calculate the standard deviation of your data, you will find that (generally): 68% of values are within 1 standard deviation of the mean

95% are within 2 standard deviations

99.7% are within 3 standard deviations

Example: 95% of students at school are between 1.1m and 1.7m tall. Assuming this data is normally distributed can you calculate the mean and standard deviation? The mean is halfway between 1.1m and 1.7m: Mean = (1.1m + 1.7m) / 2 = 1.4m 95% is 2 standard deviations either side of the mean (a total of 4 standard deviations) so: 1 standard deviation = (1.7m-1.1m) / 4 = 0.6m / 4 = 0.15m And this is the result: It is good to know the standard deviation, because we can say that any value is: likely to be within 1 standard deviation (68 out of 100 will be) very likely to be within 2 standard deviations (95 out of 100 will be) almost certainly within 3 standard deviations (997 out of 1000 will be) Standard Scores The number of standard deviations from the mean is also called the "Standard Score", "sigma" or "zscore". Get used to those words! Example: In that same school one of your friends is 1.85m tall   

You can see on the bell curve that 1.85m is 3 standard deviations from the mean of 1.4, so: Your friend's height has a "z-score" of 3.0

It is also possible to calculate how many standard deviations 1.85 is from the mean How far is 1.85 from the mean?

It is 1.85 - 1.4 = 0.45m from the mean How many standard deviations is that? The standard deviation is 0.15m, so: 0.45m / 0.15m = 3 standard deviations So to convert a value to a Standard Score ("z-score"): first subtract the mean, then divide by the Standard Deviation And doing that is called "Standardizing":  

You can take any Normal Distribution and convert it to The Standard Normal Distribution. Example: Travel Time A survey of daily travel time had these results (in minutes): 26, 33, 65, 28, 34, 55, 25, 44, 50, 36, 26, 37, 43, 62, 35, 38, 45, 32, 28, 34 The Mean is 38.8 minutes, and the Standard Deviation is 11.4 minutes (you can copy and paste the values into the Standard Deviation Calculator if Calculator if you want). Convert the values to z-scores ("standard scores"). To convert 26: first subtract the mean: 26 - 38.8 = -12.8, - 12.8, then divide by the Standard Deviation: -12.8/11.4 = -1.12

So 26 is -1.12 Standard Deviations from the Mean Here are the first three conversions Original Value 26 33 65 ...

Standard Score (z-score) (26-38.8) / 11.4 = -1.12 (33-38.8) / 11.4 = -0.51 (65-38.8) / 11.4 = +2.30 ... ... Calculation

And here they are graphically:

You can calculate the rest of the z-scores yourself! Here is the formula for z-score that we have been using: z is the "z-score" (Standard Score) x is the value to be standardized μ is the mean σ is the standard deviation Why Standardize ... ? It can help you make decisions about your data. Example: Professor Willoughby is marking a test. Here are the students results (out of 60 points): 20, 15, 26, 32, 18, 28, 35, 14, 26, 22, 17 Most students didn't even get 30 out of 60, and most will fail. The test must have been really hard, so the Prof decides to Standardize all the scores and only fail people 1 standard deviation below the mean. The Mean is 23, and the Standard Deviation is 6.6, and these are the Standard Scores: -0.45, -1.21, 0.45, 1.36, -0.76, 0.76, 1.82, -1.36, 0.45, -0.15, -0.91 Only 2 students will will fail (the ones who scored 15 and 14 on the test) It also makes life easier because we only need one table (the Standard Normal Distribution Table), Table), rather than doing calculations individually for each value of mean and standard deviation. In More Detail Here is the Standard Normal Distribution with percentages for every half of a standard deviation, and cumulative percentages:    

 

Example: Your score in a recent test was 0.5 standard deviations above the average, how many people scored lower than you did? Between 0 and 0.5 is 19.1% Less than 0 is 50% (left half of the curve) So the total less than you is: 50% + 19.1% = 69.1% In theory 69.1% scored less than you did (but with real data the percentage may be different) A Practical Example: Your company packages sugar in 1 kg bags. When you weigh a sample of bags you get these results: 1007g, 1032g, 1002g, 983g, 1004g, ... (a hundred measurements) Mean = 1010g Standard Deviation = 20g Some values are less than 1000g ... can you fix that? The normal distribution of your measurements looks like this:   

31% of the bags are less than 1000g, which is cheating the customer! Because it is a random thing we can't stop bags having less than 1000g, but we can ca n reduce it a lot ... if 1000g was at -3 standard deviations there would be only 0.1% (very small) at -2.5 standard deviations we can calculate: below 3 is 0.1% and between 3 and 2.5 standard deviations is 0.5%, together that is 0.1%+0.5% = 0.6% (a good choice I think) So let us adjust the machine to have 1000g at 2.5 standard deviations from the mean. We could adjust it to: increase the amount of sugar in each bag (this would change the mean), or make it more accurate (this would reduce the standard deviation) Let us try both:  

 

Adjust the mean amount in each bag The standard deviation is 20g, and we need 2.5 of them: 2.5 × 20g = 50g So the machine should average 1050g, like this:

Adjust the accuracy of the machine Or we can keep the same mean (of 1010g), but then we need 2.5 standard deviations to be equal to 10g: 10g / 2.5 = 4g So the standard deviation should be 4g, like this: (We hope the machine is that accurate!) Or perhaps we could have some combination of better accuracy and slightly larger average size, I will leave that up to you!

SKEWED DATA 

Data can be "skewed", meaning it tends to have a long tail on one side or the other:

Negative Skew

No Skew

Positive Skew



Negative Skew? Why is it called negative skew? Because the long "tail" is on the negative side of the peak. People sometimes say it is "skewed to the left" (the long tail is on the left hand side) The mean is also on the left of the peak.

The Normal Distribution has No Skew A Normal Distribution is not skewed. It is perfectly symmetrical. And the Mean is exactly at the peak.

Positive Skew And positive skew is when the long tail is on the positive positive side of the peak, and some people say it is "skewed to the right". The mean is on the right of the peak value.



Example: Income Distribution Here is some data I extracted from a recent Census. As you can see it is positively skewed ... in fact the tail continues way past $100,000

 

Calculating Skewness "Skewness" (the amount of skew) can be calculated, for example you could use the SKEW() function in Excel or OpenOffice Calc.

Statistics Review

Recommend Documents