CIVIL ENGINEERING STATISTICS
INTRODUCTION These are Mathematics marks for 30 students who are taking Test 1
12 , 23, 24, 45, 34, 48, 56, 63, 23, 44, 69, 78, 84, 95, 98, 67, 73, 69, 58, 70, 40, 88, 59, 47, 37, 15, 17, 36, 63, 38
How to interpret these marks?
~ Statistic s i s t he science tha t d eals with coll ecti ng, cla ssif ying , prese nti ng, describi ng, a nalyz ing and i nterpreting data to ena ble us to dr aw con clus ion s and making reasonable de cisi ons ~ Can be divi ded int o 2 cate go ri es (a)
Desc ri pt iv e st ati st ic s
(b)
Inf erent ial st ati st ic s
Descr ipt ive sta tis tic s ~ The activi ti es of collecting, classifyi ng, prese nti ng and describing quantitative data
Inf erential st ati st ic s ~ The part deali ng wi th technique and method o f int erpr etation of th e results obt ained from th e descript ive statistics
WHAT IS POPULATION ? ~ Popu lation i s t he ent ir e (com pl ete) coll ectio n of data who se properties a re analyze d. It co nt ain s all t he subj ect s of interest. ~ em of but any must size , share its it ems need beCan uni b for at lea st onnot e measur abl e fea tu re.
WHAT IS SAMPLE? ~
A por tio n of pop ulation sele cted for study
~
Samp le is any s et o f enti ti es, case s, sub jects, ite ms or experimenta l units ch ose n from the population.
WHAT IS RANDOM SAMPLE? ~
A random sample is a sample sele ct ed in suc h a way th at each eleme nt of the popu lation has th e same ch ance of being sele cted
WHAT IS PARAMETER ? ~ Paramete r is a numeri cal mea su rement describi ng some chara cteristi cs of a population ~
Eg: The pop ul ation mean , variance
WHAT IS STATISTIC? ~ Stati st ic i s a num eri cal mea su rement describi ng some chara cteristi cs of a sample ~
Eg: The sampl e mean ,vari ance
WHAT IS VARIAB LE ? ~
Any measur ed chara cteristi c or attr ibu te that differs for di ffere nt elements
~
For exampl e, if th e weigh t of 30 sub jectswou w ere red, then weight ld bmeasu e a variable .
~
Can be classi fi ed as qu ant it ati ve or qualitative
WHAT IS QUANTITATIVE VARIABLE ? ~
The variable being st ud ied is numeric
~
~
measur ed on an or di nal, in terval, or r atio sc ale eg: If the tim e it too k them to respo nd w ere me asured, then the varia ble wo uld be quantit ativ e.
WHAT IS QUALITATIVE VARIABLE ? ~ The va riable be ing stud ied is non -num eric ~ Called "categorical variables” ~ Measured on a nomi nal scale ~ eg: gend er, edu cati on al level, eye colour If five- year o ld s tud ents were a sked to name their favou rit e colou r, then the variable wou ld be qualitative.
WHAT IS DATA ? ~
A set of data is a collection of obs ervation, mea sureme nts or infor mation obtaine d
~
Can be classi fi ed as qu ant it ative or qualitative
~
Can be presented in v ariou s ways
WHAT IS QUANTITATIVE DATA ? ~
Quanti tative data refers to observations wh ich can be measured num eric ally o r co unt ed
~
Can be di vid ed int o di scr ete data and continu ous da ta
~
eg:
lengt h, time, temperature and mass
WHAT IS QUALITATIVE DATA ? ~
~
Qualit ative data are not in numerica l f orm but instea d assig ned a s attr ibut es eg:
race, mari tal s tatu s, age, gend er
~
is a set of data th at can on ly take exact and c ou ntabl e values
~
For exampl e: a) b) c) d)
The number of students in a class. The number of cars sold on any day at a ca r d ealers hi p. The number of persons in a family. The number of students in a class.
~
~
is a data can take any valu e ov er certain i nt erv al and c an be measur ed to a certain degr ee of accu racy (cor rect to certa in decima l p lace s) For exampl e: a) b) c) d)
The weight of students in a class. The time taken to complete an examination. The amount of soda in a 150ml can. The income of a family.
WHAT IS UNGROUPED DATA ? ~
(a) Raw dat a (b) Not in the term of int erval (c) Freque ncy d ist rib uti on th at has been arranged in o rd er
~
Example: (i)
3,5,6,2,5,2,4,6,5
(ii)
Numberofbooks Frequency
0
1 3
2 7
3 4
2
WHAT IS GROUPED DATA ? ~
The data can be gr ou ped int o class int erval before the fr equency distribution is constructe d
~
The table con stru cted is calle d freque ncy dis tribut ion table
~
Example:
Height (cm) Frequency
1 50 -155 1 5 5-1 60 1 60 -165 1 6 5-17 0 2
8
6
5
Determine whether the data obtained is discrete or continuous data. (a) The number of books sold by a stationary shop. (b) The time taken to travel from Kuala Terengganu to Batu Pahat (c)
The number of A’s in SPM
(d) The weight of FKAAS students (e) The diameter of twenty spheres
• All data are to be considered as sample unless otherwise stated in the questions.
Example :
Type A
The number of male children in 20 families chosen at random is as follows. 14 2 0 2 3 3 2 1 4 5 2 1 2 0 1 2 3 1 2
Type B The above data is called a raw data and it can be summarized as a frequency distribution as shown : Number of m ale
012345
children Frequency
2
5
7
3
2
1
The data shown in this frequency distribution table is known as ungrouped data.
MEASURES OF LOCATION ( CENTRAL TENDENCY) MEAN
Given a se t d ata of x 1,x 2,x 3,..x n. The me an, is d efin ed as x
sum of all observations
number of observations x1
x2
... x n
n n
xi
i 1
n
For a set of data which can be represented in a frequency distribution table, the mean is given by
k
fx i
i 1 k
f
i
i 1
i
Example : Find the mean of the following data
14 2 0 2 3 3 2 1 4 5 2 1 2 0 1 2 3 1 2 Solution: n
x x
i 1
n
41 20
i
1 4 2 ... 3
20
2.05
1
2
OR x
0
1
f
25
2 7
3 3
4 2
5 1
k
fi x i
x
f i 1
k
i
i1
7(2) 3(3) 2(0) 5(1) 2(4) 1(5) 20 2.05
Example : To o btain grade A, Saleha mus t achi eve an average of at lea st 75 marks i n f our tests. I f h er average mark for the fir st th ree tests i s 70, calcul ate the lowest mark she must get in her fourth test in ord er to o btain grade A.
Solution: Let th e four tests : w,x, y,z Mean fo r w,x,y : 70 Mean f or w,x,y,z :
3(70)
z
4
75
75
210 4
z
210
z
300
z
90
So, th e lo west mark she must get in her fourt h te st in or der to ob tain g rade A is 90
MEDIAN The median is the middl e value of a set of data tha t is arrange d in order of magnitude. th
Let x(k ) be th e k observation in a set of data which arra nged in ascending o r descending o rder . For example , consider the fol lowin g s et of numbers 9 2 7 10 5 16 After arrangement, it becomes 2 5 7 9 10 16 Thus,
between x 3 7 and x median
is 8
4
9
has bee n
Themedianof a set data x1,x 2 ,...,x nis denoted by x(m ) and x m may beca lculated as:
xm
x n1
,if n is odd
2
1 x x 2 n2
1 n is 2 ,if
n
even
Example :
Find t he me dian for t he follow ing sets of d ata a) 21, 24, 17, 28, 36, 20, 32 b) 3.56, 2.7, 5.48, 8.61, 4.35, 6.22 Solution:
a) The data a rr ang ed in ascendi ng or der : 17 , 20 , 21 , 24 , 28 , 32 , 36 Since n = 7 , which is odd , thu s the median is
xm x n 1 2
x4
24
b) The data arr ang ed in ascend ing or der : 2.71 , 3.56 , 4.35 , 5.48 , 6.22 , 8.61 Since n = 6 , which is even, thu s th e median is xm
1
x 6 2 2
x 6 2 1
1 x x 4 2 3
1 2
4.35 5.48
4.915
MODE •
The mode of a set of data is the value
that
occurs most frequently.
•
The mode may not be unique or they may be no mode at all.
Example : Find the mode for the following set of data a)
2, 3, 3, 4, 5, 28, 5, 5
b)
2,
c)
0.2,
3,
5, 0.4,
8, 0.4,
10 0.4, 0.5, 0.7,
0.7,
0.7,
0.5
QUARTILES Quartiles div ide a set of data whi ch are a rranged in asce nding o rder into 4 equal p arts . To fi nd quarti le ( Qk ): Let
r
k
n
4 where : n k
(i) If r is an int eger: 1
Qk
r 2
number of observations quartile for Qk
th
th
observation ( r
1)
observation
(ii) If r is n ot an integer , then round up to t he ne xt integer. Q2 is also called median.
Interquartile Range = Q3 Q1
PERCENTILES Percentiles div ide a set of data whi ch are a rranged in asce nding or der into 100 equal part s . To fin d percentil e ( Pk ): Let
r
k
n
100 where : n k
number of observations
percentile for Pk
(i) If r is an int eger: Pk
r 1 2
th
observation
( r 1) th
observation
(ii) If r is not an int eger, then round u p to the next integer.
Notes:Q1 =P , Median 25
2
=P3 ,75 Q =P Q 50
Exampl e : Find the median, fi rst qua rt ile ( Q1) ,thir d qu arti le (Q3) and 40 th percentil e ( P40 ) for the following sets of data
a) 21, 24, 17, 28, 36, 20, 32 b) 3.5, 2.7, 5.4, 8.6, 4.3, 6.2, 9.9, 7.6 Solution: a) The data arrange d i n ascendin g o rder : 17 , 20 , 21 , 24 , 28 , 32 , 36 Median r
k
4
n
2
4
7
Q2
3.5 ( not an integer ) th
Median Q2 4 observation
24
First quartile r
k
1
n
4
4
7
Q1
1.75 ( not an integer )
Q1 2th observation
20
Third quartile Q3
k
r
n
4
3
4
7
5.25 ( not an integer )
Q3 6th observation
40th percentile r
k
n
100
P40
40
100 rd
7
32
P40
2.8 ( not an integer )
3 observation
21
Example :
The followi ng table shows the ma rks o btaine d by 30 stu dents i n a Math ematics qui z, where the maximu m marks i s 10. Marks
2
3
4
5
6
7
8
9
10
No. of students
2
4
3
6
4
5
4
1
1
Find t he me an, mo de, median, firs t and thi rd quartil es, interquartile range th e 60th percentile.
and
Example : Data1: 6,7,8,6,9,6
mean=7
Data2: 5 ,7,2,6,13,9
m e an = 7
• Most of the numbers in data 1 are around the mean value. • Data 2 is more spread away from the mean. difference in the spread can be determined by the measure of • The dispersion
MEASURES OF DISPERSION
Three common measure of dispersion are: • Range
• Variance • Standard deviation
Rang e = Largest v alue – Smallest v alu e
REMARK • Range is not a good measure of dispersion because it is influenced by the extreme values and the calculation does not cover all observations.
• Variance and standard deviation are most useful and widely used measure of dispersion. Although they are influenced by the extreme values, the calculations cover all the observations
REMARK • Standard deviation measures how spreads out the values in a data set are. • If the data points are all close to the mean, then the standard deviation is close to zero.
• If many data points are far from the mean, then the standard deviation is far from zero. • If all the data values are equal, then the standard deviation is zero.
VARIANCE
X x
f xx nf i
ii
i
S2
(X
i X)
2
n 1
for i
1,2,...,n
Commonl y in u se formulae 2
x i nX
2
S
2 i
2 i
S
x
n 1
fx
2
n 1 x
2
nX
n 1
2
S VARIANCE 2
i
n
i
STANDARD DEVIATION
2
fx
i
2 i
f x
n 1
i
n
i
S
2
Example :
Calcul ate the variance a nd s tandard deviatio n fo r th e fo llo wi ng sets o f sampl e data. Hence, determin e wh ich d ata is m ore dis perse a bout the me an. Set 1 : 16,10,9,2,5,2,7 Set 2 : 10,32,8,12,14,36,20,8,40,4,32,1
For Data 1:
Data 1 : 16,10,9,2,5,2,7 2 n 2 x x n xi 2 i 1 2 4 Xi i 1 n 2 4 2 5 25 S n 1 7 49 9
81
10
100
16
256
n
i1
519
51
2
7
6
24.571849
n
Xi
51
X
2
i
i1
519
S
24.571849
4.957
For Data 2:
Data 2 : 10,32,8,12,14,36,20,8,40,4,32,1 2 n n n 2 x n i Xi 217 Xi 5929 X2 i 1 i1 i 1 i n i 1 2 S n 1 5929
S
217
2
12
11
182.265
182.265
13.5
Henc e, dat a 2 is mor e disperse th an data 1
STEM-AND-LEAF DIAGRAMS Used
to extract every data value in dataset.
The digit(s) in the greatest place value(s) of the data
values are the stems.
The digits in
the next greatest place values are
the leaves. To
construct a stem-and-leaf diagram: 1. Place the stems in order vertically from smallest to largest. 2. Place the leaves in order in each row from smallest to largest. 3. Create a key for the stem-and-leaf diagram so that people know how to interpret the diagram.
STEM-AND-LEAF DIAGRAMS Shape of dis tribu tion A perfectly symmetric curve is one in which both sides of the distribution would exactly match the other if the figure were folded over its central point. An example is shown below:
A sym metr ic, bell- shape d d istribut ion , a relatively common occurrence is called a norma l distribution .
STEM-AND-LEAF DIAGRAMS A distribution is said to be skewe d to the right , or posit ively skewe d , when most of the data are concentrated on the left of the distribution. The right tail clearly extends farther from the distribution's centre than the left tail, as shown below:
STEM-AND-LEAF DIAGRAMS A distribution is said to be skewed to th e left , or negatively skewed , if most of the data are concentrated on the right of the distribution. The left tail clear ly extends farther from the distribution's centre than the right tail, as shown below:
STEM-AND-LEAF DIAGRAMS Example: If the stem and leaf plot is turned on its side, it will look like the following:
The distribution shows that most data are clustered at the right. The left tail extends farther from the data centre than the right tail. Therefore, the distribution is skewed to the left or negatively skewed .
Example :
Marks of a recent Mathematics test are as given below: 73, 42, 67, 78, 99, 84, 91, 82, 86, 94 Based on the marks given: (a) Construct a stem-and-leaf diagram. (b) What is the highest and lowest mark? (c) Interpret the distribution.
Solution: (a)
Mathematics Test Mark Stem 4
Leaf 2
5 6
7
7
3
8
8
2
4
9
1
4
6 9 Key: 9 9 means 99 marks
(b) Highest mark = 99, Lowest mark = 42 (c) Negatively skewed
Example :
Given the heights of 20 people are as follows: 154, 143, 148, 139, 143, 147, 153, 162, 136, 147, 144, 143, 139, 142, 143, 156, 151, 164, 157, 149. Construct a stem-and-leaf diagram and state the shortest and tallest height. Interpret the distribution.
Solution: Stem
Leaf
13
699
14
23 33 34 77 89
15
13467
16
4 2 Key: 13 6 means 136 cm
Shortest height =136 cm Tallest height =164cm Positively skewed
Exercise: The length of a straight line that were estimated by 22 students in mm are as given below: 10.5, 8.5, 8.6, 8.1, 7.3, 4.4, 6.6, 6.6, 7.9, 8.7, 8.3, 6.0, 8.7, 7.5, 7.9, 6.0, 9.1, 7.2, 8.4, 8.1, 8.6, 9.3 Construct a stem-and-leaf diagram based on the given data. Interpret the distribution.
BOX-AND-WHISKER PLOTS 70 max Q1
mi n
Q2
60
Q3
max 50
0
10
20
30
40
50
60
70 40
Q3
Horizontal Box and Whisker 30 Q2 20
10 Vertical Box and Whisker
mi n 0
BOX-AND-WHISKER PLOTS To
construct a box-and-whisker plot:
STEP 1: Determine the five number summary. STEP 2: Draw a horizontal axis on which the number obtained in step 1 can be located. Above this axis, mark all the five number summary with vertical lines. STEP 3: Connect the quartiles to each other to make box, andand thenminimum connect lines. the box to the a maximum STEP 4: Calculate the values of upper and lower inner fence to determine whether the data Upper inner fence = Q3 + 1.5 (Q3 – Q1) Lower inner fence = Q1 - 1.5 (Q 3 – Q1)
Lowerinnerfence
Upperinnerfence
mi n
10
20
30
Q1
40
Q2
50
max
Q3
60
70
80
100
90
The data lies withi n the upper and low er inner fence, so the data has no outl ier.
Upper inner fence
Lower inner fence
Outlier max
mi n
10
20
30
Q1
40
Q2
50
Q3
60
70
80
90
The observation that lies outsi de fence is know n as outl ier.
100
SHAPE OF DATA DISTRIBUTION (SYMMETRY AND SKEWNESS) Symmetrical distribution-the ‘whiskers’ are the same length and the median Q2 is in the centre of the box.
mi n
Q1
Q2
Q3
max
SHAPE OF DATA DISTRIBUTION (SYMMETRY AND SKEWNESS) Positively skewed distribution-the left ‘whiskers’ is shorter than the right ‘whiskers’ and the median is nearer to Q 1.
mi n
Q1
Q2
Q3
max
SHAPE OF DATA DISTRIBUTION (SYMMETRY AND SKEWNESS) Negatively skewed distribution-the left ‘whiskers’ is longer than the right ‘whiskers’ and the median is nearer to Q3.
min
Q1
Q2
Q3
max
Example : Data : 40, 32, 61, 52, 65, 68, 41, 61, 70, 66, 57, 55, 45, 51, 62, 69, 31, 50, 72, 66, 41, 54, 65, 79, 66 (a) Display the data in a ste m and lea f di agram. (b) Find th e first , second and third qu artiles, upper a nd low er inner fence. (a) Constru ct a box and whisker plot fo
r the a bove da ta.
Solution : (a)
Stem
Leaf
3
1 2
4
0115
5
0 1 2 4 5 7
6
1125566689
7
029 Key: 5 4 means 54
(b)
Number of observation, n = 25, min = 31 , max = 79 1
r
4
25
2 r
4
3 r
4
25
25
6 25 .
, Q1 = the 7th observation = 50
12 5 .
, Q2 = the 13th observation = 61
18 .75
, Q3 = the 19th observation = 66
3 Upper inner fence = 1.5 (Q3- 50) – Q 1) =Q 66 ++1.5(66
= 90 Lower inner fence = Q1 - 1.5 (Q 3 – Q1) = 50 - 1.5(66 - 50) = 26
(c) Upper inner fence 90
Lower inner fence 26 Q1 31
10
20
30 40
Q2 Q3
50 61 66 50 60
70
79 80
90 100
No outlier. The data is negatively skewed (skewed to the left).
Example : Stem
Leaf
5
1 9
6
233444445
6
888999
7
022367 Key: 5 9 means 59o F
From the given Stem and Leaf diagram, construct Box and Whiskers plot. Determine the outliers of the data.
Number of observation, n = 23, min = 51 , max = 77 1 r
4
2 r
4 3
r
4
23
23
23
5 75 .
11 5 .
17 .25
Q1 = the 6th observation = 64o F Q2 = the 12th observation = 68o F Q3 = the 18th observation = 70o F
3 Upper inner fence = 1.5 (Q3 – Q1) =Q 70 ++1.5(70-64)
= 79o F Lower inner fence = Q 1 - 1.5 (Q 3 – Q1) = 64 - 1.5(70-64) = 55o F
Outlier
Lower inner fence 55
Upper inner fence 79 Q1
51 50
64 60
Q2 Q3
68 70
77 80
70
From the boxpl ot, we can see that the min imum value 51o F is ou ts ide the fe nce a nd th is value is th e out lier. Therefore whiskers is drawn from 59 o F to 77 o F . Lower inner fence 55 Q1
Outlier 51
50
Upper inner fence
59 60
64
79 Q2 Q3
68 70 70
77 80
The data is negatively skewed (skewed to the left).
GROUPED DATA
MEAN of a fre quency distributi on The mean o f a set of gr ou ped data gi ven in the form of a freque ncy distributio n is defin ed as k
fi xi x
i
1
k
fi i
1
k
fi
total no. of frequency
i 1
xi
class mark
Example :
Find the mean for th e followi ng da ta Class
Frequency,f
0 ≤ x <10
2
10 ≤ x <20
17
20 ≤ x <30
26
30 ≤ x <40
10
40 ≤ x <50
5
i
Class
Frequency
0 ≤ x <10
2
10 ≤ x <20
17
20 ≤ x <30
26
30 ≤ x <40
10
40 ≤ x <50
5
0
SOLUTION:
10
x
2
Class
Class mark, xi
Frequency, fi
f ix i
0 ≤ x <10
5
2
10
10 ≤ x <20
15
17
255
20 ≤ x <30
25
26
650
30 ≤ x <40
35
10
350
40 ≤ x <50
45
5
225
fi
= 60
fix i
k
1490
fi xi x
i
1
x
k
fi i
1
60
24.83
1490
MODE of a freque ncy dist rib uti on mod e
Lm
d1 d1
c d2
L m = lower bound ary o f the class containing the mode d 1 = the diff. betwee n t he frequency o f th e mode class and the freque it.ncy of t he class imm ediate ly before d 2 = the diff. betwee n t he frequency o f th e mode class and the frequency of th e class imm ediate ly after it C = size of the mod e class
Example : Find the mode of freque ncy dist ribut ion giv en below: Class
Frequency
15 - 19
1
20 - 24
4
25- 29
22
30- 34
35
35- 39
20
40 - 44
8
SOLUTION: The mode cl ass is 30 – 34 and t he correspondi ng f reque ncy i s 35. Lm
29.5
d1
35
d2
35
c
22
mod e
20
5
mod e 29.5
13 13
= 31.8
15
5
Lm
d1 d1 d 2
c
frequency
Draw a line from the left upp er Draw linethe from th e vertical right upper corneraof highest bar corner of the highest vertical to th e is left up per corner th ba e r Mode estim ated fro mofthe to thevertic right up per next alpo ba r corner interse ction int of bothoflinthees vertic al ba r before it Histog ram should be dra wn on a graph pape r in order to o btain an accurate answer
mode
Class bou ndaries
Example : For the data in example 2, fin d t he mode using th e histo gram SOLUTION: 35 30 25 20 15 10 5
14.5
19.5
24.5
29.5
34.5
Mod e = 31.8
39.5
44.5
MEDIAN of a fr equency d istr ibu tio n
NOTE : Median of fr equency dist rib utio n can't be coun ted like the ungro uped da ta beca use th e data ha s b een g ro up ed i n th e fo rm of classes. So, we wi ll get an est imated value of median.
MEDIAN
n m Lm n
Lm
FL
2
fm
c
lower boundary
total no. of frequency
FL
cumulative frequency of the class before median class
fm
frequency of median class
c
size o f m edian class
Example : Calculate the median for the following d ata Class
Frequency, f
0≤x<5
7
5 ≤ x <10
27
10 ≤ x <15
35
15 ≤ x < 20
54
20 ≤ x < 25
63
25 ≤ x < 30 30 ≤ x < 35
43 25
35 ≤ x < 40
17
40 ≤ x < 45
9
45 ≤ x < 50
4
SOLUTION: Class
Frequency, f
Frequency, FL
0≤x<5
7
7
5 ≤ x <10
27
34
10 ≤ x <15
35
69
15 ≤ x < 20
54
123
20 ≤ x < 25
63
186
25 ≤ x < 30
43
229
30 ≤ x < 35 35 ≤ x < 40
25 17
254 271
40 ≤ x < 45
9
280
45 ≤ x < 50
4
284
f
284
The median class is 20 ≤ x < 25 with the cor respondi ng fr equency as 6 3.
Henc e, th e medi an is Lm
m
20 f
Lm
FL 2 c fm
n
284
1 FL
123
fm
63
c
5
2 (2 84) 5123 63
m20
21.51
Quartile Quarti les div ide a set of d ata wh ic h are arranged in ascending ord er i nto 4 equal parts Percentile Perc entil es divi de a set of d ata wh ich are arranged in ascendi ng o rd er i nt o 100 equal parts Decile Decil es di vid e a set of data whic h are arranged in ascending ord er int o 10 equal parts
For gro uped data ; k 4 n FL Qk k L C k, fk k 100 n FL Pkk L k, C fk
k1,2, 3,..
k 1, 2,3,..,99
k 10 n FL Dkk L C k, fk
k 1, 2,3,.., 9
Lk = lower boundry of the class where Q k,P k,D k lies n = total number of observations FL = cumulative frequency before the class Q k,P k,D k fk = frequency of the class where Q k,P k,D klies ck = class width wher e Q k,P k,D k lies
Example : Height (cm) 3-5 Frequency
6-8
1
2
9-11 12-14 15-17 18-20 11
10
5
From the above data, calculate :
(a) first , third quartiles & interquartile range (b) the 10 th, 90 th percentiles
c the 5 th decile, D 5
1
Solution: Class Limit
Bound.
Class Freq.
Cumulative frequency
3-5
2.5-5.5
1
1
6-8
5.5-8.5
2
3
9-11
8.5-11.5
11
14
12-14
11.5-14.5
10
24
15-17 18-20
14.5-17.5 17.5-20.5
5 1
29 30
Q1 is in third class with boundries (8.5 - 11.5 )
Thus,L
k
8.5, f
k
11, F
L
3, c =3
(a) First and third quartile Q1
P25
= 8.5 +
7.5 33 9.73 11
Q3 is in third class with boundries (11.5-14.5 )
Thus, L
11.5, f
k
k
10, F
L
14, c =3
Q = P75 3
22.5-14 3 10
=11.5 +
14.05
Q3 Q 1 14.05 9.73 4.32
3 - 1 x23
(b)P 10 =5.5+
27 - 24 x3 5
P=9014.5+
c D
5
P50
=11.5+
11.8
Median
15 - 14 x3 10
8.5
16.3
RANGE Range = upper boundary of the last data - lower boundary of the first class
INTERQUARTILE RANGE •
Defined as the difference between the third quartile and the first quartile Interquartile range = Q3 - Q1
Variance, S
2
fx 2
fx
standard deviation, S
2
f
f -1
Variance S
2
Example : Find the range, variance and standard deviation Class
Frequency
Intervals
Class mark x
fx
fx 2
1-3
5
2
10
20
4-6
3
5
15
75
7-9
2
8
16
128
10-12
1
11
11
121
13-15
6
14
84
1176
16-18
4
17
68
1156
f 21
fx
= 204
fx
2
2676
Solution: Range = upper boundary of the last data - lower boundary of the first class = 18.5 – 0.5 = 18
fx
2
2
S
2676
fx
f f
S
2
34.71
1
204
20
2
2
S=
21
34.71
5.892
Example : Find the mean, variance and standard deviation. Marks
Number of students
0x< 20
9
20x< 40
29
40x< 60
42
60x< 80
26
80 x < 100
14