Chapter 1 BFC34303.PDF

CIVIL ENGINEERING STATISTICS

INTRODUCTION These are Mathematics marks for 30 students who are taking Test 1

12 , 23, 24, 45, 34, 48, 56, 63, 23, 44, 69, 78, 84, 95, 98, 67, 73, 69, 58, 70, 40, 88, 59, 47, 37, 15, 17, 36, 63, 38

How to interpret these marks?

~ Statistic s i s t he science tha t d eals with coll ecti ng, cla ssif ying , prese nti ng, describi ng, a nalyz ing and i nterpreting data to ena ble us to dr aw con clus ion s and making reasonable de cisi ons ~ Can be divi ded int o 2 cate go ri es (a)

Desc ri pt iv e st ati st ic s

(b)

Inf erent ial st ati st ic s

Descr ipt ive sta tis tic s ~ The activi ti es of collecting, classifyi ng, prese nti ng and describing quantitative data

Inf erential st ati st ic s ~ The part deali ng wi th technique and method o f int erpr etation of th e results obt ained from th e descript ive statistics

WHAT IS POPULATION ? ~ Popu lation i s t he ent ir e (com pl ete) coll ectio n of data who se properties a re analyze d. It co nt ain s all t he subj ect s of interest. ~ em of but any must size , share its it ems need beCan uni b for at lea st onnot e measur abl e fea tu re.

WHAT IS SAMPLE? ~

A por tio n of pop ulation sele cted for study

~

Samp le is any s et o f enti ti es, case s, sub jects, ite ms or experimenta l units ch ose n from the population.

WHAT IS RANDOM SAMPLE? ~

A random sample is a sample sele ct ed in suc h a way th at each eleme nt of the popu lation has th e same ch ance of being sele cted

WHAT IS PARAMETER ? ~ Paramete r is a numeri cal mea su rement describi ng some chara cteristi cs of a population ~

Eg: The pop ul ation mean , variance

WHAT IS STATISTIC? ~ Stati st ic i s a num eri cal mea su rement describi ng some chara cteristi cs of a sample ~

Eg: The sampl e mean ,vari ance

WHAT IS VARIAB LE ? ~

Any measur ed chara cteristi c or attr ibu te that differs for di ffere nt elements

~

For exampl e, if th e weigh t of 30 sub jectswou w ere red, then weight ld bmeasu e a variable .

~

Can be classi fi ed as qu ant it ati ve or qualitative

WHAT IS QUANTITATIVE VARIABLE ? ~

The variable being st ud ied is numeric

~

~

measur ed on an or di nal, in terval, or r atio sc ale eg: If the tim e it too k them to respo nd w ere me asured, then the varia ble wo uld be quantit ativ e.

WHAT IS QUALITATIVE VARIABLE ? ~ The va riable be ing stud ied is non -num eric ~ Called "categorical variables” ~ Measured on a nomi nal scale ~ eg: gend er, edu cati on al level, eye colour If five- year o ld s tud ents were a sked to name their favou rit e colou r, then the variable wou ld be qualitative.

WHAT IS DATA ? ~

A set of data is a collection of obs ervation, mea sureme nts or infor mation obtaine d

~

Can be classi fi ed as qu ant it ative or qualitative

~

Can be presented in v ariou s ways

WHAT IS QUANTITATIVE DATA ? ~

Quanti tative data refers to observations wh ich can be measured num eric ally o r co unt ed

~

Can be di vid ed int o di scr ete data and continu ous da ta

~

eg:

lengt h, time, temperature and mass

WHAT IS QUALITATIVE DATA ? ~

~

Qualit ative data are not in numerica l f orm but instea d assig ned a s attr ibut es eg:

race, mari tal s tatu s, age, gend er

~

is a set of data th at can on ly take exact and c ou ntabl e values

~

For exampl e: a) b) c) d)

The number of students in a class. The number of cars sold on any day at a ca r d ealers hi p. The number of persons in a family. The number of students in a class.

~

~

is a data can take any valu e ov er certain i nt erv al and c an be measur ed to a certain degr ee of accu racy (cor rect to certa in decima l p lace s) For exampl e: a) b) c) d)

The weight of students in a class. The time taken to complete an examination. The amount of soda in a 150ml can. The income of a family.

WHAT IS UNGROUPED DATA ? ~

(a) Raw dat a (b) Not in the term of int erval (c) Freque ncy d ist rib uti on th at has been arranged in o rd er

~

Example: (i)

3,5,6,2,5,2,4,6,5

(ii)

Numberofbooks Frequency

0

1 3

2 7

3 4

2

WHAT IS GROUPED DATA ? ~

The data can be gr ou ped int o class int erval before the fr equency distribution is constructe d

~

The table con stru cted is calle d freque ncy dis tribut ion table

~

Example:

Height (cm) Frequency

1 50 -155 1 5 5-1 60 1 60 -165 1 6 5-17 0 2

8

6

5

Determine whether the data obtained is discrete or continuous data. (a) The number of books sold by a stationary shop. (b) The time taken to travel from Kuala Terengganu to Batu Pahat (c)

The number of A’s in SPM

(d) The weight of FKAAS students (e) The diameter of twenty spheres

• All data are to be considered as sample unless otherwise stated in the questions.

Example :

Type A

The number of male children in 20 families chosen at random is as follows. 14 2 0 2 3 3 2 1 4 5 2 1 2 0 1 2 3 1 2

Type B The above data is called a raw data and it can be summarized as a frequency distribution as shown : Number of m ale

012345

children Frequency

2

5

7

3

2

1

The data shown in this frequency distribution table is known as ungrouped data.

MEASURES OF LOCATION ( CENTRAL TENDENCY) MEAN

Given a se t d ata of x 1,x 2,x 3,..x n. The me an, is d efin ed as x

sum of all observations 



number of observations x1



x2



...  x n

n n

 

xi

i 1

n

For a set of data which can be represented in a frequency distribution table, the mean is given by

k

fx i



i 1 k

f

i

i 1

i

Example : Find the mean of the following data

14 2 0 2 3 3 2 1 4 5 2 1 2 0 1 2 3 1 2 Solution: n

x x



i 1

n

41 20

i



1  4 2 ...  3



20

2.05

1

2

OR x

0

1

f

25

2 7

3 3

4 2

5 1

k

fi x i

x

 f i 1

k

i

i1





  7(2) 3(3)  2(0)  5(1) 2(4) 1(5) 20 2.05

Example : To o btain grade A, Saleha mus t achi eve an average of at lea st 75 marks i n f our tests. I f h er average mark for the fir st th ree tests i s 70, calcul ate the lowest mark she must get in her fourth test in ord er to o btain grade A.

Solution: Let th e four tests : w,x, y,z Mean fo r w,x,y : 70 Mean f or w,x,y,z :

3(70) 

z

4 



75

 75

210 4

z

210 

z

 300

z

 90

So, th e lo west mark she must get in her fourt h te st in or der to ob tain g rade A is 90

MEDIAN The median is the middl e value of a set of data tha t is arrange d in order of magnitude. th

Let x(k ) be th e k observation in a set of data which arra nged in ascending o r descending o rder . For example , consider the fol lowin g s et of numbers 9 2 7 10 5 16 After arrangement, it becomes 2 5 7 9 10 16 Thus,

between x 3 7 and x  median

is 8

4

 9

has bee n

Themedianof a set data x1,x 2 ,...,x nis denoted by x(m ) and x m may beca lculated as:

 xm

  

x  n1 

,if n is odd

   2 

 1  x  x  2   n2    

 1  n is 2 ,if

n

even

Example :

Find t he me dian for t he follow ing sets of d ata a) 21, 24, 17, 28, 36, 20, 32 b) 3.56, 2.7, 5.48, 8.61, 4.35, 6.22 Solution:

a) The data a rr ang ed in ascendi ng or der : 17 , 20 , 21 , 24 , 28 , 32 , 36 Since n = 7 , which is odd , thu s the median is

xm  x  n 1 2



x4



24

b) The data arr ang ed in ascend ing or der : 2.71 , 3.56 , 4.35 , 5.48 , 6.22 , 8.61 Since n = 6 , which is even, thu s th e median is xm

1

  x  6 2     2

 x    6   2 1    

1  x  x   4 2   3 



1 2

 4.35  5.48 

 4.915

MODE •

The mode of a set of data is the value

that

occurs most frequently.

•

The mode may not be unique or they may be no mode at all.

Example : Find the mode for the following set of data a)

2, 3, 3, 4, 5, 28, 5, 5

b)

2,

c)

0.2,

3,

5, 0.4,

8, 0.4,

10 0.4, 0.5, 0.7,

0.7,

0.7,

0.5

QUARTILES Quartiles div ide a set of data whi ch are a rranged in asce nding o rder into 4 equal p arts . To fi nd quarti le ( Qk ): Let

r

k 

n

4 where : n k





(i) If r is an int eger: 1

Qk 

r 2

number of observations quartile for Qk

th

th

observation  ( r

1)

observation

(ii) If r is n ot an integer , then round up to t he ne xt integer. Q2 is also called median.

Interquartile Range = Q3  Q1



PERCENTILES Percentiles div ide a set of data whi ch are a rranged in asce nding or der into 100 equal part s . To fin d percentil e ( Pk ): Let

r

k

n



100 where : n k

number of observations





percentile for Pk

(i) If r is an int eger: Pk

r 1 2

th

observation

 ( r 1) th

observation

(ii) If r is not an int eger, then round u p to the next integer.

Notes:Q1 =P , Median 25



2

=P3 ,75 Q =P Q 50



Exampl e : Find the median, fi rst qua rt ile ( Q1) ,thir d qu arti le (Q3) and 40 th percentil e ( P40 ) for the following sets of data

a) 21, 24, 17, 28, 36, 20, 32 b) 3.5, 2.7, 5.4, 8.6, 4.3, 6.2, 9.9, 7.6 Solution: a) The data arrange d i n ascendin g o rder : 17 , 20 , 21 , 24 , 28 , 32 , 36 Median r

k 

4

n

2 

4



7

Q2 

3.5 ( not an integer ) th

 Median  Q2 4 observation

24

First quartile r

k 



1

n



4

4

7





Q1

1.75 ( not an integer )

Q1  2th observation



20

Third quartile Q3 

k

r





n

4

3 

4

7




Q3  6th observation

40th percentile r

k 



n

100

P40



40 

100 rd

7







32

P40


3 observation



21

Example :

The followi ng table shows the ma rks o btaine d by 30 stu dents i n a Math ematics qui z, where the maximu m marks i s 10. Marks

2

3

4

5

6

7

8

9

10

No. of students

2

4

3

6

4

5

4

1

1

Find t he me an, mo de, median, firs t and thi rd quartil es, interquartile range th e 60th percentile.

and

Example : Data1: 6,7,8,6,9,6

mean=7

Data2: 5 ,7,2,6,13,9

m e an = 7

• Most of the numbers in data 1 are around the mean value. • Data 2 is more spread away from the mean. difference in the spread can be determined by the measure of • The dispersion

MEASURES OF DISPERSION

Three common measure of dispersion are: • Range

• Variance • Standard deviation

Rang e = Largest v alue – Smallest v alu e

REMARK • Range is not a good measure of dispersion because it is influenced by the extreme values and the calculation does not cover all observations.

• Variance and standard deviation are most useful and widely used measure of dispersion. Although they are influenced by the extreme values, the calculations cover all the observations

REMARK • Standard deviation measures how spreads out the values in a data set are. • If the data points are all close to the mean, then the standard deviation is close to zero.

• If many data points are far from the mean, then the standard deviation is far from zero. • If all the data values are equal, then the standard deviation is zero.

VARIANCE

X x

 f xx nf i

ii

i

S2 

(X 

 i X)

2

n 1

for i



1,2,...,n

Commonl y in u se formulae 2

x i  nX

2

S 

2 i



2 i

S 

 x 

n 1

fx 

2

 n 1 x



2

nX 

n 1

2

S  VARIANCE 2

i

n

i

STANDARD DEVIATION

2

fx 

i

2 i



 f x 

n 1

i

n

i

 S

2

Example :

Calcul ate the variance a nd s tandard deviatio n fo r th e fo llo wi ng sets o f sampl e data. Hence, determin e wh ich d ata is m ore dis perse a bout the me an. Set 1 : 16,10,9,2,5,2,7 Set 2 : 10,32,8,12,14,36,20,8,40,4,32,1

For Data 1:

Data 1 : 16,10,9,2,5,2,7 2   n   2 x x  n   xi   2  i 1    2 4 Xi    i 1  n 2 4     2  5 25 S  n 1 7 49 9

81

10

100

16

256

n

 i1

519 

 51 

2

7

6



24.571849

n

Xi

 51

X

2

i

i1

519

S



24.571849



4.957

For Data 2:

Data 2 : 10,32,8,12,14,36,20,8,40,4,32,1 2  n  n   n 2 x  n   i  Xi  217  Xi  5929   X2   i 1   i1 i 1 i   n i 1     2  S  n 1 5929 

S



 217  

2

12 

11

182.265

182.265

13.5



Henc e, dat a 2 is mor e disperse th an data 1

STEM-AND-LEAF DIAGRAMS  Used

to extract every data value in dataset.

 The digit(s) in the greatest place value(s) of the data

values are the stems.

 The digits in

the next greatest place values are

the leaves.  To

construct a stem-and-leaf diagram: 1. Place the stems in order vertically from smallest to largest. 2. Place the leaves in order in each row from smallest to largest. 3. Create a key for the stem-and-leaf diagram so that people know how to interpret the diagram.

STEM-AND-LEAF DIAGRAMS Shape of dis tribu tion A perfectly symmetric curve is one in which both sides of the distribution would exactly match the other if the figure were folded over its central point. An example is shown below:

A sym metr ic, bell- shape d d istribut ion , a relatively common occurrence is called a norma l distribution .

STEM-AND-LEAF DIAGRAMS A distribution is said to be skewe d to the right , or posit ively skewe d , when most of the data are concentrated on the left of the distribution. The right tail clearly extends farther from the distribution's centre than the left tail, as shown below:

STEM-AND-LEAF DIAGRAMS A distribution is said to be skewed to th e left , or negatively skewed , if most of the data are concentrated on the right of the distribution. The left tail clear ly extends farther from the distribution's centre than the right tail, as shown below:

STEM-AND-LEAF DIAGRAMS Example: If the stem and leaf plot is turned on its side, it will look like the following:

The distribution shows that most data are clustered at the right. The left tail extends farther from the data centre than the right tail. Therefore, the distribution is skewed to the left or negatively skewed .

Example :

Marks of a recent Mathematics test are as given below: 73, 42, 67, 78, 99, 84, 91, 82, 86, 94 Based on the marks given: (a) Construct a stem-and-leaf diagram. (b) What is the highest and lowest mark? (c) Interpret the distribution.

Solution: (a)

Mathematics Test Mark Stem 4

Leaf 2

5 6

7

7

3

8

8

2

4

9

1

4

6 9 Key: 9 9 means 99 marks

(b) Highest mark = 99, Lowest mark = 42 (c) Negatively skewed

Example :

Given the heights of 20 people are as follows: 154, 143, 148, 139, 143, 147, 153, 162, 136, 147, 144, 143, 139, 142, 143, 156, 151, 164, 157, 149. Construct a stem-and-leaf diagram and state the shortest and tallest height. Interpret the distribution.

Solution: Stem

Leaf

13

699

14

23 33 34 77 89

15

13467

16

4 2 Key: 13 6 means 136 cm

Shortest height =136 cm Tallest height =164cm Positively skewed

Exercise: The length of a straight line that were estimated by 22 students in mm are as given below: 10.5, 8.5, 8.6, 8.1, 7.3, 4.4, 6.6, 6.6, 7.9, 8.7, 8.3, 6.0, 8.7, 7.5, 7.9, 6.0, 9.1, 7.2, 8.4, 8.1, 8.6, 9.3 Construct a stem-and-leaf diagram based on the given data. Interpret the distribution.

BOX-AND-WHISKER PLOTS 70 max Q1

mi n

Q2

60

Q3

max 50

0

10

20

30

40

50

60

70 40

Q3

Horizontal Box and Whisker 30 Q2 20

10 Vertical Box and Whisker

mi n 0

BOX-AND-WHISKER PLOTS To

construct a box-and-whisker plot:

STEP 1: Determine the five number summary. STEP 2: Draw a horizontal axis on which the number obtained in step 1 can be located. Above this axis, mark all the five number summary with vertical lines. STEP 3: Connect the quartiles to each other to make box, andand thenminimum connect lines. the box to the a maximum STEP 4: Calculate the values of upper and lower inner fence to determine whether the data Upper inner fence = Q3 + 1.5 (Q3 – Q1) Lower inner fence = Q1 - 1.5 (Q 3 – Q1)

Lowerinnerfence

Upperinnerfence

mi n

10

20

30

Q1

40

Q2

50

max

Q3

60

70

80

100

90

The data lies withi n the upper and low er inner fence, so the data has no outl ier.

Upper inner fence

Lower inner fence

Outlier max

mi n

10

20

30

Q1

40

Q2

50

Q3

60

70

80

90

The observation that lies outsi de fence is know n as outl ier.

100

SHAPE OF DATA DISTRIBUTION (SYMMETRY AND SKEWNESS) Symmetrical distribution-the ‘whiskers’ are the same length and the median Q2 is in the centre of the box.

mi n

Q1

Q2

Q3

max

SHAPE OF DATA DISTRIBUTION (SYMMETRY AND SKEWNESS) Positively skewed distribution-the left ‘whiskers’ is shorter than the right ‘whiskers’ and the median is nearer to Q 1.

mi n

Q1

Q2

Q3

max

SHAPE OF DATA DISTRIBUTION (SYMMETRY AND SKEWNESS) Negatively skewed distribution-the left ‘whiskers’ is longer than the right ‘whiskers’ and the median is nearer to Q3.

min

Q1

Q2

Q3

max

Example : Data : 40, 32, 61, 52, 65, 68, 41, 61, 70, 66, 57, 55, 45, 51, 62, 69, 31, 50, 72, 66, 41, 54, 65, 79, 66 (a) Display the data in a ste m and lea f di agram. (b) Find th e first , second and third qu artiles, upper a nd low er inner fence. (a) Constru ct a box and whisker plot fo

r the a bove da ta.

Solution : (a)

Stem

Leaf

3

1 2

4

0115

5

0 1 2 4 5 7

6

1125566689

7

029 Key: 5 4 means 54

(b)

Number of observation, n = 25, min = 31 , max = 79 1

r



4

 25 

2 r



4

3 r



4



 25

 25 

6 25 .

, Q1 = the 7th observation = 50

12 5 .


18 .75






3 Upper inner fence = 1.5 (Q3- 50) – Q 1) =Q 66 ++1.5(66

= 90 Lower inner fence = Q1 - 1.5 (Q 3 – Q1) = 50 - 1.5(66 - 50) = 26

(c) Upper inner fence 90

Lower inner fence 26 Q1 31

10

20

30 40

Q2 Q3

50 61 66 50 60

70

79 80

90 100

No outlier. The data is negatively skewed (skewed to the left).

Example : Stem

Leaf

5

1 9

6

233444445

6

888999

7

022367 Key: 5 9 means 59o F

From the given Stem and Leaf diagram, construct Box and Whiskers plot. Determine the outliers of the data.

Number of observation, n = 23, min = 51 , max = 77 1 r



4

2 r



4 3

r



4

 23

 23

 23



5 75 .

11 5 .



17 .25



Q1 = the 6th observation = 64o F Q2 = the 12th observation = 68o F Q3 = the 18th observation = 70o F

3 Upper inner fence = 1.5 (Q3 – Q1) =Q 70 ++1.5(70-64)

= 79o F Lower inner fence = Q 1 - 1.5 (Q 3 – Q1) = 64 - 1.5(70-64) = 55o F

Outlier

Lower inner fence 55

Upper inner fence 79 Q1

51 50

64 60

Q2 Q3

68 70

77 80

70

From the boxpl ot, we can see that the min imum value 51o F is ou ts ide the fe nce a nd th is value is th e out lier. Therefore whiskers is drawn from 59 o F to 77 o F . Lower inner fence 55 Q1

Outlier 51

50

Upper inner fence

59 60

64

79 Q2 Q3

68 70 70

77 80

The data is negatively skewed (skewed to the left).

GROUPED DATA

MEAN of a fre quency distributi on The mean o f a set of gr ou ped data gi ven in the form of a freque ncy distributio n is defin ed as k

fi xi x

i

1

k

fi i

1

k

fi

total no. of frequency

i 1

xi

class mark

Example :

Find the mean for th e followi ng da ta Class

Frequency,f

0 ≤ x <10

2

10 ≤ x <20

17

20 ≤ x <30

26

30 ≤ x <40

10

40 ≤ x <50

5

i

Class

Frequency

0 ≤ x <10

2

10 ≤ x <20

17

20 ≤ x <30

26

30 ≤ x <40

10

40 ≤ x <50

5

0

SOLUTION:

10

x

2

Class

Class mark, xi

Frequency, fi

f ix i

0 ≤ x <10

5

2

10

10 ≤ x <20

15

17

255

20 ≤ x <30

25

26

650

30 ≤ x <40

35

10

350

40 ≤ x <50

45

5

225

fi

= 60

fix i

k

1490

fi xi x

i

1

x

k

fi i

1

60

24.83

1490

MODE of a freque ncy dist rib uti on mod e

Lm

d1 d1

c d2

L m = lower bound ary o f the class containing the mode d 1 = the diff. betwee n t he frequency o f th e mode class and the freque it.ncy of t he class imm ediate ly before d 2 = the diff. betwee n t he frequency o f th e mode class and the frequency of th e class imm ediate ly after it C = size of the mod e class

Example : Find the mode of freque ncy dist ribut ion giv en below: Class

Frequency

15 - 19

1

20 - 24

4

25- 29

22

30- 34

35

35- 39

20

40 - 44

8

SOLUTION: The mode cl ass is 30 – 34 and t he correspondi ng f reque ncy i s 35. Lm

29.5

d1

35

d2

35

c

22

mod e

20

5

mod e 29.5

13 13

= 31.8

15

5

Lm

d1 d1 d 2

c

frequency

Draw a line from the left upp er Draw linethe from th e vertical right upper corneraof highest bar corner of the highest vertical to th e is left up per corner th ba e r Mode estim ated fro mofthe to thevertic right up per next alpo ba r corner interse ction int of bothoflinthees vertic al ba r before it Histog ram should be dra wn on a graph pape r in order to o btain an accurate answer

mode

Class bou ndaries

Example : For the data in example 2, fin d t he mode using th e histo gram SOLUTION: 35 30 25 20 15 10 5

14.5

19.5

24.5

29.5

34.5

Mod e = 31.8

39.5

44.5

MEDIAN of a fr equency d istr ibu tio n

NOTE : Median of fr equency dist rib utio n can't be coun ted like the ungro uped da ta beca use th e data ha s b een g ro up ed i n th e fo rm of classes. So, we wi ll get an est imated value of median.

MEDIAN

n m Lm n





Lm

FL

2

fm

c

lower boundary

total no. of frequency

FL

cumulative frequency of the class before median class

fm

frequency of median class

c

size o f m edian class

Example : Calculate the median for the following d ata Class

Frequency, f

0≤x<5

7

5 ≤ x <10

27

10 ≤ x <15

35

15 ≤ x < 20

54

20 ≤ x < 25

63

25 ≤ x < 30 30 ≤ x < 35

43 25

35 ≤ x < 40

17

40 ≤ x < 45

9

45 ≤ x < 50

4

SOLUTION: Class

Frequency, f

Frequency, FL

0≤x<5

7

7

5 ≤ x <10

27

34

10 ≤ x <15

35

69

15 ≤ x < 20

54

123

20 ≤ x < 25

63

186

25 ≤ x < 30

43

229

30 ≤ x < 35 35 ≤ x < 40

25 17

254 271

40 ≤ x < 45

9

280

45 ≤ x < 50

4

284

f

284

The median class is 20 ≤ x < 25 with the cor respondi ng fr equency as 6 3.

Henc e, th e medi an is Lm

m

20 f

 Lm

    

  FL  2 c fm  

n

284

1 FL

123

fm

63

c

5

2 (2 84) 5123 63

m20

21.51

Quartile Quarti les div ide a set of d ata wh ic h are arranged in ascending ord er i nto 4 equal parts Percentile Perc entil es divi de a set of d ata wh ich are arranged in ascendi ng o rd er i nt o 100 equal parts Decile Decil es di vid e a set of data whic h are arranged in ascending ord er int o 10 equal parts

For gro uped data ; k     4  n  FL   Qk k  L     C k, fk        k     100  n  FL     Pkk  L   k, C fk   

k1,2, 3,..

k 1,  2,3,..,99



 k     10  n  FL     Dkk  L   C k, fk      

k 1,  2,3,.., 9

Lk = lower boundry of the class where Q k,P k,D k lies n = total number of observations FL = cumulative frequency before the class Q k,P k,D k fk = frequency of the class where Q k,P k,D klies ck = class width wher e Q k,P k,D k lies

Example : Height (cm) 3-5 Frequency

6-8

1

2

9-11 12-14 15-17 18-20 11

10

5

From the above data, calculate :

(a) first , third quartiles & interquartile range (b) the 10 th, 90 th percentiles

 c  the 5 th decile, D 5

1

Solution: Class Limit

Bound.

Class Freq.

Cumulative frequency

3-5

2.5-5.5

1

1

6-8

5.5-8.5

2

3

9-11

8.5-11.5

11

14

12-14

11.5-14.5

10

24

15-17 18-20

14.5-17.5 17.5-20.5

5 1

29 30

Q1 is in third class with boundries (8.5 - 11.5 )

Thus,L

k

8.5, f



k

11, F



L

3, c =3 

(a) First and third quartile Q1 

P25

= 8.5 +

 7.5 33  9.73     11  

Q3 is in third class with boundries (11.5-14.5 )

Thus, L

11.5, f



k

k

10, F



L

14, c =3 

Q = P75 3

 22.5-14  3    10  

=11.5 +



14.05

Q3 Q 1 14.05 9.73 4.32 







 3 - 1  x23   

(b)P 10 =5.5+

 27 - 24   x3   5 

P=9014.5+

c D

5



P50



=11.5+

 11.8

Median

 15 - 14 x3    10 





8.5

16.3

RANGE Range = upper boundary of the last data - lower boundary of the first class

INTERQUARTILE RANGE •

Defined as the difference between the third quartile and the first quartile Interquartile range = Q3 - Q1

 Variance, S

2

fx 2



  fx  



standard deviation, S







2

f

f -1

Variance S

2

Example : Find the range, variance and standard deviation Class

Frequency

Intervals

Class mark x

fx

fx 2

1-3

5

2

10

20

4-6

3

5

15

75

7-9

2

8

16

128

10-12

1

11

11

121

13-15

6

14

84

1176

16-18

4

17

68

1156

 f  21



fx

= 204

 fx

2

 2676

Solution: Range = upper boundary of the last data - lower boundary of the first class = 18.5 – 0.5 = 18

 fx

2



2

S 

 2676 





fx 

f f 

S

2 

34.71

1

 204 

20

2

2

S=

21 

34.71

5.892

Example : Find the mean, variance and standard deviation. Marks

Number of students

0x< 20

9

20x< 40

29

40x< 60

42

60x< 80

26

80 x < 100

14

Chapter 1 BFC34303.PDF

Recommend Documents