Statistical Tools for Managers
56
Chapter 4 Measure of Central Tendency 4.1.1. 4.1.1.
Proper Propertie tiess of a Good Good Meas Measure ure of of Centra Centrall Tenden Tendency cy
A good measure of central tendency should possess as far as possible t he following properties: – a.
Easy to understand.
b. Simple to compute. c. Based Based on all all obs obser erva vatio tions ns.. d. Uniq Unique uely ly defi define ned. d. e. Possibi Possibilit lity y of furth further er algeb algebrai raicc treatm treatment ent.. f.
Not unduly unduly affect affected ed by extrem extremee valu values. es.
4.1.2. 4.1.2.
Common Common Measur Measures es of Centra Centrall Tende Tendency ncy
There are three common measures of central tendency: The average value.
a.
Mean.
b.
Median. The middle value.
c.
Mode.
4.2.
Most occurring value.
Mean
There are three types of mean: – a.
Arithmetic mean (AM).
b.
Geometric Mean (GM).
c.
Harmonic Mean (HM).
4.2. 4.2.1. 1.
Simpl Simplee Arit Arithm hmet etic ic Me Mean an
4.2. 4.2.1. 1.1. 1.
Simpl Simplee Ari Arith thme meti ticc Mea Mean n for for Ungr Ungrou ouped ped Data Data (AM) (AM)
µ
=
x1 + x 2 + x3 + ..... + xn N
n
=
∑ x
i
i =1
N
There is a short cut method for calculations based on a simple concept that, if a constant is subtracted or added to all data points, the arithmetic mean (AM) is reduced or increased by that amount. Thus, n
∑ d i
µ
= A +
i =1
N
Where, A = Arbitrarily selected constant value (Assumed mean).
d i = Deviation of each observation from the assumed mean. N = N = Number of observations.
Statistical Tools for Managers
57
Note that, when assumed mean ‘A’ is exactly equal to Arithmetic mean μ or X , algebraic sum of all deviations is equal to zero. Thus, algebraic sum of deviations of all observations about Arithmetic Mean is zero. Or, n
About Arithmetic Mean,
∑ d = 0 i
i =1
4.2.1.2.
Simple Arithmetic Mean for Grouped Data
Then the weighted average is calculated by dividing sum of these values of class marks with frequency as their weights, by total number of observation (sum of all frequencies). Thus for grouped data, n
n
∑ m f ∑ m f = N f ∑ i
i
i
i =1 n
μ=
i
i =1
i
i =1
Example 2: From the following data compute Arithmetic Mean by direct method, short cut methods and step division method. Marks
0-10
No of students
5
10-20
20-30
30-40
40-50
50-60
10
25
30
20
10
Solution: Let the assumed Mean be A = 35 and Step size h = 10 Calculation Table Marks
Class Mark
No. of Students
( mi )
( f i )
Step Deviation
Deviation mi * f i
d i = mi – A
f i * d i
d i′ =(m i -A)/h
f i * d i′
0-10
5
5
25
-30
-150
-3
-15
10-20
15
10
150
-20
-200
-2
-20
20-30
25
25
625
-10
-250
-1
-25
30-40
35
30
1050
0
0
0
0
40-50
45
20
900
10
200
1
20
50-60
55
10
550
20
200
2
20
100
3300
∑ a. Direct Method: 6
∑ m f 3300 = = 33 μ= 100 ∑ f i
i
i =1
6
i
i =1
- 200
- 20
Statistical Tools for Managers
58
b. Shortcut Method: 6
∑ f d i i
i =1
μ = A +
−200
= 35 +
6
100
∑ f i
= 35 – 2 = 33
i =1
c. Step Division method n
∑ f d ′i i
µ = A +
i =1 n
× h = 35 +
∑ f i
−20 100
× 10
= 33
i =1
Note: The answer is same irrespective the method used. 4.3.1.6.
Merits of Arithmetic Mean
a.
Easy to understand and calculate.
b.
Takes all values into account.
c.
Lends itself to further mathematical treatment.
d.
Since sum of all deviations from Arithmetic mean is zero, it is a point of balance or center of gravity.
e.
Sum of the squared deviations from arithmetic mean is always the minimum.
4.3.1.7.
Limitations of Arithmetic Mean
a.
Affected significantly by extreme values.
b.
Cannot be computed for open-end class distribution without some assumptions.
c.
May give fallacious conclusions if we depend totally on Arithmetic mean for decisionmaking.
d.
Cannot be determined by inspection or graphically.
4.3.1.8.
Arithmetic Mean of Combined Data µ
µ µ µ = N × + N × + ...... + N × N + N + ...... + N 1
2
1
1
4.3.2.
2
2
n
n
n
Weighted Arithmetic Mean
There are cases where relative importance of the different items is not the same. In such a case, we need to compute the weighted arithmetic mean. The procedure is similar to the grouped data calculations studied earlier, when we consider frequency as a wei ght associated with the class-mark. Now suppose the data values are x1 , x 2 , x3 , … , x n and associated weights are W 1 , W 2 , W 3 …W n , then the weighted arithmetic mean is: Direct Method µ w
4.3.2.1.
=
W 1 × x1 + W 2 × x 2 + ...... + W n × xn W 1 + W 2 + ...... + W n
Utility of Weighted Mean
W × x ∑ = ∑W i
i
i
Statistical Tools for Managers
59
Some of the common applications where weighted mean is extensively used are: a.
Construction of index numbers, e.g. consumer Price Index, BSE sensex, etc. where different weights are associated for different items or shares.
b.
Comparison of results of the two companies when their sizes are different.
c.
Computation of standardized death and birth rates.
Example 4:
Pune University MBA [2770]-104
The management of hotel has employed 2 managers, 5 cooks and 8 waiters. The monthly salaries of the manager, the cook and waiter are Rs. 3000, Rs. 1200 and Rs. 1000 respectively. Find the mean salary of the employees. (Note: Although these salaries must be 10 to 15 year old, we will take it only to learn the principle.) Solution: Here we need to calculate waited average of salary with salaries as weights. µ w
=
W 1 × x1 + W 2 × x 2 + ...... + W n × xn W 1 + W 2 + ...... + W n = 1333.33
4.3.3.
=
2 × 3000 + 5 ×1200 + 8 ×1000 2 +5 +8
Rs.
Geometric Mean (GM)
It is defined as nth root of the product of ‘N’ values of data. If x1 , x2 …… x n are values of data, then Geometric Mean,
GM = n x1 × x 2 × ...... × xn If different values are not of equal importance and are assigned different weights say w1 , w2 ...w n then weighted Geometric Mean is given by
GM w = n x1w1 × x 2w2 × ...... × xn wn Geometric Mean is useful to find the average % increase in sales, production, population, etc. It is the most representative average in the construction of index numbers. Example 5: A person takes home loan with floating interest, on reducing balance of 10 year term. The interest rates as changed from year to year in percent are 5.5, 6.25, 7.5, 6.75, 8.25, 9.5, 10.5, 9, 8.25 and 7.5. Find was the average interest rate? Was it beneficial for him to take fixed interest rate on reducible balance at 7.5% per annum? Solution: Average interest rate can be found out using G.M. as follows. First we find the index by dividing % rate by 100 and then adding 1. Then we take G.M. of this index as average index. From it we can find out the average interest rate. Average index (G.M.) = 10
=
1.055 × 1.0625 × 1.075 × 1.0675× 1.0825× 1.095 × 1.105 × 1.09× 1.0825× 1.075 10
2.137
= 1.0789
Thus, Average Interest Rate = 7.89% Hence it was beneficial for him to take fixed interest rate on reducible balance at 7.5% per annum. 4.3.4.
Harmonic Mean (HM)
Statistical Tools for Managers
60
It is defined as the reciprocal of the arithmetic mean of the reciprocal of the individual observations. Thus Harmonic Mean is, HM =
n
n
1 + 1 + .... + 1 x x ÷ xn 1
2
=
n
1
∑ xi i =1
Example 6: A relay team has four members who have to drive four laps between two fixed points. Average speeds that the members can achieve in Km/hr are 280, 360, 380 and 310. Find average speed of the team to complete the event. Solution: The average speed can be calculated as Harmonic Mean HM. Thus, average speed of the team is, HM =
n
1 + 1 + .... + 1 1 + 1 + 1 + 1 x x ÷ ÷ xn 280 360 380 310 1
4.3.5.
4
=
2
= 327.69
Km/hr
Weighted Harmonic Mean
If weight is attached with each observation then the weighted Harmonic Mean is, HM =
w1 + w2 + ...... + wn wn = w1 w2 x1 + x 2 + .... + xn ÷
n
∑ wi i =1 n
wi
∑ xi i =1
Harmonic Mean is useful in computing the average rate of increase in profits, average speed of journey, average price of articles sold, etc. For example, airplane travels distances w1 , w2 , w3 …wn , with speeds x1 , x2 , x3 … xn , km\hr respectively, then the average speed is equal to weighted Harmonic Mean of speeds, with weights as the distances w1 , w2 , w3 …wn. Example 7: An aircraft travels 200 km upto border at speed 700 km/hr (economical), then 250 km upto the target in enemy territory at speed 950 km/hr, then after dropping the bombs travels at runaway speed of 1700 km/hr upto our nearest border at 150 km and then at the speed of 800 km/hr to the base at distance of 300 km. Find the average speed of the sortie. Also find the miss ion time. Solution: For the average speed, we need to find the weighted Harmonic Mean. Thus the average sortie speed is, HM =
w1 + w2 + ...... + wn 200 + 250 + 150 + 300 = = 889.23 w1 + w2 + .... + wn 200 + 250 + 150 + 300 km/hr x1 x 2 ÷ ÷ xn 700 950 1700 800
Mission time = 1.012 ; 1 hr approx. 4.4.
Median (Md) th
N + 1 Median M d = ÷ 2
observation.
Statistical Tools for Managers
61
If the number of observations is even, then the median is the arithmetic mean of two middle observations.
Median M d =
N 2
th
th
N observation + + 1 2
observation
2 N
In case of grouped data we first find the value
2
. Then from the cumulative frequency we find the class
th
N 2
in which the
item falls. Such a class is called as Median Class. Then the median is calculated by
formula: -
N Median M d =
Where,
L
=
L +
2
− pcf f
×h
lower limit of Median class.
N =
Total Frequency.
pcf =
preceding cumulative frequency to the median class.
f
=
frequency of median class.
h
=
class interval of median class. th
N Let us understand the logic of the formula. Median is value of 2
observation. But this observation
falls in the median class whose lower limit is L. Cumulative frequency of class preceding to the ‘median th
class’ is pcf . Thus, the median observation is
N − pcf 2
observation in the median class (counted
from the lower limit of the median class). Now, if we consider that all f observations in the median class are evenly spaced from lower limit L to upper limit L+h, the value of the median can be found out by using ratio proportion. Example 8: Calculate the median for the following data. Age
20-25
No. of Workers
14
25-30
30-35
35-40
40-45
45-50
50-55
55-60
28
33
30
20
15
13
7
Solution: Age
Frequency
Cumulative frequency
f
cf
20-25
14
14
25-30
28
42
30-35
33
75
35-40
30
105
40-45
20
125
Statistical Tools for Managers
62
45-50
15
140
50-55
13
153
55-60
7
160
Now, N = 160
N
Or,
2
= 80
80th item lies in class 35-40. Hence, pcf = 75, f =30, h = 5 and L = 35 Therefore, the Median is,
N M d =
L +
2
− pcf f
160
×h
=
L +
2
− 75
30
×5
= 35.83 4.4.1. a.
Mathematical Properties of median
An important mathematical property of the median is the sum of the absolute deviations about the median is minimum i.e.
∑ x − M
d
is minimum.
b. Median is affected by total number of observations rather than values of the observations. 4.4.2.
Merits of Median
a.
Easy to determine and easy to explain.
b.
Less distorted than arithmetic mean.
c.
Can be computed for open-end distribution.
d.
Median is the only measure of central Tendency that can be used for qualitative ranked data.
4.4.3.
Demerits of Median
a.
Need to rearranged data. For computer it is expensive operation.
b.
In case of even number of observations, median cannot be exactly determined.
c.
Less familiar than average.
d.
Does not take into account data values and their spread. It is intensive.
e.
Not capable of algebraic treatment.
4.5.
Quantiles
Quantiles are related positional measures of Central Tendency. These are useful and frequently employed measures. Most familiar quantiles are Quartiles, Deciles, and Percentiles. We are familiar with percentile scores in competitive aptitude tests or examinations of few institutes. If your score is 90 percentile, it means that 90% of the candidates who took the test, received a score lower than yours. In incomes in your organisation if you are 95 percentile, you are in the group of top 5% highest paid employees in your company. 4.5.1.
Percentile
Statistical Tools for Managers
63
P th percentile of a group of observations is that observation below which lie P % ( P percent) observations. The position of P th percentile is given by
( n + 1) × P 100
, where ‘n’ is the number of data
points. Example 9: In a computerized entrance test 20 candidates appear on a particular day. Their scores are: 9, 6, 12, 10, 13, 15, 16, 14, 14, 16, 17, 16, 24, 21, 22, 18, 19, 18, 20, 17. Find 80 th and 90th percentiles of data. Solution First, we order the data in ascending order. 6, 9, 10, 12, 13, 14, 14, 15, 16, 16, 16, 17, 17, 18, 18, 19, 20, 21, 22, 24. 80th percentile of the data set is the observation lying in the position: -
( n + 1) × P 100
=
(20 + 1) × 80 100
= 16.8
Now, the 16th observation is 19 and 17 th observation is 20. Therefore 80 th percentile is a point lying, 0.8 proportion away from 19 to 20, which is 19.8. The 90th percentile is similarly found as observation lying in position: -
( n + 1) × P 100
=
( n + 1) × 90 100
= 18.9
The 18th observation is 21 and 19 th observation is 22. Therefore 90 th percentile is a point 0.8 proportion away from 21 to 22, which is 21.9 4.5.2.
Quartile
Example 10: In a computerized entrance test 20 candidates appear on a particular day. Their scores are: 9, 6, 12, 10, 13, 15, 16, 14, 14, 16, 17, 16, 24, 21, 22, 18, 19, 18, 20, 17. Find the quartiles of data. Solution First, we order the data in ascending order. 6, 9, 10, 12, 13, 14, 14, 15, 16, 16, 16, 17, 17, 18, 18, 19, 20, 21, 22, 24. a) First quartile is the observation in position: -
( n + 1) × 25 100
= 5.25.
Value of the observation corresponding to 5.25 th position is 13.25 b) Second quartile or median is the observation in position: -
( n + 1) × 50 100
= 10.5.
Value of the observation corresponding to 10.5 th position is 16. c) Third quartile is the observation in position: -
( n + 1) × 75 100
= 15.75.
Statistical Tools for Managers
64
Value of the observation corresponding to 15.75 th position is 18.75 Note: 0th quartile is same as 0 th percentile, which is the minimum observation. Similarly 4 th quartile is 100th percentile, which equals to the maximum observation. 4.5.3.
Deciles
These are the values, which divide the total number of observations in to 10 equal parts. Obviously there are 11 deciles (including 0 th and 10th). Method of calculating deciles is same as percentiles. We can use the formula same as percentile by substituting P by 10, 20, 30, etc. for 1 st, 2nd, 3rd, etc. deciles. 4.6.
Mode
The mode of a data set is the value that occurs most frequently. There are many situations in which arithmetic mean and median fail to reveal the true characteristics of a data (most representative figure), e.g. most common size of shoes, most common size of garments. In such cases mode is the best-suited measure of the central tendency. There could be multiple model values, which occur with equal frequency. In some cases the mode may be absent. For a grouped data, model class is defined as the class with the maximum frequency. Then the mode is calculated as: Mode
= L +
∆1 ×h ∆1 + ∆ 2
Where, L = Lower limit of model class.
∆1 = Difference between frequency of the model class and preceding class.
∆
2
= Difference between frequency of the model class and succeeding class.
h = Size of the model class. Example 11: In a computerized entrance test 20 candidates appear on a particular day. Their scores are: 9, 6, 12, 10, 13, 15, 16, 14, 14, 16, 17, 16, 24, 21, 22, 18, 19, 18, 20, 17. Find the mode of the data. Solution: Now the value 16 occurs 3 times which is maximum for any observation. Therefore, Mode = 16 Example 12: In a computerized entrance test 20 candidates appear on a particular day. Their scores are: 9, 6, 12, 10, 13, 15, 14, 14, 16, 17, 16, 24, 21, 22, 18, 19, 18, 20, 17. Find the mode of the data. Solution: Now the values 14, 16, 17 and 18 occur 2 times which is maximum for any observation. Therefore, Modes are 14, 16, 17 and 18 (this is a multimodal distribution) Example 13: In a computerized entrance test 20 candidates appear on a particular day. Their scores are: 9, 6, 12, 10, 13, 15, 14, 16, 24, 21, 22, 19, 18, 20, 17. Find the mode of the data. Solution: Now there is no value that occurs more than 1 time. Therefore, the data has no Mode. 4.7.
Relationship Among Mean, Median and Mode
Statistical Tools for Managers
65
A distribution in which the mean, the median, and the mode coincide is known as symmetrical (bell shaped) distribution. Normal Distribution is one such a symmetric distribution, which is very commonly used. If the distribution is skewed, the mean, the median and the mode are not equal. In a moderately skewed distribution distance between the mean and the median is approximately one third of the distance between the mean and the mode. This can be expressed as: Mean – Median = (Mean – Mode) / 3 Mode = 3 * Median – 2 * Mean Thus, if we know values of two central tendencies, the third value can be approximately determined in any moderately skewed distribution. In any skewed distribution the median lies between the mean and mode. In case of right-skewed (positive-skewed) distribution which has a long right tail, Mode
Example Set
Example 17: Inflation rate in percent for past six months is given as 5.5, 6.2, 7.2, 6, 6.5 and 5.9. Fi nd average inflation rate over past six months. Solution: Average inflation rate can be found out using G.M. as follows. First we find the index by dividing % rate by 100 and then adding 1. Then we take G.M. of this index as average index. From it we can find out the average inflation rate. Average index (G.M.) = 6
=
1.055 × 1.062 × 1.072 × 1.06 × 1.065 × 1.059
Thus, Average Interest Rate = 6.2% Example 18:
6
1.4359
= 1.062
Pune University MBA [2875]104
The expenditure of 1000 families is given below: Expenditure in Rs.
40-59
60-79
80-99
100-119
120-139
Number of Families
50
-
500
-
50
The median of the distribution is Rs. 87. Calculate missing frequencies and for the complete distribution table calculate Mode. Solution: Let the missing frequency of class 60-79 be ‘ x’. Since the total frequency is 1000, the frequency of the class 100-119 is (1000 – 50 – x – 500 – 50 ) = 400 – x Since median is given as 87, the median class is 80-99. Now,
N Median M d =
L +
2
− pcf f
×h
Statistical Tools for Managers
Where,
L
= 80
66
lower limit of Median class.
N = 1000
Total Frequency.
pcf = 50 + x
preceding cumulative frequency to the median class.
f
= 500
frequency of median class.
h
= 20
class interval of median class.
Thus,
87 = 80 + Or, x
500 − (50 + x) 500
× 20 ⇒ 7 × 25 = 500 − (50 + x) ⇒ 50 + x = 325
= 275
Thus the missing frequency of class 60-79 is 275. Also the frequency of the class 100-119 is (400 – x ) = 125 ii)
Since the highest frequency is in class 80-99, it is a modal class. Now,
Mode
= L +
∆1 ×h ∆1 + ∆ 2
Where, L = 80
Lower limit of model class.
∆1 = 225
Difference between frequency of the model class and preceding class.
∆
Difference between frequency of the model class and succeeding class.
2
= 375
h = 20
Size of the model class.
Mode = 80 +
225 225 + 375
× 20 = 80 + 7.5 = 87.5
Example 20: JHU MBA [102] 2004 The following data are scores on a management examination taken by a group of 22 people. 88, 56, 64, 45, 52, 76, 54, 79, 38, 98, 69, 77, 71, 45, 60, 78, 90, 81, 87, 44, 80, 41 Find the mean, median, standard deviation, and 60 th percentile. Solution: Number of observations N = 22 n
a)
=
∑ x
i
X =
i =1
N 88 + 56 + 64 + 45 + 52 + 76 + 54 + 79 + 38+ 98+
69+ 77 + 71+ 45+ 60+ 78+ 90+ 81+ 87+ 44+ 80+ 41 22
= 66.9545 b)
For calculating median we need to arrange the data in ascending order as follows,
38, 41, 44, 45, 45, 52, 54, 56, 60, 64, 69, 71, 76, 77, 78, 79, 80, 81, 87, 88, 90, 98 Since the number of observations is even, hence the median,
Statistical Tools for Managers
th
M d
N 2÷ =
th
N observation + + 1÷ 2
observation
=
2
69 + 71
=
67
2
11th Observation + 12th Observation 2
= 70 th
c)
th
P
(n + 1) × P percentile = 100
observation. th
th
60
( n + 1) × 60 = 13.2th percentile = ÷ 100
observation.
Since it is a fraction, we need to interpolate the value between 13 th and 14th observations. Now 13th observation is 76 and 14 th observation is 77. Thus by interpolating, 60th percentile = 13.2 th observation = 76.2 4.9.
Exercise = 58.89
6. Calculate arithmetic mean and mode from the following: Monthly salary Rs.
400-600
600-800
800-1000
1000-1200
1200-1400
Number of Workers
4
10
12
6
2
Ans:
Mean = 852.94 ,
Mode = 850
Pune University BBA [2791]-203