1. Measures of Central Tendency, Arithmetic Mean, Median, Mode, Harmonic, Geometric Mean

Measures of Central Tendency, Arithmetic Mean, Median, Mode, Harmonic, Geometric Mean

Measures of Central Tendency or averages of first order such as Arithmetic Mean, Median, Mode, Harmonic Mean & Geometric Mean. Measures of central tendency calculate the most central value of the data. To know the central value within the distribution of data, there are three methods to calculate it arithmetic mean, median and mode. Besides this, there are two other average Geometric mean which is used for geometric calculations such as index numbers, and Harmonic Mean which is used for rates and ratio purposes used by financial institutions.

Measures of Central Tendency

  • Arithmetic Mean
  • Median
  • Mode
  • Harmonic Mean
  • Geometric Mean

Arithmetic Mean

The arithmetic mean is an average to calculate the central value of the data. In ungrouped data, it is the sum of all values divided by the number count of values. It is denoted by:

    \[ \overline{\mathbf{X}}\  \]

Properties of Arithmetic Mean

(i) The sum of the deviation of the mean from its values always equals zero.

    \[ Ungrouped\ data\ \sum\left( X - \overline{X} \right) = 0,\  \]

    \[  Grouped\ data\ \sum f\left( X - \overline{X} \right) = 0\ \]

(ii) The sum of squared deviation from the mean is less than the sum of squared deviation from arbitrary origin

    \[  Ungrouped\ data\ {\sum\left( X - \overline{X} \right)}^{2} < {\sum(X - A)}^{2}\ \]

    \[  Grouped\ data\ {\sum f\left( X - \overline{X} \right)}^{2} < {\sum f(X - A)}^{2}\ \]

(iii) Arithmetic mean has an additive property, which means that if we have multiple distributions mean, we can calculate the combined mean for all distributions.

    \[ Combine\ Mean\ \overline{X}c\ = \frac{\overline{X}1\ n1 + \overline{X}2\ n2 + \overline{X}3\ n3\ldots\overline{X}n\ nk}{n_{1} + n_{2} + n_{3}\ldots..nk}\ \]

(iv) If we have a mean of one distribution and we have a regression equation, we can calculate the mean of another distribution.

    \[ a = \overline{Y} - b\overline{X}\ where\ a\ is\ an\ intercept\ \&\ b\ is\ coefficient.\  \]

(v) If we have constant values in our distribution, then the mean of that distribution will be that constant value.

Formulas for Arithmetic Mean

We have three different methods to calculate the arithmetic mean:

  • Direct Method
  • Indirect or Short-Cut Method
  • Step Deviation or Coding Method

Formulas for ungrouped data

(1)

    \[ Direct\ Method\ A.M\ \overline{X} = \frac{\sum X}{n}\  \]

(2)

    \[ Indirect\ Method\ A.M\ \overline{X} = A + \frac{\sum D}{n}\  \]

(3)

    \[ Step\ Deviation\ or\ Coding\ Method\ A.M\ \overline{X} = A + \frac{\sum U}{n}\ \times C\  \]

Example for Understanding ungrouped data

The students of the class obtained marks in the subject of statistics as follows:

StudentABCDEFGH
Marks2060408090506040

Calculate the arithmetic mean through the three methods given above.

Solution:

StudentsMarks Obtained (X) 

    \[ \mathbf{D = X - A}\ \]

 

    \[ \mathbf{U =}\frac{\mathbf{X - A}}{\mathbf{C}}\ \]

A20-30-3
B60101
C40-10-1
D80303
E90404
F50 Selected as A00
G60101
H40-10-1
Sum

    \[ \mathbf{\sum X\ = \ 440}\ \]

    \[ \mathbf{\sum D\ = 40}\ \]

    \[ \mathbf{\sum U\ = 4}\  \]

Note: “A” is an arbitrary origin that you can choose out of X values as per your choice.

“C” is a common divisor on which you can divide all values of “D” easily.

    \[ (i)Direct\ Method\ A.M\ \overline{X} = \frac{\sum X}{n} = \frac{440}{8} = 55\ \]

    \[ (ii)Indirect\ Method\ A.M\ \overline{X} = A + \frac{\sum D}{n}\ \]

    \[ Indirect\ Method\ A.M\ \overline{X} = 50 + \frac{40}{8} = 55\ \]

    \[ (iii)Step\ Deviation\ or\ Coding\ Method\ A.M\ \overline{X} = A + \frac{\sum U}{n}\ \times C\ \]

    \[ Step\ Deviation\ or\ Coding\ Method\ A.M\ \overline{X} = 50 + \frac{4}{8}\ \times 10 = 55\ \]

Example for understanding grouped data

Calculate arithmetic mean by using:

(a) Direct method

(b) Indirect/Shortcut Method

(b) Step-deviation/Coding method

MarksNo. of Students
30–391
40–493
50–5911
60–6921
70–7943
80–8932
90–999

Solution:

MarksfClass BoundariesXfXD=X-64.5fDU=(X-64.5)/10fU
30–39129.5–39.534.534.5-30-30-3-3
40–49339.5–49.544.5133.5-20-60-2-6
50–591149.5–59.554.5599.5-10-110-1-11
60–692159.5–69.564.51354.50000
70–794369.5–79.574.53203.510430143
80–893279.5–89.584.5270420640264
90–99989.5–99.594.5850.530270327
Sum120  8880  1140 114
 ∑f=  ∑fX= ∑fD=  ∑fU=

Solution:

(i)  Direct Method A.M

    \[  \mathbf{\ }\overline{\mathbf{X}}\mathbf{=}\frac{\mathbf{\sum fX}}{\mathbf{\sum f}}\mathbf{=}\frac{\mathbf{8880}}{\mathbf{120}}\mathbf{= 74}\ \]

(ii)Indirect or Shortcut Method  

    \[ \mathbf{A.M\ }\overline{\mathbf{X}}\mathbf{= A + \ }\frac{\mathbf{\sum fD}}{\mathbf{\sum f}}\  \]

    \[ \mathbf{A.M\ }\overline{\mathbf{X}}\mathbf{= 64.5 +}\frac{\mathbf{1140}}{\mathbf{120}}\mathbf{= 74}\ \]

(iii)Step-deviation or coding Method  

    \[  \mathbf{A.M\ }\overline{\mathbf{X}}\mathbf{= A + \ }\frac{\mathbf{\sum fU}}{\mathbf{\sum f}}\mathbf{\times h}\ \]

    \[ \mathbf{\ A.M}\overline{\mathbf{X}}\mathbf{= 64.5 + \ }\frac{\mathbf{114}}{\mathbf{120}}\mathbf{\times 10}\ \]

    \[ \mathbf{\ A.M}\overline{\mathbf{X}}\mathbf{= 64.5 + \ }\frac{\mathbf{1140}}{\mathbf{120}}\mathbf{= 64.5 + 9.5 = 74}\ \]

Measures of Central Tendency, Arithmetic Mean, Median, Mode, Harmonic, Geometric Mean
Measures of Central Tendency, Arithmetic Mean, Median, Mode, Harmonic, Geometric Mean

Median

Median is also the type of average. It also represents the middle value in the data after arranging data. Median has some properties discussed below:

Properties of Median

  1. Ordering of the data means the median must be calculated after arranging data in ascending order.
  2. Not affected by extreme values  means the median only focuses on middle values so that it is not affected by extreme outlier values.
  3. Ordinal Data means the median is not just for numerical data but it also deals with ordinal or categorical data, the data which has categories with order and attributes also.
  4. Positional value means that the median is called the positional average. After ordering the data, in the odd count of values, the median is most middle value & in even data, the median is the central half value of the sum of two middle values.
  5. Robust and balanced skewed Distribution means that in skewed distribution where data concentrates on one side of the distribution either left or right, the median provides the best average due to central focus so in this point of view the median has superiority over the arithmetic mean.

Formulas for Median

The formula for ungrouped data:

    \[  Median\ \widetilde{X} = The\ Value\ of\ \left( \frac{n + 1}{2} \right)nd\ item.\ \]

The formula for grouped data:

    \[ Median\ \widetilde{X} = L + \frac{h}{f}\left( \frac{n}{2} - C \right)\  \]

Example for understanding ungrouped even data:

There are 10 students in a class who got the following marks in the paper of Statistics. Calculate median.

Students12345678910
Marks55454068905040302590

Solution:

Arranging the data

Students12345678910
Marks25304040455055689090

    \[ Median\ \widetilde{X} = The\ Value\ of\ \left( \frac{n + 1}{2} \right)nd\ item.\ \]

    \[ Median\ \widetilde{X} = The\ Value\ of\ \left( \frac{10 + 1}{2} \right)nd\ item.\ \]

    \[ Median\ \widetilde{X} = The\ value\ of\ 5.5th\ item\ \]

    \[ Median\ \widetilde{X} = 5th\ value + 0.5\ (6th - 5th\ value)\ \]

    \[ Median\ \widetilde{X} = 45 + 0.5(50 - 45)\ \]

    \[ Median\ \widetilde{X} = 45 + 2.5\ \]

    \[ Median\ \widetilde{X} = 47.5\ \]

Example for understanding ungrouped odd data:

There are 9 students in a class who got the following marks in the paper of Statistics. Calculate median.

Students123456789
Marks554540689050403025

Solution:

Arranging the data

Students123456789
Marks253040404550556890

    \[ Median\ \widetilde{X} = The\ Value\ of\ \left( \frac{n + 1}{2} \right)nd\ item.\  \]

    \[ Median\ \widetilde{X} = The\ Value\ of\ \left( \frac{9 + 1}{2} \right)nd\ item.\  \]

    \[ Median\ \widetilde{X} = The\ value\ of\ 5th\ item\  \]

    \[  Median\ \widetilde{X} = 45\ \]

Example for understanding grouped odd data:

Find median for the following data:

Weight35–3940–4445–4950–5455–5960–64
Frequency310211514

Solution

MarksfClass BoundariesXC.F
35–39334.5—39.5373
40–441039.5—44.54213
45–492144.5—49.54734
50–541549.5—54.55249
55–59154.5—59.55750
60–64459.5—64.56254
 ∑f=54=n   

    \[ \mathbf{Selection\ of\ Model\ Class}\frac{\mathbf{n}}{\mathbf{2}}\mathbf{=}\frac{\mathbf{54}}{\mathbf{2}}\mathbf{= 26\ falls\ in\ C.F\ of\ 34}\ \]

    \[ \mathbf{So\ Lower\ Class\ Boundary\ L\ is\ 44.5}\  \]

    \[ \mathbf{h\ is\ height\ of\ class\ or\ difference\ of\ values\ of\ class\ mark\ X\ is\ 5}\  \]

    \[ \mathbf{f = 21\ and\ C\ is\ 13}\  \]

    \[ \mathbf{Median\ \ }\widetilde{\mathbf{X}}\mathbf{= L +}\frac{\mathbf{h}}{\mathbf{f}}\mathbf{(}\frac{\mathbf{n}}{\mathbf{2}}\mathbf{- \ C)}\  \]

    \[ \mathbf{Median\ \ }\widetilde{\mathbf{X}}\mathbf{= 44.5 +}\frac{\mathbf{5}}{\mathbf{21}}\left( \frac{\mathbf{54}}{\mathbf{2}}\mathbf{- \ 13} \right)\mathbf{= 47.83}\  \]

Measures of Central Tendency
Measures of Central Tendency

Mode

Mode is also a type of average. In the dataset, mode is the most frequent or most repeated value. Here are some key characteristics or properties of the mode given below:

Properties of Mode

  1. Multiple means in a dataset, the mode can be multiple because there may be more than one most frequent value.
  2. Not affected by extreme values  means the median only focuses on middle values so that it is not affected by extreme outlier values.
  3. Nominal Data means, the mode is useful in nominal data. Such as the most frequent bike model sold in the year.
  4. Not Unique means, there may be no mode in a dataset, if all the observations appear as one time.
  5. Applicability means, the mode is suitable for categorical data or discrete data but it is also used in continuous data.

Formulas for Mode

Formula for ungrouped data

    \[  \mathbf{Mode\ }\widehat{\mathbf{X}}\mathbf{= The\ most\ repeated\ value}\ \]

Formula for grouped data

    \[ \mathbf{Mode\ }\widehat{\mathbf{X}}\mathbf{= L + \ }\frac{\mathbf{fm - f}\mathbf{1}}{\left( \mathbf{fm - f}\mathbf{1} \right)\mathbf{+ (fm - f}\mathbf{2)}}\mathbf{\ \times h}\ \]

Example for understanding ungrouped data:

During a survey of 10 persons about their monthly income, found the following results:

Persons12345678910
Income100</td><td>150</td><td>300</td><td>450</td><td>500</td><td>300</td><td>600</td><td>650</td><td>600</td><td>700</td></tr></tbody></table></figure> <!-- /wp:table -->  <!-- wp:paragraph --> <strong>Calculate Mode</strong> <!-- /wp:paragraph -->  <!-- wp:paragraph --> <strong>Solution</strong> <!-- /wp:paragraph -->  <!-- wp:paragraph --> <strong>Arranging the data</strong> <!-- /wp:paragraph -->  <!-- wp:table {"backgroundColor":"vivid-cyan-blue","textColor":"black","className":"is-style-stripes"} --> <figure class="wp-block-table is-style-stripes"><table class="has-black-color has-vivid-cyan-blue-background-color has-text-color has-background"><tbody><tr><td><strong>Persons</strong></td><td><strong>1</strong></td><td><strong>2</strong></td><td><strong>3</strong></td><td><strong>4</strong></td><td><strong>5</strong></td><td><strong>6</strong></td><td><strong>7</strong></td><td><strong>8</strong></td><td><strong>9</strong></td><td><strong>10</strong></td></tr><tr><td><strong>Income</strong></td><td>100150300300450500600600650700

    \[ \mathbf{Mode\ }\widehat{\mathbf{X}}\mathbf{=}The\ most\ repeated\ value\ \]

    \[ \mathbf{Mode\ }\widehat{\mathbf{X}}\mathbf{=}Most\ repeated\ values\ are\ 300\ and\ 600\ so\ there\ are\ two\ modes\ \]

Example for understanding grouped data:

The weights of the 40 male students at a college are given in the following frequency table:

Weight118–126127–135136–144145–153154–162163–171172–180
Frequency35912542

Calculate mode.

Solution

MarksfClass BoundariesX
118–1263117.5—126.5122
127–1355126.5—135.5131
136–1449135.5—144.5140
145–15312144.5—153.5149
154–1625153.5—162.5158
163–1714162.5—171.5167
172–1802171.5—180.5176
Sum∑f=40  

Selection of Model Class: Maximum frequency is 12 so we shall use this class for data.

    \[ L = 144.5,\ fm = 12,\ f1 = 9,\ f2 = 5\ and\ h\ height\ of\ class = 9\ \]

    \[  \mathbf{Mode\ }\widehat{\mathbf{X}}\mathbf{=}L + \ \frac{fm - f1}{(fm - f1) + (fm - f2)} \times h\ \]

    \[ \mathbf{Mode\ }\widehat{\mathbf{X}} = 144.5 + \ \frac{12 - 9}{(12 - 9) + (12 - 5)} \times 9\ \]

    \[ \mathbf{Mode\ }\widehat{\mathbf{X}} = 144.5 + \frac{27}{10} = 147.2\ \]

Learn Practically: Measures of Central tendency

Harmonic Mean

The harmonic mean is also a type of average that is useful when dealing with rates, ratios, or overall when data has reciprocal values. It is considered an important average in financial or monetary institutions because it reduces overestimation.

Properties of Harmonic Mean

Inequality

Harmonic Mean is always equal to or less than the arithmetic mean.

Reciprocal Relationship

It is used in the data in which values have reciprocal relationships or the values are in the form of rates and ratios.

Limited Applicability

As discussed above, it is used in reciprocal relationships such as time, speed, rates, ratios, etc. Other than this, it is not suitable for all types of data.

Weighted Harmonic Mean

We can assign weights to the data according to their sensitivity or importance and calculate harmonic mean which is called weighted harmonic mean.

No Commutative and associative property

It means that, unlike arithmetic mean, harmonic mean has no commutative property which means there is no need to order the values, and no associative property which means there is no need to group the values.

Formulas of Harmonic Mean

Formula for Harmonic Mean Ungrouped Data

    \[  \mathbf{Harmonic\ Mean\ H.M =}\frac{\mathbf{n}}{\mathbf{\sum}\left( \frac{\mathbf{1}}{\mathbf{X}} \right)}\ \]

Formula for Harmonic Mean Grouped Data

    \[ \mathbf{Harmonic\ Mean\ H.M =}\frac{\mathbf{\sum f}}{\mathbf{\sum}\left( \frac{\mathbf{f}}{\mathbf{X}} \right)}\   \]

Example for understanding ungrouped odd data:

ABC Company has 10-year investment return data in fractions given below:

Year2011201220132014201520162017201820192020
Return0.100.150.080.300.200.250.150.100.250.20

Calculate Harmonic Mean

YearReturn (X)1/X
20110.1010.0
20120.156.7
20130.0812.5
20140.303.3
20150.205.0
20160.254.0
20170.156.7
20180.1010.0
20190.254.0
20200.205.0
Sum 67.2

    \[  \mathbf{Harmonic\ Mean\ H.M =}\frac{\mathbf{n}}{\mathbf{\sum}\left( \frac{\mathbf{1}}{\mathbf{X}} \right)}\ \]

    \[  \mathbf{Harmonic\ Mean\ H.M =}\frac{\mathbf{10}}{\mathbf{67.2}}\mathbf{= 0.1488\ or\ 14.88\%}\ \]

Example for understanding grouped odd data:

The percentage of marks of the 40 male students at a college are given in the following frequency table:

Weight20–3030–4040–5050–6060–7070–8080–90
Frequency35912542

Calculate Harmonic Mean.

Solution

Percentage of MarksfXf/x
20–303250.120
30–405350.143
40–509450.200
50–6012550.218
60–705650.077
70–804750.053
80–902850.024
Sum∑f=40 ∑f/x =0.835

    \[  \mathbf{Harmonic\ Mean\ H.M =}\frac{\mathbf{\sum f}}{\mathbf{\sum}\left( \frac{\mathbf{f}}{\mathbf{X}} \right)}\ \]

    \[ \mathbf{Harmonic\ Mean\ H.M =}\frac{\mathbf{40}}{\mathbf{0.835}}\mathbf{= 47.90}\  \]

Geometric Mean

A geometric mean is also a type of average that is defined as the nth root of the product of the n numbers. It deals best with the numbers that grow differently where growth is exponential, multiplicative, or in ratios. It is denoted by G.M. It is widely used in index numbers or used by price monitoring and controlling authorities, financial institutions, and other sciences such as economics, biology, etc.

Properties of Geometric

Non-Negative: The geometric mean is non negative if the data is non negative.

Symmetry: The geometric mean is independent to the order of the data.

Small Value Data Sensitivity: The geometric mean is sensitive to small data variation that means small variation in the data leads to larger outcome variation because of multiplication.

Positive: If all the data values are positive then geometric mean will be greater than zero.

Inequality: If all the data values are positive, the geometric mean will be less than arithmetic mean but if all values are same than case will be different.

Logarithmic Data: if all the data is transformed into logarithmic values then the geometric mean of the data will be equal to the arithmetic mean of the original values data.

Formulas of Geometric Mean

Formula for ungrouped data

    \[  \mathbf{Antilog\ }\left( \frac{\mathbf{\sum logx}}{\mathbf{n}} \right)\ \]

Formula for grouped data

    \[  \mathbf{Antilog\ }\left( \frac{\mathbf{\sum flogx}}{\mathbf{\sum f}} \right)\ \]

Example for understanding Ungrouped Data

Calculate geometric mean for the data given below:

Year20182019202020212022
Sugar1520102015

Solution:

YearSugar PriceLog X
2018151.1761
2019201.3010
2020101.0000
2021201.3010
2022151.1761
Sum 5.9542

    \[  \mathbf{G.M = \ }\mathbf{Antilog\ }\left( \frac{\mathbf{\sum logx}}{\mathbf{n}} \right)\ \]

    \[  \mathbf{G.M = \ Antilog\ }\left( \frac{\mathbf{5.9542}}{\mathbf{5}} \right)\  \]

    \[  \mathbf{G.M = \ Antilog}\mathbf{(1.19084)}\  \]

    \[  \mathbf{G.M =}\mathbf{15.518}\ \]

Example for understanding grouped Data

The following data given the average monthly wages in $ of workers in Harley & Co.

Wages of workersNo. of WorkersWages of workersNo. of Workers
140-1451170-17510
145-1503175-1808
150-1552180-1855
155-1604185-1904
160-1654190-1952
165-1706195-2001

Calculate Geometric Mean

Solution

Wages of workersNo. of Workers (f)XLog Xflogx
140-1451142.52.15382.1538
145-1503147.52.16886.5064
150-1552152.52.18334.3665
155-1604157.52.19738.7891
160-1654162.52.21098.8434
165-1706167.52.224013.3441
170-17510172.52.236822.3679
175-1808177.52.249217.9936
180-1855182.52.261311.3063
185-1904187.52.27309.0920
190-1952192.52.28444.5689
195-2001197.52.29562.2956
Sum50  111.6276

Solution:

    \[  \mathbf{G.M =}\mathbf{Antilog\ }\left( \frac{\mathbf{\sum flogx}}{\mathbf{\sum f}} \right)\  \]

    \[  \mathbf{G.M = Antilog\ }\left( \frac{\mathbf{111.6276}}{\mathbf{50}} \right)\  \]

    \[  \mathbf{G.M = Antilog\ }\left( \mathbf{2.232552} \right)\ \]

    \[  \mathbf{G.M =}\mathbf{170.83}\  \]

Also read:

Introduction to Statistics

Correlation Coefficient

For practical visual lectures & resources click here

Leave a Comment

Your email address will not be published. Required fields are marked *