Measurements of central tendency - Maths & Stats - Statistics - The Library at Leeds Beckett University

Learning Outcomes

If you work through this section you should be able to:

Understand the key concepts of central tendency.
Be able to calculate mean, median, and mode.
Understand the differences between mean, median, and mode.

The central tendency of a distribution is an estimate of the "centre" of a distribution of values.

There are three major types of estimates of central tendency: mean, median, and mode. Under different conditions, some measures may be more suitable to represent a set of values than others.

In the following sections, we will look at the mean, mode and median, and learn how to calculate them and under what conditions they are most appropriate to be used.

The mean is commonly referred to as the average. It is calculated as the sum of all the values divided by the number of values.

The formula is: $\bar{x} = \frac{\sum x}{n}$

Example

A group of students achieved the following scores on a test: 5, 6, 8, 9, 5, 2, 7.

The mean is

$\bar{x} = \frac{\sum x}{n} = \frac{5 + 6 + 8 + 9 + 5 + 2 + 7}{7} = \frac{42}{7} = 6$

If the data is summarised in a frequency distribution, the formula is:

$\bar{x} = \frac{\sum f x}{\sum f}$

Example

A group of 54 students achieved the following scores on a test:

Grade	Number of students achieving grade (f)	fx
0	1	0
1	0	0
2	2	4
3	4	12
4	6	24
5	12	60
6	18	108
7	7	49
8	3	24
9	1	9
10	0	0
	$\sum f$ = 54	$\sum f x$ = 290

The mean is

$\bar{x} = \frac{\sum f x}{\sum f} = \frac{290}{54} = 5.37$

If the data is summarised in a grouped frequency distribution, x refers to the mid-points of the groups.

Example

The frequency table shows the time, in minutes, that a group of 35 students spent on their homework:

Time spent	Mid-point(x)	Number of students (f)	fx
0-15	7.5	2	15
16-30	23	3	69
31-45	38	15	570
46-60	53	10	530
61-75	68	4	272
76-90	83	1	83
		$\sum f$ = 35	$\sum f x$ = 1539

The mean is

$\bar{x} = \frac{\sum f x}{\sum f} = \frac{1539}{35} = 43.97$

The mean is usually suitable for discrete or continuous data. An important property of the mean is that the calculation uses all of the data values, therefore it could be truly representative of the whole data set under certain conditions. However, the mean can be highly influenced by extreme values, then the median may be a better indicator.

The median is the middle value of a data set when arranged in order. Then one half of the data set is less than the median, and the other half includes all values greater than the median.

For example, in a test, a group of 5 pupils scored the following marks:

2, 5, 3, 8, 6

We need to rearrange the marks into order of magnitude in order to find the median (smallest first):

2, 3, 5, 6, 8

The middle mark is 5, therefore the median is 5.

If there is an odd number of data values, the median is exactly the middle value for a set of ordered data. If there is an even number of data values, the median is the average of the two middle values.

For example, in a test, a group of 6 pupils scored the following marks:

2, 5, 3, 8, 6, 7

The marks are rearranged as below:

2, 3, 5, 6, 7, 8

The two middle values are 5 and 6.

The median is (5+6)/2=5.5.

The formula is:

Median $= \{\begin{array}{l} {(\frac{n + 1}{2})}^{th} term & if n is odd \\ \frac{1}{2} [{(\frac{n}{2})}^{th} term + {(\frac{n}{2} + 1)}^{th} term] & if n is even \end{array}$

If the data is summarised in a frequency distribution, you need to work out cumulative frequency first.

Example

A group of 54 students achieved the following scores on a test:

Grade	Number of students achieving grade (f)	Cumulative frequency
0	1	1
1	0	1
2	2	3
3	4	7
4	6	13
5	12	25
6	18	43
7	7	50
8	3	53
9	1	54
10	0	54
	$\sum f$ = 54

There is an even number of values in the data set, 54, so the median is the average of the two middle values. Using $\frac{54}{2}$ and $(\frac{54}{2}) + 1$ identifies that the 27th and 28th data values are the middle two values. The cumulative frequency column shows that the 26th to 43rd data values are 6, therefore both the 27th and 28th data values are 6. Therefore, the median $= \frac{1}{2} (6 + 6) = 6$ .

If the data is summarised in a grouped frequency distribution, apply the formula below to calculate an estimated median:

Estimated median $= b + \frac{(n / 2) - f_{b}}{f_{m}} \times w$

where:

$b$ is the lower class boundary of the group containing the median
$n$ is the total number of values
$f_{b}$ is the cumulative frequency of the groups before the median group
$f_{m}$ is the frequency of the median group
$w$ is the group width

Example

The frequency table shows the time, in minutes, that a group of 35 students spent on their homework:

Time spent	Number of students (f)	Cumulative frequency
0-15	2	2
16-30	3	5
31-45	15	20
46-60	10	30
61-75	4	34
76-90	1	35
	$\sum f$ = 35

There is an odd number of values in the data set, 35, so the median is the middle value. Using $\frac{35 + 1}{2}$ identifies the 18th data value as the middle value. The cumulative frequency column shows that the 6th to 20th data values are in the 31-45 interval; therefore, the median interval is 31-45.

Estimated median = 30.5 + $\frac{(35 / 2) - 5}{15} \times 15$ = 43

The median is particularly useful when there are extreme values, as the middle value is not affected by exceptional cases. However, when there is a large sample size without extreme values, the mean is usually a better indicator of central tendency.

The mode of a data set is the most frequently occurring value(s). If no data value appears more frequently than any other values in the set, then there will be no mode. A data set can have more than one mode.

Examples

1. If a group of students achieved the following scores on a test:

5, 6, 8, 9, 5, 2, 7,

5 occurs twice, so the mode is 5.

2. If a group of students achieved the following scores on a test:

8, 9, 10, 10, 10, 11, 11, 11, 12, 13,

both 10 and 11 occur three times, so the modes are 10 and 11.

If the data is summarised in a frequency distribution, it makes it easier to identify the mode.

Example

A group of 54 students achieved the following scores on a test:

Grade	Number of students achieving grade (f)	Cumulative frequency
0	1	1
1	0	1
2	2	3
3	4	7
4	6	13
5	12	25
6	18	43
7	7	50
8	3	53
9	1	54
10	0	54

18 students achieved Grade 6. The number of students achieving grade 6 is larger than any other grade, therefore the mode is 6.

If the data is summarised in a grouped frequency distribution, the class or category which occurs most frequently is called modal class or modal category.

Example

The frequency table shows the time, in minutes, that a group of 35 students spent on their homework:

Time spent	Number of students (f)
0-15	2
16-30	3
31-45	15
46-60	10
61-75	4
76-90	1

Minutes 31-45 is the class with the highest frequency, therefore the modal class is 31-45.

The mode is easy to calculate and can be used for both qualitative data and quantitative data. It is not affected by extreme values, but does not use all the values in a data set.

Activity

You can download a version of this Measurements of central tendency activity in Word format:

Measurements of central tendency activity

Skills for Learning workshops and events are mixture of live sessions and on-demand recordings. This includes guidance on Finding Information.