If you work through this section you should be able to:
Dispersion refers to the spread of the values around the central tendency. There are two common measures of dispersion, the range and the standard deviation.
In the following sections, we will look at the range, inter-quartile range, standard deviation, and variance, and learn how to calculate them.
The range is the simplest measure of dispersion. It is calculated by subtracting the smallest value from the largest value in a data set. The formula is:
Range = largest value − smallest value
A group of students achieved the following scores on a test: 5, 6, 8, 9, 5, 2, 7.
The range is 9 − 2 = 7.
When comparing two sets of data, a data set with a larger range is more spread out; a data set with a smaller range shows less variation among its values.
Two groups of students achieved the following scores on a test.
Mean | Range | ||||||
---|---|---|---|---|---|---|---|
Group One | 4 | 5 | 5 | 6 | 5 | 5 | 2 |
Group Two | 1 | 9 | 8 | 3 | 4 | 5 | 8 |
The two groups share the same mean, 5, but the dispersion of the scores for Group Two (range=8) is much larger than Group One (range=2).
The range is easy to calculate and interpret, but it is very sensitive to extreme values. Only the largest value and the smallest value are used in its calculation. Therefore, it is not seen as a very meaningful measure of dispersion.
The quartiles divide an ordered data set into four equal sections or quarters.
The lower quartile (Q1) is a quarter of the way through the distribution and is the value below which the lowest 25% of the data values lie.
The second quartile (Q2) is the median, which divides the data set into two halves.
The upper quartile (Q3) is three quarters of the way through a distribution and is the value above which the highest 25% of the data values lie.
The inter-quartile range is calculated by subtracting the lower quartile from the upper quartile. The formula is:
Inter-quartile range = upper quartile − lower quartile
1. A group of 7 students spent the following minutes on their homework:
15, 55, 25, 18, 36, 48, 40.
a) First the data needs to be arranged in numerical order from the smallest value to the largest value, as below:
15, 18, 25, 36, 40, 48, 55.
b) There is an odd number of values in the data set, so the median is the middle value, 36.
Q2 = 36
c) Using the median to divide all the data into two halves, the first half of the data set includes:
15, 18, 25.
The median for the first half of the data set is 18.
Q1 = 18
d) The second half of the data set includes:
40, 48, 55.
The median for the second half of the data set is 48.
Q3 = 48
e) Therefore,
interquartile range = Q3 - Q1 = 48 – 18 = 30.
2. A group of 8 students achieved the following scores on a test:
5, 6, 8, 9, 5, 2, 7, 4.
a) First the data needs to be arranged in numerical order from the smallest value to the largest value, as below:
2, 4, 5, 5, 6, 7, 8, 9.
b) Then identify the median: there is an even number of values in the data set, so the median is the average of the middle two values, 5 and 6.
Q2 = (5 + 6)/2 = 5.5
c) Using the median to divide all the data into two halves, the first half of the data set is:
2, 4, 5, 5.
There is an even number of values in the first half of the data set, so the Q1 is the average of 4 and 5.
Q1 = (4 + 5)/2 = 4.5
d) The second half of the data set is:
6, 7, 8, 9.
Again there is an even number of values in the second half of the data set, so the Q3 is the average of 7 and 8.
Q3 = (7 + 8)/2 = 7.5
e) Therefore,
interquartile range = Q3 - Q1 = 7.5 - 4.5 = 3.
The inter-quartile range is a useful measure of dispersion given that it is not too sensitive to extreme values. However, not all data values are used in its calculation.
You may also find the Box and Whisker Charts section useful which show the quartiles pictorially.
The standard deviation shows how widely dispersed the values in a data set are around the mean. A larger standard deviation means that the data values in the data set are very spread out, while a smaller standard deviation means that the data are quite concentrated around the mean.
The formula is:
SD =
where:
A group of students achieved the following scores on a test: 5, 6, 8, 9, 5, 2, 7.
The mean is
The standard deviation is
SD === 2.31
If the data is summarised in a frequency distribution, the formula is:
SD =
A group of 20 students achieved the following scores on a test:
Grade |
Number of students achieving grade (f) |
fx |
Deviation () |
Squared deviation |
× frequency |
---|---|---|---|---|---|
1 |
2 |
2 |
-2 |
4 |
8 |
2 |
3 |
6 |
-1 |
1 |
3 |
3 |
10 |
30 |
0 |
0 |
0 |
4 |
3 |
12 |
1 |
1 |
3 |
5 |
2 |
10 |
2 |
4 |
8 |
= 20 |
= 60 |
=22 |
The mean is
= = = 3
The standard deviation is
SD = = 1.08
Compared to range and inter-quartile range, the standard deviation uses all the values in a data set. If the data set is normally distributed, the following rules apply.
You can download a version of this Measurements of central dispersion activity in Word format:
{{You can add more boxes below for links specific to this page [this note will not appear on user pages] }}
Skills for Learning workshops and events are mixture of live sessions and on demand recordings.