Press "Enter" to skip to content

Descriptive Statistics: DISPERSION (3/5)

 


Dispersion shows how widely the data are spread.

Dispersion measures the spread of dataset and shows data distribution into different intervals. It indicates the lowest value and the highest value as well as every value that falls in between. By knowing the spread of items, a manager is able to conclude whether something is nearer to the bottom, or in the lower half, or closer to the top, or in the higher half.

The three most frequently used methods to measure dispersion are:

1. RANGE

2. QUARTILES

3. INTER-QUARTILE RANGE



1. RANGE

What is range? Range is the difference between the highest and the lowest items in a database.

How to calculate range? Firstly, arrange all the items from the dataset in the ascending order. The formula for range is:

Range = Highest result – Lowest result

Uses of range in business management: It is widely used in many business situations to measure dispersion between different data. When the spread of items in the dataset is large, the arithmetic mean is not very representative because a large spread indicates large differences between individual values.

Example 1: The table shows the numbers of hours that survey respondents watched television last month:

Number of hours per week (in ascending order):
Last month:1, 2, 3, 3, 4, 5, 5, 8, 10, 11, 12, 13, 13, 13, 14, 15, 15

Range = 15 -1 = 14

The range of time surveyed people spent on watching television was 14 hours.

Example 2: The table below shows all the shoes sold in the shoe shop last month:

Shoe sizes sold (in ascending order):
Last month:36, 36, 36, 36, 38, 40, 41, 41, 42, 42, 44, 45

Range = 45 – 36 = 9

The range of shoe sizes sold in the shoe shop was 9.

Example 3: A convenience store records the number of sales of Pepsi bottles at its store in the last week:

Number of bottles sold (in ascending order):
Last week:15, 20, 25, 30, 31, 37, 40

Range = 40 – 15 = 25

The range of bottles of Pepsi sold in the store was 25.

Advantages of range: It is very easy to calculate and understand. Range sets the boundaries of the dataset on both ends.

Disadvantages of range: It can be distorted by extreme result(s), therefore it would be of no practical use to managers. To make range results more realistic, the highest and the lowest results can be excluded from calculations.



2. QUARTILES

What are quartiles? Quartiles (Q1, Q2, Q3) divide a dataset into four equal parts (quarters) in order to see the distribution of results. Each of the four sections represents 25% of the observations. The median divides the data into two halves, and then quartiles divide each half into two halves again.

How to calculate quartiles? Firstly, place all the results in the dataset into ascending order. Secondly, find the median which is the middle value. The median divides the dataset into half (50% of the measurement lies below the median and 50% lies above the median). Then, find the median of the first half to break down the first 50% of the data into the first two quarters. At the same time, find the median of the second half to break down the second 50% of the data into two more quarters.

  1. First Quarter: It includes items from the smallest number up to Q1.
  2. Second Quarter: It includes items from Q1 up to the median (Q2).
  3. Third Quarter: It includes items from the median (Q2) up to Q3.
  4. Fourth Quarter: It includes items from Q3 up to the highest number.

Example 1: The following numbers shows the distribution of employee appraisal results in ascending order. There are currently 23 full-time production workers:

40, 46, 52, 62, 65,66,69, 70, 71, 72, 73,74,75, 76, 77, 81, 82,83,86, 87, 90, 95, 99
Q1Q2Q3

The median (Q2) is the 12th value (24/2) which is 74. It means that half of the workers scored below 74 on the employee appraisal and another half of the workers scored above 74. Q1 which is 66 is the central point between the smallest value which is 40 and the median (Q2) which is 74. In this case, Q1 is the 6th value. Q3 which is 83 is the central point between the median (Q2) and the highest value which is 99. In this case, Q3 is the 18th value.

Let’s interpret the numbers that represent quartiles.

  1. A score of 66 (Q1, or lower quartile) represents the first quartile and is the 25th percentile. 66 is the median of the lower half of the score set in the available data – the scores from 40 to 65. It tells us that 25% of the scores are less than 66 and 75% of the scores are greater.
  2. A score of 74 (Q2, or the median, or middle quartile) represents the second quartile and is the 50th percentile. 74 is the median of the whole dataset – 50% of the scores are below 74 and 50% of the scores are above 74.
  3. A score of 83 (Q3, or higher quartile) represents the third quartile and is the 75th percentile. 83 is the median of the higher half of the score set in the available data – 75% of the scores are below 83 and 25% of the scores are above 83.

Uses of quartiles in business management: Quartiles enable business managers to group the items into four defined parts to find out the distribution of all the data. Distributing the items into four different groups allows to rank the items and see how each item compares to the entire set of observations. Quartiles can help the business to determine its relative ranking in the industry according to sales revenue, profit, market share, market capitalization, etc. 25% of the items are less than the lower quartile, 50% are less than the median, and 75% are less than the upper quartile. Also, it can help with incentivized employees giving larger salaries, bonuses and pay increases to those who are in the top quarter (TOP25), and providing additional training to underperforming workers who are in the bottom quarter (BOTTOM25).

Advantages of quartiles: Quartiles allow to measure the spread of items above and below the mean by dividing the dataset into four equal groups. Therefore, quartiles help to determine how good the result is – whether something was a success, the mediocre results or a failure.

Disadvantages of quartiles: Quartiles need to be calculated differently depending whether the dataset includes an odd number of items or an even number of observations.

TIP: Use a QUARTILE function in Microsoft Excel to calculate quartiles in very large datasets.


3. INTER-QUARTILE RANGE

What is inter-quartile range? It is the range of items between the upper quartile (Q3) and the lower quartile (Q1) in the dataset. It shows the range of the middle 50% of the data while ignoring the bottom 25% and top 25% of the results.

How to calculate inter-quartile range? To show the middle spread of the central 50% of a dataset, simply calculate as the difference between the third and first quartile. The formula for inter-quartile range is:

Inter-quartile range = Upper quartile (Q3) – Lower quartile (Q1)

Example 1: The following numbers shows the distribution of employee appraisal results in ascending order. There are currently 23 full-time production workers:

40, 46, 52, 62, 65,66,69, 70, 71, 72, 73,74,75, 76, 77, 81, 82,83,86, 87, 90, 95, 99
Q1Q2Q3

The inter-quartile range in 17 (83-66) which is difference between the upper quartile (Q3) and the lower quartile (Q1) is 17. It means that the range of the middle half of the data is 17.

Uses of inter-quartile range in business management: It measures the variability around the median number.

Advantages of inter-quartile range: While the range does not ignore the extreme values on both ends of the dataset which can distort the results giving misleading picture, the inter-quartile range can overcome the problem. As it ignores the lowest 25% of results and highest 25% of results, it is less likely to be distorted.

Disadvantages of inter-quartile range: Data must be ordered from the lowest item to the highest item which can be a problem for very large datasets.

This article showed in details how numerical data might be summarized using the statistical techniques for calculating dispersion. While the range shows the different between the highest and the lower value in the data sample, quartiles divide a dataset into four equal parts to see the distribution of results and the inter-quartile range shows the middle spread of the central 50% excluding the bottom 25% and top 25% of the results.

You can find out more about statistical analysis of market research results here.