Press "Enter" to skip to content

Statistical Techniques in Sales Forecasting

 


This article uses statistical techniques to conduct sales forecasting in a business organization. Sales forecasting predicts future level of sales in a business from past sales data. Business managers rely on this data, which has been kept over a given period of time since it occurred, to predict the future.

What are statistical techniques?

Statistical techniques are a method of sales forecasting that is based on historical sales data, for example the past 12 months.

Statistical analysis investigates past sales data with an attempt to identify key features of the data such as average sales, central point of sales, frequency of sales, range of sales, variation of sales and changes in sales.

Interpreting and analyzing previous statistical data can help business managers to make well-informed assumptions what future sales results might be by comparing them with the past.



Methods of statistical techniques in sales forecasting

There are several statistical techniques that can be used to analyze the past sales data over the last 12 months. The most common thirteen methods of descriptive statistics used in business management have been grouped in five categories:

1. AVERAGE. Shows the center point of sales data:

a. Arithmetic mean – Average monthly sales

b. Median – Middle value of monthly sales

2. FREQUENCY. Shows how often particular sales occurred:

a. Mode – Most frequent monthly sales

b. Frequency data – Most frequent average monthly sales

c. Grouped frequency data – Frequency of monthly sales within different groups of sales data

3. DISPERSION. Shows how widely monthly sales are spread:

a. Range – Difference between the highest and lowest monthly sales

b. Quartiles – Distribution of monthly sales into 4 equal groups within sales data

c. Inter-quartile range – Range of the central 50% of the sales data

4. DEVIATION. Shows distance of monthly sales from the center point (mean):

a. Variance – Spread of monthly sales from the mean

b. Standard deviation – Average difference between monthly sales and the mean

c. Mean deviation – Average of differences between monthly sales and the mean

5. CHANGE. Shows how monthly sales changed over time.

a. Index numbers – Changes in monthly sales

b. Weighted index numbers – Changes in monthly sales when months are of unequal importance

Sales forecasting is done in order to help the business identify in advance any problems and opportunities related to sales of products.

Example of using statistical techniques in a business

Let’s say that your small business generated USD$57,000 of sales revenue in 2021. It was higher comparing with the previous year when your firm brought USD$39,000 in sales revenue in 2020. Over the period of two years, monthly sales were never constant; hence sales revenue was different each month.

The table below shows the exact amounts of sales revenue your business generated each month in 2020 and 2021:

MONTH:SALES REVENUES (2020):SALES REVENUE (2021):
JANUARYUSD$3,500USD$5,000
FEBRUARYUSD$500USD$2,000
MARCHUSD$1,000USD$3,000
APRILUSD$1,500USD$1,500
MAYUSD$3,500USD$6,500
JUNEUSD$5,000USD$7,000
JULYUSD$7,000USD$5,500
AUGUSTUSD$4,500USD$5,000
SEPTEMBERUSD$5,500USD$6,000
OCTOBERUSD$4,000USD$8,000
NOVEMBERUSD$1,500USD$2,500
DECEMBERUSD$1,500USD$5,000
TOTAL:USD$39,000USD$57,000


1. AVERAGE. It shows what the center point of the sales dataset is.

a. Arithmetic mean – Average monthly sales. This simple average shows the average monthly sales revenue in a period of one year. It is the sum of all monthly sales revenues divided by 12 months.

Arithmetic mean = (x1 + x2 + (…) + xn) / n

2020: Arithmetic mean = USD$39,000 / 12 = USD$3,250

2021: Arithmetic mean = USD$57,000 / 12 = USD$4,750

The business generated USD$3,250 per month in sales revenue on average in 2020, and USD$4,750 per month on average in 2021. The mean value of monthly sales revenue increased from USD$3,250 to USD$4,750, or by USD$1,500 per month which means that the business increased its average monthly sales revenue by 46.15%.

2022: If the business increases its average monthly sales revenue next year by 46.15%, the arithmetic mean will be USD$6,942 in 2022.

b. Median – Middle value of monthly sales. Median is the middle value of monthly sales revenue. In our case, there is an even number of items (12 months) in a dataset which means that the median sales revenue will be the midpoint between the two central items (the 6th month and the 7th month in ascending dataset).

Median (even) = (1st middle value + 2nd middle value) / 2

2020: Median (even) = (USD$3,500 + USD$3,500) / 2 = USD$3,500

2021: Median (even 2021) = (USD$5,000 + USD$3=5,000) / 2 = USD$5,000

The median sales revenue generated by the business is USD$3,500 per month in 2020, and USD$5,000 per month in 2021. The median value of monthly sales revenue increased from USD$3,500 to USD$5,000, by USD$1,500 per month, or by 42.86%. As median divides the dataset into two equal parts, the business was generating more than USD$3,500 for 6 months in 2020 (and less than USD$3,500 for another 6 months), and more than USD$5,000 for 6 months in 2021 (and less than USD$5,000 for another 6 months).

2022: If the business increases its average middle value of monthly sales revenue next year by 42.86%, the median will be USD$7,143 in 2022.



2. FREQUENCY. It shows how often particular sales occurred.

a. Mode – Most frequent monthly sales. The mode is the most frequently earned sales revenue in the period of 12 months.

2020: The mode of 2020’s results is USD$1,500 which appeared three times during the year in April, November and December. This means that the business earner USD$1,500 in sales revenue more times than any other value of sales revenue.

2021: The mode of 2021’s results is USD$5,000 which appeared three times during the year in January, August and December. This means that the business earner USD$5,000 in sales revenue more times than any other value of sales revenue.

b. Frequency data – Most frequent average monthly sales. Frequency data is used to show the average monthly sales revenue that appears the most frequently among all sales revenue values.

Mean frequency = ∑fx / ∑f

Where:

x – Monthly sales revenues

f – Frequency for monthly sales revenues

∑x – Sum of monthly sales revenues

∑f – Sum of frequencies for all monthly sales revenues

∑fx – Sum of monthly sales revenues x Frequency

The following table shows frequencies of monthly sales revenues in 2020:

Sales Revenue (x):Frequency (f):fx:
USD$5001500
USD$1,00011000
USD$1,50034500
USD$3,50027000
USD$4,00014000
USD$4,50014500
USD$5,00015000
USD$5,50015500
USD$7,00017,000
∑f = 12∑fx = 39,000

2020: Mean frequency = USD$3,250

The following table shows frequencies of monthly sales revenues in 2021:

Sales Revenue (x):Frequency (f):fx:
USD$1,50011500
USD$2,00012000
USD$2,50012500
USD$3,00013000
USD$5,000315000
USD$5,50015500
USD$6,00016000
USD$6,50016500
USD$7,00017000
USD$8,00018000
∑f = 12∑fx = 57,000

2021: Mean frequency = USD$4,750

The result of mean frequency in 2020 shows that the average monthly sales revenue generated the most often was USD$3,250. The result of mean frequency in 2021 shows that the average monthly sales revenue generated the most often was USD$4,750.

2022: If the business increases its average monthly sales revenue generated the most next year by 42.86%, the mean frequency will be USD$7,143 in 2022.

c. Grouped frequency data – Frequency of monthly sales within different groups of sales data. Grouped frequency data shows how different values of monthly sales revenue appear within different groups of sales revenue in the dataset of 12 months.

Mean frequency = ∑fx / ∑f

The following table shows grouped data in 2020:

Sales Revenue (x):Midpoint:Frequency (f)fx:Cumulative Frequency:
$0 to $2,000$1,0005$5,0005
$2,001 to $4,000$3,0003$9,0008
$4,001 to $6,000$5,0003$15,00011
$6,001 to $8,000$7,0001$7,00012
TOTAL: ∑f = 12∑fx = 36,000

2020: Mean frequency = USD$36,000 / 12 = USD$3,000

The mean frequency in this business in 2020 is USD$3,000. It means that the average monthly sales revenue earned the most frequently is USD$3,000. Modal group is the group with the highest frequency among all groups in the dataset. The modal group in this business in 2020 is the first group with 5 sales revenues between USD$0 and USD$2,000.

The following table shows grouped data in 2021:

Sales revenue (x):Midpoint:Frequency (f)fx:Cumulative frequency:
$0 to $2,000$1,0002$2,0002
$2,001 to $4,000$3,0002$6,0004
$4,001 to $6,000$5,0005$25,0009
$6,001 to $8,000$7,0003$21,00012
TOTAL: ∑f = 12∑fx = 54,000

2021: Mean frequency = USD$54,000 / 12 = USD$4,500

The mean frequency in this business in 2021 is USD$4,500. It means that the average monthly sales revenue earned the most frequently is USD$4,500. Modal group is the group with the highest frequency among all groups in the dataset. The modal group in this business in 2020 is the third group with 5 sales revenues between USD$4,001 and USD$6,000.



3. DISPERSION. It shows how widely monthly sales are spread from the lowest to the highest monthly sales revenue.

a. Range – Difference between the highest and lowest monthly sales. Range is the difference between the highest and the lowest monthly sales revenue over the period of 12 months.

Range = Highest result – Lowest result

2020: Range = USD$7,000 – USD$500 = USD$6,500

2021: Range = USD$8,000 – USD$1,500 = USD$6,500

The range of monthly sales revenue in 2020 is USD$6,500. And, the range of monthly sales revenue in 2021 is USD$6,500. The difference between the highest and the lower monthly sales revenues in 2020 and in 2021 is the same meaning that there is the same spread in monthly sales revenues in 2021 comparing with 2020

2022: The range of monthly sales revenue might also be USD$6,500 next year.

b. Quartiles – Distribution of monthly sales into 4 equal groups within sales data. Quartiles (Q1, Q2, Q3) divide the whole year into four equal parts (quarters) in order to see the distribution of monthly sales revenues. Each of the four sections represents 25% of the observations. The median divides the data into two halves, and then quartiles divide each half into two halves again.

2020: Quartiles in 2020 are as follows:

500 1,000 1,5001,500 1,500 3,5003,500 4,000 4,5005,000 5,500 7,000
Q1
(1,500)
Q2
(3,500)
Q3
(4,750)

The median (Q2) is the middle value between the 6th and the 7th values which is USD$3,500. It means that half of the monthly sales revenues were below USD$3,500 another half of the monthly sales revenues were above $USD3,500. Q1 which is USD$1,500 is the central point between the smallest monthly sales revenue which is USD$500 and the median (Q2) which is USD$3,500. In this case, Q1 is the middle value between the 3rd and the 4th which is USD$1,500. Q3 which is USD$4,750 is the central point between the median (Q2) and the highest monthly sales revenue which is USD$7,000. In this case, Q3 is the middle value between 9th and the 10th value.

Let’s interpret the numbers that represent quartiles.

1. Monthly sales revenue of USD$1,500 (Q1, or lower quartile) represents the first quartile and is the 25th percentile. USD$1,500 is the median of the lower half of the score set in the available data. It tells us that 25% of the monthly sales revenues are less than USD$1,500 and 75% of the monthly sales revenues are greater.

2. Monthly sales revenue of USD$3,500 (Q2, or the median, or medium quartile) represents the second quartile and is the 50th percentile. USD$3,500 is the median of the whole dataset – 50% of the monthly sales revenues are below USD$3,500 and 50% of the monthly sales revenues are above USD$3,500.

3. Monthly sales revenue of USD$4,750 (Q3, or higher quartile) represents the third quartile and is the 75th percentile. USD$4,750 is the median of the higher half of the score set in the available data. It tells us that 75% of the monthly sales revenues are below USD$4,750 and 25% of the monthly sales revenues are above USD$4,750.

2021: Quartiles in 2021 are as follows:

1,500 2,000 2,5003,000 5,000 5,0005,000 5,500 6,0006,500 7,000 8,000
Q1
(2,750)
Q2
(5,000)
Q3
(6,250)

The median (Q2) is the middle value between the 6th and the 7th values which is USD$5,000. It means that half of the monthly sales revenues were below USD$5,000 another half of the monthly sales revenues  were above $USD5,000. Q1 which is USD$2,750 is the central point between the smallest monthly sales revenue which is USD$1,500 and the median (Q2) which is USD$5,000. In this case, Q1 is the middle value between the 3rd and the 4th which is USD$2,750. Q3 which is USD$6,250 is the central point between the median (Q2) and the highest monthly sales revenue which is USD$8,000. In this case, Q3 is the middle value between 9th and the 10th value.

Let’s interpret the numbers that represent quartiles.

1. Monthly sales revenue of USD$2,750 (Q1, or lower quartile) represents the first quartile and is the 25th percentile. USD$2,750 is the median of the lower half of the score set in the available data. It tells us that 25% of the monthly sales revenues are less than USD$2,750 and 75% of the monthly sales revenues are greater.

2. Monthly sales revenue of USD$5,000 (Q2, or the median, or medium quartile) represents the second quartile and is the 50th percentile. USD5,000 is the median of the whole dataset – 50% of the monthly sales revenues are below USD$5,000 and 50% of the monthly sales revenues are above USD$5,000.

3. Monthly sales revenue of USD$6,250 (Q3, or higher quartile) represents the third quartile and is the 75th percentile. USD$6,250 is the median of the higher half of the score set in the available data. It tells us that 75% of the monthly sales revenues are below USD$6,250 and 25% of the monthly sales revenues are above USD$6,250.

c. Inter-quartile range – Range of the central 50% of the sales data. It is the range of monthly sales revenues between the upper quartile (Q3) and the lower quartile (Q1) in the year. It shows the range of the middle 50% of the monthly sales revenues while ignoring the bottom 25% and top 25% of the results.

Inter-quartile range = Upper quartile (Q3) – Lower quartile (Q1)

2020: Inter-quartile range = USD$4,750 – USD$1,500 = USD$3,250

The inter-quartile range in USD$3,250 which is the difference between the upper quartile (Q3) and the lower quartile (Q1). It means that the range of the middle half of monthly sales revenues is USD$3,250.

2021: Inter-quartile range (2021) = USD$6,250 – USD$2,750 = USD$3,500

The inter-quartile range in USD$3,500 which is the difference between the upper quartile (Q3) and the lower quartile (Q1). It means that the range of the middle half of monthly sales revenues is USD$3,500.

2022: If the business increases its inter-quartile range next year by 7.7%, it will be , the median will be USD$3,769.5 in 2022.



4. DEVIATION. It shows distance of monthly sales from the center point (mean) of the sales dataset.

a. Variance – Spread of monthly sales from the mean. Variance measures how monthly sales revenues in a year differ from the arithmetic mean. What is the distance between monthly sales revenues and the center of the dataset? Or, spread of monthly sales revenues from the arithmetic mean.

σ2 = Σ(xi – x)2 / (n -1)

Where:

xi = Monthly sales revenues

x = Arithmetic mean of sales revenues during a year

n = Number of values in the dataset which is 12 months

STEP 1: Find the arithmetic mean by adding up all the months and divide by 12:

2020: Arithmetic mean = USD$3,250

2021: Arithmetic mean = USD$4,750

STEP 2: To find each result’s deviation from the arithmetic mean, subtract the arithmetic mean from each result:

STEP 3: To square each deviation from the mean, multiply each deviation from the mean by itself:

STEP 4: To find the sum of squares, add up all of the squared deviations:

Step 5: To find the variance for this dataset, divide the sum of squares by n – 1:

2020: Variance σ2 (for the sample) = 4,204,545

2021: Variance σ2 (for the sample) = 4,295,455

The variance in 2021 is larger than the variance in 2020 meaning that the monthly sales revenues are farther from the arithmetic mean and far from each other comparing with 2020. In 2020, monthly sales revenues are more concentrated the arithmetic mean.

b. Standard deviation – Average difference between monthly sales and the mean. Standard deviation measures the average deviation of monthly sales revenues from the arithmetic mean. Or, the average difference between the arithmetic mean and all sales revenues generated in a year. Standard deviation shows the typical deviation of monthly sales revenues from the center of the dataset.

Standard Deviation = √σ2

Where:

σ2 = Variance of the dataset

2020: Standard Deviation = √4,204,545 = USD$2,051

2021: Standard Deviation = √4,204,545 = USD$2,073

The typical (average) deviation of monthly sales revenues in 2020 from the arithmetic mean of USD$3,250 is USD$2,051. This means that the average distance of all 12 monthly sales revenues from the arithmetic mean is USD$2,051. The typical (average) deviation of monthly sales revenues in 2021 from the arithmetic mean of USD$4,750 is USD$2,073. This means that the average distance of all 12 monthly sales revenues from the arithmetic mean is USD$2,073.

c. Mean deviation – Average of differences between monthly sales and the mean. Mean deviation measures the average of deviations between monthly sales revenues and the arithmetic mean value. Or, the average of differences of all monthly sales revenues from the arithmetic mean.

STEP 1: Find the arithmetic mean by adding up all the results and divide by the number of results:

2020: Arithmetic mean = USD$3,250

2021: Arithmetic mean = USD$4,750

STEP 2: To find each result’s deviation from the arithmetic mean, subtract the arithmetic mean from each result and express as an absolute value.

STEP 4: To find the sum of absolute mean deviations, add up all of the absolute deviations:

2020: Sum of absolute deviations = USD$20,500

2021: Sum of absolute deviations = USD$20,000

STEP 5: To find the mean deviation for this dataset, divide the sum of all absolute mean deviations by n – 1:

2020: Mean deviation (for the sample) = USD$1,864

2021: Mean deviation (for the sample) = USD$1,818

This mean deviation of USD$1,864 in 2020 is lower than the standard deviation of USD$2,051 in 2020, because the result is less impacted by the extreme values as squaring was not conducted in mean deviation. This mean deviation of USD$1,818 in 2021 is lower than the standard deviation of USD$2,073 in 2021 because the result is less impacted by the extreme values as squaring was not conducted in mean deviation.



5. CHANGE. It shows how monthly sales changed over time.

a. Index numbers – Changes in monthly sales. Index numbers show any change in values of sales revenues between the current year and the previous year. There are two basic methods of calculating simple or unweighted index numbers including Simple Average of Price Relatives Method and Simple Aggregative Method. The year 2020 will be considered as a base year.

MONTH:P0 (2020):P1 (2021):SALES REVENUE RELATIVES (R):
JANUARYUSD$3,500USD$5,000142.86
FEBRUARYUSD$500USD$2,000400.00
MARCHUSD$1,000USD$3,000300.00
APRILUSD$1,500USD$1,500100.00
MAYUSD$3,500USD$6,500185.71
JUNEUSD$5,000USD$7,000140.00
JULYUSD$7,000USD$5,50078.57
AUGUSTUSD$4,500USD$5,000111.11
SEPTEMBERUSD$5,500USD$6,000109.10
OCTOBERUSD$4,000USD$8,000200.00
NOVEMBERUSD$1,500USD$2,500166.67
DECEMBERUSD$1,500USD$5,000333.33
TOTAL:USD$39,000USD$57,000R = 2,267.35

Where:

P01 = Index number

ΣR = Sum of sales revenue relatives

n = Number of values in the dataset which is 12 months

As of: ΣR = (P1 / P0) x 100

P1 = Sales revenue in the year for which index number is to be found (2021)

P0 = Sales revenue in the base year (2020)

A. Simple Average of Price Relatives Method:

Index Number P01 = ΣR/ n

Index Number P01 = 2,267.35 / 12 = 188.95

The index number of 188.95 shows that the monthly sales revenues increased by 88.95% for each month between 2021 compared to 2020.

B. Simple Aggregative Method:

Index Number P01 = (ΣP1 / ΣP0 ) x 100

Index Number P01 = USD$57,000 / USD$39,000x 100 = 146.15

The index number of 146.15 shows that the monthly sales revenues increased by 46.15% in total between 2021 compared to 2020.

c. Weighted index numbers – Changes in monthly sales when months are of unequal importance. Weighted index numbers show any change in values of sales revenues between the current year and the previous year when some months are more important the other months. There are two basic methods of calculating weighted index numbers including Weighted Average of Relatives Method and Weighted Aggregative Method. The year 2020 will be considered as a base year. In the case of your business, let’s assume that the summer months of June, July and August are twice as important as the other nine months, so weights (W) for those three months were increase from 1 to 2.

MONTH:WEIGHTS (W):P0 (2020):P1 (2021):SALES REVENUE RELATIVES (R):R x W:
JANUARY1USD$3,500USD$5,000142.86142.86
FEBRUARY1USD$500USD$2,000400.00400.00
MARCH1USD$1,000USD$3,000300.00300.00
APRIL1USD$1,500USD$1,500100.00100.00
MAY1USD$3,500USD$6,500185.71185.71
JUNE2USD$5,000USD$7,000140.00280.00
JULY2USD$7,000USD$5,50078.57157.14
AUGUST2USD$4,500USD$5,000111.11222.22
SEPTEMBER1USD$5,500USD$6,000109.10109.10
OCTOBER1USD$4,000USD$8,000200.00200.00
NOVEMBER1USD$1,500USD$2,500166.67166.67
DECEMBER1USD$1,500USD$5,000333.33333.33
TOTAL:USD$39,000USD$57,000R = 2,267.35R x W = 2,597.03

A. Weighted Average of Relatives Method:

Weighted Index Number P01 = (ΣR x W)/ ΣW

Weighted Index Number P01 = 2597.03 / 15 = 173.14

The index number of 173.14 shows that the monthly sales revenues, when different months are having different importance throughout the year, increased by 73.14% for each month between 2021 compared to 2020.

In summary, different methods of statistical techniques in sales forecasting based on historical sales data can help to analyze the past sales in order to make assumptions for the future. It helps to identify key features of the data such as average sales, central point of sales, frequency of sales, range of sales, variation of sales and changes in sales to help business managers make well-informed predictions regarding what future sales results might be referring to the past.