I have combined a summary of Chapter 17, 18 into one blog post since they all talk about how to handle seasonal data. Chapter 17 talks about how to smoothing out seasonal data. Chapter 18 talks about how to adjust the data to allow for seasonality. Chapter 16 talks about Average and Range Charts which will be used in Chapter 18.
In Chapter 17, the moving average is discussed as a way to smooth out seasonal data. I think we all know that already. But, there are some points to be noted. the n-period moving average tends to remove n period pattern. For example, a 7-day moving average tends to remove the weekly pattern of the time series. Also, the extreme value may create an artificial spike in moving averages. So, when interpreting an n-period cycle as a change, make sure if it is due to extreme values.
The year-to-Date plot is also discussed in Chapter 17 as a way to smooth out seasonal data. But, the author does not recommend the use of it. A year-to-date value is a cumulative sum from the start of a year as of a particular date. The plot creates a few problems. First, it creates a false illusion. When we compare the year-to-date value in a particular month, say 10/2017 with 10/2018, the reader will interpret it as a monthly comparison while it is actually a year-to-date comparison. Second, the scale of the year-to-date chart makes it hard to spot variations. Since it is an accumulated sum from the start of the year, variations in a particular month are small relative to the total, especial near the end of the year. The author has pointed out a few more problems, but I found that not critical.
Before we move on to talk about Chapter 18, we need to cover Average and Range Charts. Unlike the XmR chart which plots individual values, the Average and Range Chart plots an average of a subgroup of the individual values. For example, the average of the values in the first hour will be plotted as one point. Then the average of those averages will be plotted, and the limit is computed. Like the XmR chart, the author did not explain why a particular scaling factor was used to compute the 3 sigma limit. Users need to refer to a scaling factor table to compute the limit. This is the number one shortcoming of the book.
The advantage of using subgroup averages rather than individual values is that it improves sensitivity or narrow the limit by removing some variations within the subgroup. However, the assumption is that the value collected within the subgroup is homogeneous, that is the variations within a subgroup will represent just the routine and background variation. One misuse of the chart, for example, is using weakly average while there is a clear pattern (sales peak on weekend) within the week.
Chapter 18 is actually a step by step guide to de-sensationalize data:
- Plot your data in a running record. If a repeating pattern is apparent, then go to Step 2. Otherwise, go to step 6.
- Use a few complete cycles of a seasonal pattern to obtain seasonal relatives.
- Seasonal relative = this period’ value/ average value of 2 or more period
- Place seasonal relatives on an Average and Range Chart where each subgroup represents a single “season”. Points outside the limit on the Average Chart will indicate detectable seasonal effects, while points far outside these limits will denote strong seasonal effects. If data show only weak seasonality then go to step 6.
- Estimate the seasonal factors for every period. The seasonal factors for a five-day cycle must sum to 5; for a seven-day cycle must sum to 7, etc.
- Just repeat step 2 to all period and confirm the calculation sum to 1 since they are ratios.
- De-seasonalize baseline and future values by dividing each value by the seasonal factor for that period. Place these deseasonalized baseline values on an XmR chart. Use these values to compute limits for deseasonalized future values. Interpret the chart for deseasonalized future values in the usual manner.
- Deseasonalized value = value/seasonal factor of the period
- Baseline value means the historical value.
- Future value is the prediction/ forecast. This is calculated according to the formula provided. But the author didn’t explain the origin and the rationale. It is attached at the end of the article.
- Place the individual values on an XmR chart. If this chart is useful then interpret it in the usual way. If the limits on the X Chart are so wide that they do not provide any useful information about your process go to step 7.
- When noise dominates a time series it essentially becomes a report card on the past.