How to mitigate overfitting when forecasting demand?
Our video about overfitting received its share of attention since it was published 5 years ago, that is, a half a century ago for a startup like Lokad. Years later, we have made a lot of progress but overfitting remains a tough matter.
In short, overfitting represents the risk that your forecasting model is accurate only at predicting the past, and not at predicting the future. A good forecasting model should be good at predicting the data you do not have.
A common misconception is that there is no other way to assess a model except by checking its performance against the historical data. True, the historical data must be leveraged; however, if there is one insight to remember from the Vapnik-Chervonenkis theory is that all models are not born equal: some models carry a lot more structural risk - a concept part of the theory - than others. Entire class of models can be either considered as safe, or unsafe. from a pure theoretical perspective, which turns into very real accuracy improvements.
Overfitting issues cannot be avoided entirely, but they can be mitigated nonetheless.
There are several ways to mitigate overfitting. First, the one rule you should never break is: a forecasting model should never be assessed against the data that has been used to train the model in the first place. Many toolkits regress models on the entire history in order estimate the overall fit afterward. Well, as the name suggests, such a process gives you the fit but nothing more. In particular, the fit should not be interpreted as any kind of expected accuracy, it is not. The fit is typically much lower than the real accuracy.
Second, one simple way of mitigating ovefitting is to perform extensive back-testing. In practice, it means your process needs to split the input dataset over dozens - if not hundreds - of incremental date thresholds, and re-train all the forecasting models and re-assess them each time. Backtesting requires a lot of processing power. Being able to allocate the massive processing power it takes to perform extensive back-testing was actually one of the primary reasons why Lokad migrated toward cloud computing in the first place.
Third, even the most extensive back-testing is worth little if your time-series are sparse in the first place, that is, if time-series represent items of low sales volumes. Indeed, as most of the data points of the time-series are at zero, the back-testing process learns very little by iterating over zeroes. Unfortunately for commerce, about 90% of the items sold or serviced have a demand history that is considered as sparse from a statistical viewpoint. In order to address this problem, the performance of the model should be assessed with a multiple time-series viewpoint. It’s not the performance of the model over a single time-series that matters, but its performance over well-defined clusters of time-series. Then, every becomes a balance between the local vs the global empirical accuracy when it comes to select the best model.
Any question? Don’t hesitate to post them as comments.
Reader Comments (2)
Shaun, Backtesting is a very fundamental tool in statistics. It has been used for decades in virtual all domains: finance, meteorology, transports, energy, … Backtesting does not require “trust”, it’s about the only solid methodology known to us when it comes to assessing the accuracy of a predictive process, and there is a mountain of literature on the subject. For a gentle introduction, you should really start reading “Elements of statistical learning” at http://statweb.stanford.edu/~tibs/ElemStatLearn/
4 years ago | Joannes Vermorel
Joannes, This is extremely interesting. However, I am cringing about how I will explain this to clients. So if I understand, one can perform backtesting in Lokad, however if back testing – which upon reading is a form of machine learning – does one not have a significant challenge in explaining what was done? If it becomes an advanced algorithm, then mustn’t one basically trust that it works? Obviously this is a main focus for Lokad so all of you there have given this a lot of thought. However, I do think there is a limited corporate audience to understand the specific details of how it works. So does Lokad rely upon the results of research that Lokad has performed on previous companies – showing the net benefit of this back testing?
4 years ago | Shaun Snapp