Most engineers will tell you that:
You can't optimize what you don't measure
Turns out that forecasting is no exception. Measuring forecast accuracy is one of the few cornerstones of any forecasting technology.
A frequent misconception about accuracy measurement is that Lokad has to wait for the forecasts to become past, to finally compare the forecasts with what really happened.
Although, this approach works to some extend, it comes with severe drawbacks:
- It's painfully slow: a 6 months ahead forecast takes 6 months to be validated.
- It's very sensitive to overfitting. Overfitting should not to be taken lightly, and it's one the few thing that is very likely to wreak havoc in your accuracy measurements.
Measuring the accuracy of delivered forecasts is a tough piece of work for us. Accuracy measurement accounts for roughly half of the complexity of our forecasting technology: the more advance the forecasting technology, the greater the need for robust accuracy measurements.
In particular, Lokad returns the forecast accuracy associated to every single forecast that we deliver (for example, our Excel-addin reports forecast accuracy). The metric used for accuracy measurement is the MAPE (Mean Absolute Percentage Error).
In order to compute an estimated accuracy, Lokad proceeds (roughly) through cross-validation tuned for time-series forecasts. Cross-validation is simpler than it sounds. If we consider a weekly forecast 10 weeks ahead with 3 years (aka 150 weeks) of history, then the cross-validation looks like:
- Take the 1st week, forecast 10 weeks ahead, and compare results to original.
- Take the 2 first weeks, forecast 10 weeks ahead, and compare.
- Take the 3 first weeks, forecast 10 weeks ahead, and compare.
The process is rather tedious, as we end-up recomputing forecasts about 150 times for only 3 years of history. Obviously, cross-validation screams for automation, and there is little hope to go through such a process without computer support. Yet, computers typically cost less than business forecast errors, and Lokad relies on cloud computing to deliver such high-intensive computations.
Attempts to "simplify" the process outlined are very likely to end-up with overfitting problems. We suggest to say very careful, as overfitting isn't a problem to be taken lightly. In doubts, stick to a complete cross-validation.