Failing at forecasting Sin(x) and Cos(x)

Published on by Joannes Vermorel.

We have been asked if Lokad was capable to forecast a time-series defined by a simple mathematical function, defined for example by f(x) = 1/x Sin(x+1) ? The answer is NO, I will even say that Lokad will miserably fails at forecasting polynomials and trigonometric expressions.

When succeeding on toy maths means failing on real-world

Lokad fails on toy mathematical expressions because it's a totally different situation compared to real-world business time-series such as sales, call volumes or market prices. Actually, any forecasting methods that would be highly efficient on toy mathematical expressions would also miserably fail on real-word data. Unless you've got a good statistical background, this result will probably seems counter-intuitive. If it does not work on simple mathematical expressions, how could it work on highly complex real-world time-series?

Actually, the true explanation is really complicated (if you're ready for that, then go on and start reading the Vapnik books ). Intuitively, the key issue of statistical forecasting is not to build a model that accurately fit your past data, but to build a model that accurately fits the data  that you don't have (i.e. future values). This definition is tricky because how could know that your model is good if the quality criterion is precisely based on the data that does not exists yet.

In the toy maths situation, since you already know the mathematical function, you expect the forecasting algorithm to be able to guess this function too. Yet, it is not mathematically possible because there are an infinite number of mathematical functions that could have produced the very same time-series values. Additionally, have you ever encountered any real-world situation with perfectly clean and de-noised time-series? We don't. Then, if you assume also that noise exists, then there is no reason to even assume that a simple function exists to explain the observed data.

As a conclusion, Lokad does not assume that any toy maths expressions even exists to explain the observed business data, because empirical evaluations indicate that such kind of assumption is totally wrong. As a consequence, Lokad fails on toy math expressions. Yet, this is the price to pay to perform accurate forecasts on real-world business time-series.

Categories: accuracy, forecasting, technical, tips Tags: