Entries in accuracy (16)
Errors are part of the (web) service
Since its launch in November 2006, Lokad has been providing a programmatic access to its forecasts through the Lokad Web API. Since the very first day, we have done our best to ensure the best forecasting accuracy. Yet, the future may not always be accurately predicted based on the sole historical data.
It is far better to foresee even without certainty than not to foresee at all. Henri Poincaré in The Foundations of Science, 1946
This situation is typically dealt with through an estimation of the forecast error: in addition to the forecast itself, an estimate of the forecast error is also computed. For example, this is the suggested approach for safety stock calculation.
Thus, we have decided to extend the Web API with estimated forecast errors. For any forecasting task, it is now possible to retrieve the Mean Absolute Percentage Error, see GetMape and GetMapes web methods. The new version of the Web API stays fully compatible with the existing applications.
Dealing with exceptional sales
Lately, a few users have been asking how does Lokad handle an exceptional sales that won't be happening again anytime soon? This issue is usually known as outliers management: how to deal with data points that are completely off compared to the other points.
Lokad is only looking at historical data, and since we don't know that the exceptional sales aren't going to happen again (expert knowledge), this event is going to be interpreted as "real" sales, leading to potentially over-estimated sales forecasts. This situation is a typical case of demand forecasts vs. sales forecasts.
At this point, the natural solution would be to give the user the possibility to adjust the historical data to truly reflects the demand. We are currently working on a solution dedicated to safety stock calculations that would handle this sort of situations. Stay tuned.
Choosing the right forecast period
Forecasting consists in producing figures that are supposed to reflect the future. But those figures depend heavily on the period chosen for data aggregation. Lokad supports the most frequently used periods: quarter-hour, half-hour, hour, day, week, month, quarter, semester, year ...
Intuitively, the longer the considered period, the easier it is to make an accurate forecast. For example, yearly forecasts eliminate seasonal variations. Although a short forecasting period might provide a false sense of accuracy (ex: forecasting daily candy sales over the next two years) whereas a large period might be unsuited to take operational decision (ex: trying to optimize the weekly worker schedules of the candy manufacturing unit based on yearly forecasts).
A careful choice of the forecasting period is essential to make the most of forecasting. Yet, surprisingly, this question is frequently left mostly unanswered in books treating the subject of forecasting for practitioners (usually focusing on sales or demand forecasting). Typical answers are most of the manufacturing industry is using monthly forecasts and many large retailers are using weekly forecasts.
Yet, simple assumptions can lead to practical quantitative clues to make this choice. If we just assume that forecast errors follow a normal distribution, then expected error increase when switching to a shorter period is
- year → month: √(12/1) ≈ 3.5 (i.e. error multiplied by 3.5)
- month → week: √(31/7) ≈ 2.1 (assuming a month with 31 days)
- week → day: √(6/1) ≈ 2.5 (assuming 6 business days per week)
- hour → quarter-hour: √(4/1) = 2
Although, the normal distribution assumption is usually not exactly verified, those figures are quite representative of most situations. Those figures can be used to evaluate the opportunity to change the forecasting period if the forecast error is too high or if the forecast period is too long.
Past stock-outs may generate future stock-outs
Accurate forecasts are critical because each extra-percent of forecast error comes with a steep price, literally. Indeed, when the costs associated to forecast errors are usually supra-linear, or put more simply, the costs associated to forecast errors increase much faster than the error itself.
As a simple example, a greater forecast error increases the need for safety stocks and thus working capital requirements. But if the working capital goes too high, bank interests start to rise, leading to even more expensive safety stocks.
But there are also more subtle negative consequences: past forecast errors may lower future accuracy. Indeed, historical demand itself is rarely known, instead, we usually rely on the historical sales data as an efficient approximation of the demand. Yet, this approximation is not perfect. For example, a stock-out prevents any sale to be made for a particular product. Yet, in case of a stock-out, zero sale does not equate zero demand.
For statistical forecasting algorithms, that relies on time-series analysis, it can be quite hard, using the sole sales data, to distinguish a zero sale caused by a stock-out from a zero demand. As a result, a lot of stock-outs (as they lead to lower sales) can be statistically interpreted as a lower demand; which, eventually, generates even more stock-outs.
Increasing your forecast accuracy now is one of the key to increase the forecast accuracy tomorrow. Accurate forecasting is not the destination but the journey.
The No1 never asked question about forecasting
There is a question of utmost importance when it comes to statistical forecasting: what is error function used during the learning process? Indeed, it's based on the error function that you can evaluate whether a forecast is good or bad. It's also the very same error function that drives your learning process when building a statistical model.
Finding an error function isn't hard. Quite the opposite, there are plenty of error functions available: Mean Squared Error (MSE), Mean Absolute Deviation, Median Absolute Deviation Error (MAD), Mean Absolute Percentage Error (MAPE). ...
Yet, in almost 1 year of existence for Lokad, the question of the choice of error function has never been raised by any customer. Well, this situation is very natural, as Lokad is precisely taking in charge the whole forecasting process.
For those who might be interested, the answer is, unfortunately, not simple. Lokad using several error functions depending on the context. We are often using bounded version of the MAPE (identical to the classical MAPE, but the function gets upper bounded to 1) for the benchmarks. The upper bound is used to make the process more robust against pathological time-series that would have had huge errors otherwise.
Yet, if the data is not too noisy (i.e. not too much outliers), then we are often using the MSE function which tends to be much more practical from a computational viewpoint.
Perceived quality issues in forecasting
Software quality is a challenge. As a software developer, you have to make sure that your code is going to run in all sort of unexpected situations, but still it has to work. Plenty of methods, tools and processes are available to improve quality. Lokad makes an intensive usage of those.
But there is another aspect, it's the perceived quality: the user's opinion depends on many (many) purely subjective aspects of your product. For B2C, product design and aesthetic are probably among the top factors in perceived quality (think iPod).
So far, so good, as a product developer, it simply means that you need to invest a certain amount of efforts in your product design. But what happens when perceived quality conflicts actual quality? (think to the devil's method to change your iPod battery).
In the case of Lokad, where we are delivering time-series forecasts, the situation is even more complicated because statistical forecasting is just so not intuitive.
For example, we have many customers who actually try out a couple of points to see what they get. Yet, this is really not the way to go to evaluate Lokad. The right way involves a proper training dataset and a testing dataset of your own actual business data (plus many other considerations, but it's beyond the scope of this post).
Unfortunately, for us, many customers are judging Lokad on the forecast they get after entering a dozen of points generated by some function like Cos(x) or Sin(x). Actually, it would be possible to hard-code a few heuristics in Lokad just to detect those attempts (and their underlying mathematical functions). But, by doing so those heuristics would actually decrease the overall accuracy for the users having real business data in their accounts.
Then, we have another issue: our forecasts are not exactly real time. You can retrieve your forecasts any time, but if you retrieve your forecasts 0.1s after finishing the upload of your data (through our Web Services API), Lokad won't have had the time to try complex/advanced statistical models. As a result, you will get real-time but naive forecasts.
Lokad does its best to provide an end-to-end forecasting service, but to some extend it can't escape the Law of Leaky Abstraction: in order to make the most of Lokad, one needs to understand, at least little, how statistical forecasting works and how the constraints are handled by Lokad.
Would you pay for moving average?
Many customers are asking THE question: which forecasting models are you using? Indeed, our technology page isn't very specific on the subject.
Disclaimer: I am not really going to answer this question in this post, so please, don't be too disappointed.
Actually, there are two main reasons why we do not disclose this information
- it's a proprietary technology (like Google search).
- it's a super counter-intuitive technology.
Yet, in order to clarify the situation, I can say that Lokad is not using any silver-bullet forecasting model (i.e. a super-model that would fit all situations), but tons of models instead.
For example, we do use simple moving average (among others naturally) which is probably the most naive forecasting method. Intuitively, simple moving average says: if you want to know the total sales next month, just take the average monthly sales over the last 6 months.
In the first sight, it might appear shocking to sell forecasts, if, in the end, it's moving average model that gets used. But, in my opinion, it is not.
Indeed, producing forecasts through a statistical model is only the last step of a complicated process. Before that, you need to choose the model to be used. And, this step is very complicated.
Thus, Lokad can indeed produce a forecast based on a moving average model, if we detect the moving average model as being the best available model for this particular situation (in practice, this situation arises for very short or very erratic time-series).
Batteries Included. Python motto.
But the key difficulty of the problem is to understand why the moving average model has been selected. With regular statistical packages, choosing the right model is the user's burden. With Lokad, it's part of the service.
Ps: there are more complex variant of the moving average where decreasing coefficients (also called weights) are applied to the time-series; but it's beyond the scope of the discussion.
Business forecasting and its practical application
Practical Business Forecasting is a relatively recent development.
Businesses must consider impartially how they, in their different businesses, can benefit from a systematic forecast of business conditions. Many may be somewhat skeptical (which in some ways of itself is a healthy attitude of mind), but any who are need to remember how skeptical they were of things such as costing, planning, routing, functional organization and other things which are now very much accepted as essential to the efficient conduct of industry.
The attitude of many towards business forecasting still may be, unfortunately, “You might as well try to forecast the weather!” Wouldn’t the practical and sensible response to this be that we do try to forecast the weather? Realistically, we know that, in most climates, the weather can never really be forecasted for any substantial period of time with any certainty. Yet, surely, a scientific forecast with a certain margin of error is better than no forecast at all?
Don’t’ most of us act on this belief by listening in to the forecasts? Farmers and other agricultural businesses have to rely on forecasts to harvest their crops. As a matter of fact many have mentioned that they are very agreeably surprised at the degree of accuracy of weather forecasting as a whole.
It is not my intention to suggest that at the present time the vast complex of business can be forecasted for very far in advance, nor either with complete accuracy. YET, every one in business must continually be forecasting, in a careful and systematic way — based on certain dependable principles. Forecasting is likely to be a much more certain guide than any feelings of the moment we may have.
The point is we must forecast. We cannot take any important action in our business lives without forecasting!
As a matter of fact I would go so far as to agree with the viewpoint that a man who does not forecast may be defined as a savage! He has very little sense of time. The past and the future invariably have very little meaning for him.
For anyone to ever say that because the future is difficult to foresee we should give up all and any attempts to foresee it seems absolutely ridiculous. To be sure any business man holding a view similar to this should retire immediately, while he may still have some money on which to do so.
FORECAST OR FAIL! Every man or woman in business must forecast!
Selecting a forecasting method
The choice of a forecasting technique is significantly influenced by the stage of the product life cycle and sometimes by the type of firm or the industry for which a decision is being made.
In the beginning of a product life cycle, relatively small expenditures are made for research and market investigation. During the product introduction phase, these expenditures start to increase. Whereas, In the rapid growth stage, because decisions involve considerable amounts of money, a high level of accuracy is desirable. And after the product has entered the maturity stage, decisions are much more routine, involving marketing and manufacturing. These are important considerations in choosing a sales forecast technique.
After evaluating the particular stages of the product along with firm and industry life cycles, a further probe is necessary. Instead of selecting a forecasting technique by using whatever seems applicable, decision-makers should determine what is most appropriate. Some techniques are quite simple and inexpensive; others are extremely complex, require significant amounts of time to develop, and may be quite expensive. Some are best suited for short-term projections; others are better prepared for inter-mediate or long-term forecasts.
The choice of technique or techniques depends on the following criteria:]
- How much will it cost to develop the forecasting mode compared with the potential gains resulting from its use? The choice is one of benefit-cost trade-off.
- How complicated are the relationships that must be forecast?
- Is the forecast for short-run or long-run purposes?
- How much accuracy is desired?
- Is there a minimum tolerance level of error?
- What data are available? Techniques vary in the amount of data they require.
Now, a few comments about one of the methods used:
The Qualitative Approach – the Qualitative or judgmental approach can be useful in formulating short-term forecasts and can also supplement projections based on the use of any of the qualitative methods.
Four of the better know qualitative forecasting methods are Executive Opinions, the Delphi Method, Sales Force Polling and Consumer Surveys.
Beginning with Executive Opinions – with this approach the subjective estimates of executives or experts from sales, production, finance, purchasing and administration are averaged to generate a forecast about future sales.
The Delphi Method – this method is a group technique in which a panel of experts is individually questioned about their perceptions of future events.
Sales Force Polling – some companies use as a forecast source sales people who have continual contact with customers. It is believed that sales people, who are closest to the customers, have significant insights into the future market.
Consumer Surveys – and yet again some companies conduct their own market surveys of consumer purchasing plans. These vary from telephone contact, personal interviews or maybe even questionnaires.
For more detailed information, see also Strategic Business Forecasting, by Jae K. Shim.
Want more accuracy? Start uploading now!
Lokad is not only providing hosted forecasting services, we also continuously monitor the forecasts accuracy for every single account that has been populated with time-series data (such as the sales data provided by the Lokad add-ons).
The purpose of those ever-going monitoring operations is first to detect early any issue with our algorithms, but ultimately, the purpose is also to improve the overall forecasting accuracy either by tweaking / improving our algorithms to better match our customer data; or by introducing new algorithms to handle more accurately specific situations.
Care a lot about accuracy but not ready to integrate forecasting in your daily operations? Upload your data now and revert your Lokad account to Free (that way you won't get charged). By uploading your data now, you make it possible for the Lokad staff to start improving our forecasting technology taking into account the specific needs expressed by your business data.