Demand, Sales and Workload Forecasting Software

Entries in insights (16)

Wisdom of crowds and future enterprise software

What do reCAPTCHA, Akismet and Lokad have in common?

Well, they are all application blocks leveraging their respective communities to better fulfill their goals.

At Lokad, we believe (like many others actually) that the future of enterprise software, such as ERP or CRM, will rely on Service-oriented Architecture (SOA is short) where each component of the system is largely decoupled from the other components, and eventually provided by independent software vendors.

Considering our initial question, let's note that

  • Akismet collects blog comments from a crowd of blogs in order to better filter out the blog spam.
  • reCAPTCHA (*) provides an anti-botnet filtering system to distinguish real human users from machines on the web.
  • Lokad collects demand data from a crowd of companies in order to improve the overall forecast accuracy.

A modular approach gives much more flexibility to the enterprise to design an IT system that really fit its needs (rather than the other way around). Modularity also reduces the risks of being dragged down because one piece of system is completely obsolete due to lack of investment of one particular software vendor.

Yet another type of benefits brought by specialized actors (think Akismet for blog spam filtering) is that it suddenly becomes possible to leverage the community to deliver smarter behaviors: Akismet is using the information obtained on each blog to improve its spam filtering on all the other blogs.

Today, it seems like that the first wave of crowd-enabled enterprise components was mostly oriented toward security tasks. The raise of security standards such as OpenId are likely to push even further this componentization of security blocks within enterprise software.

Yet, we believe that the second wave of crowd-enabled components will be oriented toward business analytics, ranging from intellectual property management, customer behavior analysis to operations research. Obviously, we want Lokad to be a leading player in this second wave. :-)

(*) Due to a botnet attack on our forums last week, we have upgraded the forum captcha toward reCAPTCHA. The amount of non-human registrations has dropped from hundreds per day to zero.

Posted on Monday, October 6, 2008 at 02:48PM by Registered CommenterJoannes Vermorel in , , | CommentsPost a Comment

Keeping track of errors to improve later on

Lately a couple of customers have been asking whether Lokad was keeping track of its past forecast errors in order to improve its future forecasts.

The answer is simple: yes, we do, but there are more than that. In particular, we do not wait for

  • the forecasts to be requested,
  • the course of events to happen,
  • the historical data to be updated,

to finally compare our past forecasts with what really happened. Indeed, such an approach would be way too slow and inefficient.

Instead, we are using cross-validation methods adapted for the purpose of time-series forecasting.The process is more simple than it sounds, let's start with an example.

Let assume that we have a single time-series worth 1 year of weekly sales data (i.e 52 points). We want to produce 4-weeks sales forecasts - but we also want to estimate the forecasting error.

  • take the N first points (with N = 10 initially).
  • create a forecasting model based on those N points.
  • create a 4-weeks ahead forecast based on this model.
  • compare the forecast with the complete series.
  • increment N of 1 point (i.e. 1 week).
  • repeat.

With cross-validation, we can accurately estimate the expected forecast error of a forecasting model. In particular, if you have two different models, cross-validation can help you choosing the best one (*). Cross-validation can also be used to adjust model parameters - in order to find the parameters that best fit the data.

The Lokad team continuously monitors accuracy on delivered forecasts with such cross-validation methods and keeps working on more accurate forecasting models. Thus, we do keep track of our forecast errors, but without waiting for them to happen.

(*) If you try too many models, then you are likely to end-up with overfitting issues, but this problem is beyond the scope of this post.

Posted on Monday, September 1, 2008 at 03:34PM by Registered CommenterJoannes Vermorel in , , | CommentsPost a Comment

Forecasting methods for Excel

Lokad delivers an advance forecasting technology, but for those who may just needing simple stuff, we have published a short guide about forecasting in Excel.

It is meant to be as simple as possible, and show what can be done with Excel in terms of forecasting - using linear regression or exponential fitting, through trendlines or the excel analysis toolpack (ATP).

It should allow small businesses, retailers or manufacturers, to easily perform a basic forecast of sales and demand.

It also gives useful advices about how to optimally use Excel, use comments and name your files and worksheets.

Posted on Wednesday, June 18, 2008 at 11:33AM by Registered CommenterJoannes Vermorel in , , | CommentsPost a Comment

Unfortunate "period start" setting name

In all the Lokad applications, both add-ons and the web application itself, there is a mysterious forecasting task setting named the period start.

Mea culpa. Choosing this name was really unfortunate and lead to many incomprehension.

Thus, I have decided to rename this parameter the period reference. It's default value is 2001-01-01 (i.e. Monday, January 1st, 2001), and unless you know why you need to change this value, I would strongly suggest you did not.

Let's start with a practical example. Let's assume that, as a retailer, you need monthly sales forecasts in order to make your monthly sales replenishment orders. Yet, your business months are starting the 15th of each month. All your suppliers are expecting your orders to be passed the 15th, and for years, all your monthly sales analysis have been starting the 15th.

In such a situation, it would be a real pain if Lokad was arbitrarily deciding that a month had to be starting the 1st. In order to avoid such a pitfall, Lokad provides an additional setting for the forecasting task definition that lets you adjust when you want the period to start: it's the period reference.

If you want your monthly forecasts to start the 15th of each month, then you can use 2001-01-15 as your period reference (or 1999-02-15 or 2017-11-15, the result would be the same). This date is used as a reference to infer all the other period's starting dates.

Further examples:

  • If the period reference is set to 2001-01-15 for a yearly forecast, then all years start January 15h (instead of January 1st).
  • If the period reference is set to 2001-01-15 for a weekly, then all weeks start on Tuesdays (because 2001-01-15 was a Tuesday).

In summary: The period reference, previously named "period start", is a date (past or future, it does not matter) used by Lokad to adjust the period boundaries both for historical data aggregation but also for the forecasts themselves. In particular, it has nothing to do with the starting date of the forecasts.

Posted on Monday, April 7, 2008 at 08:39PM by Registered CommenterJoannes Vermorel in , , , | CommentsPost a Comment

Dealing with exceptional sales

Lately, a few users have been asking how does Lokad handle an exceptional sales that won't be happening again anytime soon? This issue is usually known as outliers management: how to deal with data points that are completely off compared to the other points.

Lokad is only looking at historical data, and since we don't know that the exceptional sales aren't going to happen again (expert knowledge), this event is going to be interpreted as "real" sales, leading to potentially over-estimated sales forecasts. This situation is a typical case of demand forecasts vs. sales forecasts.

At this point, the natural solution would be to give the user the possibility to adjust the historical data to truly reflects the demand. We are currently working on a solution dedicated to safety stock calculations that would handle this sort of situations. Stay tuned.

Posted on Wednesday, January 30, 2008 at 10:58AM by Registered CommenterJoannes Vermorel in , , | Comments Off

Prediction markets vs. Lokad

In a previous post, I have been discussing the various worlds of forecasting software, outlining 3 main categories

  • Deterministic simulation software
  • Expert insights aggregation software
  • Statistical forecasting software

Lokad is clearly a member of the third category. Although, those three categories are not really competing with each other since they are usually not suited for the same type of situations.

In the second category, insight aggregation software, prediction markets software seems to attract more and more interest. Jed Christiansen has a very interesting review of prediction markets software

The overview page of Inlink provides an insightful summary about prediction markets

Prediction markets enable a diverse group of people to predict the answer to a question by buying and selling shares in stocks representing the possible answers. Using a stock market-like mechanism allows people to express their opinion as a "weighted vote" over time in response to new information or a change of opinion. And unlike a poll, a prediction market is asking "what will happen?" vs. "what do you want to happen?"

For example, if we ask the question: "Who will win the singing contest?" The four contestants would be represented as stocks that people buy shares in. If "Contestant A" has a stock price of $56, that means "the crowd" thinks there is a 56% chance that contestant will win. When people buy shares in that contestant, the price goes up. When they sell shares in that contestant, the price goes down. The stock price of an answer represents the probability of that answer being correct, priced stock after a period of time is considered the groups answer to the question posed.

The main difference with classical insight aggregation software is that the participants are financially involved in getting the right forecast.

Compared to Lokad (or to any statistical forecasting software), the main benefits of markets prediction is the ability to rationally tackle a forecast that depends on (potentially) irrational customer desires even when no relevant data is available. The crowd is bringing a solution to the small group of experts bias that usually plagues classical prospective methods such as the Delphi method.

Yet, like any insight aggregation method, market predictions involve quite an expensive forecasting process to get a single question answered. For example, it would not seem a very practical approach for call centers that requires 96 quarter-hour forecasts on a daily basis to predict inbound call volumes. If meaningful historical data is available, then statistical forecasts should be as accurate (if not more) and way much cheaper.

In its own statistical ways, Lokad is also (somehow) using the wisdom of the crowd, except that instead of considering a panel of people, we are considering a panel of business time-series that we exploit to improve the overall forecasting accuracy. In both cases, leveraging larger input datasets to improve forecasting accuracy is a key idea.

Posted on Monday, January 21, 2008 at 07:24PM by Registered CommenterJoannes Vermorel in , , | Comments Off

Choosing the right forecast period

Forecasting consists in producing figures that are supposed to reflect the future. But those figures depend heavily on the period chosen for data aggregation. Lokad supports the most frequently used periods: quarter-hour, half-hour, hour, day, week, month, quarter, semester, year ...

Intuitively, the longer the considered period, the easier it is to make an accurate forecast. For example, yearly forecasts eliminate seasonal variations. Although a short forecasting period might provide a false sense of accuracy (ex: forecasting daily candy sales over the next two years) whereas a large period might be unsuited to take operational decision (ex: trying to optimize the weekly worker schedules of the candy manufacturing unit based on yearly forecasts).

A careful choice of the forecasting period is essential to make the most of forecasting. Yet, surprisingly, this question is frequently left mostly unanswered in books treating the subject of forecasting for practitioners (usually focusing on sales or demand forecasting). Typical answers are most of the manufacturing industry is using monthly forecasts and many large retailers are using weekly forecasts.

Yet, simple assumptions can lead to practical quantitative clues to make this choice. If we just assume that forecast errors follow a normal distribution, then expected error increase when switching to a shorter period is

  • year → month: √(12/1) ≈ 3.5 (i.e. error multiplied by 3.5)
  • month → week: √(31/7) ≈ 2.1 (assuming a month with 31 days)
  • week → day: √(6/1) ≈ 2.5 (assuming 6 business days per week)
  • hour → quarter-hour: √(4/1) = 2

Although, the normal distribution assumption is usually not exactly verified, those figures are quite representative of most situations. Those figures can be used to evaluate the opportunity to change the forecasting period if the forecast error is too high or if the forecast period is too long.

Posted on Monday, January 14, 2008 at 10:16AM by Registered CommenterJoannes Vermorel in , , , | Comments Off

Chinese food vs. Sports bar while forecasting sales

Lokad has a pretty unique approach to forecasting where we leverage all the data that we have to perform every single forecast. While discussing with customers, I have been asked whether Lokad would mix Chinese food data with sports bar data. Indeed, the customer was worried that we might mix data that exhibits very different sales patterns although it was the same food and drink retail industry.

In fact, the more abstract question was: How refined is the notion of industry segments within Lokad? Well, the real answer is that we don't have any notion of industry segments in Lokad. And, in my humble opinion, it would be a really poor idea to even try to improve statistical forecasts based on such information

  • No matter how refined is your industry segments classification, it's still a very poor approximation of the reality. Industry segments are changing all the time, and who knows whether sales of Thai food exhibit the same patterns that sales of Vietnamese food. In a way, this is why dmoz.org is massively less popular and useful than search engines.
  • Even of small point of sales is usually generating a dozen of time-series (for each product being sold) at least 200 worked days per year. Thus, one year of history represent already more than 1.000 numbers to be exploited. The information contained in those number is dwarfing the amount of information contained in the classification that would typically be represented as just a few numbers.
  • Creating a classification that matches the forecasting purposes is probably as hard, if not harder, than the forecasting task itself. Indeed, an efficient classification would be able tell whether business segments will exhibit same patterns in the future.

Instead of relying on such a manual classification, Lokad is relying directly on statistical correlations: if some data can be used to improve the considered forecast, then do it; if the data cannot be used to achieve that, then just ignore the data. With proper statistical tools, more data does not hurt and storing data has never been cheaper.

Back to the Chinese Food vs. Sport Bar initial example, the reality is more complex than it seems. Some products, sold in both places, let's say ice cream, might exhibit similar sales patterns because they depend a lot from the weather, while some others, let's say beers, might behave very differently. Lokad is relying on automated processes to validate the correlations for every single forecasts; as opposed to do it once for a whole industry segment.

Introducing industry segments in Lokad would be like reverting from full text search to a hierarchical directory: time-consuming, and, in the end, much less efficient.

Posted on Monday, January 7, 2008 at 05:42PM by Registered CommenterJoannes Vermorel in , | Comments Off

Past stock-outs may generate future stock-outs

Accurate forecasts are critical because each extra-percent of forecast error comes with a steep price, literally. Indeed, when the costs associated to forecast errors are usually supra-linear, or put more simply, the costs associated to forecast errors increase much faster than the error itself.

As a simple example, a greater forecast error increases the need for safety stocks and thus working capital requirements. But if the working capital goes too high, bank interests start to rise, leading to even more expensive safety stocks.

But there are also more subtle negative consequences: past forecast errors may lower future accuracy. Indeed, historical demand itself is rarely known, instead, we usually rely on the historical sales data as an efficient approximation of the demand. Yet, this approximation is not perfect. For example, a stock-out prevents any sale to be made for a particular product. Yet, in case of a stock-out, zero sale does not equate zero demand.

For statistical forecasting algorithms, that relies on time-series analysis, it can be quite hard, using the sole sales data, to distinguish a zero sale caused by a stock-out from a zero demand. As a result, a lot of stock-outs (as they lead to lower sales) can be statistically interpreted as a lower demand; which, eventually, generates even more stock-outs.

Increasing your forecast accuracy now is one of the key to increase the forecast accuracy tomorrow. Accurate forecasting is not the destination but the journey.

Posted on Monday, December 17, 2007 at 10:26AM by Registered CommenterJoannes Vermorel in , , , | Comments Off

Simplicity vs. Flexibility - Forecasts at stake

One year ago, when we initially designed Lokad, we have been considering many potential features

  • more powerful data model than time-series.
  • more features in the web application (we decided to have less).
  • more indicators beside forecasts (confidence interval, estimated errors, ...)

Each time so far, we decided to do less as opposed to do more, the reason was that we did choose simplicity over flexibility (and Lokad isn't the only company doing this). The main issue with traditional statistical software is that you end up with a list of models so long that only a statistician can comprehend and use.

Simplicity often means faster and cheaper. What can be the benefits of forecasting if the process is too complicated and too time consuming to be used anyway? Lokad focuses on removing all technical obstacles that would prevent even small companies to start using forecasting.

Posted on Monday, November 26, 2007 at 10:55AM by Registered CommenterJoannes Vermorel in , , | Comments1 Comment
Page | 1 | 2 | Next 10 Entries