About this blog

Lokad staff posts here tips, news and tutorial either related to Lokad or related to business forecasting in general.

Ask Lokad Logo

Check our archives for a selection of posts to help your business with insights about forecasting.

Entries in forecasting (15)

Tuesday
23Feb2010

Measuring forecast accuracy

 alt=Most engineers will tell you that:

You can't optimize what you don't measure

Turns out that forecasting is no exception. Measuring forecast accuracy is one of the few cornerstones of any forecasting technology.

A frequent misconception about accuracy measurement is that Lokad has to wait for the forecasts to become past, to finally compare the forecasts with what really happened.

Although, this approach works to some extend, it comes with severe drawbacks:

  • It's painfully slow: a 6 months ahead forecast takes 6 months to be validated.
  • It's very sensitive to overfittingOverfitting should not to be taken lightly, and it's one the few thing that is very likely to wreak havoc in your accuracy measurements.

Measuring the accuracy of delivered forecasts is a tough piece of work for us. Accuracy measurement accounts for roughly half of the complexity of our forecasting technology: the more advance the forecasting technology, the greater the need for robust accuracy measurements.

In particular, Lokad returns the forecast accuracy associated to every single forecast that we deliver (for example, our Excel-addin reports forecast accuracy). The metric used for accuracy measurement is the MAPE (Mean Absolute Percentage Error).

In order to compute an estimated accuracy, Lokad proceeds (roughly) through cross-validation tuned for time-series forecasts. Cross-validation is simpler than it sounds. If we consider a weekly forecast 10 weeks ahead with 3 years (aka 150 weeks) of history, then the cross-validation looks like:

  1. Take the 1st week, forecast 10 weeks ahead, and compare results to original.
  2. Take the 2 first weeks, forecast 10 weeks ahead, and compare.
  3. Take the 3 first weeks, forecast 10 weeks ahead, and compare.
  4. ...

The process is rather tedious, as we end-up recomputing forecasts about 150 times for only 3 years of history. Obviously, cross-validation screams for automation, and there is little hope to go through such a process without computer support. Yet, computers typically cost less than business forecast errors, and Lokad relies on cloud computing to deliver such high-intensive computations.

Attempts to "simplify" the process outlined are very likely to end-up with overfitting problems. We suggest to say very careful, as overfitting isn't a problem to be taken lightly. In doubts, stick to a complete cross-validation.

Tuesday
12Jan2010

Ask.Lokad and get answers

Although some of you might have already noticed, we did quietly launch ask.lokad.com about one month ago as a replacement of our old forums that were suffering several terminal Web 1.0 design syndrome.

Ask.Lokad is powered by StackExchange, the engine behind StackOverflow the most popular Q/A forum for programming matters.

We have decided to revamp our forums, because we were getting the feeling that most simple questions such Should I consider working days or calendar days for lead time? or Why does ASA get negative when the number of agents get below call intensity? tend to be poorly addressed on the web.

Although, answers are probably buried in textbooks somewhere, our ambitions is to make this sort of knowledge as openly reachable as possible.

Have a question about forecasting, or about the usage of forecasts in your business? Don't hesitate to post on Ask.Lokad, we do our best to get all questions addressed. Then, as Lokad does not have a monopoly over smart people, the community can also challenge our answers with their own.

Thursday
07Jan2010

Tags and Events user guide

First of all best wishes for 2010. Lokad has been nicely growing and moving forward during 2009, we really hope to be able to move forward according to plans for 2010.

As a first token of goodwill, we have finally published the long missing User guide for Tags and Events.

Indeed, tags and events are a powerful extension of our Forecasting API, that let Lokad refines forecasts through meta-data. Both LSSC and L3C support this framework.

Tags and Events are especially useful in retail and manufacturing to better handle promotion forecasts, but also product launches as well as all sort of factors that impact the business.

Monday
14Dec2009

Where does Windows Azure folks get their inspiration?

At PDC'09, Microsoft and Lokad unveiled a case study about Windows Azure. Yet, what was our surprise when discovered the following video at the Windows Azure booth (check for video links below).

Once upon a time, there was a little company with little funds, but great ambitions.

Data Analytics with Windows Azure

The little company wanted to process truckload of historical data. Yet, it could not afford buying tons of computing stuff.

Import and export data in Windows Azure

Yet, through Windows Azure, the little company was suddently able to process truckload of data, and to output truckload of forecasts too. 

Feed IT systems with Windows Azure

The massive amount of forecasts would then flow into the IT systems of large retailers to optimize their companies.

The little company produced lots of profits, and its employees lived happily ever after. THE END

Monday
30Nov2009

Unified pricing system

We are pleased to announce that we are upgrading Lokad toward a new pricing for our Forecasting Services. The details have already been published, check it out.

Although your mileage may vary, simulations indicate that this change will represent, on average, a 20% saving for current customers. Our goal remains to stay far ahead of the competition, both in terms of forecasts accuracy but also in terms of TCO (Total Cost of Ownership).

Pricing has been simplified ...

It's simple enough to be expressed with a single compact formula $0.15 * forecasts 2/3 . And, yes, we still have the Power 2/3 coefficient. For those those who don't enjoy mental calculations of cubic roots, we do provide a calculator .

Wait! What do you call a forecast?  If you want sales forecasts for a single product for the next 3 months, one value for each month ahead, it counts as 3. If you repeat your forecasts twice during the same month, it counts as 6. If you do the same with 1000 products instead of a single one, it counts as 6,000 forecasts.

Then, the power 2/3 just acts as a large discount volumes:

  • 1k forecasts cost $15
  • 1M forecasts cost $1,5k (instead of $15k)
  • 1B forecasts cost $150k ( instead of $15M)

Bottom line, it's rather simple. Our inspiration was a mix of the Windows Azure pricing and the Twilio pricing.

In particular, it must be noted that there is no setup fee. Obviously, such a pricing is only made possible because Lokad is powered by cloud computing.

Then, if you happen to use or need Lokad, only once in a while, you wont be charged unless you actually use Lokad. If your account stay idle for 1 month, then you don't get charged at all!

Finally, there is no threshold effect our pricing (thanks to the power 2/3 approach). The more forecasts you need, the higher the costs, but the higher the volume discount too.

Our old pricing system, which had not been revised for almost 2 years, was suffering from one major issue: there were subtleties. Not major ones, still it was sufficient for people to routinely estimates their subscription costs to thousands of $ while it was only less than a hundred.

We are not going to make the same mistake twice. Our pricing page now includes dedicated simulators for inventory optimization and call center optimization. Any doubt about the new costs, just type in you number of SKUs or you number of calling queues.

... but it's still a variable pricing

Many analysts have been expressing concerns about variable pricing in software. How I can get my business plans in order if everything is changing all the time? you might ask.

In our humble opinion, and as far forecasting is concerned, we believe that variable pricing to solving way many more problems than it causes. You might wonder, what will happen if the subscription costs increases? If your Lokad subscription costs increases, it means that your company is growing and so are your forecasting needs.  The last thing you want while undergoing a steady growth is get your forecasts wrong, and let improper planning wreck havoc in your business. Then, our volume discount factor (power 2/3) ensures that the more you grow, the more volume discounts you get from Lokad.

But there is the other situation that analysts usually don't both bother to consider: what if my business is going down, what if we are downsizing, what if branches get sold?

With Lokad, your subscription will be going down accordingly. You will not be stuck with an over-sized on-premise solution to maintain. Pay-as-you-go guarantees that the software you buy today will not accelerate the demise of your business if the economy turns out to be really rough.

Saturday
14Nov2009

Internet is needed for your forecasts

Ethernet cable illustration Do I really need an Internet connection to get your forecasts? is a question frequently asked by prospects having a look at our forecasting technology.

Well, the answer is YES. With Lokad, there is no work-around. Our forecasting engine does not come as an on-premises solution.

But why should we need an internet connection for an algorithmic processing such as forecasting?

The answer to this question is one of the core reason that have lead to the very existence of Lokad in the first place.

When we started working on the Lokad project - back in 2006 -  we quickly realized that forecasting, despite appearances, was a total misfit for local processing.

1. Your can't get your forecasts right without having the data at hand. Researchers have been looking for decades for a universal forecasting model, but the consensus among the community is that there is no free lunch; universal models do not exist, or rather, they tend to perform poorly. This is the primary reasons why forecasting toolkits feature so many models (don't click this link, it's 3000 pages manual for a popular toolkit). With Lokad, the process is much simpler because the data is made available to Lokad. Hence, it does not matter any more if thousands of parameters are needed, as parameters are handled by Lokad directly.

2. Advanced forecasting is quite resource intensive but the need to forecast is only intermittent. Even a small retailer with 10 point of sales and 10k product references represents already 100k time-series to be forecasted. If we consider a typical performance of 10k/series per hour for a single CPU (which is already quite optimistic for complex models), then computing sales forecasts for the 10 points of sales take a total 10h of CPU time. Obviously, retailers prefer not to wait for 10h to get their forecasts. Buying an amazingly powerful workstation is possible, but then does it make sense to have so much processing power staying idle 99% of the time when forecasts are made only once a week? Outsourcing the processing power is the obvious cost-effective approach here.

3. Forecasting is still under fast paced evolution. Since our launch about 3 years ago, Lokad has been upgraded every month or so. Our forecasting technology is not some indisputable achievement carved in stone, but on the contrary, is still undergoing a rapid evolution. Every month, the statistical learning research community moves forward with loads of fresh ideas.  In such context, on-premise solutions undergo a rapid decay until the day the discrepancy between the performance of current version and the performance of the deployed version is so great that the company has no choice but to rush an upgrade. Aggressively developed SaaS ensure that customers benefit from the latest improvements without having to even worry about it.

In our opinion, going for an on-premise solution for your forecasts is like entering a golf competition with a large handicap. It might make the game more interesting, but it does not maximize your chances. Don't expect your competitors to be fair enough to start with the same handicap just because you do.

Thursday
15Oct2009

What's your statistical model?

We have already disclosed a few insights about what's being used at Lokad. Yet, a frequent support request remains what's your model, precisely?

We‘re looking through various forecasting statistical packages with the intent on selecting one at some point in the near future. One thing I find lacking in Lokad is to see which statistical model was used. I understand that the selection of which model is used is a trade secret, but I would like to verify the final selection, in the trial that is, with our in-house mathematician before we trust you with our actual forecasts. Most software vendors operating in this space provide the model selected. Is it possible to get that result with Lokad?

Well, unfortunately, the correct answer is that Lokad isn't a statistical package. In particular, we don't deliver models, we deliver forecasts.

The whole architecture of Lokad has been designed around this very assumption, which unfortunately is very ill-suited to deliver any information about our models.

Our forecast flow, which grabs input data and outputs forecasts, is:

  • vastly more complex compared to models shipped with statistical packages. Forecasts cannot be associated with well-known models.
  • tailored for distributed computing in the clouds, thus, the design feels very alien when compared to classic toolkits.
  • subject to ongoing changes, as we are carrying experiments on a daily basis with agile deployment strategies.

But this design has very specific benefits too:

  • no need to tune complex forecasting parameters.
  • no need to constantly watch your parameters, we monitor the results.
  • scales up as much as you need to, up to millions of forecasts.
  • handles complex patterns that are way beyond classical toolkits.

Then, we don't ask anyone to take our results for granted. Just go and see for yourself, our trial is free for 30 days.

Monday
05Oct2009

What's wrong with promotion forecasts?

It has been a little while since we posted some technical content about our forecasting methods. Let's discuss further the case of promotion forecasts.

Promotion forecasts are especially interesting for two reasons:

  • promotions represent a heavy part of the business for many companies.
  • forecasts are likely to end up really wrong when ad-hoc methods are used.

In order to clarify the traps looming behind promotion forecasts, let's have a look at the following schema. The two black curves represent fictitious product sales - illustrating typical consumer response to a promotion.

Model for retail promotion forecasts

Disclaimer: Those behaviors have been measured on Lokad databases. Yet, market responses vary a lot from one promotion to the next and from one business to the next. Our purpose here is not to provide accurate quantitative models, but rather to focus on what we believe to be key insights for promotion forecasts.

1. Primary promotion impact

The most important effect of a promotion is usually a large sales increase of the promoted product for the duration of the operation. This effect is illustrated with (1) in our schema.

Guessing that the sales are likely to increase is trivial, yet precisely estimating the impact of the promotion - that is to say the extra sales generated by the promotion itself - is a complicated process. Lokad is using its tags+events framework to do this.

It can be noted that classical forecasting methods such as exponential smoothing are completely missing promotional patterns.

2. Demand drop due to market saturation

The second effect of a promotion - which is too frequently overlooked - is the demand drop that comes just after the end of the promotion. See point (2) on the schema.

Indeed, by observing thousands of promotional operations, we have found that frequently, just after the end of a promotion, sales levels are dropping below the initial pre-promotion sales levels.

This drop reflects the temporary market saturation caused by the promotion itself. Basically, people who would have bought the product anyway have hurried their decision, resulting in fewer sales when the promotion stops.

When this drop is overlooked AND combined with a naive forecasting method, the combination can lead to extremely poor inventory management and overstocks. Indeed, the delayed behavior the exponential smoothing is likely to suggest inventory replenishments precisely when the demand is going to drop.

3. Mechanical echo due to customer synchronization

The third pattern comes from the synchronization effect of the promotion on customer replenishments. This point is illustrated with (3) on the schema.

All products (consumable or not) tend to have a lifecycle of their own. The promotion is synchronizing - to some extent - consumption patterns of customers. As a result, after the initial shock (the promotion itself), there is likely to be decreasing echos in customer demand.

In our measurement, it appears that most of the time only the first echo can be measured with enough confidence to be of any use for forecasting purposes. Subsequent echos do exist but they are too diffuse to be used to refine forecasts.

Again, if the demand bounce (3) happens right after the demand drop (2), then the exponential smoothing model is likely to cause frequent inventory shortage, because it completely fails to anticipate the bounce.

4. Demand drift due to market evolution

The fourth effect is the demand drift caused by the promotion. This effect is illustrated by point (4) in the schema.

A promotion acts as a market perturbation, and the base sales level is likely to be different after the promotion; hopefully higher, but our measures shows that the opposite, i.e. lower sales, also happens, yet with a lower frequency.

In our experience, establishing a quantitative forecast for the drift is really hard. Yet, the drift can be taken into account in order to speed-up forecasts adjustments after the end of the promotion.

5. Sales cannibalization

Sales cannibalization is typically the second major effect of a promotion in terms of absolute sales volumes. Cannibalized products end up with reduced sales during the promotion as illustrated by (5) on the schema.

Yet cannibalizations are harder to model, because the effect can be diffuse and potentially impact a wide range of unrelated products. Indeed, depending on the situation, some customers may have a fixed threshold for their spending. The immediate consequence of the threshold is that the opportunity spending offered by the promotion ends up compensated by restrictions on somewhat unrelated products.

Lokad is trying hard to provide fine grained promotion forecasts, taking into account all the effects that we can measure. If you decide to go for in-house forecasts (which we don't recommend, but well), we suggest to make sure that neither (2) nor (5) gets overlooked.

Friday
14Aug2009

Better product launch forecasts

Forecasting is hard, even when a significant amount of historical data is available. When historical data is limited, forecasting is much harder. But then, what about forecasting when there is simply no historical data?

The no-data situation is more frequent that it looks: for every product launch, a company has to forecast future sales for the new product, while there are precisely no records for this product.

In practice, we have found that many companies - already using robust statistical tools to forecast their regular sales - just guesstimate when it comes to product launches (or one-shot promotions). We have also found that, in many situations, guesstimates are vastly inaccurate.

Obviously, if there is absolutely no historical data, then, indeed, statistical forecasting tools (such as Lokad) are powerless. Yet, in most companies, new products are launched on a regular basis, and this history of launches can be analyzed to figure out patterns of early sales.

Lokad takes advantage of historical product launches (when such data is available) to forecast the sales of a product even if there are no data yet for this particular product. Typically, we estimate that 20 product launches or so are needed to start learning launch patterns. In practice, there is no hard-coded lower limit on the number of product launches in our technology, but with less 20 launches, forecasts tend to become erratic.

In practice, you can use the Safety Stock Calculator to forecast product launches. Note that raw sales data is not enough in the case of product launches, tags and events are needed as well (well, at least tags or events):

  • Tags should be provided in order to describe the product. Tags typically express similarities that exist between products (ex: color, size, category, product family, ...). Those tags are used by Lokad to match the new product with existing ones. Typically, a tag is a permanent descriptor of the product: it does not change over time.
  • Events should be (eventually) provided to describe the launch operation itself. Events are just like tags, but positioned at a certain date. Events typically represent marketing operations that support the product launch. An event usually has a lifetime shorter than the product itself (otherwise it should be considered a tag).

The distinction between tags and events helps Lokad to figure the relative position of the product within the distribution channels of the company (tags), from the impact of the marketing operations themselves (events).

Still unsure how to proceed with your product launches? Don't hesitate to drop us questions in the forums.

Tuesday
07Jul2009

Favorite Forecasting Models

What kind of forecasting models are you using internally? This is a question frequently asked both by customers and partners of Lokad.

Addressing this question is tricky for us for two reasons:

  • Our technology is a core asset. Thus, we have no plan to disclose all the details (although we aren’t utterly secretive either).
  • Our technology is complex. We are using many models, and a cornerstone component is precisely the model selection.

Thus, rather than giving the exact list of models used by Lokad, I am going to list my own personal list of favorite models. I don’t claim that these models that represent the complete list of models used at Lokad nor that all these models are actually used in production at Lokad; yet it should give you some insights in what we are doing at Lokad.

First there are the plain old classics: autoregressive, moving average, (double, triple) exponential smoothing, Box-Jenkins, Holt-Winters, ARMA, ARIMA ... Those models typically handle neither multi-series nor tags or events; yet simplicity is king is many situations. Don’t discard moving average just because it looks too simple to be good.

Then, for more advanced models, I'd rather speak of approaches than models. Indeed, the more complex the model, the more latitude is left to the mathematician to tweak in subtle ways the behavior of the forecasting model.

The Bayesian approach: establishing graphs of relationships is especially useful in the context of Lokad where we exploit correlations between time-series. It’s also useful in order to deal with tags and events.

The vast margin approach: Support Vector Machines (SVM) have become incredibly popular those days. Although, as far time-series are concerned, it’s rather Support Vector Regression (SVR) that is the most useful for us. As a minor drawback, SVM and SVR are typically quite expensive in terms of raw processing power.

The mixture / boosting approach: mixing loads of simple predictors in order in improve the overall forecast works well. The combination of large number of simple predictors can be used to reflect really complex behaviors.

The meta-heuristic approach: genetic algorithm, neural networks, genetic programming and other evolutive / adaptive approaches. Those approaches are powerful but also notoriously known for their intrinsic sensibility to many tuning parameters.

As final note, our technology is still going under a fast-paced evolution. New models get put in production every month or so. This list isn’t definitive and cloud computing is actually creating a lot of opportunities for us to push models that were just too expensive in the past.