About this blog

Lokad staff posts here tips, news and tutorial either related to Lokad or related to business forecasting in general.

Check our archives for a selection of posts to help your business with insights about forecasting.

Entries in forecasting (17)

Tuesday
Aug312010

Width vs. Depth, Rotate your sales forecasts of 90 degrees

We have already discussed why Lokad did not care much about forecasting Chinese food rather than Sport Bar beverages. Another way of thinking our technology consists of rotating your sales forecasts of 90 degrees.

We are observing that a consumer product has, on average, 3 years lifecycle. This means that on average the amount of data available for every single product about 18 months. When, we look at the sales history with a monthly aggregation, 18 months of data means 18 points.

With 18 data points, no matter how smart or advanced is your forecasting theory, you can't do much simply because we face an utter lack of data to perform any robust statistical analysis. With 18 points, even a pattern has obviously as seasonality becomes a challenge to observe because we don't even have 2 complete seasonal observation.

Your mileage may vary from one industry to the next, but unless your products stay in the market for decades, you are most likely to face this issue.

As a direct consequence, classical forecasting toolkits require statisticians to tweak forecasting models for every single product because no non-trivial statistical model can be robustly fit with only 18 points as input data.

Yet, Lokad does not require any statistician, and the magic lies in the 90 degrees rotation: our models do not iterate over data a single time-series at a time, but against all time-series at once. Thus, we have a lot more input data available, and consequently we can succeed with rather advance models.

This approach is just common sense: if you want to forecast the seasonality of your new chocolate bar, the seasonality of the other chocolate bars seems like a good candidate. Why should you treat each chocolate bar in strict isolation from the others?

Yet, from a computational perspective, the problem has just become a lot harder: if you have 10,000 SKUs the number of associations between two SKUs is roughly 100 millions (and 10,000 SKU is nowhere a large number). That's precisely where the cloud kicks in: even if your algorithms are well-designed not to suffer a strict quadratic complexity, you're still going to need a lot of processing power. The cloud just happens to make this processing power available on demand at a very low price.

Without the cloud, it is simply not possible to deliver this kind of technology.

Tuesday
Apr062010

Forecast's species: classification vs. regression

The word forecasting is covering a very large spectrum of processes, technologies and even markets. In the past, we introduced the worlds of forecasting software, distinguishing between:

  • Deterministic simulation software
  • Expert aggregation software
  • Statistical forecasting software

Lokad falls in the last category as our technology is purely statistical. Yet, Lokad is far from covering the entire statistical spectrum on is own. Two broads categories of forecasts exist in statistical forecasting (*):

  • Classification forecasts
  • Regression forecasts

(*) We are oversimplifying here for the sake of clarity, as statistical learning subtleties are well beyond the scope of this modest blog post.

Classification attempts to separate (or classify) objects according to their properties. The illustration below from Tomasz Malisiewicz illustrates a classification task trying to separate images picturing a chair from images picturing a table.

Illustration from tombone's blog

The output of a classification is binary (or rather discrete): objects get assigned to classes with more or less confidence, i.e. higher or lower probabilities.

On the other hand, regressions typically output curves. The illustration below is considering a time-series representing historical sales, and displays the corresponding forecast.

 The regression forecast is a curve rather than a binary (or combination of binary) settings. Inputs get prolonged into the future.

How does this distinction impact the business?

Well, it turns out that Lokad - as it stands early 2010 - only delivers regression forecasts. Thus, there are many interesting problems that cannot be tackled by Lokad because these are classification problems:

  • Customer segmentation: for each customer, we would like to evaluate the probability of achieving successful up-sale through a direct marketing action. Following the same idea, we could try to predict the churn as well.
  • Fraud detection: for each transaction, we would like to evaluate - based on the transaction pattern - the probability for the operation to be a fraud attempt.
  • Deal prioritization: based on the properties of the prospect (availability of budget, industry, contact rank in the company, expressed level of interest, ...), we would like to evaluate the likelihood to get a profitable deal out of each prospect to prioritize the sales team efforts.

Frequently, we are asked whether Lokad could deliver classification forecasts as well. Unfortunately, the answer will be negative for the time being. Albeit being rooted by the same mathematical theory, classification and regression entail very different technologies; and Lokad is pushing all its efforts toward regression problems.

Although, we are not dismissive about classification problems, they truly deserve attention and efforts. For 2010, we are sticking to our roadmap, but further ahead, classification could be a natural extension of our forecasting services.

Tuesday
Feb232010

Measuring forecast accuracy

 alt=Most engineers will tell you that:

You can't optimize what you don't measure

Turns out that forecasting is no exception. Measuring forecast accuracy is one of the few cornerstones of any forecasting technology.

A frequent misconception about accuracy measurement is that Lokad has to wait for the forecasts to become past, to finally compare the forecasts with what really happened.

Although, this approach works to some extend, it comes with severe drawbacks:

  • It's painfully slow: a 6 months ahead forecast takes 6 months to be validated.
  • It's very sensitive to overfittingOverfitting should not to be taken lightly, and it's one the few thing that is very likely to wreak havoc in your accuracy measurements.

Measuring the accuracy of delivered forecasts is a tough piece of work for us. Accuracy measurement accounts for roughly half of the complexity of our forecasting technology: the more advance the forecasting technology, the greater the need for robust accuracy measurements.

In particular, Lokad returns the forecast accuracy associated to every single forecast that we deliver (for example, our Excel-addin reports forecast accuracy). The metric used for accuracy measurement is the MAPE (Mean Absolute Percentage Error).

In order to compute an estimated accuracy, Lokad proceeds (roughly) through cross-validation tuned for time-series forecasts. Cross-validation is simpler than it sounds. If we consider a weekly forecast 10 weeks ahead with 3 years (aka 150 weeks) of history, then the cross-validation looks like:

  1. Take the 1st week, forecast 10 weeks ahead, and compare results to original.
  2. Take the 2 first weeks, forecast 10 weeks ahead, and compare.
  3. Take the 3 first weeks, forecast 10 weeks ahead, and compare.
  4. ...

The process is rather tedious, as we end-up recomputing forecasts about 150 times for only 3 years of history. Obviously, cross-validation screams for automation, and there is little hope to go through such a process without computer support. Yet, computers typically cost less than business forecast errors, and Lokad relies on cloud computing to deliver such high-intensive computations.

Attempts to "simplify" the process outlined are very likely to end-up with overfitting problems. We suggest to say very careful, as overfitting isn't a problem to be taken lightly. In doubts, stick to a complete cross-validation.

Tuesday
Jan122010

Ask.Lokad and get answers

Although some of you might have already noticed, we did quietly launch ask.lokad.com about one month ago as a replacement of our old forums that were suffering several terminal Web 1.0 design syndrome.

Ask.Lokad is powered by StackExchange, the engine behind StackOverflow the most popular Q/A forum for programming matters.

We have decided to revamp our forums, because we were getting the feeling that most simple questions such Should I consider working days or calendar days for lead time? or Why does ASA get negative when the number of agents get below call intensity? tend to be poorly addressed on the web.

Although, answers are probably buried in textbooks somewhere, our ambitions is to make this sort of knowledge as openly reachable as possible.

Have a question about forecasting, or about the usage of forecasts in your business? Don't hesitate to post on Ask.Lokad, we do our best to get all questions addressed. Then, as Lokad does not have a monopoly over smart people, the community can also challenge our answers with their own.

Thursday
Jan072010

Tags and Events user guide

First of all best wishes for 2010. Lokad has been nicely growing and moving forward during 2009, we really hope to be able to move forward according to plans for 2010.

As a first token of goodwill, we have finally published the long missing User guide for Tags and Events.

Indeed, tags and events are a powerful extension of our Forecasting API, that let Lokad refines forecasts through meta-data. Both LSSC and L3C support this framework.

Tags and Events are especially useful in retail and manufacturing to better handle promotion forecasts, but also product launches as well as all sort of factors that impact the business.

Monday
Dec142009

Where does Windows Azure folks get their inspiration?

At PDC'09, Microsoft and Lokad unveiled a case study about Windows Azure. Yet, what was our surprise when discovered the following video at the Windows Azure booth (check for video links below).

Once upon a time, there was a little company with little funds, but great ambitions.

Data Analytics with Windows Azure

The little company wanted to process truckload of historical data. Yet, it could not afford buying tons of computing stuff.

Import and export data in Windows Azure

Yet, through Windows Azure, the little company was suddently able to process truckload of data, and to output truckload of forecasts too. 

Feed IT systems with Windows Azure

The massive amount of forecasts would then flow into the IT systems of large retailers to optimize their companies.

The little company produced lots of profits, and its employees lived happily ever after. THE END

Monday
Nov302009

Unified pricing system

We are pleased to announce that we are upgrading Lokad toward a new pricing for our Forecasting Services. The details have already been published, check it out.

Although your mileage may vary, simulations indicate that this change will represent, on average, a 20% saving for current customers. Our goal remains to stay far ahead of the competition, both in terms of forecasts accuracy but also in terms of TCO (Total Cost of Ownership).

Pricing has been simplified ...

It's simple enough to be expressed with a single compact formula $0.15 * forecasts 2/3 . And, yes, we still have the Power 2/3 coefficient. For those those who don't enjoy mental calculations of cubic roots, we do provide a calculator .

Wait! What do you call a forecast?  If you want sales forecasts for a single product for the next 3 months, one value for each month ahead, it counts as 3. If you repeat your forecasts twice during the same month, it counts as 6. If you do the same with 1000 products instead of a single one, it counts as 6,000 forecasts.

Then, the power 2/3 just acts as a large discount volumes:

  • 1k forecasts cost $15
  • 1M forecasts cost $1,5k (instead of $15k)
  • 1B forecasts cost $150k ( instead of $15M)

Bottom line, it's rather simple. Our inspiration was a mix of the Windows Azure pricing and the Twilio pricing.

In particular, it must be noted that there is no setup fee. Obviously, such a pricing is only made possible because Lokad is powered by cloud computing.

Then, if you happen to use or need Lokad, only once in a while, you wont be charged unless you actually use Lokad. If your account stay idle for 1 month, then you don't get charged at all!

Finally, there is no threshold effect our pricing (thanks to the power 2/3 approach). The more forecasts you need, the higher the costs, but the higher the volume discount too.

Our old pricing system, which had not been revised for almost 2 years, was suffering from one major issue: there were subtleties. Not major ones, still it was sufficient for people to routinely estimates their subscription costs to thousands of $ while it was only less than a hundred.

We are not going to make the same mistake twice. Our pricing page now includes dedicated simulators for inventory optimization and call center optimization. Any doubt about the new costs, just type in you number of SKUs or you number of calling queues.

... but it's still a variable pricing

Many analysts have been expressing concerns about variable pricing in software. How I can get my business plans in order if everything is changing all the time? you might ask.

In our humble opinion, and as far forecasting is concerned, we believe that variable pricing to solving way many more problems than it causes. You might wonder, what will happen if the subscription costs increases? If your Lokad subscription costs increases, it means that your company is growing and so are your forecasting needs.  The last thing you want while undergoing a steady growth is get your forecasts wrong, and let improper planning wreck havoc in your business. Then, our volume discount factor (power 2/3) ensures that the more you grow, the more volume discounts you get from Lokad.

But there is the other situation that analysts usually don't both bother to consider: what if my business is going down, what if we are downsizing, what if branches get sold?

With Lokad, your subscription will be going down accordingly. You will not be stuck with an over-sized on-premise solution to maintain. Pay-as-you-go guarantees that the software you buy today will not accelerate the demise of your business if the economy turns out to be really rough.

Saturday
Nov142009

Internet is needed for your forecasts

Ethernet cable illustration Do I really need an Internet connection to get your forecasts? is a question frequently asked by prospects having a look at our forecasting technology.

Well, the answer is YES. With Lokad, there is no work-around. Our forecasting engine does not come as an on-premises solution.

But why should we need an internet connection for an algorithmic processing such as forecasting?

The answer to this question is one of the core reason that have lead to the very existence of Lokad in the first place.

When we started working on the Lokad project - back in 2006 -  we quickly realized that forecasting, despite appearances, was a total misfit for local processing.

1. Your can't get your forecasts right without having the data at hand. Researchers have been looking for decades for a universal forecasting model, but the consensus among the community is that there is no free lunch; universal models do not exist, or rather, they tend to perform poorly. This is the primary reasons why forecasting toolkits feature so many models (don't click this link, it's 3000 pages manual for a popular toolkit). With Lokad, the process is much simpler because the data is made available to Lokad. Hence, it does not matter any more if thousands of parameters are needed, as parameters are handled by Lokad directly.

2. Advanced forecasting is quite resource intensive but the need to forecast is only intermittent. Even a small retailer with 10 point of sales and 10k product references represents already 100k time-series to be forecasted. If we consider a typical performance of 10k/series per hour for a single CPU (which is already quite optimistic for complex models), then computing sales forecasts for the 10 points of sales take a total 10h of CPU time. Obviously, retailers prefer not to wait for 10h to get their forecasts. Buying an amazingly powerful workstation is possible, but then does it make sense to have so much processing power staying idle 99% of the time when forecasts are made only once a week? Outsourcing the processing power is the obvious cost-effective approach here.

3. Forecasting is still under fast paced evolution. Since our launch about 3 years ago, Lokad has been upgraded every month or so. Our forecasting technology is not some indisputable achievement carved in stone, but on the contrary, is still undergoing a rapid evolution. Every month, the statistical learning research community moves forward with loads of fresh ideas.  In such context, on-premise solutions undergo a rapid decay until the day the discrepancy between the performance of current version and the performance of the deployed version is so great that the company has no choice but to rush an upgrade. Aggressively developed SaaS ensure that customers benefit from the latest improvements without having to even worry about it.

In our opinion, going for an on-premise solution for your forecasts is like entering a golf competition with a large handicap. It might make the game more interesting, but it does not maximize your chances. Don't expect your competitors to be fair enough to start with the same handicap just because you do.

Thursday
Oct152009

What's your statistical model?

We have already disclosed a few insights about what's being used at Lokad. Yet, a frequent support request remains what's your model, precisely?

We‘re looking through various forecasting statistical packages with the intent on selecting one at some point in the near future. One thing I find lacking in Lokad is to see which statistical model was used. I understand that the selection of which model is used is a trade secret, but I would like to verify the final selection, in the trial that is, with our in-house mathematician before we trust you with our actual forecasts. Most software vendors operating in this space provide the model selected. Is it possible to get that result with Lokad?

Well, unfortunately, the correct answer is that Lokad isn't a statistical package. In particular, we don't deliver models, we deliver forecasts.

The whole architecture of Lokad has been designed around this very assumption, which unfortunately is very ill-suited to deliver any information about our models.

Our forecast flow, which grabs input data and outputs forecasts, is:

  • vastly more complex compared to models shipped with statistical packages. Forecasts cannot be associated with well-known models.
  • tailored for distributed computing in the clouds, thus, the design feels very alien when compared to classic toolkits.
  • subject to ongoing changes, as we are carrying experiments on a daily basis with agile deployment strategies.

But this design has very specific benefits too:

  • no need to tune complex forecasting parameters.
  • no need to constantly watch your parameters, we monitor the results.
  • scales up as much as you need to, up to millions of forecasts.
  • handles complex patterns that are way beyond classical toolkits.

Then, we don't ask anyone to take our results for granted. Just go and see for yourself, our trial is free for 30 days.

Monday
Oct052009

What's wrong with promotion forecasts?

It has been a little while since we posted some technical content about our forecasting methods. Let's discuss further the case of promotion forecasts.

Promotion forecasts are especially interesting for two reasons:

  • promotions represent a heavy part of the business for many companies.
  • forecasts are likely to end up really wrong when ad-hoc methods are used.

In order to clarify the traps looming behind promotion forecasts, let's have a look at the following schema. The two black curves represent fictitious product sales - illustrating typical consumer response to a promotion.

Model for retail promotion forecasts

Disclaimer: Those behaviors have been measured on Lokad databases. Yet, market responses vary a lot from one promotion to the next and from one business to the next. Our purpose here is not to provide accurate quantitative models, but rather to focus on what we believe to be key insights for promotion forecasts.

1. Primary promotion impact

The most important effect of a promotion is usually a large sales increase of the promoted product for the duration of the operation. This effect is illustrated with (1) in our schema.

Guessing that the sales are likely to increase is trivial, yet precisely estimating the impact of the promotion - that is to say the extra sales generated by the promotion itself - is a complicated process. Lokad is using its tags+events framework to do this.

It can be noted that classical forecasting methods such as exponential smoothing are completely missing promotional patterns.

2. Demand drop due to market saturation

The second effect of a promotion - which is too frequently overlooked - is the demand drop that comes just after the end of the promotion. See point (2) on the schema.

Indeed, by observing thousands of promotional operations, we have found that frequently, just after the end of a promotion, sales levels are dropping below the initial pre-promotion sales levels.

This drop reflects the temporary market saturation caused by the promotion itself. Basically, people who would have bought the product anyway have hurried their decision, resulting in fewer sales when the promotion stops.

When this drop is overlooked AND combined with a naive forecasting method, the combination can lead to extremely poor inventory management and overstocks. Indeed, the delayed behavior the exponential smoothing is likely to suggest inventory replenishments precisely when the demand is going to drop.

3. Mechanical echo due to customer synchronization

The third pattern comes from the synchronization effect of the promotion on customer replenishments. This point is illustrated with (3) on the schema.

All products (consumable or not) tend to have a lifecycle of their own. The promotion is synchronizing - to some extent - consumption patterns of customers. As a result, after the initial shock (the promotion itself), there is likely to be decreasing echos in customer demand.

In our measurement, it appears that most of the time only the first echo can be measured with enough confidence to be of any use for forecasting purposes. Subsequent echos do exist but they are too diffuse to be used to refine forecasts.

Again, if the demand bounce (3) happens right after the demand drop (2), then the exponential smoothing model is likely to cause frequent inventory shortage, because it completely fails to anticipate the bounce.

4. Demand drift due to market evolution

The fourth effect is the demand drift caused by the promotion. This effect is illustrated by point (4) in the schema.

A promotion acts as a market perturbation, and the base sales level is likely to be different after the promotion; hopefully higher, but our measures shows that the opposite, i.e. lower sales, also happens, yet with a lower frequency.

In our experience, establishing a quantitative forecast for the drift is really hard. Yet, the drift can be taken into account in order to speed-up forecasts adjustments after the end of the promotion.

5. Sales cannibalization

Sales cannibalization is typically the second major effect of a promotion in terms of absolute sales volumes. Cannibalized products end up with reduced sales during the promotion as illustrated by (5) on the schema.

Yet cannibalizations are harder to model, because the effect can be diffuse and potentially impact a wide range of unrelated products. Indeed, depending on the situation, some customers may have a fixed threshold for their spending. The immediate consequence of the threshold is that the opportunity spending offered by the promotion ends up compensated by restrictions on somewhat unrelated products.

Lokad is trying hard to provide fine grained promotion forecasts, taking into account all the effects that we can measure. If you decide to go for in-house forecasts (which we don't recommend, but well), we suggest to make sure that neither (2) nor (5) gets overlooked.