Filtering by Tag: retail

Promotion planning in general merchandise retail – Data Challenges

Published on by Joannes Vermorel.


Forecasting is almost always a difficult exercise, but there is one area in general merchandise retail considered as one order of magnitude more complicated than the rest: promotion planning. At Lokad, promotion planning is one of the frequent challenges we tackle for our largest clients, typically through ad-hoc Big Data missions.

This post is the first of a series on promotion planning. We are going to cover the various challenges that are faced by retailers when forecasting promotional demand, and give some insights in the solutions we propose.

The first challenge faced by retailers when tackling promotions is the quality of the data. This problem is usually vastly underestimated, by mid-size and large retailers alike. Yet, without highly qualified data about past promotions, the whole planning initiative faces a Garbage In Garbage Out problem.

Data quality problems among promotion's records

The quality of promotion data is typically poor - or at least much worse than the quality of the regular sales data. A promotional record, at the most disaggregated level represents an item identifier, a store identifier, a start date (an end date) plus all the dimensions describing the promotion itself.

Tthose promotional records have numerous problems:

  • Records exist, but the store did not fully implement the promotion plan, especially with regards of the facing.
  • Records exist, but the promotion never happened anywhere in the network. Indeed, promotion deals are typically negotiated 3 to 6 months in advance with suppliers. Sometimes a deal gets canceled with only a few weeks’ notice, but the corresponding promotional data is never cleaned-up.
  • Off the record initiatives from stores, such as moving an overstocked item to an end aisle shelves are not recorded. Facing is one of the strongest factor driving the promotional uplift, and should not be underestimated.
  • Details of the promotion mechanisms are not accurately recorded. For example, the presence of a custom packaging, and the structured description of the packaging are rarely preserved.

After having observed similar issues on many retailer's datasets, we believe that the explanation is simple: there is little or no operational imperatives to correct promotional records. Indeed, if the sales data are off, it creates so many operational and accounting problems, that fixing the problem become the No1 priority very quickly.

In contrast, promotional records can remain wildly inaccurate for years. As long nobody attempts to produce some kind of forecasting model based on those records, inaccurate records have a negligible negative impact on retailer operations.

The primary solution to those data quality problems is data quality processes, and empirically validate how resilient are those processes when facing the live store's conditions.

However, the best process cannot fix broken past data. As 2 years of good promotional data is typically required to get decent results, it’s important to invest early and aggressively on the historization of promotional records.

Structural data problems

Beyond issues with promotional records, the accurate planning of promotions also suffers from broader and more insidious problems related to the way the information is collected in retail.

Truncating the history: Most retailers do not indefinitely preserve their sales history. Usually "old" data get deleted following two rules:

  • if the record is older than 3 years, then delete the record.
  • if the item has not been sold for 1 year, then delete the item, and delete all the associated sales records.

Obviously, depending on the retailer, thresholds might differ, but while most large retailers have been around for decades, it’s exceptional to find a non-truncated 5 years sales history. Those truncations are typically based on two false assumptions:

  • storing old data is expensive: Storing the entire 10-years sales data (down to the receipt level) of Walmart – and your company is certainly smaller than Walmart – can be done for less than 1000 USD of storage per month. Data storage is not just ridiculously cheap now, it was already ridiculously cheap 10 years ago, as far retail networks are concerned.
  • old data serve no purpose: While 10 years old data certainly serve no operational purposes, from a statistical viewpoint, even 10 years old data can be useful to refine the analysis on many problems. Simply put, long history gives a much broader range of possibilities to validate the performance of forecasting models and to avoid overfitting problems.

Replacing GTINs by in-house product codes: Many retailers preserve their sales history encoded with alternative item identifiers instead of the native GTINs (aka UPC or EAN13 depending if you are in North America or Europe). By replacing GTIN with ad-hoc identification codes, it is frequently considered that it becomes easier to track GTIN substitutions and it helps to avoid segmented history.

Yet, GTIN substitutions are not always accurate, and incorrect entries become near-impossible to track down. Worse, once two GTINs have been merged, the former data are lost: it’s no more possible to reconstruct the two original sets of sales records.

Instead, it’s a much better practice to preserve GTIN entries, because GTINs represent the physical reality of the information being collected by the POS (point of sales). Then, the hints for GTIN substitutions should be persisted separately, making it possible to revise associations later on - if the need arises.

Not preserving the packaging information: In food retail, many products are declined in a variety of distinct formats: from individual portions to family portions, from single bottles to packs, from regular format to +25% promotional formats, etc.

Preserving the information about those formats is important because for many customers, an alternative format on the same product is frequently a good substitute to the product when the other format missing.

Yet again, while it might be tempting to merge the sales into some kind of meta-GTIN where all size variants have been merged, there might be exception, and not all sizes are equal substitutes (ex: 18g Nutella vs 5kg Nutella). Thus, the packaging information should be preserved, but kept apart from the raw sales.

Data quality, a vastly profitable investment

Data quality is one of the few areas where investments are typically rewarded tenfold in retail. Better data improve all downstream results, from the most naïve to the most advanced methods. In theory, data quality would suffer from the principle of diminishing returns, however, our own observations indicate that, except for a few raising stars of online commerce, most retailers are very far from the point where investing more in data quality would not be vastly profitable.

Then, unlike building advance predictive models, data quality does not require complicated technologies, but a lot of common sense and a strong sense of simplicity.

Stay tuned, the next time, we will discuss of process challenges for promotion planning.

Categories: Tags: promotion forecasting retail data No Comments

Running a very large retail network on a smartphone

Published on by Joannes Vermorel.

Processing a large retail network on a smartphone, by Lokad Long before ‘big data’ became the technology buzzword of 2012, retail networks have been among the pioneers in dealing with the large amounts of data that is produced by their supply chain and their point of sale systems. Recognizing the richness and immense value of their data, heavy IT infrastructure investments have, in many cases, been made.

However, to date, the limitations and cost of the required infrastructure has left the reality far behind ambition and promise. This is particularly true for the richest retail data source, which also dwarfs all others in size: receipts generated by point of sale systems. Collecting and processing receipts of hundreds or even thousands of stores has remained a daunting, and very expensive, task.

What about running a large retail network on a smartphone instead?

While this question is provocative both from a technical and commercial point of view, we explain in this whitepaper how fundamental operations such as collecting and processing receipts for retail networks of up to 1000 stores can be done on a smartphone. The source code used by Lokad to produce the results exposed in this white paper has been made available as open source under a very liberal license (BSD) on GitHub.

By sharing some insights on big data for retail, we hope to further fuel the advance in retail data exploitation. Any feedback will be appreciated, don't hesitate to contact us.

Categories: community, retail Tags: opensource retail whitepaper No Comments

Out-of-shelf trilemma

Published on by Joannes Vermorel.

Shopping Cart logo Most people are familiar with the notion of dilemma when two possibilities are offered neither of which being acceptable. Safety stock analysis is a classical mathematical dilemma: you can choose between more stocks or more stockouts, yet both of them generate extra costs.

However, it exists situations where the trade-off is more complex in the sense there are more than 2 unfavorable options to balance. When there are 3 unfavorable options to be balanced, the situation is called a trilemma.

Out-of-shelf (OOS) monitoring, ala Shelfcheck, is facing a trilemma when it comes to the quality of the alerts being delivered:

  • Sensibility, the percentage of OOS problems being captured.
  • Precision, the percentage of true alerts within all OOS alerts.
  • Latency, the delay between then start of the OOS problem and the alert.

Pick any two, but you can't have them all. In fact, for any given demand forecasting accuracy, improving any of those 3 metrics come at the expense of the 2 other metrics.

Categories: insights, on shelf availability, retail Tags: insights oos out-of-shelf retail No Comments

Big data in retail, a reality check

Published on by Joannes Vermorel.

Cloud computing being so 2011, big data is going to be a key IT buzzword for 2012. Yet, as far we understand our retail clients, there is one data source that holds above 90% of total information value in their possession: market basket data (tagged with fidelity card information when available).

For any mid-large retail network, the informational value of market basket data simply dwarfs about all other alternative data sources, may it be:

  • In-store video data, which remain difficult to process, and primarily focused on security.
  • Social media data, which are very noisy and reflect as much bot implementations than human behaviors.
  • Market analyst’s reports, which require the scarcest resource of all: management attention.

Yet, beside basic sales projections (aka sales per product, per store, per region, per week …), we observe that, as of January 2012, most retailers are doing very little out of their market basket data. Even forecasting for inventory optimization is typically nothing more than a moving average variant at the store level. More elaborate methods are used fore warehouses, but then, retailers are not leveraging basket data anymore, but past warehouse shipments.

Big Data vendors promise to bring an unprecedented level of data processing power to their clients to let them harness all the potential of their big data. Yet, is this going to bring profitable changes to retailers? Not necessarily so.

The storage capacity sitting on display on the shelves of an average hypermarket with +20 external drives in display (assuming 500GB per drive) typically exceeds the raw storage needed to persist a whole 3 years of history of a 1000 stores network (i.e. 10TB of market basket data). Hence, raw data storage is not a problem, or, at least, not an expensive problem. Then, data I/O (input/output) is a more challenging matter, but again, by choosing an adequate data representation (the details would go beyond the scope of this post), it’s hardly a challenge as of 2012.

We observe that the biggest challenge posed by Big Data is simply manpower requirements to do anything operational with it. Indeed, the data is primarily big in the sense that the company resources, to run the Big Data software and to implement whatever suggestions come out of it, are thin.

Producing a wall of metrics out of market basket data is easy; but it’s is much harder to build a set of metrics worth the time being read considering the hourly costs of employees.

As far we understand our retail clients, the manpower constraint alone explains why so little is being been done with market basket data on an ongoing basis: while CPU has never been to cheap, staffing has never been so expensive.

Thus, we believe that Big Data successes in retail will be encountered by lean solutions that treat, not processing power, but people, as the scarcest resource of all.

Categories: business, insights, market Tags: bigdata insights retail 1 Comment

European Supermarket Magazine features Lokad

Published on by Joannes Vermorel.

Europe’s dedicated magazine for the supermarket sector ESM is featuring Lokad this month in a two page article titled Embracing the Cloud. The publication gives a great overview of Lokad to retailers and details why the cloud is such a game changer for retail. ESM had taken notice of Lokad’s presentation at the EuroShop 2011 in Düsseldorf, which particularly impressed the Editor Kevin Kelly.

Also, we agreed to offer ESM’s readers a free proof of concept to benchmark the retailer’s current forecasting accuracy compared to the one delivered by Lokad. Consider a 3 weeks net execution time to get hard facts on your forecasting practice.

In this issue, ESM investigates the highly competitive European grocery market, including the growing prominence of private labels, covers areas of interest to both buyers and other senior management working in the retail and manufacturing sectors, such as supply chain and logistics management, technology (such as EPoS), packaging and design and environmental best practices. The magazine appears bimonthly.

Categories: retail, supply chain Tags: cloud computing press retail supply chain No Comments

Business is UP but forecasts are DOWN

Published on by Joannes Vermorel.

Statistical demand forecasting is a counter-intuitive science. This point was pressed a couple of times before, but let's have a look at another misleading situation.

If every single product segment of my business is growing fast, then at least some products should have an upward sales trend as well. Right? Otherwise, we would not be growing at all.

This statement looks like just plain common sense; and yet it's wrong, very wrong. We live in fast paced economy. Having an identical product being sold more than 3 years is the exception rather than the norm in most consumer good businesses. As a result, product life-cycles tend to dwarf organic growth of retailers.

This situation is illustrated by the schema below.

This is a set of product sales plotted on the same graphic. Each curve is associated to a particular product; and products are launched over time. Each product come with its own lifecycle pattern. The lifecycle patterns here illustrate a typical novelty effect: sales quickly ramp-up after product launch, and then the product enters its downward phase, which ends when the product is finally phased out of the market.

Yet, how does an upward trend - from the retailer itself - impacts this picture? Let's have another look at the illustration below.

Sales are higher with a positively trended retailer, yet this growth is nowhere strong enough to compensate for the product lifecycle effect. The sales of the product are still decreasing - albeit at a slower rate.

This situation outlines how we can have a fast-growing retail business with only negatively trended product sales. The main trick lies in the fact that new products keep being launched.

Alas, this situation generates a lot of confusion. Indeed, when sales forecasts severely mismatch overall expectations, it becomes very tempting to fix the forecasts.

Since most forecasting tools are poorly suited to deal with too varying or too intermittent demand anyway, it is tempting to aggregate sales per family, per category to produce an aggregated forecast; and then to de-aggregate forecasts at the SKU level using ratios. This approach is named top-down forecasting; and heavily used in many industries (textile among others).

Top-down forecasts produce results that look much closer to intuitive expectations: a growth is observed in the sales forecasts, and it matches growth observed on the various business segments.

Yet, by producing the forecast at the TOP level, the forecasting model is capturing an fictitious upward trend that only results from the contribution of regular product launches. If this fictitious ends up applied to a lower level - aka SKUs or products - then we significantly over-forecast the sales for each individual product.

Near worst case: massive overstock is generated for products precisely at the time they are phased out of the market.

From a forecasting perspective, a good forecasting system should be able to capture lifecycle effects. It means that sales forecasts may significantly differ from the overall business forecast. Business can go UP while every single product is getting DOWN. In such a situation, trying to fix forecasts is most like going to make them worse.

Addendum: Despite the date of this post (April 1st, 2011), this post is not a joke.

Categories: forecasting, insights Tags: forecasting insights lifecycle retail trend No Comments

Better promotion forecasts in retail

Published on by Joannes Vermorel.

Since our major Tags+Events upgrade last fall, we have been very actively working on promotion forecasting for retail. We have now thousands of promotional events in our databases; and the analysis of those events has lead us to very interesting findings.

Also it’s hardly surprising, we have found that:

  • promotion forecasts when performed manually by practitioners are usually involving forecast errors above 60% in average. Your mileage may vary, but typical sales forecast errors in retail are usually closer to 20%.
  • including promotion data through tags and events reduces the average forecast error by roughly 50%. Again your mileage may vary depending the amount of data that you have on your promotional events.

As a less intuitive result, we have also found that rule-based methods and linear methods, although widely advertised by some experts and some software tools, are very weak against overfitting, and can distort the evaluation of the forecast error, leading to a false impression of performance in promotion forecasting.

Also, note that this 50% improvement has been achieved with usually quite a limited amount of information, usually no more than 2 or 3 binary descriptor per promotion.

Even crude data about your promotions are leading to significant forecast improvements, which turns into significant working capital savings.

The first step to improve your promotion forecasts consists in gathering accurate promotion data. In our experience, this step is the most difficult and the most costly one. If you do not have accurate records of your promotions, then there is little hope to get accurate forecasts. As people says, Garbage In, Garbage Out.

Yet, we did notice that even a single promotion descriptor, a binary variable that just indicates whether the article is currently promoted or not, can lead to a significant forecast improvement. Thus, although your records need to be accurate, they don’t need to be detailed to improve your forecasts.

Thus, we advise you to keep track precisely of the timing of your promotions: when did it start? when did it end? Note that for eCommerce, front page display has often an effect comparable to a product promotion, thus you need to keep track of the evolution of your front page.

Then, article description matters. Indeed, in our experience, even the most frequently promoted articles are not going to have more than a dozen promotions in their market lifetime. In average, the amount of past known promotions for a given article is ridiculously low, ranging from zero to one past promotion in average. As a result, you can’t expect any reliable results by focusing on the past promotions a single product at a time, because, most of the time there isn’t any.

So instead, you have to focus on articles that look alike the article that you are planning to promote. With Lokad, you can do that by associating tags to your sales. Typically, retailers are using a hierarchy to organize their catalog. Think of an article hierarchy with families, sub-families, articles, variants, etc.

Translating a hierarchical catalog into tags can be done quite simply following the process illustrated below for a fictitious candy reseller:

The tags associated with the sales history of medium lemon lollipops would be LOLLIPOPS, LEMON, MEDIUM

This process will typically create 2 to 6 tags per article in your catalog - depending on the complexity of your catalog.

We have said that even very limited information about your promotions could be used to improve your sales forecasts right away. Yet, more detailed promotion information clearly improves the forecast accuracy.

We have found that two items are very valuable to improve the forecast accuracy:

  • the mechanism that describes the nature of the discount offered to your customers. Typical mechanisms are flat discount (ex. -20%) but there are many other mechanisms such as free shipping or discount for larger quantities (ex: buy one and get one for free).
  • the communication that describes how your customers get notified of the promotional event. Typically, communication includes marketing operations such as radio, newspaper or local ads, but also the custom packaging (if any) and the visibility of promoted articles within the point of sales.

In case of larger distribution networks, the overall availability of the promotion should also be described if articles aren’t promoted everywhere. Such situation typically arises if point of sales managers can opt out from promotional operations.

Discussing with professionals, we have found that many retailers are expecting a set of rules to be produced by Lokad; and those rules are expected to explain promotions such as


Basically, those expected rules always follow more or less the same patterns:

  • A set of binary conditions that defines the scope of the rule.
  • A set of linear coefficients to estimate the effect of the rule.

We have found that many tools in the software market are available to help you to discover those rules in your data; which, seemingly, has lead many people to believe that this approach was the only one available.

Yet, according to our experiments, rule-based methods are far from being optimal. Worse, those rules are really weak against overfitting. This weakness frequently lead to painful situations where there is a significant gap between estimated forecast accuracy and real forecast accuracy.

Overfitting is a very subtle, and yet, very important, phenomenon in statistical forecasting. Basically, the central issue in forecasting is that you want to build of model that is very accurate against the data you don’t have.

In particular, the statistical theory indicates that it is possible to build models that happen to be very accurate when applied to the historical data, and still very inaccurate to predict the future. The problem is that, in practice, if you do not carefully think of the overfitting problem beforehand, building such a model is not a mere possibility, but the most probable outcome of your process.

Thus, you really need to optimize your model against the data you don’t have. Yet, this problem looks like a complete paradox, because, by definition, you can’t measure anything if you don’t have the corresponding data. And we have found that many professionals gave up on this issue, because it doesn’t look like a tractable thinking anyway.

Our advice is: DON’T GIVE UP

The core issue with those rules is that they perform too well on historical data. Each rule you add is mechanically reducing the forecast error that you are measuring on your historical data. If you add enough rules, you end-up with an apparent near-zero forecasting error. Yet, the empirical error that you measure on your historical data is an artifact of the process used to build the rules in the first place. Zero forecast error on historical data does not translate itself into zero forecast error on future promotions. Quite the opposite in fact, as such models tend to perform very poorly on future promotions.

Although, optimizing for the data you don’t have is hard, the statistical learning theory offers both theoretical understanding and practical solutions to this problem. The central idea consists of introducing the notion of structural risk minimization which balances the empirical error.

This will be discussed in a later post, stay tuned.

(Shameless plug) Many of those modern solutions, i.e. mathematical models that happen to be careful about the overfitting issue, have been implemented by Lokad, so that you don’t have to hire a team of experts to benefit from them.

Categories: business, forecasting, insights, retail, sales, supply chain Tags: forecasting insights promotion retail theory 1 Comment