Filtering by Tag: supply chain

Entropy analysis for supply chain IT system discovery

Published on by Joannes Vermorel.

The IT landscape of supply chains is nearly always complex. Indeed, by nature, supply chains involve multiple actors, multiple sites, multiple systems, etc. As a result, building data-driven insights in supply chains is a challenge due to the sheer heterogeneity of the IT landscape. Too frequently, supply chain analytics deliver nonsensical results precisely because of underlying garbage in, garbage out problems.

At Lokad, we have not only developed a practice that thoroughly surveys the IT landscape and the datasets inhabiting it, but we have also created some bits of technology to facilitate the surveying operations themselves. In this post, we detail one step of our surveying methodology, which is based on Shannon entropy. We successfully leveraged entropy analysis for several large scale supply chain initiatives.

Our surveying process starts by reviewing all the database tables that are understood to be relevant for the supply chain initiative. Supply chains are complex and, consequently, the IT systems that operate the supply chains reflect this complexity. Moreover, the IT systems might have been evolving for several decades, and layers of complexity tend to take over those systems. As a result, it’s not uncommon to identify dozens of database tables with each table having dozens of columns, i.e. fields in the database terminology.

For large supply chains, we have observed situations where the total number of distinct fields are above 10,000. By leveraging entropy analysis, we are able to remove half of the columns from the picture immediately and consequently reduce the remaining amount of work significantly.

Dealing with that many columns is a major undertaking. The problem is not data-processing capabilities: with cloud computing and adequate data storage, it’s relatively straightforward to process thousands of columns. The real challenge is to make sense of all those fields. As a rule of thumb, we estimate that well-written documentation of a field takes up about one page, when interesting use cases and edge cases are covered. Without proper documentation of the data, data semantics are lost, and odds are very high that any complex analysis performed on the data will suffer from massive garbage in garbage out headaches. Thus, with 10,000 fields, we are faced with the production of a 10,000-page manual, which demands a really monumental effort.

Yet, we have observed that those large IT systems also carry a massive amount of dead weight. While the raw number of fields appear to very high, in practice, it does not mean that every column found in the system contains meaningful data. At the extreme, the column might be entirely empty or constant and, thus, contain no information whatsoever. A few fields can be immediately discarded because they are truly empty. However, we have observed that fully empty fields are actually pretty rare. Sometimes, the only non-constant information in the column dates from the day when the system was turned on; the field was never to be reused afterward. While truly empty fields are relatively rare, we usually observe that degenerate fields are extremely numerous. These fields contain columns with almost no data, well below any reasonable threshold to leverage this data for production purposes.

For example, a PurchaseOrders table containing over one million rows might have an Incoterms column that is non-empty in only 100 rows; furthermore, all those rows are more than five years old, and 90 rows contain the enry thisisatest. In this case, the field Incoterms is clearly degenerate, and there is no point in even trying to make sense of this data. Yet, a naive SQL filter will fail to identify such a column as degenerate.

Thus, a tool to identify degenerate columns is needed. It turns out that Shannon entropy is an excellent candidate. Shannon entropy is a mathematical tool to measure the quantity of information that is contained in a message. The entropy is measured in Shannons, which is a unit of measurement somewhat akin to bits of information. By treating the values found in the column itself as the message, Shannon entropy gives us a measure of the information contained in the column expressed in Shannons.

While all of this might sound highly theoretical, putting this insight into practice is extremely straightforward. All it takes is to use the entropy() aggregator provided by Envision. The tiny script below illustrates how we can use Envision to produce the entropy analysis of a table with 3 fields.

read "data.csv" as T[*]
show table "List of entropies" with

Any field associated with an entropy lower than 0.1 is a very good indicator of a degenerate column. If the entropy is lower than 0.01, the column is guaranteed to be degenerate.

Our experience indicates that performing an initial filtering based on entropy measurements reliably eliminates between one-third and two-thirds of the initial fields from the scope of interest. The savings in time and efforts are very substantial: for large supply chain projects, we are talking about man-years being saved through this analysis.

We unintentionally discovered a positive side effect of entropy filtering: it lowers the IT fatigue associated with the (re)discovery of the IT systems. Indeed, investigating a degenerate field typically proves to be an exhausting task. As the field is not used - sometimes not used any more - nobody is quite sure whether the field is truly degenerate or if the field is playing a critical, but obscure, role in the supply chain processes. Because of the complexities of supply chains, there is frequently nobody who can positively affirm that a given field is not used. Entropy filtering immediately eliminates the worst offenders that are guaranteed to lead us on a wild-goose chase.

Categories: Tags: supply chain envision insights No Comments

The Supply Chain Scientist

Published on by Joannes Vermorel.

Artificial intelligence has been making steady progress over the last few decades. However, while self-driving cars might be just around the corner, we are still decades away from having software smart enough to devise a supply chain strategy. Yet, at the same time, it would be incorrect to conclude that supply chain as a whole is still decades away from being positively impacted by machine learning algorithms.

Lokad’s supply chain science competency was born out of the observation that while algorithms alone were insufficient, they actually became formidable enablers in the hands of capable supply chain experts. Machine learning offers the possibility to achieve unprecedented levels of supply chain performance by taking care of all the extensive but otherwise clerical micro-decisions that your supply chain requires: when to order a product, when to move a unit of stock, when to produce more items, etc.

The Supply Chain Scientist is a mix between a data scientist and a supply chain expert. This person is responsible for the proper data preparation and the proper quantitative modelling of your supply chain. Indeed, it takes human supply chain insights to realize that some relevant data may be missing from a project and to align the optimization parameters with the supply chain strategy of the company.

Too often, supply chain initiatives come with fragmented responsibilities:

  • Data preparation is owned by the IT team
  • Statistics and reporting is owned by the BI (business intelligence) team
  • Supply chain execution is owned by the supply chain team

The traditional S&OP answer to this issue is the creation of collective ownership through monthly meetings between many stakeholders, ideally having the whole thing owned by the CEO. However, while we are certainly not opposed to the principle of collective ownership, our experience indicates that things tend to move forward rather slowly when it comes to traditional S&OP.

In contrast to the collective ownership established through scheduled meetings, the Supply Chain Scientist holds the vital role of taking on the end-to-end ownership of all the quantitative aspects of a supply chain initiative.

This focused ownership is critical in order to avoid too common pitfalls associated with traditional supply chain organizations which are:

  • Data is incorrectly extracted and prepared, primarily because the IT team has limited insights in relation to the use of the data.
  • Statistics and reporting misrepresent the business; they provide less-than-useful insights and suffer from less-than-perfect data inputs.
  • Execution rely heavily on ad-hoc Excel sheets in order to try to mitigate the two problems described above, while creating an entire category of new problems.

When we begin a quantitative supply chain initiative with a client company, we start by making sure that a Supply Chain Scientist is available to execute the initiative.

Learn more about supply chain scientists

Categories: Tags: insights supply chain 1 Comment

The test of supply chain performance

Published on by Joannes Vermorel.

Answering these 12 questions tell more about your supply chain performance than nearly all benchmarks and audits that the market has to offer. This test should take about 5 minutes of your time.

  1. Can supply chain operate without Excel?
  2. Is ABC analysis regarded as obsolete?
  3. Is all relevant data documented by the supply chain teams?
  4. Do you record historical stock levels?
  5. Do supply chain teams monitor the quality of their data?
  6. Do you forecast lead times?
  7. Do you prevent any manual intervention on forecasts?
  8. Do you manage operational constraints, e.g. MOQs, as data?
  9. Do you quantify the cost of supply chain failures?
  10. Can decision-making systems be left unattended for a week?
  11. Can you recompute all decisions in one hour?
  12. Are all decisions prioritized against each other?

If your company isn’t answering yes to at least 10 of those questions, then, a supply chain initiative has the potential to deliver a sizeable ROI. If you don't have 8 positive answers, ROI can be expected to be massive. If your company does not reach 6 positive answers, then, in our book, supply chain optimization hasn't even started yet.

Read more about the Lokad test of supply chain performance.

Categories: Tags: supply chain insights No Comments

Preparing enterprise data takes 6 months

Published on by Joannes Vermorel.

How long does it take to get started with Lokad? Answering this question is tough because often our answer is about 3 to 6 months. Hell, 6 months! How can your software be so clunky that it can take up to 6 months to get started? Well, our typical set-up phases can be broken down as follows:

  • 90 to 180 days: preparing the data
  • 3 to 30 days: configuring Lokad

This shows that Lokad’s setup is actually lightweight. Yes, there is room for improvement, but if we consider that Lokad offers programmatic capabilities which completely fit the business drivers, the process can already be considered as lean.

The biggest challenge, which makes pretty much everything else seem insignificant, is data preparation. Preparing data is the art of producing numbers that make sense out of raw data obtained from the company systems.

It is tempting to underestimate the amount of effort that needs to be invested upfront in order to deliver numbers that make sense. In fact, data preparation is too often reduced to a simple data cleaning operation as if the challenge could simply be addressed by filtering the few parts in the data that happen to be incorrect (such as the negative stock levels).

Yet, the true challenge lies in uncovering and documenting the precise semantics of the data. When we begin a project, we consider ourselves lucky if we have about one line of documentation per field for every database table that is made available to us. By the end of the project, we have about one page worth of documentation per field.

If data preparation takes 6 months, why not just postpone using Lokad for 6 months? So that we have all the data just before Lokad starts working on it.

Establishing a data-driven company culture takes years. If your company does not already have a team of data scientists working for it, not much will happen for the next 6 months as far as your supply chain project is concerned. Hence, after 6 months of waiting, your company will still be stuck with another 6 months of data preparation. One of the core know-hows we have developed at Lokad consists precisely in uncovering all the subtle “gotchas” that may backfire against supply chain initiatives.

Lokad can be a key driver for change in your company’s supply chain. Don’t hesitate to contact us, we are happy to discuss these matters with you in more detail.

Categories: Tags: insights supply chain No Comments

Ionic data storage for high scalability in supply chain

Published on by Joannes Vermorel.

Supply chains moved quite early on towards computer-based management systems. Yet, as a result, many large companies have decade-old supply chain systems which tend to be sluggish when it comes to crunching a lot of data. Certainly, tons of Big Data technologies are available nowadays, but companies are treading carefully. Many, if not most, of those Big Data companies are critically dependent on top-notch engineering talent to get their technologies working smoothly; and not all companies succeed, unlike Facebook, in rewriting layers of Big Data technologies for making them work.

Being able to process vast amounts of data has been a long-standing commitment of Lokad. Indeed, optimizing a whole supply chain typically requires hundreds of incremental adjustments. As hypotheses get refined, it’s typically the entire chain of calculations that needs to be re-executed. Getting results that encompass the whole supply chain network in minutes rather than hours lets you complete a project in a few weeks while it would have dragged on for a year otherwise.

And this is why we started our migration towards cloud computing back in 2009. However, merely running on top of a cloud computing platform does not guarantee that vast amount of data can be processed swiftly. Worse still, while using many machines offers the possibility to process more data, it also tends to make data processing slower, not faster. In fact, delays tend to take place when data is moved around from one machine to the next, and also when machines need to coordinate their work.

As a result, merely throwing more machines at a data processing problem does not reduce any further the data processing time. The algorithms need to be made smarter, and every single machine should be able to do more with no more computing resources.

A few weeks ago, we have released a new high-performance column storage format code-named Ionic thatis heavily optimized for high-speed concurrent data processing. This format is also geared towards supply chain optimization as it natively supports the handling of storage distributions of probabilities. And these distributions are critical in order to be able to take advantage of probabilistic forecasts. Ionic is not intended to be used as an exchange format between Lokad and its clients. For data exchange, using flat text file format, such as CSV, is just fine. The Ionic format is intended to be used as internal data format to speed-up everything that happens within Lokad. Thanks to Ionic, Lokad can now process hundreds of gigabytes worth of input data with relative ease.

In particular, the columnar aspect of the Ionic format ensures that columns can be loaded and processed separately. When addressing supply chain problems, we are routinely facing ERP extractions where tables have over 100 columns, and up to 500 columns for the worst offenders. Ionic delivers a massive performance boost when it comes to dealing with that many columns.

From Lokad’s perspective, we are increasingly perceiving data processing capabilities as a critical success factor in the implementation of supply chain optimization projects. Longer processing time means that less gets done every single day, which is problematic since ultimately every company operates under tight deadlines.

The Ionic storage format is one more step into our Big Data journey.

Categories: Tags: technology release supply chain cloud computing bigdata No Comments

Will compilation save supply chains?

Published on by Joannes Vermorel.

Yes. To a noticeable extent. And I would never have ventured to put forward this opinion when founding Lokad nearly a decade ago.

By compilation I refer to the art of crafting compilers, that is, computer programs that translate source code into another language. Few people outside the ranks of programmers know what a compiler does, and few people within the ranks of programmers know how a compiler is designed. At first, compilation concerns appear distant (to say the least) to supply chain concerns. Yet, nowadays, at Lokad, it’s compilation stuff that keeps saving the day; one supply chain project after another.

Shameless plug: software engineers with compilation skills don’t grow on trees, and we are hiring. Want to work on stuff that matters? Well, the next time your plane is late because a part was missing, or the next time the drug you seek is out of stock, just remember that you could have made a difference by joining Lokad :-)

Supply chains are complex, maddeningly complex. Globalization has multiplied sourcing opportunities, but delays are longer and more erratic than ever. Sales channel are being multiplied too: there are physical stores, online stores, marketplaces, resellers, wholesalers, ... And now, thanks to Amazon, everyone, everywhere expects everything to be ordered and received overnight. Supply chain expectations are higher than ever.

Approaching supply chain problems with anything less than the full expressiveness of a programming language does not work. Just like Lego programming is not going happen, supply chain challenges won’t fit into checkboxes and dropdowns. This does not prevent software vendors from trying, mind you. Solutions that include more than 1000 tables, each table hovering at around 100 fields on average, are all too common. And while the client company is only using about 1% of the solution’s feature area, they still have to cope with its pervasive complexity.

Compilation saves the day because it provides a huge body of knowledge and know-how when it comes to crafting high-quality abstractions intended as power tools for solving statistical and combinatorial problems (and much more actually). And most supply chain challenges happen to be precisely statistical and combinatorial. For example, at Lokad, by introducing an algebra of distributions, we managed to "crack down" on complicated lead time problems which were resisting our more direct approaches through packaged software features.

What makes language features different from, say, the usual app features (wysiwyg), is that language features are much less sensitive to the specificities of a given challenge than their app features counterparts. For example, let’s consider a situation where your stock-out detection logic backfires in the specific case of ultra-seasonal products. If the feature is delivered through a language construct, then you can always narrow down the data scope until the feature works exactly where it’s intended to do so; possibly dynamically adjusting the scope through an ad-hoc numerical analysis. In contrast, with an app feature, you’re stuck with the filtering options that have been built into this feature. App features are a good fit only if your problems are narrow and well-defined, which is actually very unlike supply chain optimization.

In supply chain, programmability shines because:

  • Problems are both highly numerical and very structured
  • Supply chains are modular and this modularity needs to be leveraged
  • The number of variables is significant but not overwhelming
  • Fitting the precise shape of the problems is critical

It is slightly amusing to see how many software vendors tend to gradually re-invent programmability. As the user interface grows in depth and complexity, with the possibility to add filters, options, pre-process or post-process-hooks, templated alerts, KPI monitors, the user interface gradually becomes a programmable thing, and reaches the point where only a programmer can actually make sense of it (precisely thanks to his or her pre-existing programming skills). Programmable yes, but in a highly convoluted way.

Compilation is the art of amplifying engineering skills: one has to craft abstractions and language constructs that streamline thinking the resolution of problems. As Brian Kernighan famously wrote: Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it? The same logic applies to supply chain optimization, because it’s essentially the same thing as writing code. Well, at Lokad, it literally is the same thing.

Conventional IT wisdom states that one should automate the easy parts first, leaving human experts to cope with the more complex elements. Yet, in supply chain, this approach backfires badly every single time. The most complex parts of supply chain are nearly always the most costly ones, the ones that urgently need attention. The easy parts can take care of themselves through min/max inventory or Kanban. Just like you wouldn’t build software for autonomous cars by refining software for automatic train operations, you can’t tackle difficult supply chain problems by refining software initially designed to resolve simple challenges.

Naturally, compilation alone isn’t sufficient to cope with supply chain challenges. Machine learning, big data processing and a sizable amount of human skills are worth mentioning as well. However, in all cases, having carefully crafting high-quality abstractions helps considerably. Machine learning is vastly simpler when input data is well-prepared. Big data processing is also much more straightforward when computations lend themselves easily to a high degree of parallelization.

Categories: Tags: hiring insights supply chain No Comments

Solving the general MOQ problem

Published on by Joannes Vermorel.

Minimal Order Quantities (MOQs) are ubiquitous in supply chain. At a fundamental level, MOQs represent a simple way for the supplier to indicate that there are savings to be made when products are ordered in batches rather than being ordered unit by unit. From the buyer's perspective, however, dealing with MOQs is far from being a trivial matter. The goal is not merely to satisfy the MOQs - which is easy, just order more - but to satisfy the MOQs while maximizing the ROI.

Lokad has been dealing with MOQs for years already. Yet, so far, we were using numerical heuristics implemented through Envision whenever MOQs were involved. Unfortunately, those heuristics were somewhat tedious to implement repeatedly, and the results we were obtaining were not always as good as we wanted them to be - albeit already a lot better than their "manual" counterparts.

Thus, we finally decided to roll our own non-linear solver for the general MOQ problem. This solver can be accessed through a function named moqsolv in Envision. Solving the general MOQ problem is hard - really hard, and under the hood, it's a fairly complex piece of software that operates. However, through this solver, Lokad now offers a simple and uniform way to deal with all types of MOQs commonly found in commerce or manufacturing.

Categories: Tags: insights supply chain envision No Comments

The Stock Reward Function

Published on by Joannes Vermorel.

The classic way of thinking about replenishment consists of establishing one target quantity per SKU. This target quantity typically takes the form of a reorder point which is dynamically adjusted based on the demand forecast for the SKU. However, over the years at Lokad, we have realized that this approach was very weak in practice, no matter how good the (classic) forecasts.

Savvy supply chain practitioners usually tend to outperform this (classic) approach with a simple trick: instead at looking at SKUs in isolation, they would step back and look at the bigger picture, while taking into consideration the fact that all SKUs compete for the same budget. Then, practitioners would choose the SKUs that seem to be the most pressing. This approach outperforms the usual reorder point method because, unlike the latter, it gives priority to certain replenishments. And as any business manager would know, even very basic task prioritization is better that no prioritization at all.

In order to reproduce this nice “trick”, in early 2015 we upgraded Lokad towards a more powerful form of ordering policy known as prioritized ordering. This policy precisely adopts the viewpoint that all SKUs compete for the next unit to be bought. Thanks to this policy, we are getting the best of both worlds: advanced statistical forecasts combined with the sort of domain expertise which was unavailable to the software so far.

However, the prioritized ordering policy requires a scoring function to operate. Simply put, this function converts the forecasts plus a set of economic variables into a score value. By assigning a specific score to every SKU and every unit of these SKUs, this scoring function offers the possibility to rank all “atomic” purchase decisions. By atomic, we refer to the purchase of 1 extra unit for 1 SKU. As a result, the scoring function should be as aligned to the business drivers as possible. However, while crafting approximate “rule-of-thumb” scoring functions is reasonably simple, defining a proper scoring function is a non-trivial exercise. Without getting too far into the technicalities, the main challenge lies in the “iterated” aspect of the replenishments where the carrying costs keep accruing charges until units get sold. Calculating 1 step ahead is easy, 2 steps ahead a little harder, and N steps ahead is pretty complicated actually.

Not so long ago, we managed to crack this problem with the stock reward function. This function breaks down the challenges through three economic variables: the per-unit profit margin, the per-unit stock-out cost and the per-unit carrying cost. Through the stock reward function, one can get the actual economic impact broken down into margins, stock-outs and carrying costs.

The stock reward function represents a superior alternative to all the scoring functions we have used so far. Actually, it can even be considered as a mini framework that can be adjusted with small (but quite expressive) set of economic variables in order to best tackle the strategic goals of merchants, manufacturers or wholesalers. We recommend using this function whenever probabilistic forecasts are involved.

Over the course of the coming weeks, we will gradually update all our Envision templates and documentation materials to reflect this new Lokad capability.

Categories: Tags: insights supply chain envision No Comments

Supply Chain Antipatterns, a Lokad initiative

Published on by Joannes Vermorel.

Most of the supply chain initiatives fail. Yet, over the years at Lokad, we started to realize that those failures are far from random: there are recurrent patterns that invariably lead to failure. Thus, we decided to start a side initiative to survey the most frequent misguided responses to supply chain challenges through our new blog Supply Chain Antipatterns.

The blog comes with real comic strips produced in-house by the Lokad team.

This initiative is intended as a collaborative effort. Don't hesitate to tell the tale of our own misguided attempts (you may remain anonymous too). It might help more than a few companies to avoid falling for the same problem in the future.

Categories: Tags: antipatterns insights supply chain No Comments

Optimal service level and order quantity

Published on by Joannes Vermorel.

In the inventory optimization literature, one of the most recurring concepts is the service level, i.e. the desired probability of not hitting a stock-out situation. The service level expresses the tradeoff between too much inventory and too many stock-outsHowever, experts remain typically vague when it comes to choose service level values; a pattern also followed by most inventory software products...

That's why we have spent a bit of time to craft a formula that gives optimal service levels. Naturally, the optimality is not obtained without assumptions. However, we believe those are reasonable enough to preserve the efficiency of the formula for most businesses.

Then, another subject, that receives too little attention, is the optimal order quantity: the quantity to be ordered in order to minimize the combination of purchase costs, carrying costs, shipping costs, etc. As of January 2012, it's fascinating to notice that most of the industry still relies on the Wilson formula devised back in 1913. Yet, this formula comes with strong assumptions that do not make much sense any more for the supply chain of the 21st century.

Thus, we have designed another economic order quantity formula that emphasizes volume discounts (instead of a flat ordering cost) for larger purchases. The formula (or rather the approach) is fairly general, and could be applied to any pricing structure, including non-linear situations where specific quantities are favored because they matche the size of a crate or a pallet.

Both situations are illustrated with Excel sheets (so you don't even need Lokad to get started).

Categories: docs, technical Tags: supply chain No Comments

European Supermarket Magazine features Lokad

Published on by Joannes Vermorel.

Europe’s dedicated magazine for the supermarket sector ESM is featuring Lokad this month in a two page article titled Embracing the Cloud. The publication gives a great overview of Lokad to retailers and details why the cloud is such a game changer for retail. ESM had taken notice of Lokad’s presentation at the EuroShop 2011 in Düsseldorf, which particularly impressed the Editor Kevin Kelly.

Also, we agreed to offer ESM’s readers a free proof of concept to benchmark the retailer’s current forecasting accuracy compared to the one delivered by Lokad. Consider a 3 weeks net execution time to get hard facts on your forecasting practice.

In this issue, ESM investigates the highly competitive European grocery market, including the growing prominence of private labels, covers areas of interest to both buyers and other senior management working in the retail and manufacturing sectors, such as supply chain and logistics management, technology (such as EPoS), packaging and design and environmental best practices. The magazine appears bimonthly.

Categories: retail, supply chain Tags: cloud computing press retail supply chain No Comments

Reverse supply chain gotcha for demand forecasting

Published on by Joannes Vermorel.

As far forecasting is concerned, there is an old saying Gargage In Garbage Out which says that even the best forecasting technology cannot fix incorrect data inputs. Although, in our experience the sales data hold by retailers and manufacturers are usually rather clean. In this day and age of electronic transactions, the amount of sales data incorrectly entered is vanishingly small.

Nevertheless, there are still subtleties when it comes to forecasting. In particular, Lokad strives to deliver demand forecasts rather than sales forecasts.

When consolidating their historical demand data, we have noticed that returns, part of the reverse supply chain process, and cancelations, frequently end-up counted as negative in the historical consolidated sales. Such a pattern typically generates negative sales that are rather easy to spot even if they happen to be rather rare as the number of returns is typically small compared to the number of orders.

This behavior is correct from an accounting viewpoint, but unfortunately misleading from a forecasting viewpoint. Let's see why.

Let's consider that 9 units of a product are ordered on Monday. Upon delivery on Tuesday, considering overnight shipping, 3 units are returned.  In the end, only 6 units have been really sold on Monday. Yet, what is the correct demand forecast for Monday?

  • If we assume that 6 units is the right demand for Monday, then only 6 units are stored; which lead to only 6 clients being served on Monday, the remaining 3 clients being most likely turned off because of the lack of available merchandise. Then, on Tuesday, 2 units (instead of 3) are still returned. This approach leads to 4 effectives sales for Monday.
  • If we assume that 9 units is the right demand for Monday, then 9 units are stored and consequently shipped. Minus 3 returns on Tuesday, this approach leads to 6 effective sales for Monday.

Obviously, the second scenario is better for the retailer. As a general guidelines, data that come from the reverse supply chain should have no impact on the demand forecasts.

Then, concerning the cancelations, there is a grey area.

  • If an order is cancelled quickly, before the item had any chance of being shipped, the item becomes immediately available for a new order to be placed. In such a situation, it makes sense not to count the quickly-canceled item as part of the demand.
  • Yet, if the cancelation happens a couple of days after the order, but still before the shipment, then one item in inventory has been considered as unavailable during the whole period in between the order and the cancellation. Not counting this long-delayed-cancelation order as demand would potentially prevent actual customers from purchasing the item they want.

The second insight is even stronger if the cancelation is caused by the long shipping delay in the first place.

As a general guideline, all operations that happen to allocate inventory need to be counted as demand, even if those operations happen to be canceled later on. The cancelation can be considered an uncertain replenishment but not as a smaller demand. For a top seller associated to a rather stable return percentage, this is would eventually mean that the stock on order representing the inventory expected to arrive in the future, could be increased according to the amount of sales in-process to reflect the expected cancelation rate.

Categories: accuracy, insights, supply chain Tags: reorder reverse supply chain No Comments

How bulk purchases impact safety stocks

Published on by Joannes Vermorel.

We have published an introductory tutorial that provides the formula used to compute the reorder point based on the forecasted demand, the demand uncertainty, the lead time and a couple of other factors.

This classical safety stock calculation comes with a couple of key assumptions about the demand. We have already posted about how to handle varying lead time. Then, there is another implicit assumption in the classical formula: buyers are assumed to be independent

Recently, we have been approached by a company that frequently sell items in bulk to classrooms. Although most of the sales are single-item order, from time to time, comes a 20-30 items order for a whole classroom. The graph below illustrate the resulting sales patterns of intermittent bulk purchase.

Dislaimer: numbers made up and some results are widly approximated for the sake of simplicity.

Over those 12 months, we can see that we have 2 patterns:

  • ongoing single-unit orders which account for 13 orders per months on average.
  • intermittent bulk sales which account for +30 orders on average.

The average monthly sales is at 23 orders per month, but if we remove the bulk order factor, then the average purchase drop to 13 orders per month.

Now, what is the right safety stock in this situation? If we consider the classical safety stock formula with typical settings, then we are going to have a reorder point established at roughly 30 items: the 23 orders per month average, plus the safety stock itself covering the demand uncertainty. Bulk purchases at 30 items are very likely to be missed short of a hand-few items.

Yet, the classical safety stock calculation is less than optimal: here, we end-up storing about twice as much items than we need to to address the individual purchase, and yet, the safety stock is not high enough to cover big bulk purchases.

In order to address bulk purchases, we need to refine our safety stock formula to take this pattern into account. For sake of simplicity, we are going to model the bulk purchase pattern as a single factor later reintegrated in the safety stock formula.

In order to reflect the bulkiness of the sales, we could consider the largest purchase for each item being sold. Yet, this value is not robust in the statistical sense, as a single super-large historical purchase can completely skew the results.


Instead, we should rather consider a quantile of the bulk order quantity distribution as illustrated by the threshold Q in the illustration here above where all orders have been ordered from the smallest quantity to the largest one.

In this safety stock analysis, there is a natural fit for the quantile value to adopted for Q: it should be equal to the service level - as defined for the classical safety stock formula - that is to say the desired probability of not having a shortage.

Let call yQ the bulk quantity associated to probability Q (in this illustration here above, we have yQ = 30). Technically, yQ is the inverse cumulative distribution of sales function taken at the quantile Q. The reorder point calculation becomes:

R = D + MAX(σL * cdf(P); yQ)

where σL * cdf(P) happens to be the safety stock as computed based on the demand uncertainty.

Computing yQ in Excel is a bit tedious because there is not equivalent of the PERCENTILE function for inverse cumulative distribution. We have to resort either to an histogram scheme or to a VBA macro.

The ICMD User Defined Function for Excel, pasted below, performs the yQ calculation, assuming the sales orders are listed in an Excel range and sorted in increasing order.

' Inverse cumulative distribution
Function ICMD(r As Range, q As Double)
    ' Computing the total
    Dim s As Double
    For Each c In r
        s = s + c.Value
    ' Finding the threshold
    Dim a As Double
    For Each d In r
        a = a + d.Value
        If a >= (q * s) Then
            ICMD = d.Value
            Exit For
        End If
End Function

Based on this refined formula applied to the sample data, we obtain a reorder point R = 13 (demand forecast) + 30 (bulk quantity) = 43 which is sufficient to address the bulk purchase with a high probability while keep the inventory as low as possible.

Got business-specific constraints too? Don't hesitate to let us know. We can adjust Salescast to better fit your business.

Categories: insights, safety stock, supply chain Tags: formula safety stock supply chain 1 Comment

Refreshing Min/Max inventory planning

Published on by Joannes Vermorel.

Modeling inventory replenishments

Min/Max inventory planning has been available for decades. Yet, some people argue that Min/Max drive higher costs and that it should be replaced with other methods

Before jumping on conclusions, let’s try to clarify a bit the situation first. For a given SKU (Stock Keeping Unit), the inventory manager needs only two values to specify his inventory management policy:

  • A threshold, named reorder point, which defines if any reorder should be made (Point 3 in the schema).
  • A quantity, named reorder quantity, to be reordered, if any (Point 1 in the schema).

The Min/Max system simply states that:

MIN = ReorderPoint
MAX = ReorderQuantity + InventoryOnHand  + InventoryOnOrder

Thus, as long you’re not carving in stone your Min & Max values, the Min/Max system is perfectly generic: it can express any reorder policy. As far inventory optimization is concerned, adopting the Min/Max convention is neutral, as it’s just way to express your replenishment policy. Contrary to what people seem to believe, Min/Max does neither define nor prevent any inventory optimization strategy.

What about LSSC and Min/Max?

Let’s see how our Safety Stock Calculator can be plugged into a Min/Max framework. The goal is to update the Min & Max values to optimize the inventory based on the forecasts delivered by Lokad.

The calculator reports reorder points. Thus, handling MIN values is rather straightforward since MIN = ReorderPoint. The calculator even lets you export reorder points directly into any 3rd party database. Yet, MAX values are slightly more complicated. The MAX definition states that:

MAX = ReorderQuantity + InventoryOnHand  + InventoryOnOrder

Let’s start with the ReorderQuantity. The safety stock analysis gives us:

ReorderQuantity = LeadDemand + SafetyStock
                             - InventoryOnHand - InventoryOnOrder

Which could be rewritten as:

ReorderQuantity = ReorderPoint - InventoryOnHand - InventoryOnOrder

where ReorderPoint = LeadDemand + SafetyStock Thus,

MAX = ReorderQuantity + InventoryOnHand  + InventoryOnOrder


MAX = (ReorderPoint - InventoryOnHand - InventoryOnOrder)
    + InventoryOnHand  + InventoryOnOrder

Which simplifies into MAX = ReorderPoint that is to say MAX = MIN.

Obviously there is something fishy going on here. Did you spot what’s wrong in our reasoning?

Well, we haven’t defined any cost being associated with order operations. Consequently, the maths end up telling us something rather obvious: without extra cost for a new order (except the cost of buying the product from the supplier), the optimal planning involves an infinite number of replenishments, where the size of each replenishment tend to zero (or rather tend to 1 if we assume that no fractional product can be ordered).

Getting back to a more reasonable situation, we need to introduce the EOQ (Economic Order Quantity): the minimal amount of inventory that maintain the expected profit margin on the product. Note that our definition differs a bit from the historical EOQ that is a tradeoff between fixed cost per order and the holding cost.

In our experience, the EOQ is a complex product-specific mix:

  • It depends on volume discounts.
  • It depends on product lifetime, and potentially expiration dates.
  • It depends (potentially) on other orders being placed in the time.
  • ...

Thus, we are not going to define EOQ here, as it would go beyond the scope of this post. Instead, we are just going to assume that this value is known to the retailers (somehow). Introducing the EOQ leads to:


What’s the impact of EOQ on service level?

Let have another look at the schema. The Point 2 illustrates what happens when the reorder quantity is made larger: the replenishment cycle gets longer too (see Point 4), as it takes more time to reach the reorder point.

Other things being equal, increasing EOQ also increase service level, yet in a rather inefficient way, as it leads to a very uniform increase of your inventory levels that is not going to accurately match the demand.

Thus, we suggest taking the smallest EOQ that maintain the desired margin on the products being ordered.

Categories: insights, safety stock, supply chain Tags: inventory optimization safety stock supply chain 4 Comments

Modeling varying lead time

Published on by Joannes Vermorel.

Yesterday, we discussed why lead times were varying in the first place. Let's go further and see how varying lead times impact safety stock calculations.

Schema of a lead time distribution

Let's start with qualitative insights of a lead time distribution. For the sake of simplicity, we are considering working days here to avoid week-end artifacts.

  1. The lead time distribution starts with a gap (illustrated by Point 1) that illustrates the minimal amount of time needed to perform shipping and transport.
  2. Then, there is the lead time mode which corresponds to the average shipping and transport time when the product happens to be available at hand in the supplier inventory. This mode is located at Point 2.
  3. If replenishment takes longer, it's because the supplier has been encountering a shortage. As illustrated by Point 3, the lead time distribution is rather flat, and reflects the lead time mode of the supplier itself, that is to say the amount of time needed by the supplier to replenish its own inventory.
  4. Finally, there are rare situations where replenishment takes even longer (Point 4). This situation happens if both the supplier and the supplier of the supplier get a shortage of their own at the same time; or if there are disruptions at the producer level.

The safety stock model model proposed in our sample Excel spreadsheet does not take into account varying lead time. Yet, it happens that this formula can be adjusted in a simple way to take lead time variations into account.

If we assume supplier shortages to be independent for the ones of the retailer being replenished then, lead time should be adjusted to match the desired service level. Obviously, if the supplier supplies only one company, the retailer itself, this assumption does not make much sense; but it's well adapted to the frequent situation where there are many retailers passing order to a larger wholesaler.

Visually, as illustrated in the schema here above, if the desired service level is at 70%, then the surface of the area colored in orange must represent 70% of the total area under the curve; that way the lead time ends up matching the desired service level.

Looking at the schema, it is clear that the higher the service level, the larger the corresponding lead time which is a rather reasonable behavior.

In order words, instead of handling the full complexity of the lead time distribution, we propose a mathematical trick where a single lead time quantile that matches the service level is used. This single value reflects the amount of uncertainties undergone by the retailer to ensure a certain level of service to its own customers.

Percentile formula in Microsoft Excel

The good news is that Microsoft Excel does natively support quantile calculations through the PERCENTILE function. Thus, you can list all the observed lead times in an single Excel column and then apply the PERCENTILE function, the first argument being your list of observations, the second argument being the service level percentage expressed as a value between 0 and 1 (ex: 0.30 represents 30%).

Once you have computed this lead time quantile, you can inject the value as such into the Safety Stock Calculator. It will directly reflect the lead time variations into the reorder point calculation.

Schema of a lead time distribution, higher service level

This analysis initiated with real-world eCommerce observations is leading us to interesting conclusions: in order to ensure high service levels, someone has to the take the financial hit as far inventory levels are concerned.

In our first schema, the orange area was illustrating a lead time associated to a 70% service level (numbers here are made up, it's just for the sake of the explanation), but what happens if the retailer want to increase its service level?

Well, there is a threshold effect matching the service level of the supplier itself. In the present case, we have a supplier with a service level at 75%. This threshold is caused by lead time distribution itself that comes with a strong statistical mode.

If the retailer wants service levels below 75% (i.e. below the supplier's own service level), then the matching lead times are small. Ex: 3 days for the real world example considered in the previous post.

On the contrary, if the retailer wants service levels above 75%, then matching lead times are inflating very fast. This behavior is visually illustrated with the second schema displaying a 90% service level. As you can see, the duration of the matching lead time get more than doubled, which mechanically more or less double the amount of inventory as well.

As we were saying initially, high service levels - which increase sales along with customer satisfaction - do not come for free. In the end, a company in the chain ends up paying for that. Retailers need to be careful concerning service levels offered by their own suppliers, because the threshold effect that we just outlined radically impacts the amount of inventory needed to satisfy its own customers.

Categories: business, insights, safety stock, supply chain Tags: formula lead time model safety stock supply chain 3 Comments

Understanding varying lead time

Published on by Joannes Vermorel.

Recently, quite a few questions have been raised about varying lead times, and how it impacts safety stock calculations. Indeed, Salescast associates a static lead time to each SKU.

It's clear though that lead times aren't deterministic and that there is some uncertainties involved when a reorder is send to a supplier. Thus, we have been - rightfully - asked for a safety stock formula taking into account the lead time uncertainty as well as the future demand uncertainty.

Yet, before jumping straight into the equations, every quantitative business analysis must start with some data at hand to better understand what's going on. Fortunately, Anthony Holloway (from k9cuisine, Dog Food & Dog Treats) was kind enough to hand us over a small dataset representing a series of observed lead times.

Lead time distribution with calendar days

As expected, lead times are varying. Yet, it can already be noted that it's clearly not a normal distribution (it's way too asymmetric for that). Thus, the uncertainty behind the demand forecast and the uncertainty behind the lead time cannot be handled with a symmetric formula.

Our experience at Lokad is telling us that - on average - there is not so much variations in lead time due to shipping and transport. Those two operations tend to be fairly deterministic.

Usually, the root cause of lead time variation is supplier-side shortage. Yet, with the graph here above, the shortage pattern isn't that obvious.

Yet, we started thinking more about this graph, we noticed a suspicious pattern: the 2 lead time peaks are at respectively 3 and 5 days. This looks an awful lot like a week-end effect to us. In other words, shipping & transport take usually 3 working days that is to say 3 or 5 calendar days depending on an eventual week-end.

Discussing more with the retailer who had produced the dataset, we got the confirmation that, indeed, lead times were expressed as plain calendar days.

Lead time distribution with working days

Thus, we decided to apply a working days correction to this graph. The green graph here above displays the very same lead time distribution as the previous blue one, but lead times are now expressed as working days instead.

With the new corrected graph, a much sharper pattern emerges. We have a clear two phases process:

  • the supplier has the product available at hand, in which case, lead times takes 3 days on average, with a minor +1/-1 variation.
  • the supplier does not have the product at hand, in which case. lead times takes a random amount of time, with a rather flat distribution reflecting the supplier-side (longer) lead time.

Furthermore, we can even easily evaluate the supplier-side service level, that is to say the probability for the supplier not hitting a shortage when a reorder is made. In the present case, if we consider 5 days as being the shortage indicator threshold, then we end up with a service level at 75% which is within the usual range for retail wholesalers.

A key question remains: how should we adjust the Salescast settings to take into account varying lead times? The question will be addressed in the next post. Stay tuned.

Categories: insights, supply chain Tags: formula lead time model supply chain 2 Comments