Gold sponsor ACM programming contest

Published on by Joannes Vermorel.

The future of supply chains depends on critical innovation such as deep learning or Bitcoin, the only blockchain that appears to have at least a plausible chance of delivering the massive on-chain scalability that supply chains require. Young talents capable of delivering tomorrow’s innovation should be attracted and nurtured. Thus, Lokad is proud to announce that - thanks to our sponsors CoinGeek and nChain - we have become a Gold Sponsor of the ACM programming context for Southwestern Europe.

Programming contests are tough on students. It takes a lot of courage to participate, and naturally a huge amount of talent to have any chance of making it to the top ranks. Yet, it’s also a great opportunity to take part in a high-spirited event, intended to raise the bar of computer science skills of a whole generation.

SWERC participants need to demonstrate superior abilities in solving tough algorithmic problems, which are of primary relevance for supply chains. Indeed, crafting flexible, scalable, expressive algorithms are of primary importance in order to get more accurate demand forecasts and a better resolution for non-linear optimization problems.

Join us at SWERC 2018 and be part of the future of supply chains!

Categories: Tags: events No Comments

Beyond in-memory databases

Published on by Joannes Vermorel.

Most IT buzzwords age poorly, and for a good reason: most tech that used to have a competitive advantage gets superseded by superior alternatives within a decade or less. Thus, if a software vendor keeps pressing a buzzword past its expiration date (1) then the simplest explanation is that its R&D team has not even realized that the world has moved on.

Anecdotally, multiple venture capitalists have also told me that there were weary of investing in any software company that was more than a few years old, because most companies never manage to decouple their own tech from the tech landscape that defined them when they started.

(1) The tech market itself defines when technology “expires”. When looking at a given piece of technology, at best you can only guesstimate how long it will remain reasonably close to the state of the art.

In-memory databases used to be an IT buzzword, and this did not age well: any software company that markets itself nowadays as delivering in-memory computing or in-memory is pushing outdated pieces of technology to the market (2). Don’t get me wrong though: making the most of the “memory” - more on this later - has never been more important; however, the computing landscape is now more complex than it used to be.

(2) In theory, it could just be the marketing team that happens to be lagging behind, while the tech team has leaped forward already. However, I have never met any software company that was suffering from this problem. You can safely assume that marketing is always ahead of tech.

In the late 90s and early 00s, a specific type of volatile memory, colloquially referred to as the RAM, had become afordable enough so that increasingly interesting and valuable datasets could fit “in-memory. At the time, most software was engineered around the idea that RAM was so expensive and limited that going to great lengths of complications, just for the sake of restricting as much as possible the RAM pressure, was a worthy approach. By simply revisiting most problems from a fresh, unconstrained approach, i.e. “in-memory” computing, many software vendors achieved tremendous speed-ups against older products, which were exclusively relying on spinning disks.

Fast forward to 2018, in-memory databases are an outdated perspective, and it has been that way for years already. There are many types of data storage:

  • L1 CPU cache
  • L2/L3 CPU cache
  • Local RAM
  • Local GPU RAM
  • Local SSD
  • Local HDD
  • Remote RAM
  • Remote SSD
  • Remote HDD
  • Tape or Optical storage

I am not even listing newer storage technologies like the Intel Optane which almost represents a class of device of their own.

Vendors promoting “in-memory” computing are hinting that their software technology is dominantly geared toward the exploitation of two types of memory: the local RAM and the remote RAM. While making the most of the RAM, both local and remote, is certainly a good thing, it also outlines engineering approaches that are underusing the alternatives.

For example, over the last two decades, the CPU cache has gone from 512KB to over 60MB for high-end CPUs. With that much CPU cache, it’s now possible to do “in-cache computing”, bringing massive speed-ups over plain “in-memory computing”. However, leveraging the cache does require the minification of many data structures or even smarter strategies, well beyond what is considered as necessary or even desirable from the RAM perspective.

However, only pointing out that CPU cache is faster than local RAM would be missing the point. Nowadays, good software engineering involves maxing out the respective capabilities of all those classes of data storage. Thanks to cloud computing, assembling an ad-hoc mix of computing resources has never been easier.

Thus, Lokad is not delivering “in-memory” software technology, because it would prevent us from taking advantage of the other options that are presently available to us. For example, while we could rent machines with up to 2TB of RAM, it would be needlessly expensive for our clients. There are many algorithms that can be entirely streamed; thus processing TBs of data does not require TBs of RAM, and considering that 1GB of RAM is about 1000x more expensive than 1GB of HDD, it’s not an implementation detail.

Our goal is not to adhere to some rigid perspective on software engineering, but to stick to the broader engineering perspective, which consists of doing the most you can with the budget you have. In other words, we are inclined to use in-memory processing whenever it outcompetes the alternatives, but no more.

Then, as computing resources are not completely adjustable on-demand - e.g. you cannot realistically rent a machine without a CPU cache - at Lokad, we strive to make the most of all the resources that are being paid for, even if those resources were not strictly requested in the first place.

Categories: Tags: insights technology No Comments

Book: The Quantitative Supply Chain

Published on by Joannes Vermorel.

My latest book, The Quantitative Supply Chain is out! This book presents the Lokad way of envisioning supply chains, which is best summarized through our supply chain manifesto. Your supply chain deserves what machine learning and big data have to offer.

The book can be acquired online.

Lokad will also be offering 100 copies for free to supply chain practitioners. Just drop an email to contact@lokad.com titled Free QSC book from your professional address. The free copies will be served on a FIFO basis, which we feel is the proper way to proceed with a supply chain audience.

Categories: Tags: books 1 Comment

From CRPS to cross-entropy

Published on by Joannes Vermorel.

Our deep learning technology is an important milestone for both us and our clients. Some of the changes associated with deep learning are obvious and tangible, even for the non-expert. For example, the Lokad offices are now littered with Nvidia boxes associated to relatively high-end gaming products. When I started Lokad back in 2008, I would certainly not have anticipated that we would have so much high-end gaming hardware involved in the resolution of supply chain challenges.

Then, some other changes are a lot subtler and yet as critically important: transitioning from CRPS (continuous ranked probability score) to cross-entropy is one of those changes.

The systematic use at Lokad of the CRPS metric was introduced at the same time as our 4th generation forecasting engine; our first native probabilistic engine. CRPS had been introduced as a generalization of the pinball-loss function, and it served its purpose well. At the time, Lokad would never have cracked its aerospace or fashion challenges – supply chain wise – without this metric. Yet CRPS which, roughly speaking, generalizes the mean absolute error to probabilistic forecasts, is not without flaws.

For example, from the CRPS perspective, it’s OK to assign a zero probability to an estimated outcome, if the bulk of the probability mass isn’t too far off from the actual observed outcome. This is exactly what you would expect for a generalization of the mean absolute error. Yet, this also implies that the probabilistic models may claim with absolute certainty that certain events won’t happen, while those events do indeed happen. This sort of vastly incorrect statistical statements about the future comes with a cost that is structurally under-estimated by CPRS.

The cross-entropy, in contrast, assigns an infinite penalty to a model that is proven wrong after assigning a zero probability to an outcome that does happen nonetheless. Thus, from the cross-entropy perspective, models must embrace the all futures are possible, just not equally probable perspective. Assigning a uniform zero probability whenever there isn’t sufficient data for an accurate probability estimate isn’t a valid answer anymore.

However, cross-entropy is not only superior from a purely theoretical perspective. In practice, using cross-entropy to drive the statistical learning process ultimately yields models that happen to be superior against both metrics: cross-entropy and CRPS; even if CRPS happens to be absent from the optimization process altogether.

Cross-entropy is the fundamental metric driving our 5th generation forecasting engine. This metric substantially departs from the intuition that was backing our older forecasting engines. For the first time, Lokad adopts a full Bayesian perspective on statistical learning, while our previous iterations were more grounded into the frequentist perspective.

Check out our latest knowledge base entry about cross-entropy.

Categories: Tags: statistics insights forecasting measure No Comments

Deep Learning as 5th gen forecasting engine

Published on by Joannes Vermorel.

As part of our core commitment to deliver the most accurate forecasts that technology can produce, we are proud to announce that our 5th generation of forecasting engine is now live at Lokad. This engine is bringing the largest accuracy improvement that we have ever managed to achieve in a single release. The engine’s design relies on a relatively recent flavor of machine learning named deep learning. For supply chains, large forecasting accuracy improvements can translate to equally large returns, serving more clients, serving them faster, while facing less inventory risks.

From probabilistic forecasting to deep learning

About 18 months ago, we were announcing the 4th generation of our forecasting technology. The 4th gen was the first to deliver true probabilistic forecasts. Probabilistic forecasts are essential in supply chains because the costs are concentrated on the statistical extremes, when demand happens to be unexpectedly high or low. In contrast, traditional forecasting methods - like traditional daily, weekly or monthly forecasts - that only focus on delivering median or average forecasts, are blind to the problem. As a consequence, those methods usually fail to deliver satisfying returns for companies.

Partly by chance, it turns out that deep learning happens to be heavily geared toward probabilistic forecasts by design. The motivation for this perspective was, however, entirely unrelated to supply chain concerns. Deep learning algorithms are favoring optimization built on top of a probabilistic / Bayesian perspective with metrics like cross entropy because these metrics provide huge gradient values that are especially suitable for the stochastic gradient descent, the “one” algorithm that makes deep learning possible.

In the specific case of supply chains, it happens that the foundations of deep learning are fully aligned with the actual business requirements!

Beyond the hype of artificial intelligence

Artificial intelligence - powered by deep learning in practice - has been the buzzword of the year in 2017. Claims are bold, enthralling and, well, fuzzy. From Lokad’s vantage point, we observe that the majority of these enterprise AI techs are not living up to their expectations. Very few companies can secure over half a billion USD in funding, like Instacart, to gather a world-class deep learning team in order to successfully tackle a supply chain challenge.

With this release, Lokad is making AI-grade forecasting technology accessible to any reasonably “digitalized” company. Obviously, the whole thing is still powered by historical supply chain data, so the data must be accessible to Lokad; but our technology requires zero deep learning expertise. Unlike virtually every single “enterprise” AI techs, Lokad does not rely on manual feature engineering. As far as our clients are concerned, the upgrade from our previous probabilistic forecasts to deep learning will be seamless.

Lokad is the first software company to provide a turnkey AI-grade forecasting technology, accessible both to tiny 1-man ecommerces and yet scaling up to the largest supply chain networks that can include thousands of locations, and a million product references.

The age of GPU computing

Deep learning remained somewhat niche until the community managed to upgrade its own software building block to take advantage of GPUs (graphic processing units). Those GPUs differ largely from CPUs (central processing units), which are still powering the vast majority of apps nowadays with the notable exceptions of computer games, which are intensively relying on both CPUs and GPUs.

Along with the complete rewrite of our forecasting engine for this 5th iteration, we have also significantly upgraded the low level infrastructure of Lokad. Indeed, in order to serve companies, the Lokad platform now leverages GPUs as well as CPUs. Lokad is now taking advantage of the GPU-powered machines that can be rented on Microsoft Azure, the cloud computing platform that supports Lokad.

Through the massive processing power of the GPUs, we are not only making our forecasts more accurate, we are making them much faster too. Through a grid of GPUs, we are now typically getting the forecasts about 3x to 6x faster, for any sizeable datasets (*).

(*) For ultra-small datasets, our 5th gen forecasting engine is actually slower, and takes a few more minutes - which is largely inconsequential in practice.

Product launches and promotions

Our 5th generation forecasting engine is bringing substantial improvements to hard forecasting situations, most notably product launches and promotions. From our perspective product launches, albeit very difficult, remain a tad easier than promotion forecasts. The difference in difficulty is driven by the quality of the historical data, which is invariably lower for promotions compared to product launches. Promotion data gets better over time once the proper QA processes are in place.

In particular, we are seeing deep learning as a massive opportunity for fashion brands who are struggling with product launches that dominate their sales: launching a new product isn’t the exception, it’s the rule. Then, as color and size variants vastly inflate the number of SKUs, the situation is made even more complex.

Early access to deep forecasting

We are planning to gradually upgrade our entire clientbase to our newest forecasting engine. This gradual deployment is intended to make sure we don’t inadvertently introduce regressions where the latest version could happen to be less accurate than the older one. As the version 5.0 is externally identical to the version 4.0, the upgrade will be fully transparent. Clients will only notice the extra accuracy. By the end of Q1 2018, all the forecasts generated through Envision will be powered by 5.0.

If you wish for an early access to deep learning at Lokad for your company, just drop us an email.

Categories: Tags: forecasting release No Comments

The age of online supply chains

Published on by Joannes Vermorel.

We are proud to announce that Lokad now has a native integration for Piwik, an open source web tracking software. In short, Piwik is just like Google Analytics, except that it gives your company full control over your own web traffic data. In particular, it makes sense not to trust the same company with both your SEM (search engine marketing) spendings and your web conversion metrics: Piwik gives you precisely that.

Predicting the next trends is a tough problem in most verticals. Some verticals, like fashions, are even more erratic than others, making the problem even more acute. From Lokad’s data science perspective, any data that could prove valuable in identifying these trends more accurately is welcome. Leveraging the historical sales data is obvious, but what about the web traffic itself?

Customers start looking for products before they buy them. While most customers won’t give you a six month head-start in browsing the product they intend to buy, they are likely to give you some hints a few days in advance (from time to time). This extra information could be put to good use, and make the difference between a product that is immediately available and a product that ends-up out-of-stock at the worst time possible.

The idea of leveraging this behavior from web visitors isn’t new. There is an entire industry known as retargeting dedicated to this very challenge from a marketing perspective. However, Lokad is now bringing this perspective to supply chains. More than ever, the future of supply chains will be online.

Categories: Tags: release bigdata partners No Comments

Periodic demand forecast

Published on by Joannes Vermorel.

As a rule of thumb, whenever supply chain is involved, probabilistic forecasts yield superior results compared to traditional periodic forecasts; i.e., forecasts expressed per day, week or month. Yet, there are also a few situations where demand uncertainty is very low such as where demand is very steady and non-sparse. In those situations, it might still make sense to consider periodic demand forecasts.

Therefore, we have extended our latest forecasting engine to support periodic forecasts. This feature is live and readily available for all Lokad accounts. The output of the periodic forecast takes the form of a table with 4 columns:

  • the product identifier
  • the target date for the forecast
  • the mean value for the demand
  • the sigma (square root of the variance) for the demand

In particular, unlike the probabilistic forecast, the periodic forecast does return fractional demand values, even if the historical demand is expressed as pure integers, typically reflecting the number of units sold.

Under the hood, this periodic forecast is obtained by generating a probabilistic forecast, and by projecting the mean and the variance of the resulting distributions. This periodic forecast benefits from all the perks associated with the probabilistic forecast such as native support for stockouts, for example.

Categories: Tags: insights forecasting release No Comments

Text mining for better demand forecasts

Published on by Joannes Vermorel.

We are proud to announce that Lokad is now featuring text mining capabilities that assist its forecasting engine in delivering accurate demand forecasts, even when looking at products associated with sparse and intermittent demand that do not benefit from attributes such as categories and hierarchies. This feature is live, check out the label option of our forecasting engine.

The primary forecasting challenge faced by supply chains is the sparsity of the data: most products don’t have a decade’s worth of relevant historical data and aren’t served by thousands of units when considering the edges of the supply chain network. Traditional forecasting methods, which rely on the assumption that the time series are both long and non-sparse, perform poorly for this very reason.

Lokad is looking at supply chain historical data from another angle: instead of looking at the depth of the data, which tends to be nonexistent, we are looking at the width of the data, that is, all the correlations that exist between the products. As there are frequently thousands of products, many correlations can be leveraged to improve the forecasting accuracy significantly. Yet, when establishing those correlations, we cannot count on relying on the demand history because many products, such as the products that are about to be launched, don’t even have historical data yet. Thus, the forecasting engine of Lokad has introduced a mechanism to leverage categories and hierarchies instead.

Leveraging categories and hierarchies for increased forecasting accuracy works great. However, this approach suffers from one specific limitation: it relies on the availability of categories and hierarchies. Indeed, many companies haven’t invested much in master data setups, and, as a result, cannot benefit from much fine-grained information about the products that flow through the supply chain. Previously, when no category and no hierarchy were available, our forecasting engine was essentially crippled in its capability to cope with sparse and intermittent demand.

The new text mining capabilities of the Lokad forecasting engine is a game changer: the engine is now capable of processing the plain-text description of products to establish the correlations between the products. In practice, we observe that while companies may be lacking proper categorizations for their products, a plain-text description of the products is nearly always available, dramatically improving the applicability of the width-first forecasting perspective of Lokad.

For example, if a diverse set of products happens to be named Something Christmas, and all those products exhibit a consistent seasonal spike before Christmas, then the forecasting engine can identify this pattern and automatically apply the inferred seasonality to a new product that has the keyword Christmas in its description. This is exactly what happens under the hood at Lokad when plain-text labels are fed to the forecasting engine.

Our example above is simplistic, but, in practice, text mining involves uncovering complex relationships between words and demand patterns that can be observed in the historical data. Products sharing similar descriptions may share similar trends, similar life-cycles, similar seasonalities. However, two products with similar descriptions may share the same trend but not the same seasonality, etc. The forecasting engine of Lokad is based on machine learning algorithms that automatically identify the relevant information from the plain-text descriptions of the products. The engine requires no preprocessing of the product descriptions.

Our motto is to make the most of the data you have. With text mining capabilities, we are once again lowering the requirements to bring your company to age of quantitative supply chains. Any question? Just drop us a line at contact@lokad.com.

Categories: Tags: forecasting insights technology release No Comments

Entropy analysis for supply chain IT system discovery

Published on by Joannes Vermorel.

The IT landscape of supply chains is nearly always complex. Indeed, by nature, supply chains involve multiple actors, multiple sites, multiple systems, etc. As a result, building data-driven insights in supply chains is a challenge due to the sheer heterogeneity of the IT landscape. Too frequently, supply chain analytics deliver nonsensical results precisely because of underlying garbage in, garbage out problems.

At Lokad, we have not only developed a practice that thoroughly surveys the IT landscape and the datasets inhabiting it, but we have also created some bits of technology to facilitate the surveying operations themselves. In this post, we detail one step of our surveying methodology, which is based on Shannon entropy. We successfully leveraged entropy analysis for several large scale supply chain initiatives.

Our surveying process starts by reviewing all the database tables that are understood to be relevant for the supply chain initiative. Supply chains are complex and, consequently, the IT systems that operate the supply chains reflect this complexity. Moreover, the IT systems might have been evolving for several decades, and layers of complexity tend to take over those systems. As a result, it’s not uncommon to identify dozens of database tables with each table having dozens of columns, i.e. fields in the database terminology.

For large supply chains, we have observed situations where the total number of distinct fields are above 10,000. By leveraging entropy analysis, we are able to remove half of the columns from the picture immediately and consequently reduce the remaining amount of work significantly.

Dealing with that many columns is a major undertaking. The problem is not data-processing capabilities: with cloud computing and adequate data storage, it’s relatively straightforward to process thousands of columns. The real challenge is to make sense of all those fields. As a rule of thumb, we estimate that well-written documentation of a field takes up about one page, when interesting use cases and edge cases are covered. Without proper documentation of the data, data semantics are lost, and odds are very high that any complex analysis performed on the data will suffer from massive garbage in garbage out headaches. Thus, with 10,000 fields, we are faced with the production of a 10,000-page manual, which demands a really monumental effort.

Yet, we have observed that those large IT systems also carry a massive amount of dead weight. While the raw number of fields appear to very high, in practice, it does not mean that every column found in the system contains meaningful data. At the extreme, the column might be entirely empty or constant and, thus, contain no information whatsoever. A few fields can be immediately discarded because they are truly empty. However, we have observed that fully empty fields are actually pretty rare. Sometimes, the only non-constant information in the column dates from the day when the system was turned on; the field was never to be reused afterward. While truly empty fields are relatively rare, we usually observe that degenerate fields are extremely numerous. These fields contain columns with almost no data, well below any reasonable threshold to leverage this data for production purposes.

For example, a PurchaseOrders table containing over one million rows might have an Incoterms column that is non-empty in only 100 rows; furthermore, all those rows are more than five years old, and 90 rows contain the enry thisisatest. In this case, the field Incoterms is clearly degenerate, and there is no point in even trying to make sense of this data. Yet, a naive SQL filter will fail to identify such a column as degenerate.

Thus, a tool to identify degenerate columns is needed. It turns out that Shannon entropy is an excellent candidate. Shannon entropy is a mathematical tool to measure the quantity of information that is contained in a message. The entropy is measured in Shannons, which is a unit of measurement somewhat akin to bits of information. By treating the values found in the column itself as the message, Shannon entropy gives us a measure of the information contained in the column expressed in Shannons.

While all of this might sound highly theoretical, putting this insight into practice is extremely straightforward. All it takes is to use the entropy() aggregator provided by Envision. The tiny script below illustrates how we can use Envision to produce the entropy analysis of a table with 3 fields.

read "data.csv" as T[*]
show table "List of entropies" with
  entropy(T.Field1)
  entropy(T.Field2)
  entropy(T.Field3)


Any field associated with an entropy lower than 0.1 is a very good indicator of a degenerate column. If the entropy is lower than 0.01, the column is guaranteed to be degenerate.

Our experience indicates that performing an initial filtering based on entropy measurements reliably eliminates between one-third and two-thirds of the initial fields from the scope of interest. The savings in time and efforts are very substantial: for large supply chain projects, we are talking about man-years being saved through this analysis.

We unintentionally discovered a positive side effect of entropy filtering: it lowers the IT fatigue associated with the (re)discovery of the IT systems. Indeed, investigating a degenerate field typically proves to be an exhausting task. As the field is not used - sometimes not used any more - nobody is quite sure whether the field is truly degenerate or if the field is playing a critical, but obscure, role in the supply chain processes. Because of the complexities of supply chains, there is frequently nobody who can positively affirm that a given field is not used. Entropy filtering immediately eliminates the worst offenders that are guaranteed to lead us on a wild-goose chase.

Categories: Tags: supply chain envision insights No Comments

The illustrated stock reward function

Published on by Joannes Vermorel.

The stock reward function is a key ingredient to make the most of probabilistic forecasts in order to boost your supply chain performance. The stock reward is used for computing the return on investment for every extra unit of stock to be purchased or manufactured.

The stock reward function is expressive and can be used like a mini-framework for addressing many different situations. However, as a minor downside, it’s not always easy to make sense of the calculations performed with the stock reward function. Below you’ll find a short list of graphs that represent the various transformations applied to the forecasts.

The first graph - entitled Future demand - represents a probabilistic demand forecast associated with a given SKU. The curve represents a distribution of probabilities, with the total area under the curve equal to one. In the background, this future demand is implicitly associated with a probabilistic lead time forecast, also represented as a distribution of probabilities. Such a distribution is typically generated through a probabilistic forecasting engine.

The Marginal fill rate graph represents the fraction of extra demand that is captured by each extra unit of stock. In other words, this graph demonstrates what happens to the fill rate as the stock increases. Since we are representing a marginal fill rate here, the total area under the curve remains equal to one. The marginal fill rate distribution can be computed with the fillrate() function.

The Demand with backorders graph is identical to the Future demand graph, except that 8 units have been introduced to represent a back order. The backorder represents guaranteed demand since these units have already been bought by clients. As a result, when backordered units are introduced, the probability distribution of demand is shifted to the right as the backordered units being guaranteed demand. The shift operator >> is available as part of the algebra of distribution to compute such a transformation over the initial distribution.

The Fill rate with backorders graph is also very similar to the original Marginal fill rate graph, but has also been shifted 8 units to the right. Here, the plotted fill rate is only associated with the uncertain demand, hence the shape of the distribution remains the same.

The Margin graph represents the margin economic reward as computed by the stock reward function taking the Demand with backorders as input. The stock reward can be visualized as a distribution, but this is not a distribution of probabilities: the area under the curve is not equal to one but is instead equal to the total margin which would be captured with unlimited inventory. On the left of the graph, each backordered unit yields the same margin, which is not surprising as there is no uncertainty in capturing the margin given that the units have already been bought.

The Stockout penalty represents the second component of the stock reward function. The shape of the distribution might feel a bit unexpected, but this shape merely reflects that, by construction of the stock reward function, the total area under the curve is zero. Intuitively, starting from a stock level of zero, we have the sum of all the stockout penalties as we are missing all the demand. Then, as we move to the right with higher stock levels we are satisfying more and more demand and thus further reducing the stockout penalties; until there is no penalty left because the entire demand has been satisfied. The stock-out penalty of not serving backorders is represented as greater than the penalty of not serving the demand that follows. Here, we are illustrating the assumption that clients who have already backordered typically have greater service expectations than clients who haven’t yet bought any items.

The Carrying costs graph represents the third and last component of the stock reward function. As there is no upper limit for the carrying costs - it’s always possible to keep one more unit in stock thus further increasing the carrying costs - the distribution is divergent: it tends to negative infinity on the right. The total area under the curve is negative infinity, although this is a rather theoretical perspective. On the right, the carrying costs associated with the backordered units are zero: indeed, as those units have already been bought by clients they won’t incur any carrying costs, since those units will be shipped to clients as soon as possible.

The final stock reward - not represented above - would be obtained by summing the three components of the stock reward function. The resulting distribution would be interpreted as the ROI for each extra unit of stock to be acquired. This distribution typically starts with positive values,the first units of stock being profitable, but converge to negative infinity as we move to higher stock levels given the unbounded carrying costs.

The term support classically refers to the demand levels associated with non-zero probabilities. In the graphs above, the term support is used loosely to refer to the entire range that needs to processed as non-zero values by Envision. In particular, it’s worth mentioning that there are multiple calculations that require the distribution support to be extended in order to make sure that the final resulting distribution isn’t truncated.

  • The shift operation, which happens when backorders are present, requires the support to be increased by the number of backordered units.
  • The margin and carrying cost components of the stock reward function have no theoretical limits on the right, and can require arbitrarily large extensions of the support.
  • Ordering constraints, such as MOQs, may require having inventory levels that are even greater than the ones reached by the shifted distributions. Properly assessing the tail of the distribution is key for estimating whether the MOQ can be profitably satisfied or not.

In practice, the Envision runtime takes care of automatically adjusting the support to make sure that distributions aren’t truncated during the calculations.

Categories: Tags: insights inventory optimization No Comments

The Supply Chain Scientist

Published on by Joannes Vermorel.

Artificial intelligence has been making steady progress over the last few decades. However, while self-driving cars might be just around the corner, we are still decades away from having software smart enough to devise a supply chain strategy. Yet, at the same time, it would be incorrect to conclude that supply chain as a whole is still decades away from being positively impacted by machine learning algorithms.

Lokad’s supply chain science competency was born out of the observation that while algorithms alone were insufficient, they actually became formidable enablers in the hands of capable supply chain experts. Machine learning offers the possibility to achieve unprecedented levels of supply chain performance by taking care of all the extensive but otherwise clerical micro-decisions that your supply chain requires: when to order a product, when to move a unit of stock, when to produce more items, etc.

The Supply Chain Scientist is a mix between a data scientist and a supply chain expert. This person is responsible for the proper data preparation and the proper quantitative modelling of your supply chain. Indeed, it takes human supply chain insights to realize that some relevant data may be missing from a project and to align the optimization parameters with the supply chain strategy of the company.

Too often, supply chain initiatives come with fragmented responsibilities:

  • Data preparation is owned by the IT team
  • Statistics and reporting is owned by the BI (business intelligence) team
  • Supply chain execution is owned by the supply chain team

The traditional S&OP answer to this issue is the creation of collective ownership through monthly meetings between many stakeholders, ideally having the whole thing owned by the CEO. However, while we are certainly not opposed to the principle of collective ownership, our experience indicates that things tend to move forward rather slowly when it comes to traditional S&OP.

In contrast to the collective ownership established through scheduled meetings, the Supply Chain Scientist holds the vital role of taking on the end-to-end ownership of all the quantitative aspects of a supply chain initiative.

This focused ownership is critical in order to avoid too common pitfalls associated with traditional supply chain organizations which are:

  • Data is incorrectly extracted and prepared, primarily because the IT team has limited insights in relation to the use of the data.
  • Statistics and reporting misrepresent the business; they provide less-than-useful insights and suffer from less-than-perfect data inputs.
  • Execution rely heavily on ad-hoc Excel sheets in order to try to mitigate the two problems described above, while creating an entire category of new problems.

When we begin a quantitative supply chain initiative with a client company, we start by making sure that a Supply Chain Scientist is available to execute the initiative.

Learn more about supply chain scientists

Categories: Tags: insights supply chain 1 Comment

Fashion demand forecasting

Published on by Joannes Vermorel.

Forecasting is hard. Forecasting the future of fashion is insanely hard. As a result, for most part, the fashion industry still relies on crude methods such as Open-To-Buy which are nothing but glorified top-down moving averages. Yet, most supply chain practitioners would argue that as long as there isn’t something that can actually beat Open-To-Buy in the real world, then Open-To-Buy isn’t outdated, no matter how crude the method might be. In fact, until recently, our own observations were aligned with what fashion companies were telling us: nothing really works for fashion, and guesswork remains the best option among all the other, even less satisfactory, alternatives.

Our probabilistic forecasting engine, released last year, became a game changer for fashion. After years of struggling with fashion demand patterns, we finally have a forecasting engine that is natively architectured towards the specific challenges of the fashion sector. Over the last couple of months, we have been driving the supply chains of multiple fashion companies, and well, it actually works! Considering the track record of forecasting vendors in the fashion industry, the odds weren’t really in our favor.

Demand in fashion is typically driven by novelty, and new products come together through collections. Collections are essential from the fashion perspective; yet, at the same time, they represent a massive forecasting challenge.

The demand needs to be forecast for products that haven’t been sold yet.

Fashion isn’t about products that haven’t been sold for long, fashion is about products that haven’t been sold at all. This perspective is a fundamental mismatch with the time-series forecasting approach that represents the foundation of nearly all forecasting systems - not Lokad though. Indeed, from a time-series perspective, in the case of fashion, time-series have zero historical depth, hence there is nothing to rely on for the purpose of forecasting.

Lokad’s probabilistic forecasting engine takes a completely different stance: it actively leverages the different product attributes: brand, style, color, fabric, size, price point, category, etc, in order to build a demand forecast based on the performance of similar products in the previous collections.

One of the things that Lokad’s forecasting engine does not do is to require products to be manually paired between collections. First, establishing those pairs is very complicated and extremely time-consuming. Supply chain practitioners are not supposed to be the slaves of their own systems; if the systems require thousands of products to be manually paired, chances are that this time would be better invested in producing a manual forecast that directly benefits from human insights. Second, in fashion, 1-to-1 mapping between the old and the new collections does not actually make any sense most of the time. New collections are likely to redefine the codes in subtle yet important ways: one product may become many, and vice-versa. A methodology that exclusively relies on 1-to-1 pairings is guaranteed to deliver rather naive results about the future collections.

Lokad’s forecasting engine is all about computing all those similarities in a fully automated manner through machine learning algorithms. Artificial Intelligence is now all the rage in the media, but under the surface it boils down to machine learning algorithms that have undergone steady and yet gradual progress over the last 3 decades. Lokad leverages several classes of machine learning algorithms, tailored for supply chain purposes.

In addition, Lokad delivers probabilistic forecasts. Instead delivering one demand forecast - the median or the mean - that is (nearly) guaranteed to be incorrect, Lokad delivers the probabilities for (nearly) all the demand scenarios. This aspect is of critical importance for the fashion industry because uncertainty is irreducible; and good supply order frequently boils down to a risk analysis.

In fashion, the two main risks are lost opportunities if there is not enough stock, and stock depreciations if the goods have to be sold with a very aggressive discount during the sales period - in order to liquidate the remaining stocks of a collection. Lokad has native capabilities to deal with this specific risk analysis that is so important in fashion.

Intrigued by Lokad’s capabilities for fashion? Don’t hesitate to book a demo call with us.

Categories: Tags: forecasting fashion No Comments

Machine learning jobs at Lokad

Published on by Joannes Vermorel.

Machine learning along with artificial intelligence have become buzzwords. Given that Lokad has become identified as one of the key European companies that generate real-world decisions driven by machine learning - supply chain decisions actually - we are getting a growing number of applicants.

The good news: we are still hiring!

In this post, we review the three realms of machine learning that exist at Lokad and what you need to do to maximize the odds of getting an interview with us, and ideally be hired afterwards.

Kudos to the applicants who will be able to mention that they have read this blog post during their interview. Smart people are curious people, and if you can’t be bothered doing a bit of research on your future employer, you’re probably not fit for the machine learning industry anyway.

Job 1: Predictive business modelling

Improving the supply chain performance of a company through machine learning takes significant effort. The data need to be well-prepared. The resolution of the challenge should be fully aligned with the vision and the strategy of the client company. The supply chain teams should be coached to embrace a new and more capable analytical solution. Measurable results should be collected, and one should be be prepared to have these results challenged by top management. At Lokad, the data modelling team, or more simply put, the data team, is responsible for tackling those challenges.

For this specific position, we are looking for engineers with a strong analytical mindset who are capable not only of understanding the strengths and the limitations of the machine learning engines that are made available to them, but are also able to implement real-life set-ups that will be integrated into the daily workflows of real-world supply chains. Improvements are real and mistakes are real too. In your interview, it is advised to demonstrate your understanding of the Lokad product as documented on our website. Bonus points if you can outline how Lokad’s technology can be used to address actual supply chain challenges.

Job 2: Crafting the Big Data infrastructure

Machine learning is critically dependent on data. In fact, the more data is available, the better machine learning works. Lokad seeks talented software engineers that can design all the infrastructure that supports the different machine learning bits. The importance of the whole data pipeline is not to be underestimated: a deficient pipeline is one of the primary failure causes of data-driven initiatives. The infrastructure needs to be not only fast and reliable, but also needs to be able cope with the hefty computing requirements of the machine learning algorithms themselves.

For this role, we are looking for software engineers with a strong taste for complex distributed back-office processing. You should not be afraid of tackling complicated algorithms, such as dealing with a radix tree, and implement such algorithms yourself. Ideally, in your interview, you should demonstrate not only your capacity to understand and implement this kind of algorithmic processing, but also to deliver code that can be maintained and that is fit for production.

Job 3: Hardcore machine learning science

Most modern machine learning algorithms are complicated not only from a statistical perspective, but also from a purely algorithmic perspective. Lokad seeks talented mathematicians who are willing to acquire the software development skills it takes to implement those “hardcore” machine learning algorithms. We have developed our own suit of algorithms which are specifically designed for supply chain needs. Do not expect to plug one open source machine learning toolkit and move on: our clients are critically dependent on algorithms that have been designed to accommodate specific supply chain challenges.

For this position, we are looking for mathematicians or software developers with a strong propensity for numerical analysis and optimization, who have the ambition to deal with stunningly difficult problems. You should not be afraid of rolling out your own class of algorithms which may be somewhat unlike what is considered to be “mainstream” machine learning. Ideally, in your interview, you should be able to demonstrate why Lokad requires alternative approaches and maybe even shed some personal insights on the case.

More about Careers at Lokad.

Categories: Tags: hiring machine learning No Comments

The test of supply chain performance

Published on by Joannes Vermorel.

Answering these 12 questions tell more about your supply chain performance than nearly all benchmarks and audits that the market has to offer. This test should take about 5 minutes of your time.

  1. Can supply chain operate without Excel?
  2. Is ABC analysis regarded as obsolete?
  3. Is all relevant data documented by the supply chain teams?
  4. Do you record historical stock levels?
  5. Do supply chain teams monitor the quality of their data?
  6. Do you forecast lead times?
  7. Do you prevent any manual intervention on forecasts?
  8. Do you manage operational constraints, e.g. MOQs, as data?
  9. Do you quantify the cost of supply chain failures?
  10. Can decision-making systems be left unattended for a week?
  11. Can you recompute all decisions in one hour?
  12. Are all decisions prioritized against each other?

If your company isn’t answering yes to at least 10 of those questions, then, a supply chain initiative has the potential to deliver a sizeable ROI. If you don't have 8 positive answers, ROI can be expected to be massive. If your company does not reach 6 positive answers, then, in our book, supply chain optimization hasn't even started yet.

Read more about the Lokad test of supply chain performance.

Categories: Tags: supply chain insights No Comments

2017, year of quantitative supply chain

Published on by Joannes Vermorel.

Thanks to the probabilistic forecasting engine that we released last year, our capacity to optimize supply chains has dramatically improved over the last couple of months. Through our growing experience, we have come to realize that there are 5 principles that drive the success of the supply chain initiatives undertaken by Lokad:

  1. All possible futures must be considered; a probability for each possibility.
  2. All feasible decisions must considered; an economic score for each possibility.
  3. There are no absolutes, only relative costs and opportunities.
  4. Being in control requires automation of every mundane task.
  5. Data requires more effort and brings more returns than you expect.

We decided to name this approach: quantitative supply chain.

You can also read our full Quantitative Supply Chain Manifesto

The quantitative supply chain approach acknowledges the near-infinite computing resources that are available to supply chain practitioners today. It also puts the management back in control of their own supply chain by freeing up teams from unending manual entries required by traditional solutions.

For 2017, we wish you incredible supply chain performance!

Categories: Tags: history insights No Comments

Markdown tile and Summary tile

Published on by Joannes Vermorel.

The dashboards produced by Lokad are composite: they are built of tiles that can be rearranged as you see fit. We have many different tiles available: linechart, barchart, piechart, table, histogram, etc. This tile approach offers great flexibility when it comes to crafting a dashboard that contains the exact figures your company needs. Recently, we have introduced two extra tiles in order to help fine-tune your dashboards even further.


The Summary tile offers a compact approach for displaying KPIs (key performance indicators). While it was already possible to use the Table tile for a similar purpose, this approach was requiring 1 tile to be introduced for every KPI. As a result, dashboards containing a dozen or more KPIs were needlessly large. In contrast, the Summary tile offers a more practical way for gathering a couple of key figures in one place. As usual, the real challenge is not to present thousands of numbers to the supply chain practitioner - that part is easy - but rather to present the 10 numbers that are worth being read - and that part is hard; and the Summary tile happens to be the best tile to gather those 10 numbers.

The Markdown tile offers the possibility to display simply formatted text in the dashboard. As the name suggests, the text gets formatted using the Markdown syntax, which is rather straightforward. One of the most urgent needs addressed by the Markdown tile is the possibility to embed detailed legends within dashboards. Indeed, when composing complex tables, such as suggested purchase quantities, it is important to make sure there is no remaining ambiguity concerning the semantic of each table column. The Markdown tile represents a practical way of delivering contextual documentation and making sure that no numbers get misinterpreted. It also provides an opportunity to document the intent behind the numbers which is too frequently lost amid technicalities: the documentation can outline as to why a number is shown on the dashboard in the first place.

Categories: Tags: envision release No Comments

Preparing enterprise data takes 6 months

Published on by Joannes Vermorel.

How long does it take to get started with Lokad? Answering this question is tough because often our answer is about 3 to 6 months. Hell, 6 months! How can your software be so clunky that it can take up to 6 months to get started? Well, our typical set-up phases can be broken down as follows:

  • 90 to 180 days: preparing the data
  • 3 to 30 days: configuring Lokad

This shows that Lokad’s setup is actually lightweight. Yes, there is room for improvement, but if we consider that Lokad offers programmatic capabilities which completely fit the business drivers, the process can already be considered as lean.

The biggest challenge, which makes pretty much everything else seem insignificant, is data preparation. Preparing data is the art of producing numbers that make sense out of raw data obtained from the company systems.

It is tempting to underestimate the amount of effort that needs to be invested upfront in order to deliver numbers that make sense. In fact, data preparation is too often reduced to a simple data cleaning operation as if the challenge could simply be addressed by filtering the few parts in the data that happen to be incorrect (such as the negative stock levels).

Yet, the true challenge lies in uncovering and documenting the precise semantics of the data. When we begin a project, we consider ourselves lucky if we have about one line of documentation per field for every database table that is made available to us. By the end of the project, we have about one page worth of documentation per field.

If data preparation takes 6 months, why not just postpone using Lokad for 6 months? So that we have all the data just before Lokad starts working on it.

Establishing a data-driven company culture takes years. If your company does not already have a team of data scientists working for it, not much will happen for the next 6 months as far as your supply chain project is concerned. Hence, after 6 months of waiting, your company will still be stuck with another 6 months of data preparation. One of the core know-hows we have developed at Lokad consists precisely in uncovering all the subtle “gotchas” that may backfire against supply chain initiatives.

Lokad can be a key driver for change in your company’s supply chain. Don’t hesitate to contact us, we are happy to discuss these matters with you in more detail.

Categories: Tags: insights supply chain No Comments