A few weeks ago, in Berlin, Lokad was invited on stage at Vizions 2017 to give a talk about probabilistic forecasting and how this paradigm will change the future of supply chain for the fashion industry.
The stock reward function is a key ingredient to make the most of probabilistic forecasts in order to boost your supply chain performance. The stock reward is used for computing the return on investment for every extra unit of stock to be purchased or manufactured.
The stock reward function is expressive and can be used like a mini-framework for addressing many different situations. However, as a minor downside, it’s not always easy to make sense of the calculations performed with the stock reward function. Below you’ll find a short list of graphs that represent the various transformations applied to the forecasts.
The first graph - entitled Future demand - represents a probabilistic demand forecast associated with a given SKU. The curve represents a distribution of probabilities, with the total area under the curve equal to one. In the background, this future demand is implicitly associated with a probabilistic lead time forecast, also represented as a distribution of probabilities. Such a distribution is typically generated through a probabilistic forecasting engine.
The Marginal fill rate graph represents the fraction of extra demand that is captured by each extra unit of stock. In other words, this graph demonstrates what happens to the fill rate as the stock increases. Since we are representing a marginal fill rate here, the total area under the curve remains equal to one. The marginal fill rate distribution can be computed with the fillrate() function.
The Demand with backorders graph is identical to the Future demand graph, except that 8 units have been introduced to represent a back order. The backorder represents guaranteed demand since these units have already been bought by clients. As a result, when backordered units are introduced, the probability distribution of demand is shifted to the right as the backordered units being guaranteed demand. The shift operator >> is available as part of the algebra of distribution to compute such a transformation over the initial distribution.
The Fill rate with backorders graph is also very similar to the original Marginal fill rate graph, but has also been shifted 8 units to the right. Here, the plotted fill rate is only associated with the uncertain demand, hence the shape of the distribution remains the same.
The Margin graph represents the margin economic reward as computed by the stock reward function taking the Demand with backorders as input. The stock reward can be visualized as a distribution, but this is not a distribution of probabilities: the area under the curve is not equal to one but is instead equal to the total margin which would be captured with unlimited inventory. On the left of the graph, each backordered unit yields the same margin, which is not surprising as there is no uncertainty in capturing the margin given that the units have already been bought.
The Stockout penalty represents the second component of the stock reward function. The shape of the distribution might feel a bit unexpected, but this shape merely reflects that, by construction of the stock reward function, the total area under the curve is zero. Intuitively, starting from a stock level of zero, we have the sum of all the stockout penalties as we are missing all the demand. Then, as we move to the right with higher stock levels we are satisfying more and more demand and thus further reducing the stockout penalties; until there is no penalty left because the entire demand has been satisfied. The stock-out penalty of not serving backorders is represented as greater than the penalty of not serving the demand that follows. Here, we are illustrating the assumption that clients who have already backordered typically have greater service expectations than clients who haven’t yet bought any items.
The Carrying costs graph represents the third and last component of the stock reward function. As there is no upper limit for the carrying costs - it’s always possible to keep one more unit in stock thus further increasing the carrying costs - the distribution is divergent: it tends to negative infinity on the right. The total area under the curve is negative infinity, although this is a rather theoretical perspective. On the right, the carrying costs associated with the backordered units are zero: indeed, as those units have already been bought by clients they won’t incur any carrying costs, since those units will be shipped to clients as soon as possible.
The final stock reward - not represented above - would be obtained by summing the three components of the stock reward function. The resulting distribution would be interpreted as the ROI for each extra unit of stock to be acquired. This distribution typically starts with positive values,the first units of stock being profitable, but converge to negative infinity as we move to higher stock levels given the unbounded carrying costs.
The term support classically refers to the demand levels associated with non-zero probabilities. In the graphs above, the term support is used loosely to refer to the entire range that needs to processed as non-zero values by Envision. In particular, it’s worth mentioning that there are multiple calculations that require the distribution support to be extended in order to make sure that the final resulting distribution isn’t truncated.
- The shift operation, which happens when backorders are present, requires the support to be increased by the number of backordered units.
- The margin and carrying cost components of the stock reward function have no theoretical limits on the right, and can require arbitrarily large extensions of the support.
- Ordering constraints, such as MOQs, may require having inventory levels that are even greater than the ones reached by the shifted distributions. Properly assessing the tail of the distribution is key for estimating whether the MOQ can be profitably satisfied or not.
In practice, the Envision runtime takes care of automatically adjusting the support to make sure that distributions aren’t truncated during the calculations.
Artificial intelligence has been making steady progress over the last few decades. However, while self-driving cars might be just around the corner, we are still decades away from having software smart enough to devise a supply chain strategy. Yet, at the same time, it would be incorrect to conclude that supply chain as a whole is still decades away from being positively impacted by machine learning algorithms.
Lokad’s supply chain science competency was born out of the observation that while algorithms alone were insufficient, they actually became formidable enablers in the hands of capable supply chain experts. Machine learning offers the possibility to achieve unprecedented levels of supply chain performance by taking care of all the extensive but otherwise clerical micro-decisions that your supply chain requires: when to order a product, when to move a unit of stock, when to produce more items, etc.
The Supply Chain Scientist is a mix between a data scientist and a supply chain expert. This person is responsible for the proper data preparation and the proper quantitative modelling of your supply chain. Indeed, it takes human supply chain insights to realize that some relevant data may be missing from a project and to align the optimization parameters with the supply chain strategy of the company.
Too often, supply chain initiatives come with fragmented responsibilities:
- Data preparation is owned by the IT team
- Statistics and reporting is owned by the BI (business intelligence) team
- Supply chain execution is owned by the supply chain team
The traditional S&OP answer to this issue is the creation of collective ownership through monthly meetings between many stakeholders, ideally having the whole thing owned by the CEO. However, while we are certainly not opposed to the principle of collective ownership, our experience indicates that things tend to move forward rather slowly when it comes to traditional S&OP.
In contrast to the collective ownership established through scheduled meetings, the Supply Chain Scientist holds the vital role of taking on the end-to-end ownership of all the quantitative aspects of a supply chain initiative.
This focused ownership is critical in order to avoid too common pitfalls associated with traditional supply chain organizations which are:
- Data is incorrectly extracted and prepared, primarily because the IT team has limited insights in relation to the use of the data.
- Statistics and reporting misrepresent the business; they provide less-than-useful insights and suffer from less-than-perfect data inputs.
- Execution rely heavily on ad-hoc Excel sheets in order to try to mitigate the two problems described above, while creating an entire category of new problems.
When we begin a quantitative supply chain initiative with a client company, we start by making sure that a Supply Chain Scientist is available to execute the initiative.
Learn more about supply chain scientists
Forecasting is hard. Forecasting the future of fashion is insanely hard. As a result, for most part, the fashion industry still relies on crude methods such as Open-To-Buy which are nothing but glorified top-down moving averages. Yet, most supply chain practitioners would argue that as long as there isn’t something that can actually beat Open-To-Buy in the real world, then Open-To-Buy isn’t outdated, no matter how crude the method might be. In fact, until recently, our own observations were aligned with what fashion companies were telling us: nothing really works for fashion, and guesswork remains the best option among all the other, even less satisfactory, alternatives.
Our probabilistic forecasting engine, released last year, became a game changer for fashion. After years of struggling with fashion demand patterns, we finally have a forecasting engine that is natively architectured towards the specific challenges of the fashion sector. Over the last couple of months, we have been driving the supply chains of multiple fashion companies, and well, it actually works! Considering the track record of forecasting vendors in the fashion industry, the odds weren’t really in our favor.
Demand in fashion is typically driven by novelty, and new products come together through collections. Collections are essential from the fashion perspective; yet, at the same time, they represent a massive forecasting challenge.
The demand needs to be forecast for products that haven’t been sold yet.
Fashion isn’t about products that haven’t been sold for long, fashion is about products that haven’t been sold at all. This perspective is a fundamental mismatch with the time-series forecasting approach that represents the foundation of nearly all forecasting systems - not Lokad though. Indeed, from a time-series perspective, in the case of fashion, time-series have zero historical depth, hence there is nothing to rely on for the purpose of forecasting.
Lokad’s probabilistic forecasting engine takes a completely different stance: it actively leverages the different product attributes: brand, style, color, fabric, size, price point, category, etc, in order to build a demand forecast based on the performance of similar products in the previous collections.
One of the things that Lokad’s forecasting engine does not do is to require products to be manually paired between collections. First, establishing those pairs is very complicated and extremely time-consuming. Supply chain practitioners are not supposed to be the slaves of their own systems; if the systems require thousands of products to be manually paired, chances are that this time would be better invested in producing a manual forecast that directly benefits from human insights. Second, in fashion, 1-to-1 mapping between the old and the new collections does not actually make any sense most of the time. New collections are likely to redefine the codes in subtle yet important ways: one product may become many, and vice-versa. A methodology that exclusively relies on 1-to-1 pairings is guaranteed to deliver rather naive results about the future collections.
Lokad’s forecasting engine is all about computing all those similarities in a fully automated manner through machine learning algorithms. Artificial Intelligence is now all the rage in the media, but under the surface it boils down to machine learning algorithms that have undergone steady and yet gradual progress over the last 3 decades. Lokad leverages several classes of machine learning algorithms, tailored for supply chain purposes.
In addition, Lokad delivers probabilistic forecasts. Instead delivering one demand forecast - the median or the mean - that is (nearly) guaranteed to be incorrect, Lokad delivers the probabilities for (nearly) all the demand scenarios. This aspect is of critical importance for the fashion industry because uncertainty is irreducible; and good supply order frequently boils down to a risk analysis.
In fashion, the two main risks are lost opportunities if there is not enough stock, and stock depreciations if the goods have to be sold with a very aggressive discount during the sales period - in order to liquidate the remaining stocks of a collection. Lokad has native capabilities to deal with this specific risk analysis that is so important in fashion.
Intrigued by Lokad’s capabilities for fashion? Don’t hesitate to book a demo call with us.
Machine learning along with artificial intelligence have become buzzwords. Given that Lokad has become identified as one of the key European companies that generate real-world decisions driven by machine learning - supply chain decisions actually - we are getting a growing number of applicants.
The good news: we are still hiring!
In this post, we review the three realms of machine learning that exist at Lokad and what you need to do to maximize the odds of getting an interview with us, and ideally be hired afterwards.
Kudos to the applicants who will be able to mention that they have read this blog post during their interview. Smart people are curious people, and if you can’t be bothered doing a bit of research on your future employer, you’re probably not fit for the machine learning industry anyway.
Job 1: Predictive business modelling
Improving the supply chain performance of a company through machine learning takes significant effort. The data need to be well-prepared. The resolution of the challenge should be fully aligned with the vision and the strategy of the client company. The supply chain teams should be coached to embrace a new and more capable analytical solution. Measurable results should be collected, and one should be be prepared to have these results challenged by top management. At Lokad, the data modelling team, or more simply put, the data team, is responsible for tackling those challenges.
For this specific position, we are looking for engineers with a strong analytical mindset who are capable not only of understanding the strengths and the limitations of the machine learning engines that are made available to them, but are also able to implement real-life set-ups that will be integrated into the daily workflows of real-world supply chains. Improvements are real and mistakes are real too. In your interview, it is advised to demonstrate your understanding of the Lokad product as documented on our website. Bonus points if you can outline how Lokad’s technology can be used to address actual supply chain challenges.
Job 2: Crafting the Big Data infrastructure
Machine learning is critically dependent on data. In fact, the more data is available, the better machine learning works. Lokad seeks talented software engineers that can design all the infrastructure that supports the different machine learning bits. The importance of the whole data pipeline is not to be underestimated: a deficient pipeline is one of the primary failure causes of data-driven initiatives. The infrastructure needs to be not only fast and reliable, but also needs to be able cope with the hefty computing requirements of the machine learning algorithms themselves.
For this role, we are looking for software engineers with a strong taste for complex distributed back-office processing. You should not be afraid of tackling complicated algorithms, such as dealing with a radix tree, and implement such algorithms yourself. Ideally, in your interview, you should demonstrate not only your capacity to understand and implement this kind of algorithmic processing, but also to deliver code that can be maintained and that is fit for production.
Job 3: Hardcore machine learning science
Most modern machine learning algorithms are complicated not only from a statistical perspective, but also from a purely algorithmic perspective. Lokad seeks talented mathematicians who are willing to acquire the software development skills it takes to implement those “hardcore” machine learning algorithms. We have developed our own suit of algorithms which are specifically designed for supply chain needs. Do not expect to plug one open source machine learning toolkit and move on: our clients are critically dependent on algorithms that have been designed to accommodate specific supply chain challenges.
For this position, we are looking for mathematicians or software developers with a strong propensity for numerical analysis and optimization, who have the ambition to deal with stunningly difficult problems. You should not be afraid of rolling out your own class of algorithms which may be somewhat unlike what is considered to be “mainstream” machine learning. Ideally, in your interview, you should be able to demonstrate why Lokad requires alternative approaches and maybe even shed some personal insights on the case.
More about Careers at Lokad.
Answering these 12 questions tell more about your supply chain performance than nearly all benchmarks and audits that the market has to offer. This test should take about 5 minutes of your time.
- Can supply chain operate without Excel?
- Is ABC analysis regarded as obsolete?
- Is all relevant data documented by the supply chain teams?
- Do you record historical stock levels?
- Do supply chain teams monitor the quality of their data?
- Do you forecast lead times?
- Do you prevent any manual intervention on forecasts?
- Do you manage operational constraints, e.g. MOQs, as data?
- Do you quantify the cost of supply chain failures?
- Can decision-making systems be left unattended for a week?
- Can you recompute all decisions in one hour?
- Are all decisions prioritized against each other?
If your company isn’t answering yes to at least 10 of those questions, then, a supply chain initiative has the potential to deliver a sizeable ROI. If you don't have 8 positive answers, ROI can be expected to be massive. If your company does not reach 6 positive answers, then, in our book, supply chain optimization hasn't even started yet.
Read more about the Lokad test of supply chain performance.
Thanks to the probabilistic forecasting engine that we released last year, our capacity to optimize supply chains has dramatically improved over the last couple of months. Through our growing experience, we have come to realize that there are 5 principles that drive the success of the supply chain initiatives undertaken by Lokad:
- All possible futures must be considered; a probability for each possibility.
- All feasible decisions must considered; an economic score for each possibility.
- There are no absolutes, only relative costs and opportunities.
- Being in control requires automation of every mundane task.
- Data requires more effort and brings more returns than you expect.
We decided to name this approach: quantitative supply chain.
You can also read our full Quantitative Supply Chain Manifesto
The quantitative supply chain approach acknowledges the near-infinite computing resources that are available to supply chain practitioners today. It also puts the management back in control of their own supply chain by freeing up teams from unending manual entries required by traditional solutions.
For 2017, we wish you incredible supply chain performance!
The dashboards produced by Lokad are composite: they are built of tiles that can be rearranged as you see fit. We have many different tiles available: linechart, barchart, piechart, table, histogram, etc. This tile approach offers great flexibility when it comes to crafting a dashboard that contains the exact figures your company needs. Recently, we have introduced two extra tiles in order to help fine-tune your dashboards even further.
The Summary tile offers a compact approach for displaying KPIs (key performance indicators). While it was already possible to use the Table tile for a similar purpose, this approach was requiring 1 tile to be introduced for every KPI. As a result, dashboards containing a dozen or more KPIs were needlessly large. In contrast, the Summary tile offers a more practical way for gathering a couple of key figures in one place. As usual, the real challenge is not to present thousands of numbers to the supply chain practitioner - that part is easy - but rather to present the 10 numbers that are worth being read - and that part is hard; and the Summary tile happens to be the best tile to gather those 10 numbers.
The Markdown tile offers the possibility to display simply formatted text in the dashboard. As the name suggests, the text gets formatted using the Markdown syntax, which is rather straightforward. One of the most urgent needs addressed by the Markdown tile is the possibility to embed detailed legends within dashboards. Indeed, when composing complex tables, such as suggested purchase quantities, it is important to make sure there is no remaining ambiguity concerning the semantic of each table column. The Markdown tile represents a practical way of delivering contextual documentation and making sure that no numbers get misinterpreted. It also provides an opportunity to document the intent behind the numbers which is too frequently lost amid technicalities: the documentation can outline as to why a number is shown on the dashboard in the first place.
How long does it take to get started with Lokad? Answering this question is tough because often our answer is about 3 to 6 months. Hell, 6 months! How can your software be so clunky that it can take up to 6 months to get started? Well, our typical set-up phases can be broken down as follows:
- 90 to 180 days: preparing the data
- 3 to 30 days: configuring Lokad
This shows that Lokad’s setup is actually lightweight. Yes, there is room for improvement, but if we consider that Lokad offers programmatic capabilities which completely fit the business drivers, the process can already be considered as lean.
The biggest challenge, which makes pretty much everything else seem insignificant, is data preparation. Preparing data is the art of producing numbers that make sense out of raw data obtained from the company systems.
It is tempting to underestimate the amount of effort that needs to be invested upfront in order to deliver numbers that make sense. In fact, data preparation is too often reduced to a simple data cleaning operation as if the challenge could simply be addressed by filtering the few parts in the data that happen to be incorrect (such as the negative stock levels).
Yet, the true challenge lies in uncovering and documenting the precise semantics of the data. When we begin a project, we consider ourselves lucky if we have about one line of documentation per field for every database table that is made available to us. By the end of the project, we have about one page worth of documentation per field.
If data preparation takes 6 months, why not just postpone using Lokad for 6 months? So that we have all the data just before Lokad starts working on it.
Establishing a data-driven company culture takes years. If your company does not already have a team of data scientists working for it, not much will happen for the next 6 months as far as your supply chain project is concerned. Hence, after 6 months of waiting, your company will still be stuck with another 6 months of data preparation. One of the core know-hows we have developed at Lokad consists precisely in uncovering all the subtle “gotchas” that may backfire against supply chain initiatives.
Lokad can be a key driver for change in your company’s supply chain. Don’t hesitate to contact us, we are happy to discuss these matters with you in more detail.
Forecasting promotions is notoriously difficult. It involves data challenges, process challenges and optimization challenges. As promotions are present everywhere in the retail sector, they have been a long-term concern for Lokad.
However, while nearly every single retailer has its share of promotions, and while nearly every forecasting vendor claims to provide full support for handling promotions, the reality is that nearly all forecasting solutions out there are far from being satisfying in this regard. Worse still, our experience indicates that most of such solutions actually achieve poorer results , as far as forecasting accuracy is concerned, than if they were to use the naive approach which consists of simply ignoring promotions altogether.
What make promotions so challenging is that the degree of uncertainty that is routinely observed when working with promotions. From the classic forecasting perspective, which only considers the mean or median future demand, this extra uncertainty is very damaging to the forecasting process . In fact, the numerical outputs of such forecasting solutions are so unreliable that they do not provide any reasonable options for using their figures for optimizing the supply chain.
Yet, at Lokad, over the years, we have become quite good at dealing with uncertain futures. In particular, with our 4th generation probabilistic forecasting engine, we now have the technology that is completely geared towards the precise quantification of very uncertain situations. The probabilistic viewpoint does not make the uncertainty go away, however, instead of dismissing the case entirely, it provides a precise quantitative analysis of the extent of this uncertainty.
Our probabilistic forecasting engine has recently been upgraded to be able to natively support promotions. When promotional data is provided to Lokad, we expect both past and future promotions to be flagged as such. Past promotions are used to assess the quantitative uplift, as well as to correctly factor in the demand distortions introduced by the promotions themselves. Future promotions are used to anticipate the demand uplift and adjust the forecasts accordingly.
Unlike most classic forecasting solutions, our forecasting engine does not expect the historical data to be “cleaned” of the promotional spikes in any way. Indeed, no one will ever know for sure what would have happened if a promotion had not taken place.
Finally, regardless of the amount of machine learning and advanced statistical efforts that Lokad is capable of delivering in order to forecast promotions, careful data preparation remains as critical as ever. End-to-end promotion forecasts are fully supported as part of our inventory optimization as a service package.
Supply chains moved quite early on towards computer-based management systems. Yet, as a result, many large companies have decade-old supply chain systems which tend to be sluggish when it comes to crunching a lot of data. Certainly, tons of Big Data technologies are available nowadays, but companies are treading carefully. Many, if not most, of those Big Data companies are critically dependent on top-notch engineering talent to get their technologies working smoothly; and not all companies succeed, unlike Facebook, in rewriting layers of Big Data technologies for making them work.
Being able to process vast amounts of data has been a long-standing commitment of Lokad. Indeed, optimizing a whole supply chain typically requires hundreds of incremental adjustments. As hypotheses get refined, it’s typically the entire chain of calculations that needs to be re-executed. Getting results that encompass the whole supply chain network in minutes rather than hours lets you complete a project in a few weeks while it would have dragged on for a year otherwise.
And this is why we started our migration towards cloud computing back in 2009. However, merely running on top of a cloud computing platform does not guarantee that vast amount of data can be processed swiftly. Worse still, while using many machines offers the possibility to process more data, it also tends to make data processing slower, not faster. In fact, delays tend to take place when data is moved around from one machine to the next, and also when machines need to coordinate their work.
As a result, merely throwing more machines at a data processing problem does not reduce any further the data processing time. The algorithms need to be made smarter, and every single machine should be able to do more with no more computing resources.
A few weeks ago, we have released a new high-performance column storage format code-named Ionic thatis heavily optimized for high-speed concurrent data processing. This format is also geared towards supply chain optimization as it natively supports the handling of storage distributions of probabilities. And these distributions are critical in order to be able to take advantage of probabilistic forecasts. Ionic is not intended to be used as an exchange format between Lokad and its clients. For data exchange, using flat text file format, such as CSV, is just fine. The Ionic format is intended to be used as internal data format to speed-up everything that happens within Lokad. Thanks to Ionic, Lokad can now process hundreds of gigabytes worth of input data with relative ease.
In particular, the columnar aspect of the Ionic format ensures that columns can be loaded and processed separately. When addressing supply chain problems, we are routinely facing ERP extractions where tables have over 100 columns, and up to 500 columns for the worst offenders. Ionic delivers a massive performance boost when it comes to dealing with that many columns.
From Lokad’s perspective, we are increasingly perceiving data processing capabilities as a critical success factor in the implementation of supply chain optimization projects. Longer processing time means that less gets done every single day, which is problematic since ultimately every company operates under tight deadlines.
The Ionic storage format is one more step into our Big Data journey.
Yes. To a noticeable extent. And I would never have ventured to put forward this opinion when founding Lokad nearly a decade ago.
By compilation I refer to the art of crafting compilers, that is, computer programs that translate source code into another language. Few people outside the ranks of programmers know what a compiler does, and few people within the ranks of programmers know how a compiler is designed. At first, compilation concerns appear distant (to say the least) to supply chain concerns. Yet, nowadays, at Lokad, it’s compilation stuff that keeps saving the day; one supply chain project after another.
Shameless plug: software engineers with compilation skills don’t grow on trees, and we are hiring. Want to work on stuff that matters? Well, the next time your plane is late because a part was missing, or the next time the drug you seek is out of stock, just remember that you could have made a difference by joining Lokad :-)
Supply chains are complex, maddeningly complex. Globalization has multiplied sourcing opportunities, but delays are longer and more erratic than ever. Sales channel are being multiplied too: there are physical stores, online stores, marketplaces, resellers, wholesalers, ... And now, thanks to Amazon, everyone, everywhere expects everything to be ordered and received overnight. Supply chain expectations are higher than ever.
Approaching supply chain problems with anything less than the full expressiveness of a programming language does not work. Just like Lego programming is not going happen, supply chain challenges won’t fit into checkboxes and dropdowns. This does not prevent software vendors from trying, mind you. Solutions that include more than 1000 tables, each table hovering at around 100 fields on average, are all too common. And while the client company is only using about 1% of the solution’s feature area, they still have to cope with its pervasive complexity.
Compilation saves the day because it provides a huge body of knowledge and know-how when it comes to crafting high-quality abstractions intended as power tools for solving statistical and combinatorial problems (and much more actually). And most supply chain challenges happen to be precisely statistical and combinatorial. For example, at Lokad, by introducing an algebra of distributions, we managed to "crack down" on complicated lead time problems which were resisting our more direct approaches through packaged software features.
What makes language features different from, say, the usual app features (wysiwyg), is that language features are much less sensitive to the specificities of a given challenge than their app features counterparts. For example, let’s consider a situation where your stock-out detection logic backfires in the specific case of ultra-seasonal products. If the feature is delivered through a language construct, then you can always narrow down the data scope until the feature works exactly where it’s intended to do so; possibly dynamically adjusting the scope through an ad-hoc numerical analysis. In contrast, with an app feature, you’re stuck with the filtering options that have been built into this feature. App features are a good fit only if your problems are narrow and well-defined, which is actually very unlike supply chain optimization.
In supply chain, programmability shines because:
- Problems are both highly numerical and very structured
- Supply chains are modular and this modularity needs to be leveraged
- The number of variables is significant but not overwhelming
- Fitting the precise shape of the problems is critical
It is slightly amusing to see how many software vendors tend to gradually re-invent programmability. As the user interface grows in depth and complexity, with the possibility to add filters, options, pre-process or post-process-hooks, templated alerts, KPI monitors, the user interface gradually becomes a programmable thing, and reaches the point where only a programmer can actually make sense of it (precisely thanks to his or her pre-existing programming skills). Programmable yes, but in a highly convoluted way.
Compilation is the art of amplifying engineering skills: one has to craft abstractions and language constructs that streamline thinking the resolution of problems. As Brian Kernighan famously wrote: Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it? The same logic applies to supply chain optimization, because it’s essentially the same thing as writing code. Well, at Lokad, it literally is the same thing.
Conventional IT wisdom states that one should automate the easy parts first, leaving human experts to cope with the more complex elements. Yet, in supply chain, this approach backfires badly every single time. The most complex parts of supply chain are nearly always the most costly ones, the ones that urgently need attention. The easy parts can take care of themselves through min/max inventory or Kanban. Just like you wouldn’t build software for autonomous cars by refining software for automatic train operations, you can’t tackle difficult supply chain problems by refining software initially designed to resolve simple challenges.
Naturally, compilation alone isn’t sufficient to cope with supply chain challenges. Machine learning, big data processing and a sizable amount of human skills are worth mentioning as well. However, in all cases, having carefully crafting high-quality abstractions helps considerably. Machine learning is vastly simpler when input data is well-prepared. Big data processing is also much more straightforward when computations lend themselves easily to a high degree of parallelization.
The future is uncertain, and one of the best mathematical tools we have for coping with this fact is the distribution of probability. Lokad features both a probabilistic forecasting engine and an algebra of distributions. These two capabilities get along pretty well when it comes to dealing with complex, erratic and very uncertain supply chain situations. At their core, these capabilities rely enormously on processing distributions of probabilities. Yet, until recently, Lokad was lacking convenient ways to visualize these distributions.
As a result, we have introduced a new tile - the histogram - which is specifically intended for visualizing the distributions of probabilities. The histogram can be used to visualize a synthetic distribution:
The above graph has been obtained with the following script:
show histogram "Poisson" with poisson(3)
The uncertainty of the future lead time can be visualized with:
Similarly, an uncertain future demand, integrated over an uncertain future lead time, can be visualized with:
Classic forecasts - where the future is supposed to be known - are comfortable and intuitive. Yet, unfortunately, they happen to be deeply wrong as well, and routinely lead to grim supply chain results. Therefore, out of necessity, we remain stuck with probabilities. Nevertheless, with proper visualization tools, we hope to make these probabilities a bit easier to handle.
Once again, we are hiring. We are looking for a Software Engineer and a Business Data Analyst.
You will integrate a team of talented software engineers in order to further develop our cloud-based data crunching apps. We have infrastructure, data processing, scalability and reliability challenges, and need your help in addressing them.
At Lokad, you will benefit from the coaching of an awesome dev team. You will gain skills in Big Data processing and cloud computing apps. Our codebase is clean, documented and heavily (unit) tested. Our offices are quiet (no open space!), bright, and you can get three monitors.
We are a C#/.NET shop, and you will be developing under Visual Studio, the source code being versioned in Git. Our apps are hosted on Microsoft Azure. In addition, with .NET Core coming later this year, we also anticipate a few strategic migrations towards Linux.
We expect you to have strong software development skills. A taste for low-level high performance computing is a big plus, and a vivid interest for distributed systems is very much appreciated. Contributions to open source projects are also highly regarded.
Big Data Analyst
Your role is to make sure our clients get the most from Lokad. You will address complex supply chain issues and craft quantitative strategies. Your goal is also to keep refining these strategies over time to keep them aligned with the needs of our fast-growing clients.
At Lokad, you will benefit from the extensive training and coaching of our expert team. You will gain skills in Big Data, predictive analysis and overall quantitative optimization for business. You will learn how to achieve measurable business results grounded on scientific analysis of data.
About one quarter of your time is spent interacting with clients in order to better understand their business, (mainly over the phone in English). The rest of your time is spent in what could be akin to advanced Excel-like analytics; except that you're dealing with Big Data and Machine Learning through the use of Lokad's platform.
We expect you to have a keen interest in data and quantitative analysis in general. Good Excel skills are a plus; and having even the most modest programming skills is a bonus too. An engineering background is usually a good fit. We also expect you to be fluent in English as the majority of our clients are located overseas. 2 years or more of professional experience are expected.
To apply, just drop your resume at email@example.com.
Lokad is a software company that specializes in Big Data for commerce. We help merchants and a few other verticals (aerospace, manufacturing) to forecast their inventory and optimize their prices. We are profitable and we are growing fast. We are closing deals in North America, Europe and Asia. The vast majority of our clients are based outside of France. We are located 50m from Place d'Italie in Paris (France).
The releases of Lokad are done on Tuesdays, and every Tuesday, we release a few more useful bits. Sometimes we release major components - like our latest probabilistic forecasting engine - but nearly every week comes with a few more features and enhancements. Software development at Lokad is very incremental.
A few weeks ago, we improved our line chart. So far, it was only possible to specify one color - the primary color - for the line chart, and then, if multiple lines were to be present, Envision was auto-picking one color for each line. However, with 4 lines or more, our line charts were becoming somewhat unreadable:
Thus, we have improved the syntax of the line chart to offer the possibility to specify a color for each line:
Through this syntax we get the much improved visual:
We are hiring again!
You will integrate a team of talented software engineers in order to further develop our cloud-based data crunching apps. We have infrastructure, data processing, scalability and reliability challenges. We need your help to get those challenges addressed.
At Lokad, you will benefit from the coaching of an awesome dev team. You will gain skills in Big Data processing and cloud computing apps. Our codebase is clean, documented and heavily (unit) tested. Our offices are quiet (no open space!), bright, and you can get three monitors.
We are a C#/.NET shop, and you will be developing under Visual Studio, the source code being versionned in Git. Our apps are hosted on Microsoft Azure. With .NET Core coming this year, we anticipate a few strategic migrations toward Linux.
We expect strong software development skills from you. A taste for low-level high performance computing is a big plus. A vivid interest for distributed systems is very appreciated. Contributions to open source projects are also highly regarded. We are located 50m from Place d'Italie in Paris (France).
Lokad is a software company that specializes in Big Data for commerce. We help merchants, and a few other verticals (aerospace, manufacturing), to forecast their inventory and to optimize their prices. We are profitable and we are growing fast. We are closing deals in North America, Europe and Asia. The vast majority of our clients are based outside of France.
Lokad is the winner of the 2010 Windows Azure Partner of the Year Award, and was named as one of Europe’s 100 hottest startups by Wired Magazine (09/2012).
To apply, drop your resume to firstname.lastname@example.org.
The future is uncertain. Yet, nearly all predictive supply chain solutions make the opposite assumption: they assume that their forecasts are correct, and hence roll out their simulations based on those forecasts. Implicitly, the future is assumed to be certain and complications ensue.
From a historical perspective, software engineers were not making those assumptions without a reason: a deterministic future was the only option that the early - and not so early - computers could process at best. Thus, while dealing with an uncertain future was known to be the best approach in theory, in practice, it was not even an option.
In addition, a few mathematical tricks were found early in the 20th century in order to circumvent this problem. For example, the classic safety stock analysis assumes that both the lead times and the demand follow a normal distribution pattern. The normal distribution assumption is convenient from a computing viewpoint because all it takes is two variables to model the future: the mean and the variance.
Yet again, the normal distribution assumption - both for the lead times and the demand - proved to be incorrect in nearly all but a few situations, and complications ensued.
Back in 2012 at Lokad, we realized that the classic inventory forecasting approach was simply not working: mean or median forecasts were not addressing the right problem. No matter how much technology we poured on the case, it was not going to work satisfyingly.
Thus, we shifted to quantile forecasts, which can be interpreted as forecasting the future with an intended bias. Soon we realized that quantiles were invariably superior to the classic safety stock analysis, if only because quantiles were zooming in on where it really mattered from a supply chain perspective.
However, while going quantile, we realized that we had lost quite a few things in the process. Indeed, unlike classic mean forecasts, quantile forecasts are not additive, so it was not possible to make sense of a sum of those quantiles for example. In practice, the loss wasn’t too great because since classic forecasts weren’t making much sense in the first place, summing them up wasn’t a reasonable option anyway.
Over the years, while working with quantiles, we realized that so many of the things we took for granted had become a lot more complicated: demand quantities could no longer be summed or subtracted or linearly adjusted. In short, while moving towards an uncertain future, we had lost the tools to operate on this uncertain future.
Back in 2015, we introduced quantile grids. While quantile grids were not exactly the same as our full-fledged probabilistic forecasts just yet, our forecasting engine was already starting to deliver probabilities instead of quantile estimates. Distributions of probabilities are much more expressive than simple quantile estimates, and, it turns out that it is possible to define an algebra over distributions.
While the term algebra might sound technical, it’s not that complicated; it means that a simple operation such as the sum, the product, the difference, …, can be defined in ways which are not only mathematically consistent, but also highly relevant from the supply chain perspective.
As a result, just a few weeks ago, we integrated an algebra of distributions right into Envision, our domain-specific language dedicated to commerce optimization. Thanks to this algebra of distributions, it becomes straightforward to carry outseemingly simple operations such as summing two uncertain lead times (say an uncertain production lead time plus an uncertain transport lead time). The sum of those two lead times is carried out through an operation known as a convolution. While the calculation itself is fairly technical, in Envision, all it takes is to write
A = B +* C, where
+* is the convolution operator used to sum up independent random variables (*).
Through this algebra of distributions most of the “intuitive” operations which were possible with classic forecasts are back : random variables can be summed, multiplied, stretched, exponentiated, etc. And while relatively complex calculations are taking place behind the scenes, probabilistic formulas are not more complicated than plain Excel formulas from the Envision perspective.
Instead of wishing for the forecasts to be perfectly accurate, this algebra of distributions lets us embrace uncertain futures: supplier lead times tend to vary, quantities delivered may differ from quantities ordered, customer demand changes, products get returned, inventory may get lost or damaged … Through this algebra of distributions it becomes much more straightforward to model most of those uncertain events with minimal coding efforts.
Under the hood, processing distributions is quite intensive; and once again, we would never have ventured into those territories without a cloud computing platform that handles this type of workload - Microsoft Azure in our case. Nevertheless, computing resources have never been cheaper, and your company’s next $100k purchase order is probably well worth spending a few CPU hours - costing less than $1 and executed in just a few minutes - to make sure that the ordered quantities are sound.
(*) A random variable is a distribution that has a mass of 1. It’s a special type of distribution. Envision can process distributions of probabilities (aka random variables), but more general distributions as well.
File formats are staggeringly diverse. At Lokad, our ambition is to support all the (reasonable) tabular file formats. We were already supporting CSV (comma-separated values) files with all their variants - which can involve varying separators or varying line returns.
However, tabular files can become very large, and in order to make the file transfer to Lokad faster, these files can be compressed. Lossless compression of flat text files works very well, frequently yielding a compression ratio below 10%, i.e. the resulting compressed file is less than 10% of the original file.
Then again, compression formats are staggeringly diverse as well. So far, we were only supporting the venerable and ubiquitous GZip - the compression format used to compress web pages for example.
The two formats WinZip - famous for its .zip file extension - and 7z - one of the most efficient compression algorithms available on the market - are now supported by Lokad. In both cases, the file formats are archive formats, hence, a single .zip file can contain many files within the archive. For now, Lokad only supports single-file archives.
This choice makes sense in practice because if the flat file is so large that it requires compression in the first place, producing an even bigger archive gathering multiple large files tends to be impractical. Instead, we suggest to use incremental file uploads.
Check out our documentation about how to read files in Envision.
A little over one year ago, we unveiled quantile grids as our 3.0 forecasting technology. More than ever, Lokad remains committed to delivering the best forecasts that technology can produce, and today, our 4th generation of forecasting technology, namely our probabilistic forecasting engine, is live and available in production for all clients. This new engine consists of a complete rewrite of our forecasting technology stack, and addresses many long-standing challenges that we were facing.
The future is uncertain no matter how good the forecasting technology. Back in 2012, when Lokad first ventured into the depths of quantile forecasting, we quickly realized that uncertainty should not be dismissed like it's done with the classic forecasting approach, but should rather be embraced. Simply put, supply chain costs are concentrated at the statistical extremes: it's the surprisingly high demand that causes stock-outs, and the surprisingly low demand that causes dead inventory. In the middle, supply chain tends to operates quite smoothly.
With quantile grids, Lokad was delivering a much more fine-grained vision of possible future outcomes. However, as the name suggests, our quantile grids were built on top of our quantile forecasts, multiple layers of quantiles actually. These quantile grids proved to be tremendously useful over the last year, but while our forecasting engine was producing probabilities, internally, nearly all its logic was not working directly with probabilities. The probabilities we computed were a byproduct of a quantile forecasting system.
Because of these quantile roots, our forecasting engine 3.0 had multiple subtle limitations. And while most of these limitations were too subtle to be noticed by clients, they did not go ignored by Lokad’s R&D team. Thus, we decided to reboot our entire forecasting technology with a true native probabilistic forecasting perspective; and this was the start of the forecasting engine 4.0.
Lead time forecasting
Lead times are frequently assumed to be a given. However, while past lead times are known, future lead times can only be estimated. For years, Lokad had under-estimated the challenge of accurately approximating the future lead times. Lead times are subtle: most statistical patterns, such as seasonality (and the Chinese New Year in particular), which impact the demand, also impact the lead time.
In our forecasting engine 4.0, lead times have become first-class citizens with their own lead time forecasting mode. Lead times now benefit from dedicated built-in forecasting models. Naturally, with our engine being a probabilistic forecasting engine, lead time forecasts are a distribution of probabilities associated with an uncertain time period.
Integrated demand forecasting
Lead times vary, and yet, our forecasting engine 3.0 was stuck with fixed lead times. From a traditional perspective, the classic safety stock analysis assumes that lead time follows a normal distribution, while nearly all measurements we have ever carried out indicate that varying lead times are clearly not normally distributed. While our experiments routinely showed that having a fixed lead time was better than having a flawed model, being stuck with static lead times was nevertheless not the perfectly satisfying solution we were looking for.
The forecasting engine 4.0 introduces the concept of integrated demand forecasting, with integrated signifying integrated over the lead time. The engine takes a full distribution of lead time probabilities, and produces the corresponding probabilistic demand forecast. In practice, the lead time distribution is also computed by the forecasting engine as seen previously. Integrated demand forecasting finally brings a satisfying answer to the challenge of dealing with varying lead times.
New products forecasting
Forecasting demand for new product is plain hard. Since, in this case, forecasting obviously cannot rely on the sales history, the forecasting engine has to rely on other data known about the product prior to its launch. Our forecasting engine 3.0 already had a tags framework, precisely geared towards this specific use case. However, tags were unfortunately not carrying as much information as we would have liked and some accuracy was left on the table.
With 4.0, this specific challenge is revised with the introduction of categories and hierarchies. Categories and hierarchies are more expressive as well as more structured than tags, and convey a lot more information. The forecasting engine 4.0 takes the full advantage of this richer data framework to deliver more accurate forecasts, with new-product forecasting being the most acute use case.
Stock-outs and promotions
The intent of the forecasting engine is to forecast the future demand. However, our knowledge of past demand is typically imperfect, with only past sales really being known. Sales typically tends to be a reasonable approximation of the demand, but sales come with multiple biases, the most common cases being stock-outs and promotions. Our engine 3.0 already had a few heuristics to deal with this bias, plus quantile forecasts are intrinsically more robust than (classic) average forecasts. Yet, once again, the situation was not entirely satisfying for us.
The engine 4.0 introduces the notion of biased demand, which can be either censored or inflated. When the demand for a given product on a given day is marked as censored, we are telling the forecasting engine that the demand should have been higher, and that the true demand for that day remains unknown. The engine leverages this information to refine the forecasts, even when the history is full of events which have distorted the demand signal.
While quantile forecasts are vastly superior to classic average or median forecasts when it comes to estimating the probabilities of rare events, quantiles begin demonstrating their limits when it comes to estimating super-rare events. For example, our quantile models were struggling to properly deal with items sold only once or twice a year, as well as handling service levels higher than 98%.
Native probabilistic models, as implemented in our engine 4.0, are much better behaved when it comes to ultra-sparse demand and “rare” events in general. These models could have been implemented within a quantile forecasting framework (a probabilistic forecast can be easily turned into a quantile forecast); but our engine 3.0 did not have the infrastructure to support them. So they were implemented into the engine 4.0 instead.
Blended into Envision
Versions 2.0 and 3.0 of our forecasting engine came with a web user interface. At first glance, it seemed easy. However, the user interface was actually dismissing the factor which represents the true challenge of using (any) forecasting engine, which is to provide complete control of the data transferred into the forecasting engine. Indeed, garbage-in, garbage-out remains an all too frequent problem.
The engine 4.0 is interfaced from within Envision, our domain-specific language geared towards quantitative optimization for commerce. Calling the forecasting engine takes a series of data arguments provided from an Envision script. This approach requires a bit more upfront effort, however, the productivity benefits kick in rapidly; as soon as adjustments are made on the input data.
The release of our forecasting engine 4.0 is only the first part of a series of important improvements that have been brought to Lokad over the last few weeks. Stay tuned for more.
When data scientists work with Envision, our domain-specific language tailored for quantitative optimization for commerce, we want to ensure that they are as productive as possible. Indeed, data scientists don't grow on trees, and when you happen to have one available, you want to make the most of his time.
A data analysis begins by loading input data, which happens to be stored as flat files within Lokad. Therefore, an Envision script always starts with a few statements such as:
read "/sample/Lokad_Items.tsv" read "/sample/Lokad_Orders.tsv" as Orders read "/sample/Lokad_PurchaseOrders.tsv" as PurchaseOrders
While Envision syntax is compact and straightforward, file names may, on the other hand, be fairly complex. From the beginning, our source code editor had been released with autocompletion, however until recently, autocompletion was not providing suggestions for file names. A few days ago, the code editor was upgraded, and file names are now suggested as follows:
This feature was part of a larger upgrade which also made the Envision code source editor more responsive and more suitable for dealing with large scripts.