Periodic demand forecast

Published on by Joannes Vermorel.

As a rule of thumb, whenever supply chain is involved, probabilistic forecasts yield superior results compared to traditional periodic forecasts; i.e., forecasts expressed per day, week or month. Yet, there are also a few situations where demand uncertainty is very low such as where demand is very steady and non-sparse. In those situations, it might still make sense to consider periodic demand forecasts.

Therefore, we have extended our latest forecasting engine to support periodic forecasts. This feature is live and readily available for all Lokad accounts. The output of the periodic forecast takes the form of a table with 4 columns:

  • the product identifier
  • the target date for the forecast
  • the mean value for the demand
  • the sigma (square root of the variance) for the demand

In particular, unlike the probabilistic forecast, the periodic forecast does return fractional demand values, even if the historical demand is expressed as pure integers, typically reflecting the number of units sold.

Under the hood, this periodic forecast is obtained by generating a probabilistic forecast, and by projecting the mean and the variance of the resulting distributions. This periodic forecast benefits from all the perks associated with the probabilistic forecast such as native support for stockouts, for example.

Categories: Tags: insights forecasting release No Comments

Text mining for better demand forecasts

Published on by Joannes Vermorel.

We are proud to announce that Lokad is now featuring text mining capabilities that assist its forecasting engine in delivering accurate demand forecasts, even when looking at products associated with sparse and intermittent demand that do not benefit from attributes such as categories and hierarchies. This feature is live, check out the label option of our forecasting engine.

The primary forecasting challenge faced by supply chains is the sparsity of the data: most products don’t have a decade’s worth of relevant historical data and aren’t served by thousands of units when considering the edges of the supply chain network. Traditional forecasting methods, which rely on the assumption that the time series are both long and non-sparse, perform poorly for this very reason.

Lokad is looking at supply chain historical data from another angle: instead of looking at the depth of the data, which tends to be nonexistent, we are looking at the width of the data, that is, all the correlations that exist between the products. As there are frequently thousands of products, many correlations can be leveraged to improve the forecasting accuracy significantly. Yet, when establishing those correlations, we cannot count on relying on the demand history because many products, such as the products that are about to be launched, don’t even have historical data yet. Thus, the forecasting engine of Lokad has introduced a mechanism to leverage categories and hierarchies instead.

Leveraging categories and hierarchies for increased forecasting accuracy works great. However, this approach suffers from one specific limitation: it relies on the availability of categories and hierarchies. Indeed, many companies haven’t invested much in master data setups, and, as a result, cannot benefit from much fine-grained information about the products that flow through the supply chain. Previously, when no category and no hierarchy were available, our forecasting engine was essentially crippled in its capability to cope with sparse and intermittent demand.

The new text mining capabilities of the Lokad forecasting engine is a game changer: the engine is now capable of processing the plain-text description of products to establish the correlations between the products. In practice, we observe that while companies may be lacking proper categorizations for their products, a plain-text description of the products is nearly always available, dramatically improving the applicability of the width-first forecasting perspective of Lokad.

For example, if a diverse set of products happens to be named Something Christmas, and all those products exhibit a consistent seasonal spike before Christmas, then the forecasting engine can identify this pattern and automatically apply the inferred seasonality to a new product that has the keyword Christmas in its description. This is exactly what happens under the hood at Lokad when plain-text labels are fed to the forecasting engine.

Our example above is simplistic, but, in practice, text mining involves uncovering complex relationships between words and demand patterns that can be observed in the historical data. Products sharing similar descriptions may share similar trends, similar life-cycles, similar seasonalities. However, two products with similar descriptions may share the same trend but not the same seasonality, etc. The forecasting engine of Lokad is based on machine learning algorithms that automatically identify the relevant information from the plain-text descriptions of the products. The engine requires no preprocessing of the product descriptions.

Our motto is to make the most of the data you have. With text mining capabilities, we are once again lowering the requirements to bring your company to age of quantitative supply chains. Any question? Just drop us a line at contact@lokad.com.

Categories: Tags: forecasting insights technology release No Comments

Entropy analysis for supply chain IT system discovery

Published on by Joannes Vermorel.

The IT landscape of supply chains is nearly always complex. Indeed, by nature, supply chains involve multiple actors, multiple sites, multiple systems, etc. As a result, building data-driven insights in supply chains is a challenge due to the sheer heterogeneity of the IT landscape. Too frequently, supply chain analytics deliver nonsensical results precisely because of underlying garbage in, garbage out problems.

At Lokad, we have not only developed a practice that thoroughly surveys the IT landscape and the datasets inhabiting it, but we have also created some bits of technology to facilitate the surveying operations themselves. In this post, we detail one step of our surveying methodology, which is based on Shannon entropy. We successfully leveraged entropy analysis for several large scale supply chain initiatives.

Our surveying process starts by reviewing all the database tables that are understood to be relevant for the supply chain initiative. Supply chains are complex and, consequently, the IT systems that operate the supply chains reflect this complexity. Moreover, the IT systems might have been evolving for several decades, and layers of complexity tend to take over those systems. As a result, it’s not uncommon to identify dozens of database tables with each table having dozens of columns, i.e. fields in the database terminology.

For large supply chains, we have observed situations where the total number of distinct fields are above 10,000. By leveraging entropy analysis, we are able to remove half of the columns from the picture immediately and consequently reduce the remaining amount of work significantly.

Dealing with that many columns is a major undertaking. The problem is not data-processing capabilities: with cloud computing and adequate data storage, it’s relatively straightforward to process thousands of columns. The real challenge is to make sense of all those fields. As a rule of thumb, we estimate that well-written documentation of a field takes up about one page, when interesting use cases and edge cases are covered. Without proper documentation of the data, data semantics are lost, and odds are very high that any complex analysis performed on the data will suffer from massive garbage in garbage out headaches. Thus, with 10,000 fields, we are faced with the production of a 10,000-page manual, which demands a really monumental effort.

Yet, we have observed that those large IT systems also carry a massive amount of dead weight. While the raw number of fields appear to very high, in practice, it does not mean that every column found in the system contains meaningful data. At the extreme, the column might be entirely empty or constant and, thus, contain no information whatsoever. A few fields can be immediately discarded because they are truly empty. However, we have observed that fully empty fields are actually pretty rare. Sometimes, the only non-constant information in the column dates from the day when the system was turned on; the field was never to be reused afterward. While truly empty fields are relatively rare, we usually observe that degenerate fields are extremely numerous. These fields contain columns with almost no data, well below any reasonable threshold to leverage this data for production purposes.

For example, a PurchaseOrders table containing over one million rows might have an Incoterms column that is non-empty in only 100 rows; furthermore, all those rows are more than five years old, and 90 rows contain the enry thisisatest. In this case, the field Incoterms is clearly degenerate, and there is no point in even trying to make sense of this data. Yet, a naive SQL filter will fail to identify such a column as degenerate.

Thus, a tool to identify degenerate columns is needed. It turns out that Shannon entropy is an excellent candidate. Shannon entropy is a mathematical tool to measure the quantity of information that is contained in a message. The entropy is measured in Shannons, which is a unit of measurement somewhat akin to bits of information. By treating the values found in the column itself as the message, Shannon entropy gives us a measure of the information contained in the column expressed in Shannons.

While all of this might sound highly theoretical, putting this insight into practice is extremely straightforward. All it takes is to use the entropy() aggregator provided by Envision. The tiny script below illustrates how we can use Envision to produce the entropy analysis of a table with 3 fields.

read "data.csv" as T[*]
show table "List of entropies" with
  entropy(T.Field1)
  entropy(T.Field2)
  entropy(T.Field3)


Any field associated with an entropy lower than 0.1 is a very good indicator of a degenerate column. If the entropy is lower than 0.01, the column is guaranteed to be degenerate.

Our experience indicates that performing an initial filtering based on entropy measurements reliably eliminates between one-third and two-thirds of the initial fields from the scope of interest. The savings in time and efforts are very substantial: for large supply chain projects, we are talking about man-years being saved through this analysis.

We unintentionally discovered a positive side effect of entropy filtering: it lowers the IT fatigue associated with the (re)discovery of the IT systems. Indeed, investigating a degenerate field typically proves to be an exhausting task. As the field is not used - sometimes not used any more - nobody is quite sure whether the field is truly degenerate or if the field is playing a critical, but obscure, role in the supply chain processes. Because of the complexities of supply chains, there is frequently nobody who can positively affirm that a given field is not used. Entropy filtering immediately eliminates the worst offenders that are guaranteed to lead us on a wild-goose chase.

Categories: Tags: supply chain envision insights No Comments

The illustrated stock reward function

Published on by Joannes Vermorel.

The stock reward function is a key ingredient to make the most of probabilistic forecasts in order to boost your supply chain performance. The stock reward is used for computing the return on investment for every extra unit of stock to be purchased or manufactured.

The stock reward function is expressive and can be used like a mini-framework for addressing many different situations. However, as a minor downside, it’s not always easy to make sense of the calculations performed with the stock reward function. Below you’ll find a short list of graphs that represent the various transformations applied to the forecasts.

The first graph - entitled Future demand - represents a probabilistic demand forecast associated with a given SKU. The curve represents a distribution of probabilities, with the total area under the curve equal to one. In the background, this future demand is implicitly associated with a probabilistic lead time forecast, also represented as a distribution of probabilities. Such a distribution is typically generated through a probabilistic forecasting engine.

The Marginal fill rate graph represents the fraction of extra demand that is captured by each extra unit of stock. In other words, this graph demonstrates what happens to the fill rate as the stock increases. Since we are representing a marginal fill rate here, the total area under the curve remains equal to one. The marginal fill rate distribution can be computed with the fillrate() function.

The Demand with backorders graph is identical to the Future demand graph, except that 8 units have been introduced to represent a back order. The backorder represents guaranteed demand since these units have already been bought by clients. As a result, when backordered units are introduced, the probability distribution of demand is shifted to the right as the backordered units being guaranteed demand. The shift operator >> is available as part of the algebra of distribution to compute such a transformation over the initial distribution.

The Fill rate with backorders graph is also very similar to the original Marginal fill rate graph, but has also been shifted 8 units to the right. Here, the plotted fill rate is only associated with the uncertain demand, hence the shape of the distribution remains the same.

The Margin graph represents the margin economic reward as computed by the stock reward function taking the Demand with backorders as input. The stock reward can be visualized as a distribution, but this is not a distribution of probabilities: the area under the curve is not equal to one but is instead equal to the total margin which would be captured with unlimited inventory. On the left of the graph, each backordered unit yields the same margin, which is not surprising as there is no uncertainty in capturing the margin given that the units have already been bought.

The Stockout penalty represents the second component of the stock reward function. The shape of the distribution might feel a bit unexpected, but this shape merely reflects that, by construction of the stock reward function, the total area under the curve is zero. Intuitively, starting from a stock level of zero, we have the sum of all the stockout penalties as we are missing all the demand. Then, as we move to the right with higher stock levels we are satisfying more and more demand and thus further reducing the stockout penalties; until there is no penalty left because the entire demand has been satisfied. The stock-out penalty of not serving backorders is represented as greater than the penalty of not serving the demand that follows. Here, we are illustrating the assumption that clients who have already backordered typically have greater service expectations than clients who haven’t yet bought any items.

The Carrying costs graph represents the third and last component of the stock reward function. As there is no upper limit for the carrying costs - it’s always possible to keep one more unit in stock thus further increasing the carrying costs - the distribution is divergent: it tends to negative infinity on the right. The total area under the curve is negative infinity, although this is a rather theoretical perspective. On the right, the carrying costs associated with the backordered units are zero: indeed, as those units have already been bought by clients they won’t incur any carrying costs, since those units will be shipped to clients as soon as possible.

The final stock reward - not represented above - would be obtained by summing the three components of the stock reward function. The resulting distribution would be interpreted as the ROI for each extra unit of stock to be acquired. This distribution typically starts with positive values,the first units of stock being profitable, but converge to negative infinity as we move to higher stock levels given the unbounded carrying costs.

The term support classically refers to the demand levels associated with non-zero probabilities. In the graphs above, the term support is used loosely to refer to the entire range that needs to processed as non-zero values by Envision. In particular, it’s worth mentioning that there are multiple calculations that require the distribution support to be extended in order to make sure that the final resulting distribution isn’t truncated.

  • The shift operation, which happens when backorders are present, requires the support to be increased by the number of backordered units.
  • The margin and carrying cost components of the stock reward function have no theoretical limits on the right, and can require arbitrarily large extensions of the support.
  • Ordering constraints, such as MOQs, may require having inventory levels that are even greater than the ones reached by the shifted distributions. Properly assessing the tail of the distribution is key for estimating whether the MOQ can be profitably satisfied or not.

In practice, the Envision runtime takes care of automatically adjusting the support to make sure that distributions aren’t truncated during the calculations.

Categories: Tags: insights inventory optimization No Comments

The Supply Chain Scientist

Published on by Joannes Vermorel.

Artificial intelligence has been making steady progress over the last few decades. However, while self-driving cars might be just around the corner, we are still decades away from having software smart enough to devise a supply chain strategy. Yet, at the same time, it would be incorrect to conclude that supply chain as a whole is still decades away from being positively impacted by machine learning algorithms.

Lokad’s supply chain science competency was born out of the observation that while algorithms alone were insufficient, they actually became formidable enablers in the hands of capable supply chain experts. Machine learning offers the possibility to achieve unprecedented levels of supply chain performance by taking care of all the extensive but otherwise clerical micro-decisions that your supply chain requires: when to order a product, when to move a unit of stock, when to produce more items, etc.

The Supply Chain Scientist is a mix between a data scientist and a supply chain expert. This person is responsible for the proper data preparation and the proper quantitative modelling of your supply chain. Indeed, it takes human supply chain insights to realize that some relevant data may be missing from a project and to align the optimization parameters with the supply chain strategy of the company.

Too often, supply chain initiatives come with fragmented responsibilities:

  • Data preparation is owned by the IT team
  • Statistics and reporting is owned by the BI (business intelligence) team
  • Supply chain execution is owned by the supply chain team

The traditional S&OP answer to this issue is the creation of collective ownership through monthly meetings between many stakeholders, ideally having the whole thing owned by the CEO. However, while we are certainly not opposed to the principle of collective ownership, our experience indicates that things tend to move forward rather slowly when it comes to traditional S&OP.

In contrast to the collective ownership established through scheduled meetings, the Supply Chain Scientist holds the vital role of taking on the end-to-end ownership of all the quantitative aspects of a supply chain initiative.

This focused ownership is critical in order to avoid too common pitfalls associated with traditional supply chain organizations which are:

  • Data is incorrectly extracted and prepared, primarily because the IT team has limited insights in relation to the use of the data.
  • Statistics and reporting misrepresent the business; they provide less-than-useful insights and suffer from less-than-perfect data inputs.
  • Execution rely heavily on ad-hoc Excel sheets in order to try to mitigate the two problems described above, while creating an entire category of new problems.

When we begin a quantitative supply chain initiative with a client company, we start by making sure that a Supply Chain Scientist is available to execute the initiative.

Learn more about supply chain scientists

Categories: Tags: insights supply chain 1 Comment

Fashion demand forecasting

Published on by Joannes Vermorel.

Forecasting is hard. Forecasting the future of fashion is insanely hard. As a result, for most part, the fashion industry still relies on crude methods such as Open-To-Buy which are nothing but glorified top-down moving averages. Yet, most supply chain practitioners would argue that as long as there isn’t something that can actually beat Open-To-Buy in the real world, then Open-To-Buy isn’t outdated, no matter how crude the method might be. In fact, until recently, our own observations were aligned with what fashion companies were telling us: nothing really works for fashion, and guesswork remains the best option among all the other, even less satisfactory, alternatives.

Our probabilistic forecasting engine, released last year, became a game changer for fashion. After years of struggling with fashion demand patterns, we finally have a forecasting engine that is natively architectured towards the specific challenges of the fashion sector. Over the last couple of months, we have been driving the supply chains of multiple fashion companies, and well, it actually works! Considering the track record of forecasting vendors in the fashion industry, the odds weren’t really in our favor.

Demand in fashion is typically driven by novelty, and new products come together through collections. Collections are essential from the fashion perspective; yet, at the same time, they represent a massive forecasting challenge.

The demand needs to be forecast for products that haven’t been sold yet.

Fashion isn’t about products that haven’t been sold for long, fashion is about products that haven’t been sold at all. This perspective is a fundamental mismatch with the time-series forecasting approach that represents the foundation of nearly all forecasting systems - not Lokad though. Indeed, from a time-series perspective, in the case of fashion, time-series have zero historical depth, hence there is nothing to rely on for the purpose of forecasting.

Lokad’s probabilistic forecasting engine takes a completely different stance: it actively leverages the different product attributes: brand, style, color, fabric, size, price point, category, etc, in order to build a demand forecast based on the performance of similar products in the previous collections.

One of the things that Lokad’s forecasting engine does not do is to require products to be manually paired between collections. First, establishing those pairs is very complicated and extremely time-consuming. Supply chain practitioners are not supposed to be the slaves of their own systems; if the systems require thousands of products to be manually paired, chances are that this time would be better invested in producing a manual forecast that directly benefits from human insights. Second, in fashion, 1-to-1 mapping between the old and the new collections does not actually make any sense most of the time. New collections are likely to redefine the codes in subtle yet important ways: one product may become many, and vice-versa. A methodology that exclusively relies on 1-to-1 pairings is guaranteed to deliver rather naive results about the future collections.

Lokad’s forecasting engine is all about computing all those similarities in a fully automated manner through machine learning algorithms. Artificial Intelligence is now all the rage in the media, but under the surface it boils down to machine learning algorithms that have undergone steady and yet gradual progress over the last 3 decades. Lokad leverages several classes of machine learning algorithms, tailored for supply chain purposes.

In addition, Lokad delivers probabilistic forecasts. Instead delivering one demand forecast - the median or the mean - that is (nearly) guaranteed to be incorrect, Lokad delivers the probabilities for (nearly) all the demand scenarios. This aspect is of critical importance for the fashion industry because uncertainty is irreducible; and good supply order frequently boils down to a risk analysis.

In fashion, the two main risks are lost opportunities if there is not enough stock, and stock depreciations if the goods have to be sold with a very aggressive discount during the sales period - in order to liquidate the remaining stocks of a collection. Lokad has native capabilities to deal with this specific risk analysis that is so important in fashion.

Intrigued by Lokad’s capabilities for fashion? Don’t hesitate to book a demo call with us.

Categories: Tags: forecasting fashion No Comments

Machine learning jobs at Lokad

Published on by Joannes Vermorel.

Machine learning along with artificial intelligence have become buzzwords. Given that Lokad has become identified as one of the key European companies that generate real-world decisions driven by machine learning - supply chain decisions actually - we are getting a growing number of applicants.

The good news: we are still hiring!

In this post, we review the three realms of machine learning that exist at Lokad and what you need to do to maximize the odds of getting an interview with us, and ideally be hired afterwards.

Kudos to the applicants who will be able to mention that they have read this blog post during their interview. Smart people are curious people, and if you can’t be bothered doing a bit of research on your future employer, you’re probably not fit for the machine learning industry anyway.

Job 1: Predictive business modelling

Improving the supply chain performance of a company through machine learning takes significant effort. The data need to be well-prepared. The resolution of the challenge should be fully aligned with the vision and the strategy of the client company. The supply chain teams should be coached to embrace a new and more capable analytical solution. Measurable results should be collected, and one should be be prepared to have these results challenged by top management. At Lokad, the data modelling team, or more simply put, the data team, is responsible for tackling those challenges.

For this specific position, we are looking for engineers with a strong analytical mindset who are capable not only of understanding the strengths and the limitations of the machine learning engines that are made available to them, but are also able to implement real-life set-ups that will be integrated into the daily workflows of real-world supply chains. Improvements are real and mistakes are real too. In your interview, it is advised to demonstrate your understanding of the Lokad product as documented on our website. Bonus points if you can outline how Lokad’s technology can be used to address actual supply chain challenges.

Job 2: Crafting the Big Data infrastructure

Machine learning is critically dependent on data. In fact, the more data is available, the better machine learning works. Lokad seeks talented software engineers that can design all the infrastructure that supports the different machine learning bits. The importance of the whole data pipeline is not to be underestimated: a deficient pipeline is one of the primary failure causes of data-driven initiatives. The infrastructure needs to be not only fast and reliable, but also needs to be able cope with the hefty computing requirements of the machine learning algorithms themselves.

For this role, we are looking for software engineers with a strong taste for complex distributed back-office processing. You should not be afraid of tackling complicated algorithms, such as dealing with a radix tree, and implement such algorithms yourself. Ideally, in your interview, you should demonstrate not only your capacity to understand and implement this kind of algorithmic processing, but also to deliver code that can be maintained and that is fit for production.

Job 3: Hardcore machine learning science

Most modern machine learning algorithms are complicated not only from a statistical perspective, but also from a purely algorithmic perspective. Lokad seeks talented mathematicians who are willing to acquire the software development skills it takes to implement those “hardcore” machine learning algorithms. We have developed our own suit of algorithms which are specifically designed for supply chain needs. Do not expect to plug one open source machine learning toolkit and move on: our clients are critically dependent on algorithms that have been designed to accommodate specific supply chain challenges.

For this position, we are looking for mathematicians or software developers with a strong propensity for numerical analysis and optimization, who have the ambition to deal with stunningly difficult problems. You should not be afraid of rolling out your own class of algorithms which may be somewhat unlike what is considered to be “mainstream” machine learning. Ideally, in your interview, you should be able to demonstrate why Lokad requires alternative approaches and maybe even shed some personal insights on the case.

More about Careers at Lokad.

Categories: Tags: hiring machine learning No Comments

The test of supply chain performance

Published on by Joannes Vermorel.

Answering these 12 questions tell more about your supply chain performance than nearly all benchmarks and audits that the market has to offer. This test should take about 5 minutes of your time.

  1. Can supply chain operate without Excel?
  2. Is ABC analysis regarded as obsolete?
  3. Is all relevant data documented by the supply chain teams?
  4. Do you record historical stock levels?
  5. Do supply chain teams monitor the quality of their data?
  6. Do you forecast lead times?
  7. Do you prevent any manual intervention on forecasts?
  8. Do you manage operational constraints, e.g. MOQs, as data?
  9. Do you quantify the cost of supply chain failures?
  10. Can decision-making systems be left unattended for a week?
  11. Can you recompute all decisions in one hour?
  12. Are all decisions prioritized against each other?

If your company isn’t answering yes to at least 10 of those questions, then, a supply chain initiative has the potential to deliver a sizeable ROI. If you don't have 8 positive answers, ROI can be expected to be massive. If your company does not reach 6 positive answers, then, in our book, supply chain optimization hasn't even started yet.

Read more about the Lokad test of supply chain performance.

Categories: Tags: supply chain insights No Comments

2017, year of quantitative supply chain

Published on by Joannes Vermorel.

Thanks to the probabilistic forecasting engine that we released last year, our capacity to optimize supply chains has dramatically improved over the last couple of months. Through our growing experience, we have come to realize that there are 5 principles that drive the success of the supply chain initiatives undertaken by Lokad:

  1. All possible futures must be considered; a probability for each possibility.
  2. All feasible decisions must considered; an economic score for each possibility.
  3. There are no absolutes, only relative costs and opportunities.
  4. Being in control requires automation of every mundane task.
  5. Data requires more effort and brings more returns than you expect.

We decided to name this approach: quantitative supply chain.

You can also read our full Quantitative Supply Chain Manifesto

The quantitative supply chain approach acknowledges the near-infinite computing resources that are available to supply chain practitioners today. It also puts the management back in control of their own supply chain by freeing up teams from unending manual entries required by traditional solutions.

For 2017, we wish you incredible supply chain performance!

Categories: Tags: history insights No Comments

Markdown tile and Summary tile

Published on by Joannes Vermorel.

The dashboards produced by Lokad are composite: they are built of tiles that can be rearranged as you see fit. We have many different tiles available: linechart, barchart, piechart, table, histogram, etc. This tile approach offers great flexibility when it comes to crafting a dashboard that contains the exact figures your company needs. Recently, we have introduced two extra tiles in order to help fine-tune your dashboards even further.


The Summary tile offers a compact approach for displaying KPIs (key performance indicators). While it was already possible to use the Table tile for a similar purpose, this approach was requiring 1 tile to be introduced for every KPI. As a result, dashboards containing a dozen or more KPIs were needlessly large. In contrast, the Summary tile offers a more practical way for gathering a couple of key figures in one place. As usual, the real challenge is not to present thousands of numbers to the supply chain practitioner - that part is easy - but rather to present the 10 numbers that are worth being read - and that part is hard; and the Summary tile happens to be the best tile to gather those 10 numbers.

The Markdown tile offers the possibility to display simply formatted text in the dashboard. As the name suggests, the text gets formatted using the Markdown syntax, which is rather straightforward. One of the most urgent needs addressed by the Markdown tile is the possibility to embed detailed legends within dashboards. Indeed, when composing complex tables, such as suggested purchase quantities, it is important to make sure there is no remaining ambiguity concerning the semantic of each table column. The Markdown tile represents a practical way of delivering contextual documentation and making sure that no numbers get misinterpreted. It also provides an opportunity to document the intent behind the numbers which is too frequently lost amid technicalities: the documentation can outline as to why a number is shown on the dashboard in the first place.

Categories: Tags: envision release No Comments

Preparing enterprise data takes 6 months

Published on by Joannes Vermorel.

How long does it take to get started with Lokad? Answering this question is tough because often our answer is about 3 to 6 months. Hell, 6 months! How can your software be so clunky that it can take up to 6 months to get started? Well, our typical set-up phases can be broken down as follows:

  • 90 to 180 days: preparing the data
  • 3 to 30 days: configuring Lokad

This shows that Lokad’s setup is actually lightweight. Yes, there is room for improvement, but if we consider that Lokad offers programmatic capabilities which completely fit the business drivers, the process can already be considered as lean.

The biggest challenge, which makes pretty much everything else seem insignificant, is data preparation. Preparing data is the art of producing numbers that make sense out of raw data obtained from the company systems.

It is tempting to underestimate the amount of effort that needs to be invested upfront in order to deliver numbers that make sense. In fact, data preparation is too often reduced to a simple data cleaning operation as if the challenge could simply be addressed by filtering the few parts in the data that happen to be incorrect (such as the negative stock levels).

Yet, the true challenge lies in uncovering and documenting the precise semantics of the data. When we begin a project, we consider ourselves lucky if we have about one line of documentation per field for every database table that is made available to us. By the end of the project, we have about one page worth of documentation per field.

If data preparation takes 6 months, why not just postpone using Lokad for 6 months? So that we have all the data just before Lokad starts working on it.

Establishing a data-driven company culture takes years. If your company does not already have a team of data scientists working for it, not much will happen for the next 6 months as far as your supply chain project is concerned. Hence, after 6 months of waiting, your company will still be stuck with another 6 months of data preparation. One of the core know-hows we have developed at Lokad consists precisely in uncovering all the subtle “gotchas” that may backfire against supply chain initiatives.

Lokad can be a key driver for change in your company’s supply chain. Don’t hesitate to contact us, we are happy to discuss these matters with you in more detail.

Categories: Tags: insights supply chain No Comments

Probabilistic promotions forecasting

Published on by Joannes Vermorel.

Forecasting promotions is notoriously difficult. It involves data challenges, process challenges and optimization challenges. As promotions are present everywhere in the retail sector, they have been a long-term concern for Lokad.

However, while nearly every single retailer has its share of promotions, and while nearly every forecasting vendor claims to provide full support for handling promotions, the reality is that nearly all forecasting solutions out there are far from being satisfying in this regard. Worse still, our experience indicates that most of such solutions actually achieve poorer results , as far as forecasting accuracy is concerned, than if they were to use the naive approach which consists of simply ignoring promotions altogether.

What make promotions so challenging is that the degree of uncertainty that is routinely observed when working with promotions. From the classic forecasting perspective, which only considers the mean or median future demand, this extra uncertainty is very damaging to the forecasting process . In fact, the numerical outputs of such forecasting solutions are so unreliable that they do not provide any reasonable options for using their figures for optimizing the supply chain.

Yet, at Lokad, over the years, we have become quite good at dealing with uncertain futures. In particular, with our 4th generation probabilistic forecasting engine, we now have the technology that is completely geared towards the precise quantification of very uncertain situations. The probabilistic viewpoint does not make the uncertainty go away, however, instead of dismissing the case entirely, it provides a precise quantitative analysis of the extent of this uncertainty.

Our probabilistic forecasting engine has recently been upgraded to be able to natively support promotions. When promotional data is provided to Lokad, we expect both past and future promotions to be flagged as such. Past promotions are used to assess the quantitative uplift, as well as to correctly factor in the demand distortions introduced by the promotions themselves. Future promotions are used to anticipate the demand uplift and adjust the forecasts accordingly.

Unlike most classic forecasting solutions, our forecasting engine does not expect the historical data to be “cleaned” of the promotional spikes in any way. Indeed, no one will ever know for sure what would have happened if a promotion had not taken place.

Finally, regardless of the amount of machine learning and advanced statistical efforts that Lokad is capable of delivering in order to forecast promotions, careful data preparation remains as critical as ever. End-to-end promotion forecasts are fully supported as part of our inventory optimization as a service package.

Categories: Tags: forecasting promotion insights No Comments

Ionic data storage for high scalability in supply chain

Published on by Joannes Vermorel.

Supply chains moved quite early on towards computer-based management systems. Yet, as a result, many large companies have decade-old supply chain systems which tend to be sluggish when it comes to crunching a lot of data. Certainly, tons of Big Data technologies are available nowadays, but companies are treading carefully. Many, if not most, of those Big Data companies are critically dependent on top-notch engineering talent to get their technologies working smoothly; and not all companies succeed, unlike Facebook, in rewriting layers of Big Data technologies for making them work.

Being able to process vast amounts of data has been a long-standing commitment of Lokad. Indeed, optimizing a whole supply chain typically requires hundreds of incremental adjustments. As hypotheses get refined, it’s typically the entire chain of calculations that needs to be re-executed. Getting results that encompass the whole supply chain network in minutes rather than hours lets you complete a project in a few weeks while it would have dragged on for a year otherwise.

And this is why we started our migration towards cloud computing back in 2009. However, merely running on top of a cloud computing platform does not guarantee that vast amount of data can be processed swiftly. Worse still, while using many machines offers the possibility to process more data, it also tends to make data processing slower, not faster. In fact, delays tend to take place when data is moved around from one machine to the next, and also when machines need to coordinate their work.

As a result, merely throwing more machines at a data processing problem does not reduce any further the data processing time. The algorithms need to be made smarter, and every single machine should be able to do more with no more computing resources.

A few weeks ago, we have released a new high-performance column storage format code-named Ionic thatis heavily optimized for high-speed concurrent data processing. This format is also geared towards supply chain optimization as it natively supports the handling of storage distributions of probabilities. And these distributions are critical in order to be able to take advantage of probabilistic forecasts. Ionic is not intended to be used as an exchange format between Lokad and its clients. For data exchange, using flat text file format, such as CSV, is just fine. The Ionic format is intended to be used as internal data format to speed-up everything that happens within Lokad. Thanks to Ionic, Lokad can now process hundreds of gigabytes worth of input data with relative ease.

In particular, the columnar aspect of the Ionic format ensures that columns can be loaded and processed separately. When addressing supply chain problems, we are routinely facing ERP extractions where tables have over 100 columns, and up to 500 columns for the worst offenders. Ionic delivers a massive performance boost when it comes to dealing with that many columns.

From Lokad’s perspective, we are increasingly perceiving data processing capabilities as a critical success factor in the implementation of supply chain optimization projects. Longer processing time means that less gets done every single day, which is problematic since ultimately every company operates under tight deadlines.

The Ionic storage format is one more step into our Big Data journey.

Categories: Tags: technology release supply chain cloud computing bigdata No Comments

Will compilation save supply chains?

Published on by Joannes Vermorel.

Yes. To a noticeable extent. And I would never have ventured to put forward this opinion when founding Lokad nearly a decade ago.

By compilation I refer to the art of crafting compilers, that is, computer programs that translate source code into another language. Few people outside the ranks of programmers know what a compiler does, and few people within the ranks of programmers know how a compiler is designed. At first, compilation concerns appear distant (to say the least) to supply chain concerns. Yet, nowadays, at Lokad, it’s compilation stuff that keeps saving the day; one supply chain project after another.

Shameless plug: software engineers with compilation skills don’t grow on trees, and we are hiring. Want to work on stuff that matters? Well, the next time your plane is late because a part was missing, or the next time the drug you seek is out of stock, just remember that you could have made a difference by joining Lokad :-)

Supply chains are complex, maddeningly complex. Globalization has multiplied sourcing opportunities, but delays are longer and more erratic than ever. Sales channel are being multiplied too: there are physical stores, online stores, marketplaces, resellers, wholesalers, ... And now, thanks to Amazon, everyone, everywhere expects everything to be ordered and received overnight. Supply chain expectations are higher than ever.

Approaching supply chain problems with anything less than the full expressiveness of a programming language does not work. Just like Lego programming is not going happen, supply chain challenges won’t fit into checkboxes and dropdowns. This does not prevent software vendors from trying, mind you. Solutions that include more than 1000 tables, each table hovering at around 100 fields on average, are all too common. And while the client company is only using about 1% of the solution’s feature area, they still have to cope with its pervasive complexity.

Compilation saves the day because it provides a huge body of knowledge and know-how when it comes to crafting high-quality abstractions intended as power tools for solving statistical and combinatorial problems (and much more actually). And most supply chain challenges happen to be precisely statistical and combinatorial. For example, at Lokad, by introducing an algebra of distributions, we managed to "crack down" on complicated lead time problems which were resisting our more direct approaches through packaged software features.

What makes language features different from, say, the usual app features (wysiwyg), is that language features are much less sensitive to the specificities of a given challenge than their app features counterparts. For example, let’s consider a situation where your stock-out detection logic backfires in the specific case of ultra-seasonal products. If the feature is delivered through a language construct, then you can always narrow down the data scope until the feature works exactly where it’s intended to do so; possibly dynamically adjusting the scope through an ad-hoc numerical analysis. In contrast, with an app feature, you’re stuck with the filtering options that have been built into this feature. App features are a good fit only if your problems are narrow and well-defined, which is actually very unlike supply chain optimization.

In supply chain, programmability shines because:

  • Problems are both highly numerical and very structured
  • Supply chains are modular and this modularity needs to be leveraged
  • The number of variables is significant but not overwhelming
  • Fitting the precise shape of the problems is critical

It is slightly amusing to see how many software vendors tend to gradually re-invent programmability. As the user interface grows in depth and complexity, with the possibility to add filters, options, pre-process or post-process-hooks, templated alerts, KPI monitors, the user interface gradually becomes a programmable thing, and reaches the point where only a programmer can actually make sense of it (precisely thanks to his or her pre-existing programming skills). Programmable yes, but in a highly convoluted way.

Compilation is the art of amplifying engineering skills: one has to craft abstractions and language constructs that streamline thinking the resolution of problems. As Brian Kernighan famously wrote: Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it? The same logic applies to supply chain optimization, because it’s essentially the same thing as writing code. Well, at Lokad, it literally is the same thing.

Conventional IT wisdom states that one should automate the easy parts first, leaving human experts to cope with the more complex elements. Yet, in supply chain, this approach backfires badly every single time. The most complex parts of supply chain are nearly always the most costly ones, the ones that urgently need attention. The easy parts can take care of themselves through min/max inventory or Kanban. Just like you wouldn’t build software for autonomous cars by refining software for automatic train operations, you can’t tackle difficult supply chain problems by refining software initially designed to resolve simple challenges.

Naturally, compilation alone isn’t sufficient to cope with supply chain challenges. Machine learning, big data processing and a sizable amount of human skills are worth mentioning as well. However, in all cases, having carefully crafting high-quality abstractions helps considerably. Machine learning is vastly simpler when input data is well-prepared. Big data processing is also much more straightforward when computations lend themselves easily to a high degree of parallelization.

Categories: Tags: hiring insights supply chain No Comments

Visualizing probabilities with histograms

Published on by Joannes Vermorel.

The future is uncertain, and one of the best mathematical tools we have for coping with this fact is the distribution of probability. Lokad features both a probabilistic forecasting engine and an algebra of distributions. These two capabilities get along pretty well when it comes to dealing with complex, erratic and very uncertain supply chain situations. At their core, these capabilities rely enormously on processing distributions of probabilities. Yet, until recently, Lokad was lacking convenient ways to visualize these distributions.

As a result, we have introduced a new tile - the histogram - which is specifically intended for visualizing the distributions of probabilities. The histogram can be used to visualize a synthetic distribution:

The above graph has been obtained with the following script:

show histogram "Poisson" with poisson(3)

The uncertainty of the future lead time can be visualized with:

Similarly, an uncertain future demand, integrated over an uncertain future lead time, can be visualized with:

Classic forecasts - where the future is supposed to be known - are comfortable and intuitive. Yet, unfortunately, they happen to be deeply wrong as well, and routinely lead to grim supply chain results. Therefore, out of necessity, we remain stuck with probabilities. Nevertheless, with proper visualization tools, we hope to make these probabilities a bit easier to handle.

Categories: Tags: release features No Comments

Hiring Big Data Analyst and Software Engineer

Published on by Joannes Vermorel.

Once again, we are hiring. We are looking for a Software Engineer and a Business Data Analyst.

Software Engineer

You will integrate a team of talented software engineers in order to further develop our cloud-based data crunching apps. We have infrastructure, data processing, scalability and reliability challenges, and need your help in addressing them.

At Lokad, you will benefit from the coaching of an awesome dev team. You will gain skills in Big Data processing and cloud computing apps. Our codebase is clean, documented and heavily (unit) tested. Our offices are quiet (no open space!), bright, and you can get three monitors.

We are a C#/.NET shop, and you will be developing under Visual Studio, the source code being versioned in Git. Our apps are hosted on Microsoft Azure. In addition, with .NET Core coming later this year, we also anticipate a few strategic migrations towards Linux.

We expect you to have strong software development skills. A taste for low-level high performance computing is a big plus, and a vivid interest for distributed systems is very much appreciated. Contributions to open source projects are also highly regarded.

Big Data Analyst

Your role is to make sure our clients get the most from Lokad. You will address complex supply chain issues and craft quantitative strategies. Your goal is also to keep refining these strategies over time to keep them aligned with the needs of our fast-growing clients.

At Lokad, you will benefit from the extensive training and coaching of our expert team. You will gain skills in Big Data, predictive analysis and overall quantitative optimization for business. You will learn how to achieve measurable business results grounded on scientific analysis of data.

About one quarter of your time is spent interacting with clients in order to better understand their business, (mainly over the phone in English). The rest of your time is spent in what could be akin to advanced Excel-like analytics; except that you're dealing with Big Data and Machine Learning through the use of Lokad's platform.

We expect you to have a keen interest in data and quantitative analysis in general. Good Excel skills are a plus; and having even the most modest programming skills is a bonus too. An engineering background is usually a good fit. We also expect you to be fluent in English as the majority of our clients are located overseas. 2 years or more of professional experience are expected.

About Lokad

To apply, just drop your resume at contact@lokad.com.

Lokad is a software company that specializes in Big Data for commerce. We help merchants and a few other verticals (aerospace, manufacturing) to forecast their inventory and optimize their prices. We are profitable and we are growing fast. We are closing deals in North America, Europe and Asia. The vast majority of our clients are based outside of France. We are located 50m from Place d'Italie in Paris (France).

Categories: Tags: hiring No Comments

Multicolor line charts

Published on by Joannes Vermorel.

The releases of Lokad are done on Tuesdays, and every Tuesday, we release a few more useful bits. Sometimes we release major components - like our latest probabilistic forecasting engine - but nearly every week comes with a few more features and enhancements. Software development at Lokad is very incremental.

A few weeks ago, we improved our line chart. So far, it was only possible to specify one color - the primary color - for the line chart, and then, if multiple lines were to be present, Envision was auto-picking one color for each line. However, with 4 lines or more, our line charts were becoming somewhat unreadable:

Thus, we have improved the syntax of the line chart to offer the possibility to specify a color for each line:

Through this syntax we get the much improved visual:

Categories: Tags: release features No Comments

Senior software engineer wanted!

Published on by Joannes Vermorel.

We are hiring again!

You will integrate a team of talented software engineers in order to further develop our cloud-based data crunching apps. We have infrastructure, data processing, scalability and reliability challenges. We need your help to get those challenges addressed.

At Lokad, you will benefit from the coaching of an awesome dev team. You will gain skills in Big Data processing and cloud computing apps. Our codebase is clean, documented and heavily (unit) tested. Our offices are quiet (no open space!), bright, and you can get three monitors.

We are a C#/.NET shop, and you will be developing under Visual Studio, the source code being versionned in Git. Our apps are hosted on Microsoft Azure. With .NET Core coming this year, we anticipate a few strategic migrations toward Linux.

We expect strong software development skills from you. A taste for low-level high performance computing is a big plus. A vivid interest for distributed systems is very appreciated. Contributions to open source projects are also highly regarded. We are located 50m from Place d'Italie in Paris (France).

Lokad is a software company that specializes in Big Data for commerce. We help merchants, and a few other verticals (aerospace, manufacturing), to forecast their inventory and to optimize their prices. We are profitable and we are growing fast. We are closing deals in North America, Europe and Asia. The vast majority of our clients are based outside of France.

Lokad is the winner of the 2010 Windows Azure Partner of the Year Award, and was named as one of Europe’s 100 hottest startups by Wired Magazine (09/2012).

To apply, drop your resume to contact@lokad.com.

Categories: Tags: hiring No Comments