The test of supply chain performance

Published on by Joannes Vermorel.

Answering these 12 questions tell more about your supply chain performance than nearly all benchmarks and audits that the market has to offer. This test should take about 5 minutes of your time.

  1. Can supply chain operate without Excel?
  2. Is ABC analysis regarded as obsolete?
  3. Is all relevant data documented by the supply chain teams?
  4. Do you record historical stock levels?
  5. Do supply chain teams monitor the quality of their data?
  6. Do you forecast lead times?
  7. Do you prevent any manual intervention on forecasts?
  8. Do you manage operational constraints, e.g. MOQs, as data?
  9. Do you quantify the cost of supply chain failures?
  10. Can decision-making systems be left unattended for a week?
  11. Can you recompute all decisions in one hour?
  12. Are all decisions prioritized against each other?

If your company isn’t answering yes to at least 10 of those questions, then, a supply chain initiative has the potential to deliver a sizeable ROI. If you don't have 8 positive answers, ROI can be expected to be massive. If your company does not reach 6 positive answers, then, in our book, supply chain optimization hasn't even started yet.

Read more about the Lokad test of supply chain performance.

Categories: Tags: supply chain insights No Comments

2017, year of quantitative supply chain

Published on by Joannes Vermorel.

Thanks to the probabilistic forecasting engine that we released last year, our capacity to optimize supply chains has dramatically improved over the last couple of months. Through our growing experience, we have come to realize that there are 5 principles that drive the success of the supply chain initiatives undertaken by Lokad:

  1. All possible futures must be considered; a probability for each possibility.
  2. All feasible decisions must considered; an economic score for each possibility.
  3. There are no absolutes, only relative costs and opportunities.
  4. Being in control requires automation of every mundane task.
  5. Data requires more effort and brings more returns than you expect.

We decided to name this approach: quantitative supply chain.

You can also read our full Quantitative Supply Chain Manifesto

The quantitative supply chain approach acknowledges the near-infinite computing resources that are available to supply chain practitioners today. It also puts the management back in control of their own supply chain by freeing up teams from unending manual entries required by traditional solutions.

For 2017, we wish you incredible supply chain performance!

Categories: Tags: history insights No Comments

Markdown tile and Summary tile

Published on by Joannes Vermorel.

The dashboards produced by Lokad are composite: they are built of tiles that can be rearranged as you see fit. We have many different tiles available: linechart, barchart, piechart, table, histogram, etc. This tile approach offers great flexibility when it comes to crafting a dashboard that contains the exact figures your company needs. Recently, we have introduced two extra tiles in order to help fine-tune your dashboards even further.

The Summary tile offers a compact approach for displaying KPIs (key performance indicators). While it was already possible to use the Table tile for a similar purpose, this approach was requiring 1 tile to be introduced for every KPI. As a result, dashboards containing a dozen or more KPIs were needlessly large. In contrast, the Summary tile offers a more practical way for gathering a couple of key figures in one place. As usual, the real challenge is not to present thousands of numbers to the supply chain practitioner - that part is easy - but rather to present the 10 numbers that are worth being read - and that part is hard; and the Summary tile happens to be the best tile to gather those 10 numbers.

The Markdown tile offers the possibility to display simply formatted text in the dashboard. As the name suggests, the text gets formatted using the Markdown syntax, which is rather straightforward. One of the most urgent needs addressed by the Markdown tile is the possibility to embed detailed legends within dashboards. Indeed, when composing complex tables, such as suggested purchase quantities, it is important to make sure there is no remaining ambiguity concerning the semantic of each table column. The Markdown tile represents a practical way of delivering contextual documentation and making sure that no numbers get misinterpreted. It also provides an opportunity to document the intent behind the numbers which is too frequently lost amid technicalities: the documentation can outline as to why a number is shown on the dashboard in the first place.

Categories: Tags: envision release No Comments

Preparing enterprise data takes 6 months

Published on by Joannes Vermorel.

How long does it take to get started with Lokad? Answering this question is tough because often our answer is about 3 to 6 months. Hell, 6 months! How can your software be so clunky that it can take up to 6 months to get started? Well, our typical set-up phases can be broken down as follows:

  • 90 to 180 days: preparing the data
  • 3 to 30 days: configuring Lokad

This shows that Lokad’s setup is actually lightweight. Yes, there is room for improvement, but if we consider that Lokad offers programmatic capabilities which completely fit the business drivers, the process can already be considered as lean.

The biggest challenge, which makes pretty much everything else seem insignificant, is data preparation. Preparing data is the art of producing numbers that make sense out of raw data obtained from the company systems.

It is tempting to underestimate the amount of effort that needs to be invested upfront in order to deliver numbers that make sense. In fact, data preparation is too often reduced to a simple data cleaning operation as if the challenge could simply be addressed by filtering the few parts in the data that happen to be incorrect (such as the negative stock levels).

Yet, the true challenge lies in uncovering and documenting the precise semantics of the data. When we begin a project, we consider ourselves lucky if we have about one line of documentation per field for every database table that is made available to us. By the end of the project, we have about one page worth of documentation per field.

If data preparation takes 6 months, why not just postpone using Lokad for 6 months? So that we have all the data just before Lokad starts working on it.

Establishing a data-driven company culture takes years. If your company does not already have a team of data scientists working for it, not much will happen for the next 6 months as far as your supply chain project is concerned. Hence, after 6 months of waiting, your company will still be stuck with another 6 months of data preparation. One of the core know-hows we have developed at Lokad consists precisely in uncovering all the subtle “gotchas” that may backfire against supply chain initiatives.

Lokad can be a key driver for change in your company’s supply chain. Don’t hesitate to contact us, we are happy to discuss these matters with you in more detail.

Categories: Tags: insights supply chain No Comments

Probabilistic promotions forecasting

Published on by Joannes Vermorel.

Forecasting promotions is notoriously difficult. It involves data challenges, process challenges and optimization challenges. As promotions are present everywhere in the retail sector, they have been a long-term concern for Lokad.

However, while nearly every single retailer has its share of promotions, and while nearly every forecasting vendor claims to provide full support for handling promotions, the reality is that nearly all forecasting solutions out there are far from being satisfying in this regard. Worse still, our experience indicates that most of such solutions actually achieve poorer results , as far as forecasting accuracy is concerned, than if they were to use the naive approach which consists of simply ignoring promotions altogether.

What make promotions so challenging is that the degree of uncertainty that is routinely observed when working with promotions. From the classic forecasting perspective, which only considers the mean or median future demand, this extra uncertainty is very damaging to the forecasting process . In fact, the numerical outputs of such forecasting solutions are so unreliable that they do not provide any reasonable options for using their figures for optimizing the supply chain.

Yet, at Lokad, over the years, we have become quite good at dealing with uncertain futures. In particular, with our 4th generation probabilistic forecasting engine, we now have the technology that is completely geared towards the precise quantification of very uncertain situations. The probabilistic viewpoint does not make the uncertainty go away, however, instead of dismissing the case entirely, it provides a precise quantitative analysis of the extent of this uncertainty.

Our probabilistic forecasting engine has recently been upgraded to be able to natively support promotions. When promotional data is provided to Lokad, we expect both past and future promotions to be flagged as such. Past promotions are used to assess the quantitative uplift, as well as to correctly factor in the demand distortions introduced by the promotions themselves. Future promotions are used to anticipate the demand uplift and adjust the forecasts accordingly.

Unlike most classic forecasting solutions, our forecasting engine does not expect the historical data to be “cleaned” of the promotional spikes in any way. Indeed, no one will ever know for sure what would have happened if a promotion had not taken place.

Finally, regardless of the amount of machine learning and advanced statistical efforts that Lokad is capable of delivering in order to forecast promotions, careful data preparation remains as critical as ever. End-to-end promotion forecasts are fully supported as part of our inventory optimization as a service package.

Categories: Tags: forecasting promotion insights No Comments

Ionic data storage for high scalability in supply chain

Published on by Joannes Vermorel.

Supply chains moved quite early on towards computer-based management systems. Yet, as a result, many large companies have decade-old supply chain systems which tend to be sluggish when it comes to crunching a lot of data. Certainly, tons of Big Data technologies are available nowadays, but companies are treading carefully. Many, if not most, of those Big Data companies are critically dependent on top-notch engineering talent to get their technologies working smoothly; and not all companies succeed, unlike Facebook, in rewriting layers of Big Data technologies for making them work.

Being able to process vast amounts of data has been a long-standing commitment of Lokad. Indeed, optimizing a whole supply chain typically requires hundreds of incremental adjustments. As hypotheses get refined, it’s typically the entire chain of calculations that needs to be re-executed. Getting results that encompass the whole supply chain network in minutes rather than hours lets you complete a project in a few weeks while it would have dragged on for a year otherwise.

And this is why we started our migration towards cloud computing back in 2009. However, merely running on top of a cloud computing platform does not guarantee that vast amount of data can be processed swiftly. Worse still, while using many machines offers the possibility to process more data, it also tends to make data processing slower, not faster. In fact, delays tend to take place when data is moved around from one machine to the next, and also when machines need to coordinate their work.

As a result, merely throwing more machines at a data processing problem does not reduce any further the data processing time. The algorithms need to be made smarter, and every single machine should be able to do more with no more computing resources.

A few weeks ago, we have released a new high-performance column storage format code-named Ionic thatis heavily optimized for high-speed concurrent data processing. This format is also geared towards supply chain optimization as it natively supports the handling of storage distributions of probabilities. And these distributions are critical in order to be able to take advantage of probabilistic forecasts. Ionic is not intended to be used as an exchange format between Lokad and its clients. For data exchange, using flat text file format, such as CSV, is just fine. The Ionic format is intended to be used as internal data format to speed-up everything that happens within Lokad. Thanks to Ionic, Lokad can now process hundreds of gigabytes worth of input data with relative ease.

In particular, the columnar aspect of the Ionic format ensures that columns can be loaded and processed separately. When addressing supply chain problems, we are routinely facing ERP extractions where tables have over 100 columns, and up to 500 columns for the worst offenders. Ionic delivers a massive performance boost when it comes to dealing with that many columns.

From Lokad’s perspective, we are increasingly perceiving data processing capabilities as a critical success factor in the implementation of supply chain optimization projects. Longer processing time means that less gets done every single day, which is problematic since ultimately every company operates under tight deadlines.

The Ionic storage format is one more step into our Big Data journey.

Categories: Tags: technology release supply chain cloud computing bigdata No Comments

Will compilation save supply chains?

Published on by Joannes Vermorel.

Yes. To a noticeable extent. And I would never have ventured to put forward this opinion when founding Lokad nearly a decade ago.

By compilation I refer to the art of crafting compilers, that is, computer programs that translate source code into another language. Few people outside the ranks of programmers know what a compiler does, and few people within the ranks of programmers know how a compiler is designed. At first, compilation concerns appear distant (to say the least) to supply chain concerns. Yet, nowadays, at Lokad, it’s compilation stuff that keeps saving the day; one supply chain project after another.

Shameless plug: software engineers with compilation skills don’t grow on trees, and we are hiring. Want to work on stuff that matters? Well, the next time your plane is late because a part was missing, or the next time the drug you seek is out of stock, just remember that you could have made a difference by joining Lokad :-)

Supply chains are complex, maddeningly complex. Globalization has multiplied sourcing opportunities, but delays are longer and more erratic than ever. Sales channel are being multiplied too: there are physical stores, online stores, marketplaces, resellers, wholesalers, ... And now, thanks to Amazon, everyone, everywhere expects everything to be ordered and received overnight. Supply chain expectations are higher than ever.

Approaching supply chain problems with anything less than the full expressiveness of a programming language does not work. Just like Lego programming is not going happen, supply chain challenges won’t fit into checkboxes and dropdowns. This does not prevent software vendors from trying, mind you. Solutions that include more than 1000 tables, each table hovering at around 100 fields on average, are all too common. And while the client company is only using about 1% of the solution’s feature area, they still have to cope with its pervasive complexity.

Compilation saves the day because it provides a huge body of knowledge and know-how when it comes to crafting high-quality abstractions intended as power tools for solving statistical and combinatorial problems (and much more actually). And most supply chain challenges happen to be precisely statistical and combinatorial. For example, at Lokad, by introducing an algebra of distributions, we managed to "crack down" on complicated lead time problems which were resisting our more direct approaches through packaged software features.

What makes language features different from, say, the usual app features (wysiwyg), is that language features are much less sensitive to the specificities of a given challenge than their app features counterparts. For example, let’s consider a situation where your stock-out detection logic backfires in the specific case of ultra-seasonal products. If the feature is delivered through a language construct, then you can always narrow down the data scope until the feature works exactly where it’s intended to do so; possibly dynamically adjusting the scope through an ad-hoc numerical analysis. In contrast, with an app feature, you’re stuck with the filtering options that have been built into this feature. App features are a good fit only if your problems are narrow and well-defined, which is actually very unlike supply chain optimization.

In supply chain, programmability shines because:

  • Problems are both highly numerical and very structured
  • Supply chains are modular and this modularity needs to be leveraged
  • The number of variables is significant but not overwhelming
  • Fitting the precise shape of the problems is critical

It is slightly amusing to see how many software vendors tend to gradually re-invent programmability. As the user interface grows in depth and complexity, with the possibility to add filters, options, pre-process or post-process-hooks, templated alerts, KPI monitors, the user interface gradually becomes a programmable thing, and reaches the point where only a programmer can actually make sense of it (precisely thanks to his or her pre-existing programming skills). Programmable yes, but in a highly convoluted way.

Compilation is the art of amplifying engineering skills: one has to craft abstractions and language constructs that streamline thinking the resolution of problems. As Brian Kernighan famously wrote: Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it? The same logic applies to supply chain optimization, because it’s essentially the same thing as writing code. Well, at Lokad, it literally is the same thing.

Conventional IT wisdom states that one should automate the easy parts first, leaving human experts to cope with the more complex elements. Yet, in supply chain, this approach backfires badly every single time. The most complex parts of supply chain are nearly always the most costly ones, the ones that urgently need attention. The easy parts can take care of themselves through min/max inventory or Kanban. Just like you wouldn’t build software for autonomous cars by refining software for automatic train operations, you can’t tackle difficult supply chain problems by refining software initially designed to resolve simple challenges.

Naturally, compilation alone isn’t sufficient to cope with supply chain challenges. Machine learning, big data processing and a sizable amount of human skills are worth mentioning as well. However, in all cases, having carefully crafting high-quality abstractions helps considerably. Machine learning is vastly simpler when input data is well-prepared. Big data processing is also much more straightforward when computations lend themselves easily to a high degree of parallelization.

Categories: Tags: hiring insights supply chain No Comments

Visualizing probabilities with histograms

Published on by Joannes Vermorel.

The future is uncertain, and one of the best mathematical tools we have for coping with this fact is the distribution of probability. Lokad features both a probabilistic forecasting engine and an algebra of distributions. These two capabilities get along pretty well when it comes to dealing with complex, erratic and very uncertain supply chain situations. At their core, these capabilities rely enormously on processing distributions of probabilities. Yet, until recently, Lokad was lacking convenient ways to visualize these distributions.

As a result, we have introduced a new tile - the histogram - which is specifically intended for visualizing the distributions of probabilities. The histogram can be used to visualize a synthetic distribution:

The above graph has been obtained with the following script:

show histogram "Poisson" with poisson(3)

The uncertainty of the future lead time can be visualized with:

Similarly, an uncertain future demand, integrated over an uncertain future lead time, can be visualized with:

Classic forecasts - where the future is supposed to be known - are comfortable and intuitive. Yet, unfortunately, they happen to be deeply wrong as well, and routinely lead to grim supply chain results. Therefore, out of necessity, we remain stuck with probabilities. Nevertheless, with proper visualization tools, we hope to make these probabilities a bit easier to handle.

Categories: Tags: release features No Comments

Hiring Big Data Analyst and Software Engineer

Published on by Joannes Vermorel.

Once again, we are hiring. We are looking for a Software Engineer and a Business Data Analyst.

Software Engineer

You will integrate a team of talented software engineers in order to further develop our cloud-based data crunching apps. We have infrastructure, data processing, scalability and reliability challenges, and need your help in addressing them.

At Lokad, you will benefit from the coaching of an awesome dev team. You will gain skills in Big Data processing and cloud computing apps. Our codebase is clean, documented and heavily (unit) tested. Our offices are quiet (no open space!), bright, and you can get three monitors.

We are a C#/.NET shop, and you will be developing under Visual Studio, the source code being versioned in Git. Our apps are hosted on Microsoft Azure. In addition, with .NET Core coming later this year, we also anticipate a few strategic migrations towards Linux.

We expect you to have strong software development skills. A taste for low-level high performance computing is a big plus, and a vivid interest for distributed systems is very much appreciated. Contributions to open source projects are also highly regarded.

Big Data Analyst

Your role is to make sure our clients get the most from Lokad. You will address complex supply chain issues and craft quantitative strategies. Your goal is also to keep refining these strategies over time to keep them aligned with the needs of our fast-growing clients.

At Lokad, you will benefit from the extensive training and coaching of our expert team. You will gain skills in Big Data, predictive analysis and overall quantitative optimization for business. You will learn how to achieve measurable business results grounded on scientific analysis of data.

About one quarter of your time is spent interacting with clients in order to better understand their business, (mainly over the phone in English). The rest of your time is spent in what could be akin to advanced Excel-like analytics; except that you're dealing with Big Data and Machine Learning through the use of Lokad's platform.

We expect you to have a keen interest in data and quantitative analysis in general. Good Excel skills are a plus; and having even the most modest programming skills is a bonus too. An engineering background is usually a good fit. We also expect you to be fluent in English as the majority of our clients are located overseas. 2 years or more of professional experience are expected.

About Lokad

To apply, just drop your resume at

Lokad is a software company that specializes in Big Data for commerce. We help merchants and a few other verticals (aerospace, manufacturing) to forecast their inventory and optimize their prices. We are profitable and we are growing fast. We are closing deals in North America, Europe and Asia. The vast majority of our clients are based outside of France. We are located 50m from Place d'Italie in Paris (France).

Categories: Tags: hiring No Comments

Multicolor line charts

Published on by Joannes Vermorel.

The releases of Lokad are done on Tuesdays, and every Tuesday, we release a few more useful bits. Sometimes we release major components - like our latest probabilistic forecasting engine - but nearly every week comes with a few more features and enhancements. Software development at Lokad is very incremental.

A few weeks ago, we improved our line chart. So far, it was only possible to specify one color - the primary color - for the line chart, and then, if multiple lines were to be present, Envision was auto-picking one color for each line. However, with 4 lines or more, our line charts were becoming somewhat unreadable:

Thus, we have improved the syntax of the line chart to offer the possibility to specify a color for each line:

Through this syntax we get the much improved visual:

Categories: Tags: release features No Comments

Senior software engineer wanted!

Published on by Joannes Vermorel.

We are hiring again!

You will integrate a team of talented software engineers in order to further develop our cloud-based data crunching apps. We have infrastructure, data processing, scalability and reliability challenges. We need your help to get those challenges addressed.

At Lokad, you will benefit from the coaching of an awesome dev team. You will gain skills in Big Data processing and cloud computing apps. Our codebase is clean, documented and heavily (unit) tested. Our offices are quiet (no open space!), bright, and you can get three monitors.

We are a C#/.NET shop, and you will be developing under Visual Studio, the source code being versionned in Git. Our apps are hosted on Microsoft Azure. With .NET Core coming this year, we anticipate a few strategic migrations toward Linux.

We expect strong software development skills from you. A taste for low-level high performance computing is a big plus. A vivid interest for distributed systems is very appreciated. Contributions to open source projects are also highly regarded. We are located 50m from Place d'Italie in Paris (France).

Lokad is a software company that specializes in Big Data for commerce. We help merchants, and a few other verticals (aerospace, manufacturing), to forecast their inventory and to optimize their prices. We are profitable and we are growing fast. We are closing deals in North America, Europe and Asia. The vast majority of our clients are based outside of France.

Lokad is the winner of the 2010 Windows Azure Partner of the Year Award, and was named as one of Europe’s 100 hottest startups by Wired Magazine (09/2012).

To apply, drop your resume to

Categories: Tags: hiring No Comments

Working with uncertain futures

Published on by Joannes Vermorel.

The future is uncertain. Yet, nearly all predictive supply chain solutions make the opposite assumption: they assume that their forecasts are correct, and hence roll out their simulations based on those forecasts. Implicitly, the future is assumed to be certain and complications ensue.

From a historical perspective, software engineers were not making those assumptions without a reason: a deterministic future was the only option that the early - and not so early - computers could process at best. Thus, while dealing with an uncertain future was known to be the best approach in theory, in practice, it was not even an option.

In addition, a few mathematical tricks were found early in the 20th century in order to circumvent this problem. For example, the classic safety stock analysis assumes that both the lead times and the demand follow a normal distribution pattern. The normal distribution assumption is convenient from a computing viewpoint because all it takes is two variables to model the future: the mean and the variance.

Yet again, the normal distribution assumption - both for the lead times and the demand - proved to be incorrect in nearly all but a few situations, and complications ensued.

Back in 2012 at Lokad, we realized that the classic inventory forecasting approach was simply not working: mean or median forecasts were not addressing the right problem. No matter how much technology we poured on the case, it was not going to work satisfyingly.

Thus, we shifted to quantile forecasts, which can be interpreted as forecasting the future with an intended bias. Soon we realized that quantiles were invariably superior to the classic safety stock analysis, if only because quantiles were zooming in on where it really mattered from a supply chain perspective.

However, while going quantile, we realized that we had lost quite a few things in the process. Indeed, unlike classic mean forecasts, quantile forecasts are not additive, so it was not possible to make sense of a sum of those quantiles for example. In practice, the loss wasn’t too great because since classic forecasts weren’t making much sense in the first place, summing them up wasn’t a reasonable option anyway.

Over the years, while working with quantiles, we realized that so many of the things we took for granted had become a lot more complicated: demand quantities could no longer be summed or subtracted or linearly adjusted. In short, while moving towards an uncertain future, we had lost the tools to operate on this uncertain future.

Back in 2015, we introduced quantile grids. While quantile grids were not exactly the same as our full-fledged probabilistic forecasts just yet, our forecasting engine was already starting to deliver probabilities instead of quantile estimates. Distributions of probabilities are much more expressive than simple quantile estimates, and, it turns out that it is possible to define an algebra over distributions.

While the term algebra might sound technical, it’s not that complicated; it means that a simple operation such as the sum, the product, the difference, …, can be defined in ways which are not only mathematically consistent, but also highly relevant from the supply chain perspective.

As a result, just a few weeks ago, we integrated an algebra of distributions right into Envision, our domain-specific language dedicated to commerce optimization. Thanks to this algebra of distributions, it becomes straightforward to carry outseemingly simple operations such as summing two uncertain lead times (say an uncertain production lead time plus an uncertain transport lead time). The sum of those two lead times is carried out through an operation known as a convolution. While the calculation itself is fairly technical, in Envision, all it takes is to write A = B +* C, where +* is the convolution operator used to sum up independent random variables (*).

Through this algebra of distributions most of the “intuitive” operations which were possible with classic forecasts are back : random variables can be summed, multiplied, stretched, exponentiated, etc. And while relatively complex calculations are taking place behind the scenes, probabilistic formulas are not more complicated than plain Excel formulas from the Envision perspective.

Instead of wishing for the forecasts to be perfectly accurate, this algebra of distributions lets us embrace uncertain futures: supplier lead times tend to vary, quantities delivered may differ from quantities ordered, customer demand changes, products get returned, inventory may get lost or damaged … Through this algebra of distributions it becomes much more straightforward to model most of those uncertain events with minimal coding efforts.

Under the hood, processing distributions is quite intensive; and once again, we would never have ventured into those territories without a cloud computing platform that handles this type of workload - Microsoft Azure in our case. Nevertheless, computing resources have never been cheaper, and your company’s next $100k purchase order is probably well worth spending a few CPU hours - costing less than $1 and executed in just a few minutes - to make sure that the ordered quantities are sound.

(*) A random variable is a distribution that has a mass of 1. It’s a special type of distribution. Envision can process distributions of probabilities (aka random variables), but more general distributions as well.

Categories: Tags: insights forecasting envision No Comments

WinZip and 7z file formats now supported

Published on by Joannes Vermorel.

File formats are staggeringly diverse. At Lokad, our ambition is to support all the (reasonable) tabular file formats. We were already supporting CSV (comma-separated values) files with all their variants - which can involve varying separators or varying line returns.

However, tabular files can become very large, and in order to make the file transfer to Lokad faster, these files can be compressed. Lossless compression of flat text files works very well, frequently yielding a compression ratio below 10%, i.e. the resulting compressed file is less than 10% of the original file.

Then again, compression formats are staggeringly diverse as well. So far, we were only supporting the venerable and ubiquitous GZip - the compression format used to compress web pages for example.

The two formats WinZip - famous for its .zip file extension - and 7z - one of the most efficient compression algorithms available on the market - are now supported by Lokad. In both cases, the file formats are archive formats, hence, a single .zip file can contain many files within the archive. For now, Lokad only supports single-file archives.

This choice makes sense in practice because if the flat file is so large that it requires compression in the first place, producing an even bigger archive gathering multiple large files tends to be impractical. Instead, we suggest to use incremental file uploads.

Check out our documentation about how to read files in Envision.

Categories: Tags: release technology No Comments

Forecasting 4.0 with Probabilistic Forecasts

Published on by Joannes Vermorel.

A little over one year ago, we unveiled quantile grids as our 3.0 forecasting technology. More than ever, Lokad remains committed to delivering the best forecasts that technology can produce, and today, our 4th generation of forecasting technology, namely our probabilistic forecasting engine, is live and available in production for all clients. This new engine consists of a complete rewrite of our forecasting technology stack, and addresses many long-standing challenges that we were facing.

True probabilities

The future is uncertain no matter how good the forecasting technology. Back in 2012, when Lokad first ventured into the depths of quantile forecasting, we quickly realized that uncertainty should not be dismissed like it's done with the classic forecasting approach, but should rather be embraced. Simply put, supply chain costs are concentrated at the statistical extremes: it's the surprisingly high demand that causes stock-outs, and the surprisingly low demand that causes dead inventory. In the middle, supply chain tends to operates quite smoothly.

With quantile grids, Lokad was delivering a much more fine-grained vision of possible future outcomes. However, as the name suggests, our quantile grids were built on top of our quantile forecasts, multiple layers of quantiles actually. These quantile grids proved to be tremendously useful over the last year, but while our forecasting engine was producing probabilities, internally, nearly all its logic was not working directly with probabilities. The probabilities we computed were a byproduct of a quantile forecasting system.

Because of these quantile roots, our forecasting engine 3.0 had multiple subtle limitations. And while most of these limitations were too subtle to be noticed by clients, they did not go ignored by Lokad’s R&D team. Thus, we decided to reboot our entire forecasting technology with a true native probabilistic forecasting perspective; and this was the start of the forecasting engine 4.0.

Lead time forecasting

Lead times are frequently assumed to be a given. However, while past lead times are known, future lead times can only be estimated. For years, Lokad had under-estimated the challenge of accurately approximating the future lead times. Lead times are subtle: most statistical patterns, such as seasonality (and the Chinese New Year in particular), which impact the demand, also impact the lead time.

In our forecasting engine 4.0, lead times have become first-class citizens with their own lead time forecasting mode. Lead times now benefit from dedicated built-in forecasting models. Naturally, with our engine being a probabilistic forecasting engine, lead time forecasts are a distribution of probabilities associated with an uncertain time period.

Integrated demand forecasting

Lead times vary, and yet, our forecasting engine 3.0 was stuck with fixed lead times. From a traditional perspective, the classic safety stock analysis assumes that lead time follows a normal distribution, while nearly all measurements we have ever carried out indicate that varying lead times are clearly not normally distributed. While our experiments routinely showed that having a fixed lead time was better than having a flawed model, being stuck with static lead times was nevertheless not the perfectly satisfying solution we were looking for.

The forecasting engine 4.0 introduces the concept of integrated demand forecasting, with integrated signifying integrated over the lead time. The engine takes a full distribution of lead time probabilities, and produces the corresponding probabilistic demand forecast. In practice, the lead time distribution is also computed by the forecasting engine as seen previously. Integrated demand forecasting finally brings a satisfying answer to the challenge of dealing with varying lead times.

New products forecasting

Forecasting demand for new product is plain hard. Since, in this case, forecasting obviously cannot rely on the sales history, the forecasting engine has to rely on other data known about the product prior to its launch. Our forecasting engine 3.0 already had a tags framework, precisely geared towards this specific use case. However, tags were unfortunately not carrying as much information as we would have liked and some accuracy was left on the table.

With 4.0, this specific challenge is revised with the introduction of categories and hierarchies. Categories and hierarchies are more expressive as well as more structured than tags, and convey a lot more information. The forecasting engine 4.0 takes the full advantage of this richer data framework to deliver more accurate forecasts, with new-product forecasting being the most acute use case.

Stock-outs and promotions

The intent of the forecasting engine is to forecast the future demand. However, our knowledge of past demand is typically imperfect, with only past sales really being known. Sales typically tends to be a reasonable approximation of the demand, but sales come with multiple biases, the most common cases being stock-outs and promotions. Our engine 3.0 already had a few heuristics to deal with this bias, plus quantile forecasts are intrinsically more robust than (classic) average forecasts. Yet, once again, the situation was not entirely satisfying for us.

The engine 4.0 introduces the notion of biased demand, which can be either censored or inflated. When the demand for a given product on a given day is marked as censored, we are telling the forecasting engine that the demand should have been higher, and that the true demand for that day remains unknown. The engine leverages this information to refine the forecasts, even when the history is full of events which have distorted the demand signal.

Ultra-sparse demand

While quantile forecasts are vastly superior to classic average or median forecasts when it comes to estimating the probabilities of rare events, quantiles begin demonstrating their limits when it comes to estimating super-rare events. For example, our quantile models were struggling to properly deal with items sold only once or twice a year, as well as handling service levels higher than 98%.

Native probabilistic models, as implemented in our engine 4.0, are much better behaved when it comes to ultra-sparse demand and “rare” events in general. These models could have been implemented within a quantile forecasting framework (a probabilistic forecast can be easily turned into a quantile forecast); but our engine 3.0 did not have the infrastructure to support them. So they were implemented into the engine 4.0 instead.

Blended into Envision

Versions 2.0 and 3.0 of our forecasting engine came with a web user interface. At first glance, it seemed easy. However, the user interface was actually dismissing the factor which represents the true challenge of using (any) forecasting engine, which is to provide complete control of the data transferred into the forecasting engine. Indeed, garbage-in, garbage-out remains an all too frequent problem.

The engine 4.0 is interfaced from within Envision, our domain-specific language geared towards quantitative optimization for commerce. Calling the forecasting engine takes a series of data arguments provided from an Envision script. This approach requires a bit more upfront effort, however, the productivity benefits kick in rapidly; as soon as adjustments are made on the input data.

The release of our forecasting engine 4.0 is only the first part of a series of important improvements that have been brought to Lokad over the last few weeks. Stay tuned for more.

Categories: Tags: release forecasting No Comments

Autocomplete file paths with Envision

Published on by Joannes Vermorel.

When data scientists work with Envision, our domain-specific language tailored for quantitative optimization for commerce, we want to ensure that they are as productive as possible. Indeed, data scientists don't grow on trees, and when you happen to have one available, you want to make the most of his time.

A data analysis begins by loading input data, which happens to be stored as flat files within Lokad. Therefore, an Envision script always starts with a few statements such as:

read "/sample/Lokad_Items.tsv"
read "/sample/Lokad_Orders.tsv" as Orders
read "/sample/Lokad_PurchaseOrders.tsv" as PurchaseOrders

While Envision syntax is compact and straightforward, file names may, on the other hand, be fairly complex. From the beginning, our source code editor had been released with autocompletion, however until recently, autocompletion was not providing suggestions for file names. A few days ago, the code editor was upgraded, and file names are now suggested as follows:

This feature was part of a larger upgrade which also made the Envision code source editor more responsive and more suitable for dealing with large scripts.

Categories: Tags: envision release features No Comments

Proofs of concept don’t work in quantitative supply chain optimization

Published on by Joannes Vermorel.

Proofs of concept are one of the most frequent requests we get from our prospect clients looking to try out our supply chain optimization service. Yet, we frequently decline such requests; first because they hurt client’s company itself, and second because they also hurt Lokad in the process. Since POCs – or proofs-of-concept – are so widespread in B2B software, it is usually hard to grasp why they can be downright harmful in the specific case of quantitative supply chain optimization (1). In this post, we gather our findings on POCs, considering them to be a supply chain “anti-pattern".

POCs do not cost less

One core assumption behind the POC methodology is that POCs cost less than the real thing. Unfortunately, this assumption is nearly always incorrect.

First, establishing a small scope within an entire supply chain network only barely moves the needle. In the past, software vendors struggled with scalability problems and actual full-scale deployments did typically require heavy upfront hardware investments, potentially bundled with software licenses such as databases. Without these investments, it was not even possible to start processing data. Yet, in today’s age of cloud computing, this constraint does no longer exist, and if an app is designed correctly, nothing extra is required to start processing data. The cloud computing bill will increase only marginally for every additional client, but all in all, this cost is negligible compared to, say, the costs involved in establishing a discussion with the prospect. Second, the bulk of initial efforts consists of qualifying data, followed by a proper identification oed in establishing a commercial B2B relationship with the client.

Worse still, having more data typically makes things easier, not harder, whenever statistical forecasting is involved. Therefore, by restricting the data scope, POCs tend to make things more difficult, and hence more costly, when compared to addressing the full scope of challenges. Our experience indicates that even when POCs focus on only 5% of the entire supply chain network, these 5% typically involve almost the entire complexity of the network as a whole. Actually, it is precisely because POCs embed nearly all the complexity of a full-scale project, that POCs would be expected to make sense in the first place.

Dismissing the complexity is indeed not an option. If your supply network includes container shipments and working with unreliable suppliers, how could a POC possibly be convincing if these elements are not been factored into the initiative? If any specific constraint is ignored, such as MOQs (minimal order quantities), the numerical results end-up unusable.

The costs beyond the POC are driven by the efforts to be put on both sides, both by Lokad and its client, in managing the full complexity of the supply chain. Those costs are driven by the specificities of the business being considered, the scale having only a marginal impact on costs.

POCs increase the odds of failure

When opting for a POC, companies frequently end up trying stuff to improve their supply chain. However, in this specific case, I would like to quote Yoda, Do. Or do not. There is no try. Despite the claims of software vendors, optimizing supply chain is hard. The problem with POCs is that is gives too much leeway for parties to fail.

  • Extracting sales history is hellishly complicated. Alas, there is no alternative anyway: one will never succeed at optimizing supply chain without data representing the demand.
  • Electronic stock levels are inaccurate. Technology can help auto-detect the most obvious deviations, and help prioritize recounts. However, it is not uncommon for, supply chain managers to deal with phantom inventory too.
  • Forecasts remain poor no matter what. Businesses should learn to embrace an uncertain future, instead of wishing this uncertainty away. Probabilistic forecasts are particularly good at capturing future uncertainty.

Complications are as many excuses to drop the ball.

There are situations where solutions are expected to be easy and uneventful: creating a new email account for a new employee for example. However, optimizing supply chain is nearly always difficult: if the company has been around for more than a few years, the “easy” part of supply chain optimization has already been done years ago. The “difficult” part is what remains.

In our experience, most POCs fail at the initial stages of the project, when teams are still struggling with data issues. Yet, this says nothing about the inventory optimization solution itself, because the solution is never put to the test.

POCs sidetrack supply chain optimization initiatives

POCs emphasize a viewpoint that is not exactly the production viewpoint. Executives seek benchmarks to be made or KPIs to be established. However, what if a certain KPI happens to be more difficult to compute than performing the optimization itself? What if the KPI itself, while instructive, does not offer any tractable options to improve anything?

Our experience indicates that POCs routinely get sidetracked by considerations that are simply non-requirements from a production perspective. Trying to address those requirements typically poisons the POC because suddenly the POC actually becomes an even greater challenge than the production itself.

Also, as the main point of a POC is to seek reassurance, most POCs suffer from gold plating anti-patterns where the client company pressures the vendor to be all inclusive in capturing every single aspect of their business, even at the expense of the overall reliability of the solution. The resulting solution is often too brittle to be of any use from a production perspective.

We have seen many POCs fail on “imaginary” problems as well. For example, if the best forecasting model, empirically tested over thousands of SKUs, happens to be non-seasonal and outperforms all other available seasonable models, should this be considered a problem? There is no question whether the business in question is seasonal: it is. But what if the best known way to anticipate future demand is to merely ignore seasonality in this case? Should this be considered a problem? In our experience, this single “problem” has been considered a blocking issue for many POCs while supply chain practitioners themselves were admitting that the ultimate purchase order quantities suggested were sound.

Go for production and revise project if needed

POCs are usually and rightfully perceived as distractions by practitioners who need to keep the business running while the next-gen solution is coming. Our experience indicates that going straight for production is cheaper and less risky. However, this should be done with the proper methodology.

First, failing due to “logistics of data” is not an option. You can’t optimize what you don’t measure. If data is meaningless, then all optimization attempts will be meaningless too. Success is a requirement since otherwise the company may no longer exist a few years from now. It happens that the vast majority of efforts to be invested are associated with this logistics of data, and this investment can be nearly fully separated from the solution being considered for production. And this is a good thing! If the optimization solution was for some reason falling short, the investment is not lost and merely needs to be redirected to a better alternative solution.

Second, while the goal is to shoot straight for production, it does not mean that numbers go unchallenged, quite the opposite. The old and new process should coexist, picking as many low hanging fruits as possible from the old process (2) while the new one gets polished.

Then, dozens of issues typically arise. It is important to sort them out:

  • problems that were already impacting the old process, albeit in a more silent way. Good processes and good technologies make problems obvious; this is not a defect but a virtue.
  • problems that can’t be fixed by the software being deployed. If the SKU picking is unreliable in the warehouse, don’t expect the demand forecasting module to make it trustworthy.
  • mismatch between real problems vs. expectations. Statistical forecasting is deeply counter-intuitive; don’t let your expectations override what quantitative measurements tell you.
  • design issues that can’t be solved without significantly redesigning the solution, which usually happens when the software does not have the right angle to tackle the challenge.

The last point requires another solution to be considered. However, as mentioned above, this should not be the end of the initiative, merely the beginning of a collaboration with another vendor.

Abandoning the idea of a POC usually means losing the entire momentum that has been invested in the initiative. Furthermore, most POCs fail for the wrong reasons, which means that the odds of success of future attempts will barely be improved as the real challenges remain mostly untouched.

Going straight for production is actually less risky than it sounds. It helps prevent an entire class of failures that tend to be ignored in the case of POCs, while they should not be. It forces the initiative to adopt a narrow focus on what is actually needed to obtain improvements, and to put wishful thinking aside. When facing a serious vendor failure, a company can still capitalize on its internal momentum and switch to another vendor, without losing the said momentum as it usually happens with POCs.

(1) There are many ways to optimize supply chain: better processes, better suppliers, better transporters, better hiring … This post focuses on quantitative optimization: supply chain challenges that can be addressed through statistical forecasts and/or numeric solvers.

(2) Fixing phantom inventory is of benefit to all inventory optimization processes. The same is true for revisiting and improving inventory valuations.

Categories: Tags: insights business No Comments

Q&A about inventory optimization software

Published on by Joannes Vermorel.

Under the supervision of Prof. Dr. Stefan Minner, Leander Zimmermann and Patrick Menzel are writing a thesis at the Technical University of Munich. The goal of this study is to compare inventory optimization software. Lokad did receive their questionnaire, and with the permission of the authors, we are publishing here both their questions and our answers.

1. When did you introduce your optimization software to the market?

Lokad was launched in 2008, but as a pure demand forecasting solution at the time. We started to do end-to-end supply chain optimization in 2012.

2. For which company sizes is your software suitable?

We have clients ranging from 1-man companies to companies over 100,000 employees. However, below 500k€ worth of inventory, the statistical optimization of the supply chain is frequently not worth the effort.

3. For a midsized company of around 50-250 employees and for sales of around 10-25 million euros per year. What would be the price of your standard software package?

This would be our Premier package at $2500 / month. However, the package covers a lot more than just software. Pure software is only 1/5th of our fees or so.

The bulk of the fee goes into paying a data scientist at Lokad who manage the account, leveraging our technology stack to get the final results. That's what we call an inventory optimization as a service.

4. Is your software suitable for different industries? (e.g. pharmacy, metal, perishable goods, …)

Yes, we support diverse verticals from aerospace to fashion with fresh food in the middle. However, our software is primarily a programmatic toolkit tailored for quantitative supply chain optimization. While we do address many verticals, it usually takes a data scientist to craft the finalized solution.

5. What characteristics of your software differentiate you from other optimization software? (Unique selling proposition)

Classic forecasts, and by extension the classic inventory optimization theory, work poorly, surprisingly poorly even. It took Lokad years to realize that the main challenge - statistically speaking - was related to the extreme cases and that is what costs money in reality. Lokad delivers probabilistic forecasts. Whenever inventory is involved, probabilistic forecasts are just better than the classic ones.

6. For which computer platforms is your software applicable? (e.g. Microsoft, Apple, Linux, …)

Lokad is a SaaS (webapp) built on top of a cloud computing platform (Microsoft Azure). Our clients are very diverse. However, in supply chain, there are still more IBM Mainframes out there than OSX setups.

However, without a cloud computing platform, it would be very impractical to run the machine learning algorithms that Lokad routinely leverages. Thus, our software is not designed to run on premise.

7. Does your company provide standardized or personalized software solutions?

Tricky question and subtle answer.

Lokad delivers a packaged platform. We are multi-tenant: all our clients run on the same app. In this respect, we are heavily standardized.

Yet, Lokad delivers a domain-specific language called Envision. Through this language, it's possible to tailor bespoke solutions. In practice, most of our clients benefit from fully personalized solutions.

Lokad has crafted a technology intended to deliver personalized supply chain solutions at a fraction of the costs usually involved with such solutions by boosting the expert's productivity.

8. If it is a standardized software, which features are included in the standard package of your software?

We have over 100 pages worth of documentations. For the sake of concision, they won't be listed there.

9. Are there add-ons available? If yes, which? (e.g. spare parts, …)

We don’t have add-ons in the sense that every single plan - even our free plan – include all features without restriction.

10. For which stages/levels can your software optimize inventory management? (e.g. factory, warehouse, supplier, …)

We cover pretty much all supply chain stages - warehouses, point of sales, workshops – both for forward and reverse logistics.

11. Is your software solving the problems optimally or heuristically?

Computer Science tells you that nearly every non-trivial numerical optimization problem can only be resolved approximately. Even something as basic as bin packing is already NP-complete, and bin packing is far from being a complex supply chain problem.

Many vendors - maybe even Lokad (I try hard to resist to marketing superlatives) - may claim to have an "optimal" solution, but, at best, this should be considered Dolus Bonus; aka an acceptable lie, akin to TV ads boasting unforgetable experience or similar semi-ridiculous claims.

I advise to check my earlier post about top 10 lies of forecasting vendors. Any vendor who would seriously claim to deliver an "optimal" solution - in the mathematical sense - would either be lying or delusional.

12. Which algorithms is your software using? (e.g. Silver-Meal, Wagner-Within, ...)

Both Silver-Meal and Wagner-Within come from the classic perspective where future demand cannot be expressed as arbitrary non-parametric distributions of probabilities. In our book, those algorithms fail at delivering satisfying answers whenever uncertainty is present.

Lokad is using over 100 distinct algorithms, most of them having no known name in the scientific literature. Specialization is king. Most of those algorithms are only new/better in the sense that they provide a superior solution to a very narrow class of problems - as opposed to generic numeric solvers.

13. Where are the limits in terms of input quantities which can be calculated at once? (e.g. size of cargo, different products, period of time, …)

The numerical limits of our technology are typically ridiculously high compared to the actual size of the supply chain challenges. Ex: no more than 2^32 SKUs can be processed at once. Through cloud computing, we can tap nearly unbounded computing resources.

That being said, unbounded computing resources also imply unbounded computing costs. Thus, while we don’t have hard limits on data inputs or outputs, we pay attention to keep those computing costs under control, adjusting the amount of computing resources to the scale of the business challenge to be addressed.

14. How many variables can be chosen and how many are given? (e.g. degree of service, period of time, Lot size, ...)

Lokad is designed around “Envision” a domain-specific programming language dedicated to supply chain optimization. This language offers programmatic capabilities, hence again hard limits are so high they are irrelevant in practice. Our language would not support more than 2^31 variables for example.

However, dealing with more than 100 heterogenous variables at once would already be an insanely costly undertaking from a practical perspective: each variable needs to be qualified, fed with proper data, properly adjusted to fit into the bigger model, etc.

15. Does your inventory management support multiple supply chains for one stock?

Yes. There might be multiple sources AND multiple consumers for a given stock. Inventory can be serial too: each unit of stock may have some unique properties influencing the rest of the chain. This situation is commonly found in aerospace for example.

16. If yes, can those supply chains be prioritized/classified? (e.g. ABC/XYZ products)

Yes. However, prioritization is usually more expressive than classification. We strongly discourage our clients from using ABC analysis, because a lot of valuable information gets lost through such a crude classification.

17. Which method of demand forecasting is implemented? (e.g. moving average, exponential smoothing, Winter’s Method, …)

Moving average, exponential smoothing, Holt and/or Winter’s methods, all those methods produce classic forecasts – aka average or median forecasts. Those forecasts invariably work poorly for inventory optimization because they can’t capture a truly stochastic vision of the future. Plus, as a separate concern, they can’t correlate demand patterns between SKUs either.

Being the counterpart of constrained optimization (detailed above), Lokad has also over 100 algorithms in the field of statistical forecasting. Most of those algorithms have no well-known name in the literature either. Yet, again, specialization is king.

18. How many past periods are considered to calculate the future demand?

The idea that past demand should be represented as periods is mostly wrong. The granularity of the demand is important: 10 clients ordering 1 unit each is not the same thing than 1 client ordering 10 units at once. Our algorithms are typically not based on periods.

Then, in terms of depth of the history, our algorithms typically try to leverage all the history available. In practice, it’s rare that looking further than 10 years back yield any gain in the future forecasts. So there is no hard limit, it’s just that the past fades into numerical irrelevance.

19. Is the seasonal change in demand included in the forecast? (yes/no)

Yes. However, seasonality is only one of the cyclicities that exist in the demand: day of week and day of the month are also important, and also handled. Then, we have also made recent progress on quasi-seasonality: patterns that don’t exactly fit the Gregorian calendar such as Easter, Chinese New Year, Ramadan, Mother’s day, etc.

20. What kind of performance measures can be analyzed? (e.g. waiting time, ready rate, non-stockout probability, degree of service, …)

As long as you can write a program to express your metric, it should be feasible with Lokad. Yet again, Lokad offers a domain-specific programming language, so we are flexible by design. In the end, there is one metric to rule them all: the dollars of error.

21. Does your software support the implementation of penalty costs? (e.g. cost for “out of stock”, “capacity limits reached”, …)

Yes, it's one special case of the many business drivers that we take into account. Those penalties can take many numerical shapes: linear or not, deterministic or not, etc.

22. Which are your three strongest competitors in your market segment?

Excel, Excel and Excel. Number 4 is pen+paper+guesswork.

23. Do you have a list of companies (mid-size to large-size) using your software?

See our customer's page.

Categories: Tags: insights software No Comments

Uploading very large files through the web

Published on by Joannes Vermorel.

The web was not really intended to transfer giganormous files. In order to do that, there are other (older) protocols like FTP (the File Transfer Protocol) or its secure alternative FTPS and SFTP. Lokad was already supporting many file receiving options, including web uploads. However, until today, our web uploads were restricted to files weighing less than 200MB.

Today, we have released a new version of our web upload features, and it's now possible to upload arbitrarily large files through your favorite web browser into Lokad. Our web upload component is smart enough to perform retries, so if your internet connection faces glitches midway, it will not restart the upload from scratch, but resume the transfer.

While uploading a 10GB flat file through your web browser might not be a very practical option when operating in production, it can be very handy to quickly get started with Lokad; especially if you're not too comfortable with FTP clients like FileZilla.

Ps: we never pushed any official announcement, but we have been supporting public key authentication for SFTP for a while as well.

Categories: Tags: bigfiles data release No Comments

Joining tables with Envision

Published on by Joannes Vermorel.

When it comes to supply chain optimization, it’s important to accommodate the challenges while minimizing the amount of reality distortion that get introduced in the process. The tools should embrace the challenge as it stands instead of distorting the challenge to make it fit within the tools.

Two years ago, we introduced Envision, a domain-specific language, precisely intended as a way to accommodate the incredibly diverse range of situations found in supply chain. From day 1, Envision was offering a programmatic expressiveness which was a significant step forward compared to traditional supply chain tools. However, this flexibility was still limited by the actual viewpoint taken by Envision itself on the supply chain data.

A few months ago, we have introduced a generic JOIN mechanism in Envision. Envision is no more limited by natural joins as it was initially, and offers the possibility to process with a much broader range of tabular data. In supply chain, arbitrary table joins are particularly useful to accommodate complex scenarios such as multi-sourcing, one-way compatibilities, multi-channels, etc.

For the readers who may be familiar with SQL already, joining tables feels like a rather elementary operation; however, in SQL, combining complex numeric calculation with table joins rapidly end up with source code that looks obscure and verbose. Moreover, joining large tables also raises quite a few performance issues which need to be carefully addressed either by adjusting the SQL queries themselves, or by adjusting the database itself throught the introduction of table indexes.

One of the key design goals for Envision was to give up on some of the capabilities of SQL in exchange of a much lower coding overhead when facing supply chain optimization challenges. As a result, the initial Envision was solely based on natural joins, which removed almost entirely the coding overhead associated to JOIN operations, as it is usually done in SQL.

Natural joins have their limits however, and we lifted those limits by introducing the left-by syntax within Envision. Through left-by statements, it becomes possible to join arbitrary tables within Envision. Under the hood, Envision takes care of creating optimized indexes to keep the calculations fast even when dealing with giganormous data files.

From a pure syntax perspective, the left-by is a minor addition to the Envision language, however, from a supply chain perspective, this one feature did significantly improve the capacity of Lokad to accommodate the most complex situations.

If don’t have a data scientist in-house that happens to be a supply chain expert too, we do. Lokad can provides an end-to-end service where we take care of implementing your supply chain solution.

Categories: Tags: envision technical release No Comments

Insights on the Lokad tech evolution

Published on by Joannes Vermorel.

The technology of Lokad has evolved so much that people who have had the chance to trial Lokad even two years ago would barely recognize the app as it stands today.

The "old" Lokad was completely centered around our forecasting engine - i.e. what you can see as a forecasting project in your Lokad account today. As a result, our forecasting engine gradually gained tons of features not even remotely related to statistics. About two years ago, our forecasting engine had become a jack-of-all-trade responsible for almost everything:

  • data preparation with the possibility to accommodate a large diversity of data formats
  • reporting analytics with a somewhat complex, and somewhat flexible, Excel forecasting report
  • scheduled execution through a webcron integration or through the API

Then, during the last two years, we have gradually introduced stand-alone replacements for those features that now live outside our forecasting engine. However, calling those new features mere replacements is unfair, because those replacements are vastly more powerful than their original counterparts.

  • We can now process very diverse files, varying in size, in complexity and even in data formats. Plus, we have many data connectors too.
  • The capabilities of our old Excel forecasting report are dwarfed by the newer reporting capabilities of Envision.
  • Scheduling and orchestration are now first-class citizens which encompasse also the data retrieval from other apps.

Because those new features are plainly superior to the old ones, we are gradually phasing out the cruft, that is, phasing out all the non-forecasting related things that still live inside our forecasting engine.

In order to keep the process smooth, we are gradually - but actively - migrating all our clients from the old Lokad to the new Lokad; and when an old feature isn't used anymore, we remove it entirely.

The old Excel forecasting report is a tough case for us. The challenge is not to merely duplicate the report itself within Envision (that alone isn't hard at all) - the challenge is that the underlying thinking that went into this report is now fairly outdated. Indeed, over the years, Lokad has introduced better forecasting technologies - the latest iteration being probabilistic forecasts - which cannot be made to fit within this report. By design, this one report is stuck with a legacy approach to forecasting, which unfortunately is not such a good fit as far as inventory optimization is concerned.

In contrast, combining probabilistic forecasts with business drivers does require more efforts both on the Lokad side and the client side, but business results simply don't compare. The former is about optimizing percents of error while the later optimize dollars of error. Unsurprisingly, once our clients realize how much money they leave on the table not doing the later, they never consider going back to the former.

Then, our data integrations are currently undergoing a similar, and no less radical transformation. When we started developing data connectors, we tried to fit all the data we were retrieving into the framework established by our forecasting engine; that is producing files such as Lokad_Items.tsv, Lokad_Orders.tsv, etc. This approach was initially appealing because it was forcing a normalization on the data retrieved and processed by Lokad.

Unfortunately, this abstraction - like all abstractions - is leaky. All apps don't agree on what exactly is a product or an order; there are tons of subtle differences to be accounted for, and it was simply not possible to accommodate all the business subtleties through some kind of data normalization.

Thus, we have started to take the data integration challenge from another angle: retrieve the app data while preserving as much as possible the original structures and concepts. The main drawback of this approach is that it requires more initial efforts to get results because the data is not transformed upfront to be compatible with all the default expectations of Lokad.

However, because the data doesn't suffer misguided transformations, it also means that Lokad does not get stuck into not being able to accommodate business subtleties because they don't fit the framework. With some programmatic glue, we accommodate the business needs down to the minute details.

Similarly to our old Excel report, the transition toward native data - as opposed to normalized data - follows our experience which indicates that investing a little more in getting the number aligned with the business yields a lot more results.

Categories: Tags: insights No Comments