Deming’s PDCA Cycle & Data Science

Tom Breur

20 August 2017

Dr. W. Edwards Deming (usually just called “Deming”) played a seminal role in the sixties, enabling a spectacular industrial revolution. This started initially in Japan and the far east. He is often considered the godfather of Total Quality Management (TQM), although his influence extends way beyond that. TQM eventually evolved into “Just-in-Time” (JIT) manufacturing, which then evolved into Lean, also referred to as the Toyota method (Toyota Production System).

One of the cornerstones throughout this evolution has been the continued reliance on the scientific method. For some people, this appears to trigger associations with “academic”, not practical, etc., but nothing could be farther from the truth. Business (!) transformation is guided by strict adherence to facts and hypothesis testing: the so-called Deming PDCA cycle of “Plan-Do-Check-Act.”

For data scientists, the Deming cycle shows up in methodologies like CRISP-DM (Cross Industry Standard Process for Data Mining) and SAS’ SEMMA (Sample Explore Modify Model Assess), for instance. Both of these do more or less the same, and should be considered analytics’ equivalents of Deming’s PDCA cycle.

In particular the last step in this cycle, feeding results from data science models back to the initial business objectives, all too often suffers. “Closing the loop” it is called, and for various reasons this often turns out to be the weakest link in the cycle. Either because data analysts run out of bandwidth, or preparing the next campaign gets priority over analyzing the previous one. Understandable from a commercial perspective, but pitiful. Past campaign response (typically purchasing behavior) is the most valid feedback to learn what customers like – they vote with their feet!

When you use a predictive model to optimize campaign response, it behooves the business owner(s) to analyze each and every time the model gets (re)used, and evaluate if the model keeps predicting as expected. How to incorporate such tests was the subject of an old paper of mine on how to evaluate campaign response. Unless you incorporate two test groups in your campaign, you lose the opportunity to explain after the fact if and why response is sliding.

Experience has taught me that product managers never ask for an explanation about results when response exceeds their expectation. Disappointing business results, however, invariably trigger some “reflection”, but this can only be done scientifically if you built in both test groups.

Predictive models have the unfortunate habit of decaying over time. Mostly because of population drift, or because the interrelation between variables is changing. You can imagine a continuum between relatively stable, unchangeable variables to ephemeral, transaction oriented variables. An example of a stable variable would be gender, or (to a lesser extent) socio-economic status. An example of a volatile variable might be number of transactions in the last week, or number of days since the most recent call to the contact center. From my experience, the latter (volatile) category of variables tend to produce more accurate predictions. But unfortunately, they also generate models that decay more rapidly.

As a result of this dynamic, there is a price to pay for more accurate predictive models: they will likely need to be reviewed and updated more often. Models that leverage lots of volatile variables may predict “better”, but are likely to have a much shorter half-life. No free lunch here.

Outside this “inner cycle” of model updating, runs another loop: which variables can you make available at run-time when models need to be deployed? There is a constant drive to enhance the array of variables that are available for predictions. This “data R&D” work is valuable albeit not particularly glamorous. Making all those data available at run-time can be extremely CPU intensive, and horsepower doesn’t come free. This illustrates moreover why you want to evaluate how your models have done (and continue to do), and calculate how much ($$$) your efforts have generated. After all: how else do you justify (all) your investments in data science?!?



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s