25 June 2017
When I started using computers, they mostly didn’t have terminals, yet. The introduction of “screens” for interactivity with the operator was a huge breakthrough. Technology hasn’t stopped since then, to the contrary. Computers and smartphones are ubiquitous, and this has triggered revolutionary innovations. We now live in a data-driven economy. The power of data should not be confined to those with technical skills (typically IT). We are bringing data to end-users, and providing interfaces that require little or no technical (or data) knowledge. Gartner refers to this latter user groups as “Citizen BI users” or “Citizen Data Scientists.”
I have noticed an interesting phenomenon. End-user tools get ever more user-friendly, and “citizen users” no longer want to rely on IT to have their data needs met. Seems genuine progress to me. As testimony of success I now see citizen data scientists perform very elaborate data manipulation, done within the confines of user-friendly end-reporting tools. Great! Power to the people. But with that power has come a new risk, actually one that seems very familiar… Citizen BI users are re-creating the old “Excel-hell” (or here) straight inside their own self-service BI tools!
What do I mean by this? Whenever a BI application gets released, invariably the first and most commonly asked question is: “Can I export the results to Excel??” IT has always resented this reflex, claiming that releasing data into uncontrolled environments will inevitably lead to (data) chaos. Lack of control, lack of consistency. Maybe most importantly, poor maintainability for applications that organically grow downstream from the officiated corporate data source. We refer to this as “Excel hell” because these federated data applications can easily mushroom out of control. In that end game, the very flexibility and versatility of Excel comes back to bite you in the bum.
From a bird’s eye view, you notice that individual contributors are confined by their idiosyncratic boundaries, and within those constraints do the best they can. But that local optimum need not be the best solution across the value chain. The same data transformation (maybe a sub query that is need for a report) can be done in the reporting tool, or it could be performed upstream and its result passed on to the report. The latter solution, by implication, is more likely to get identified as a requirement and maybe even documented as such. Not to mention the superior usage of CPU horsepower when you run this on the server (only once…) rather than inside the reporting tool.
My friend Ronald Damhof invented, and subsequently has been popularizing a “four quadrant” model where data needs like these can migrate over time from quadrant IV to quadrant II (see also this article). It is wonderful when self-service enable discovery of extracted features. But if there is no (convenient!) path to make these enriched data features available to the entire organization, then both standardization of context as well as robustness and maintainability are compromised. Absent any realistic input into the corporate release process, of course users default to loading their (!) data into Excel. There they have control. There they can determine their fate. They need to do their thing. With, or without help from IT.
All things considered, maybe self-service BI solutions place even higher demands on data governance than the “old” operating model (IT “in control”) ever did. However, the wisdom of the crowds, massively parallel experimentation, and a democratic process to surface the ideal representation of data –and hence business processes– must surely be in the best interest of value creation. Besides, history has also shown that once the genie is out of the bottle, there is no way to ever get it back in… I remain hopeful that careful data governance, collaboration between IT and citizen users, the transition from Agile to DevOps to No-Ops, micro services and containerization, can help mitigate some of these side effects of self-service BI.