5 February 2017
Hal Varian (Chief Economist at Google) famously nominated “Statistician” as the sexiest job of the 21st century. Data scientists have been riding along, and getting lots of “air time”, all the more interesting since the term “data science” emerged only recently. Some say William S. Cleveland introduced it in 2001, but tracing back the real origins is always tricky. Data science and business intelligence (BI) analysts are often positioned as different disciplines. I’d like to add my two cents. For one thing, I consider the differences less pronounced than many others. Secondly, when you look at their respective activities, and how much they pursue similar goals, I want to make a case for how they can benefit from close collaboration.
Both Data Science and Business Intelligence revolve around deriving business insights from data. Not much difference there. Yet data scientists seem to be considered much more “sexy”, for some reason. I have always felt that many of the goals and responsibilities are very similar. In fact, I think they are more alike than different. But that’s just me, of course. Let’s have a look at some of the respective skillsets and typical results, and see where that takes us.
My perspective is that the main difference between BI analysts and data scientists is the time horizon they focus on. Whereas BI is usually backward looking, data science is mostly targeted at forward looking insights. This may sound more principled than I really see it. Personally, I see it as a nuance, a slight difference in emphasis. Think about this: for the majority of use cases, analysis of historical data (the focus of BI analysis) is also done to make extrapolations about the future. That may often be by implication, rather than explicitly. Consider this: few business people look at historical data for purposes of an account of the past per se. They are not historians, they use these data to drive the business, drive it forward.
Another distinction I often hear is that BI focuses on the “what”, whereas data science aims to uncover the “why” of important business events. Again, I would argue that this distinction positions them as more distinct, more different, than I see them. A substantial part of the “art” or data science is telling a story with data, and that almost always means giving a structured and credible overview of seemingly disparate business facts that everyone can agree with. You sum up an array of agreed upon observations, and then posit a hypothesis about how or why they are connected. BI analysts leave the explanation to their report readers, whereas data scientists are more inclined to make explicit reference to the causal nature of connections between facts, a reference to “why”, if you like.
BI outputs are mostly consumed at the enterprise level, or at least at the department level. This always involves a fair amount of processing and (re)structuring of data, work that needs to happen to enable this. This is where they can join forces with data scientists, and where both can improve effectiveness. Data scientists can nicely piggyback on these efforts. When data scientists do all of those transformations (referred to as ETL) themselves, this leads to two kinds of problems. Firstly, this leads to inefficiencies, they are effectively duplicating work. Secondly, and more importantly, this carries a very real risk of inconsistencies in results. When two people try to do the same transformations, there is a very real possibility that although they started off with the same raw data, they may wind up with slightly different result sets. For this reason, data scientists and BI analysts that tap into the same source data, really ought to be joined at the hip!