The big Nothing of Data

The big Nothing of Data

“data is the new oil”… one of the quotes you get to see often lately. Basically there is nothing wrong with this. Unfortunately I often see a large focus on the gathering, generating and storing of large amounts of data. And because we are so fond of sticking labels, we have a label for this too. We call this ‘data hoarding’.


There might be a good reason for this data hoarding. Because everyone is ‘into’ big data you have to be there too, don’t we?  The lack of clear definition doesn’t help here at all. Just by hoarding huge amounts of data you actually can claim being into Big Data. If that’s all you need you can rest assure and stop reading any further.


I use a different definition for Big Data. I don’t use 3, 4, 5 of even more V’s (which apparently need to be used to define Big Data or Analytics). My starting point for discussing Big Data is the purpose of it. And there is just one purpose (challenge me please). This purpose is to generate insights. Insights are the corner stones for decision making (other corner stones might be blindness, intuition, emotions, and so on).

 

 

 


Everyone knows that the best decision making is based on facts. And this is were data can play its role. But just throwing huge amounts at a decision maker probable doesn’t do the trick. This data needs to be prepared, sliced, diced, analyzed and so on before it can do its magic. This is where the data analytics come into play.


So should we just throw large amounts of data to the data analytical person (data scientist)? He or she will probably like it but whether or not this leads to results remains the question. This then all too much depends on serendipity and that just don’t passes your doorstep frequently enough. So, this process of data analytics (or data science) needs to be guided. I must admit that allowing data scientists to just have their way with data is also good. Innovation is sparked by creativity and creativity is sparked by degrees of freedom (in my opinion at least).


 I just learned from a Coursera course on Data Science that the question of what to analyze is even more important than the data itself. This question is the guiding mechanism for data science (call it structured data science).
So there is a holy matrimony between in obtaining valuable insights:
1. The decision maker and his question; and
2. The Data Scientist and his data skills; and
3. The Data and its uncaptured potential
4. The Subject Matter expert and his knowledge of the meaning and use of the data

 

 

 


And then there is this thing called IoT (Internet of Things). I don’t want to go very in depth in IoT, but basically it does two things:
1. It generates events which trigger a predefined action (mostly of another device)
2. It generates data, more data and even more data.

Data generated from IoT is often described in terms of providing the factual foundation for decision making. This can only be true if this data is processed by proper data science and guided or structured by a proper demand for insights.
There are several angles for  demand for insights accompanied by various forms of data analysis:

1. Searching for relationships with a population that are yet unknown. This is the basic level of data analysis and is called Exploratory Analysis.
2. Trying do say something of a larger population by looking at a smaller sample group. This is called Inferential Analysis.
3. Trying to predict outcomes for objects based on the analysis of the data for other objects. This is called Predictive Analysis.
4. Investigating the effect of changing one variable onto another variable. This is called Causal Analysis.
5. Investigating how exactly changes in variables lead to changes in other variables for individual objects. This is called Mechanistic Analysis.

The more precise the question, the deeper the analysis goes. The deeper the analysis goes, the broader and deeper the capabilities of the data scientist have to go.

In conclusion I have to say that undertaking a path of Big data requires more than just hoarding data. It requires the deliberate and careful built-up of analytical capabilities, the definition of the business questions that guide the development of insights and the involvement of subject matter expert who are capable of putting everything into perspective.

Without these, Big Data is nothing!

Fully agree! More data collected = more RANDOM correlations possible. Therefore: the more data you have, the more careful your analysis technique has to be. The current reality is the opposite. Best example is context-sensitive advertising. Yesterday my wife send me an email with the word "gold" in the title, relating to the tradename of a natural stone worktop material. This morning, a professional website that I regularly visit is swamped with adverts for jewellery, art auction houses, etc. Not the first time this happens: I communicate with my solicitor regarding a property purchase - next day law company adverts fill my screen (who needs 2 lawyers?). I complain about the late delivery of an order - the big adverts of this particular supplier annoy me further next morning. Who is willing to pay Google a fee for such low quality analysis?

Like
Reply

To view or add a comment, sign in

More articles by Rolf Akker

Insights from the community

Others also viewed

Explore topics