Actionable data > big data
Big data: a brief history
In 2011, McKinsey published “Big data: the next frontier for innovation, competition, and productivity”. Overnight, “big data” became the business buzzword on everyone’s lips.
Companies rushed to collect as much data as possible… though often without a clear strategy for how they were going to use it.
By 2016, articles started appearing with titles like “Big Data is Dead”. Popularity for the term started to wane. The focus had moved from data collection to data quality.
In 2018, new regulations like GDPR came into effect in response to a wave of data privacy concerns, sparked by scandals like Cambridge Analytica’s microtargeting of Facebook users.
Large-scale, indiscriminate data collection had become less appealing and more risky.
Since 2020 we’ve shifted to using terms like “data-driven decision making” and “AI-powered market insights”.
In 2024, capturing, storing and processing big data is a given. The conversation now centres on how to effectively extract revenue-generating insights from that data.
How big is big data these days?
Could be one terabyte. Maybe 100 gigabytes, according to Jordan Tigani, one of the founding engineers behind Google BigQuery.
It doesn’t matter. Big data is a means to an end. The haystack, not the needle.
Much of it is noise.
Recommended by LinkedIn
From noise to signal
I’ve been chatting with Data Scientists, Vaidotas Zemlys-Balevičius and Povilas Bockus , to find out how much data Euromonitor stores and processes.
Euromonitor's raw SKU database is currently 500 Terabytes in size.
However, the market intelligence platforms we build from this SKU data are only a few gigabytes in size.
For example, Passport Innovation (tracks the success / failure of brand and sub-brand launches monthly since Jan 2021) is 5 gigabytes.
Passport Innovation currently tracks 12,500+ new brands and 126,000+ new sub-brands.
In the back-end we have 300,000+ brands and more than a million sub-brands.
How do we spot and track what’s new?
Here's an overview showing the size of data needed at each step of the journey:
Again, big data is a given, but it’s mostly noise.
The trick is in structuring the data, filtering out anything irrelevant and amplifying anything which is pertinent to your specific use cases.
Copyright © Mark Omfalos 2024