The Data Scientist is Thriving
An interesting article in Forbes talks about why there will be no data scientist titles by 2029. A CTO at Demandbase says, "2019 will be the year of the death of the data scientist." In both cases, it would be easy to come away thinking that the role isn't a place to invest.
Nothing could be further from the truth.
As the world marches toward artificial intelligence, hype gives way to disillusionment in a pattern Gartner calls a "Hype Cycle." The cycle captures in a very predictable way the expectations of any new technology over time.
I have my own way of looking at it and it sounds like this, "Nothing happens when everyone is busy talking about it...it all happens when we STOP talking about it." The fact that data science is under attack as not really a "thing" tells me that we're at the inflection point where things really take off.
What's Really Happening, Then?
Beyond the hype, there is significant automation happening in data science, just as in any hot profession. A relative scarcity of people with strong data science skills combined with a massive business need is causing parts of the data science role to become automated. This automation is commonly sold by software vendors as "AutoML," which is the automation of a portion of the data science job.
At Databricks, our own MLflow product automates the following:
- Tracking of parameters rather than storing this data in spreadsheets
- Making results reproducible as easily in production as in the data science lab
- Massive parallel search for the best predictive model (AKA hyperparameter tuning)
- Deployment of predictions into the customer-facing environment
It would be easy to think this kind of automation proves the death of data science, but it doesn't. Instead, it allows a data scientist focus more energy on other parts of their job, like training data sourcing and prep, asking more questions of the data, and staying fully locked on the ever-expanding needs of the business. Those are clearly very high-value requirements that will keep data scientists busy for years to come.