Building a Better Data Science Team

Building a Better Data Science Team

Over 75% of data science and big data projects fail. That failure rate has many reasons behind it, one such reason is the way in which many of these teams are organized.  The organizational structure of a team is something that has a lot of impact on its abilities to be successful.  Unfortunately most teams are organized in a way that almost guarantees failure and at best mediocre success.  There are better structures that allow for success, one such model is the distributed team model.

 

Command and Control

Most teams have a command and control model where there is an actual order to follow in terms of how things get done and approved. This model tends to suffer from bottlenecks.  A lot of companies enforce this model on data science practices which tends to reward the ego driven who somehow find their way to the top of this command and control or in a bottleneck position where they can slow things down just because.  Data science is a shiny object in many companies and people see it as path to more money and power, which tends to create issues when it comes to actually getting real work done.  Command and control works great in some areas but not when you are doing data science.  The need to explore and find answers is not something that command and control helps teams excel at. 

 

Command and control also goes against the very nature of most people who tend to gravitate to data science. Those that are great at data science tend to be collaborative around experiments, communicate on topics they find of interest and have a natural curiosity.  Command and control structures tend to stifle these characteristics in people who go for data science.  Understanding that command and control is not ideal, many teams tend to swing to the opposite.

 

Flat Orgs

Flat orgs are often the answer to command and control that many teams move towards. The issue with flat org structures are that when they are implement to the letter, they tend create a lot of inefficiencies.  A flat org works well when everyone is well compensated and has good self-discipline.  Let’s be honest, not everyone is well disciplined and not everyone feels they are well compensated, which tends to lead to lack of efficiency. 

 

What ends up happening is that a team becomes semi flat, there is a leadership class and then the workers. It often becomes difficult for people to move up and their lack of motivation tends to create a lot of inefficiencies.  It also creates a model where the best person is not the one who makes the decisions.  Data science is a complex space and the best person who has the domain expertise should be driving but in a semi flat org, that is often not the case.  Often it is committee driven or leader driven which tends to frustrate the person who actually knows the domain when things don’t go right.  This then creates inefficiencies.  The other aspect is that flat orgs drive out motivation.  Since people often feel they are not getting paid enough, they don’t work hard.  Since there are no advancements and targets to achieve that will enrich their lives, they tend to just work at a comfortable pace, not the pace that they need to for them to achieve success.

 

Distributed Orgs

Looking at data science, most people tend to focus on the math, coding and domain expertise. I have found this to be completely limiting and out of date.  The new way to approach things is to look at data science as a mix of: business skills, technology skills and science skills.  Once you start looking at your team this way, it makes a lot more sense how to structure it and manage it so that your team is set up for success.

 

Teams do need a mixture of the command and control and the flat org methods. This combined method is what I call distributed org structuring.  Not everyone is the master of business, technology and science and not all projects need these in equal amounts.  Each project tends to have a mixture of all.  I play heavily on the business side and I leave a lot of the technology to other players who are better skilled at the nuances each project’s needs.  The science tends to be a shared domain by many of us. 

 

When I am done with the business side of the project, I often hand over leadership to someone else on the team to drive the technology part. An example, I build the business case and work with teams around a deep learning projects.  I will hand over the coding to someone else and the streaming work to someone else to drive and guide the project.  I am still a part of it and still have a voice but understand my own limitations.  Can I do the coding, yes, but am I the best person to do it, not in all cases.  So I look at everyone on the team and look for who has the skills and let them drive.  Also, we work on many projects.  I may work on six at a time, where as someone who is more focused on technology leadership may only do two.

 

When people come to us for progress updates, it is a team effort to talk about what is going on. This method for many seems disorganized but I have found that many projects move faster this way and ROI is better.  We often have to explore many answers to get the best answer and this team approach allows people to speak up and have a sense of ownership, this investment into a project helps us get to the best answer, faster. 

 

Our efficiencies are high because we have flexibility that a command and control lacks and we have upward mobility that a flat org lacks. This helps us move fast and stay motivated in ways other teams just can’t.

 

Is this set in stone? No, just like data science, the team structure will be evolving over time.  But I have had a lot of success with data science following this method.  If this is something you find of interest, feel free to contact me.  I help companies make a difference through data science.

To view or add a comment, sign in

More articles by Edward Chenard

Insights from the community

Others also viewed

Explore topics