I’ve been thinking a lot lately about what it takes to make a successful data science team, or what that even means. There’s a ton of blogs and papers and books on this topic, but it still seems elusive. In my mind, a successful data science/AI program is one that is highly engaged with the business, has developed, scaled, and deployed a portfolio of algorithms (>=2 that are not regression, I said what I said) that are not just actively used but championed by business leaders, and has a pipeline of business problems up for research and possible development. Too often, I see data science teams that struggle to find projects, are misunderstood by the business, and struggle to get even a simple regression out the door and into a basic deployment strategy with logging, re-training schedules, basic QA and performance monitoring, and beyond. With everything we have today, why is this often so hard?
I’m entering my 8th year as a professional data scientist. Early in my career, I received the feedback that my leadership and solutions were too ‘academic.’ I was coached to implement basic, more ‘pragmatic’ solutions first. Looking back, I don’t think this was bad advice, but I’ve learned some things since then, and I’m shifting my opinion. Here’s my three-pronged data science strategy to set up successful data science teams that I’m currently noodling on.
1. Data science needs data. And they need data rapidly. And they need cleaned data. They need curated data with nice and thoughtful schemas. They need outdated data deleted rapidly. Even a couple of mediocre data scientists with a carefully curated data warehouse and prod/dev environments and all of the above are UNSTOPPABLE. I said what I said.
2. Once a data product is out there, it’s out there, and it’s hard to walk it back. This might be a controversial opinion, but in my 8 years, every time I see a basic statistic or a simple ranking list put out there in lieu of a more complex data science model, it’s damn near impossible to version that into some sort of neural net or classifier. Because? Because clients start to be comfortable with it. The business starts to know it. And then you want to change it to something else? Nope, too risky. So, AI first, yes? And if your business problems don’t need data science solutions, then why do you have a data science team? And data science solutions can be prototyped rapidly if time to market is your worry. See #1.
3. C-suite leaders need to talk about AI. Talking about data, data literacy, and the use of data needs to become part of the fabric of the organization. There needs to be a comfort and ease to it. Leaders need to talk about it with excitement and promise. We need to talk about the business using graphs, performance measure numbers, etc. There must be strong data leadership, and the people with a ‘Chief’ or ‘VP’ should be leading those discussions.
I’m beginning to believe organizations that really want successful data science (as defined above) need to have these 3 things in place.
And on the data science side, required business reading for all data scientists:
AI and data science first.