Lindsey Clark

March 22, 2021

Trusting Data and People from the Top Down

Like many during the pandemic, I’ve seen an uptick in spam and robo calls, typically wanting to extend my car warranty (which has spawned a treasure trove of internet memes sure to entertain on a slow Friday night), sell me various items, including Medicare insurance (am I that old yet?), and a small list of other revenue-generating desires. I received one such call today, and although this is an entitled first-world problem attitude, interrupting my work at 1 PM with your needless call is far up on my list of annoying things. I’ll admit I lost my cool on the ‘recorded line.’ The representative wanted to know if I was interested in selling or refinancing my home, as if those were the only options. When I replied ‘neither,’ the conversation quickly went in the wrong direction. She was sure that I needed to speak to a loan officer, per the information she had. I informed her repeatedly that I didn’t know where she got her information, but it was indeed incorrect. After the whole exchange, it struck me: I didn’t trust her data. It was indeed wrong. But I, the data scientist, didn’t trust the data. 
 
I recently came across a 2016 Fast Company article on why executives don’t trust their own analytics. In sum, most executives are both investing in and feel that data is crucial to decision making, yet 2/3 of them don’t trust the data. That’s indeed an interesting and counterintuitive statistic, and I’m not sure how it translates 5 years later. I’d be willing to hypothesize that it’s not much different today. Why keep investing in something that you don’t trust or use? I’ll admit that as a person who is a transformer of data, someone who takes data and makes more information from it, in hopes of giving helpful insight, have been incredulous when people don’t trust it. I feel somewhat ashamed that being on the receiving end today of data I didn’t trust, I felt annoyed at best, attacked at worst. Is that what business users of data and analytics feel like when they don’t know what they are looking at? And why don’t they trust it? I didn’t trust the data today because I knew it was wrong. And how did I know it was wrong? Did they predict I was looking to refinance and I am indeed not? As an algorithmist, I know that doesn’t make the data or model wrong. It just makes it wrong in this instance. Do users sometimes know data is wrong and how? Is it really just the packaging?
 
This somewhat loosely translates to trustful relationships at work. In the past, I’ve seen and been a part of a breakdown of trust both between teams and among executives and the people executing on the work. This is a common problem, and a hard one to solve once it’s there. But in my opinion, trust issues can be avoided altogether, or at least attenuated, by leaders setting expectations, roles, and responsibilities. If that’s in place, expectations are likely to be met and trustful relationships at work get nurtured. And that all starts at the chief level and works its way down. 
 
Perhaps it’s the same with data. I think we as data scientists should start getting creative with ways we can set responsibilities and expectations with raw data, the transformation and abstraction layers, and prediction/classification outputs. It’s all open to interpretation, and it’s our responsibility to ensure users of our transformations know how to make those interpretations. I don’t think there’s really any good amount of written documentation that builds that kind of data trust—we need to somehow get in front of it. Transparency and explainability in data science are not new concepts, but I think they need better solutions. And that has to be met with reciprocal responsibilities on the business side to create a trustful, and useful, data culture.