When I started writing this series just over a year ago, I certainly didn't think it would take me this long to finish it. Part 1 was written on June 23rd of last year and covered how healthcare practitioners are structured from a billing perspective; Part 2 was written on February 17th of this year and went a level of complexity deeper, discussing the same topic, but for healthcare facilities and organizations.

Shortly after Part 2 was posted, I decided to wait until posting Part 3 until the one-year anniversary of the machine-readable file (MRF) mandate from CMS was live across the country and reflect on what's worked, what hasn't, and what might lie ahead. Also, and more importantly, I wanted a fresh perspective on the topic, as I recently decided to take a break from healthcare after 15 years. The following is an inside account of nearly a year of interaction I had with CMS and the developers across the industry who were tasked with navigating this massively complex implementation.

Petabyte Pain

Think of the biggest file you've ever downloaded on your computer. I'm guessing it was in the gigabytes, which is fairly manageable these days. If you're a software engineer, think of the biggest file you've ever had to save in cloud storage. I've seen a few instances where files were in the single-digit terabyte range, but services such as S3 usually put a hard cap on 5TB for a single file. Anything above that isn't designed to efficiently auto-transport across our global network of fiber optic cables, and if you're in the rare air of having a requirement that large, then one of your only options is renting a semi-truck to transport your data – yes, this is a thing. Notice that we're still talking about terabytes, so what about files that are a petabyte (1,000 TBs) in size?

This scenario might sound like a fun what-if question asked in a Google interview for a principle cloud architect role, which is fair game. But what if I told you this became an alarming scenario in the world of healthcare for every network originator, health insurer, and third-party administrator (TPA) across the country two years ago?

(TiC)Tok

On June 24th, 2019, President Trump issued executive order 13877, otherwise known as the "Executive Order on Improving Price and Quality Transparency in American Healthcare to Put Patients First". The final rules as required for implementation by HHS were posted on November 12th, 2020 with an effective date of January 11th, 2021. One of these rules was a requirement for health plans and employers to post their contracted rates for all providers in a machine-readable format and to make those files available by 1/1/2022. This initiative became known simply as the "Transparency in Coverage (TiC) Rule".

For advocates of price transparency, this was a huge win, at least on the surface. I consider myself one of those advocates, and had spent the last eight years working at Healthcare Bluebook building software that exposed these prices and armed patients with the right level of info to "shop" for care on elective procedures and drugs. We saved families from bankruptcy and helped facilitate honest conversations with providers on the cost of their care, and we inserted ourselves in every conversation we could with politicians and news outlets to create awareness on just how wide the cost variance was for the same procedure in the same city, often at two different locations just across the street from each other.

We were always champions on the democratization of this type of data, so we welcomed this regulation from a theoretical level... with caveats. Just like the complex financial engineering of credit default swaps and collateralized debt obligations that led to the Great Recession, the fee schedule engineering in the world of healthcare had grown to absurd levels of complexity, so deconstructing that info into a single file to share with the public would be quite the Rosetta Stone to solve. Like all unnecessarily complex topics, I'll refer to this eternal quote from The Oracle of Omaha:

Investment must be rational; if you can't understand it, don't do it. It's only when the tide goes out that you learn who's been swimming naked.
- Warren Buffett

Substitute "investment" for "provider contracting" and you get the point.

I ran some quick back-of-the-napkin math based on what I knew about the complexities of provider payment contracts, and the results were comically concerning: best case, files for the major network originators would be in the single-digit terabytes, and worse case they would be in the hundreds of terabytes. To clarify, that's for a single file, and there would be hundreds if not thousands of files from each of these insurers posted on a monthly basis. We had quickly reached petabyte territory.

On April 30th of last year, I posted a comment on the discussion board of The CMS GitHub repository for a concern I had on the technical implementation of the Transparency in Coverage final rules on machine-readable files (MRFs):

Based on the existing discussions I've read on here, the trend seems to be: 1) most developers of these files will go the route of JSON vs XML or YAML, and 2) for larger plans, these files could be massive. That being said, one common issue with massive JSON files are when a system need to load a single file as one large transaction because the data is stored as a single JSON object. JSONL alleviates this issue by allowing data to be processed one line at a time. Supporting this format might not be necessary if some of the suggested changes are implemented around Group IDs and/or Service Codes, but it might be worth considering by the community to make these files better optimized for consumption.

ELI5: Dear CMS, the structure of these files are going to cause a massive memory and computing issue on parsing and transforming the data of a single file, so you can you make this small tweak to the format to alleviate those resource constraints?

CMS responded a month later acknowledging my concerns, but held firm on sticking to the requirement that a file must be one JSON object. Undeterred, I kicked off another discussion on storage estimates, approximating a $330k per year cost for a mid-sided health plan to store just one of the file types for 150 employer clients. Other developers quickly chimed in with their own cost estimates, leaving a breadcrumb trail of hard-to-swallow numbers on just how costly this project was becoming. Other topics quickly filled the GitHub discussion board around normalizing the data, using other storage-efficient formats, and where/how a table of contents file could be used as a starting point just to understand how to interpret the files from each health plan.

I don't envy the job of the CMS developers that had to read and respond to all of our feedback, but I do have great respect for them trying to address the tsunami of requests we were throwing their way. After a period of radio silence – likely because the CMS dev team was waiting to hear from their policy-making counterparts – they started responding to our posts at a rapid pace, implementing what they could. Thankfully, the deadline for the MRFs was extended by six months to 7/1/2022, which gave everyone a sigh of relief. By the time the requirement was in effect, CMS was able to implement a number of changes to normalize the data across multiple files and result in file outputs that were often 1/10th of what we originally estimated. However, those files were still massive by consumer laptop standards, so we held our breath to see how the industry would react.

Open The Floodgates

Remember when products like the Segway and Google Glass were in full-on hype mode by the press prior to release? They were supposed to be life-altering, changing the status quo for families across the country. We all know how that played out: beyond the early adopters, both products fizzled, and became examples of how over-hype can obscure the inconvenient truth that many of these products simply don't live up to expectation. I felt this way about the MRFs, grabbed my virtual popcorn, and monitored the headlines.

News outlets were quick to post how the MRFs would usher in a new era of price transparency. Startups bolted into the race to claim that, before the files were even released, they would consume every MRF on the internet and offer a new shopping experience for patients overnight. Healthcare in the US was about to be disrupted.

So what was the reality on MRF D-Day: July 1st, 2022? In my opinion, the deployment of the MRFs was an inconsistent and massively confusing shitshow. Part erratic standards by CMS, part opportunity for health plans to make the process confusing for patients – I once heard a plan executive say, "How can we make this as difficult as possible for people to find?" – the launch was more of a whimper than a bang, and many of those same startups clamoring about disruptive innovation the night before suddenly went silent when they realized the integration mess they were dealing with.

Perhaps the best play-by-play take of the MRF rollout was a real-time blog by Turquoise Health. Their witty posts frequently called out the absurd tactics taken by some of the plans to make the consumption of these files as difficult as possible. Here's an example of Cigna taking the we're complying, but are going to make this a miserable experience for you approach:

Signal vs Noise

One year later, what's changed? Plans are still posting files, startups and other vendors are still consuming them, CMS has clarified a few additional technical requirements, and congressional hearings have been held as a postmortem review on the rollout. Most importantly, the media has changed its narrative to an honest tune about the reality of dealing with these files: the data is a mess, and doing anything with it is very hard. Some companies, namely Turquoise Health, have claimed to have integrated and cleaned over 700,000 MRFs from health plans across the country, but their former proclamations of using this data as a "one shopping tool to rule them all" have mostly subsided.

My favorite piece of postmortem content is a recent bi-partisan letter from Senators Maggie Hassan (D-NH) and Mike Braun (R-IN). It's content is so good that I think it's worth posting in its entirety:

We write to express our concerns about loopholes that are inhibiting the worthy intent of the Centers for Medicare and Medicaid Services (CMS)-issued Transparency in Coverage (TiC) rule. Americans should not struggle with opaque pricing for health care, and we respectfully ask CMS to update its rule to ensure that there is true health plan transparency and compliance.

In July 2022, CMS issued the TiC rule, requiring health insurance companies to publicly post their in- and out-of-network rates for the health care plans that they offer. We were heartened to see the current and prior administration take important steps toward improving health care price transparency. With this newly available data, employers, researchers, and policymakers can identify unreasonable prices and excessive increases, ultimately helping to lower prices for patients. However, we are concerned that remaining technical loopholes have resulted in insurance companies publishing data that does not align with the intent of the CMS rule.

While some insurance companies are complying with CMS’ rule, others may be relying on gaps to evade accountability. According to reports, insurance companies have provided information in an indecipherable structure, omitted important pricing information, and stuffed the information into files too large for anything but a supercomputer to process. The existing system also makes
cross-plan comparisons challenging, as plans are formatting and structuring their data differently. CMS also has not created a central repository where the public can find plan data. As a result, employers and researchers have been unable to use the data to assess the drivers of high health care costs and target solutions.

There are a number of administrative actions CMS can take to improve data accessibility, usability, and quality. Experts have highlighted potential solutions, urging CMS to limit file sizes, create a standardized reporting template, reduce the frequency of reporting, and require a clear organizational system and standardized labeling. These changes would allow the public to use the data more effectively, while simplifying the reporting process for plans. Additionally,
CMS should consider pairing these reforms with increased enforcement efforts targeting plans that provide low-quality data, or no data at all. Randomly auditing the quality of plan data would result in better usability, and additional enforcement would help ensure that remaining non-compliant plans follow the law.

We urge CMS to consider these and other expert recommendations so we may continue improving price transparency for Americans and ultimately bring down health care costs. We look forward to working with you on this issue.

An Outsider's Perspective

When I started this series, I didn't think I'd finish it as an observer from the outside in the cheap seats. There are certainly times I miss working in the industry, but it's been a refreshing and often cathartic journey to acknowledge that most people don't even know about these files, let alone how the MRFs were supposed to peel back the layers of an arcane and asymmetrical practice, thus lowering healthcare costs for all of us.

The data is useful, and incrementally helpful, but it's not as disruptive as people think. Creating a file format based off of how fee schedules are calculated is like having a list of ingredients to make a five-course meal: you might have everything you need, but that doesn't make you an Iron Chef. What I think is more useful is what the final meal looks like – in healthcare, what the final post-adjudicated claim looks like – versus all the indecipherable inputs that go into it. If anything, I hope the exposure to this fee-schedule complexity in the MRFs forces a harsh light of disinfectant on the mathematical gymnastics plans go through on provider reimbursements. If this eventually results in a simplification in how providers are paid, then I think we all win. Until then, I'll remain... frustrated.

Machine Readable Frustration - Part 3

Petabyte Pain

(TiC)Tok

Open The Floodgates

Signal vs Noise

An Outsider's Perspective