‘It’s shocking.’ How inaccurate California death records obscure pandemic’s true story

When California looks back on the COVID-19 pandemic — the most significant health crisis in modern history, with tens of thousands of deaths so far — medical researchers will find some of the most basic details remarkably incomplete.

Overwhelmed public health departments and front-line workers have for months failed to record accurate health histories for COVID-19 victims, a Sacramento Bee review of the state’s internal pandemic death records found.

The records show a Fresno County man in his 60s died of COVID-19 with an otherwise clean medical history. But that’s not necessarily because he didn’t have underlying conditions that contributed to his death; more likely, no one bothered to enter those conditions into a state disease surveillance database.

At the same time, California has a record of a San Diego County man in his 60s who died of COVID-19, along with a history of high blood-pressure and diabetes coupled with kidney disease and a chronic lung condition that labored his breathing on the best of days.

Why does this matter? From urban city blocks to rural county outposts, each row of data entered into the California Reportable Disease Information Exchange is supposed to provide a medical snapshot of every person who died. The system, better known as CalREDIE, helps health officials track in real-time who is most at risk and tell the story of the disease’s dangerous, disastrous spread.

Information from CalREDIE feeds the graphics on county dashboards to inform the public. Gov. Gavin Newsom’s daily updates are based on the reports. Data from the system is among the most heavily weighted in deciding what businesses could be open and which counties should be ordered to shut down.

It ultimately helped inform how vaccines should be doled out.

As the state slowly moves to recover from a global plague that killed 61,000 Californians, the omissions in the database reveal in detail how the state — and the country as a whole — was caught flat-footed for a pandemic that exploited a fragmented and underfunded public health system.

“There’s a lot of places where the information just gets lost,” said Dr. Lee Riley, professor and chair of the Division of Infectious Diseases and Vaccinology at UC Berkeley. “We’re never going to have a complete story on complete facts about this epidemic. It’s a moving target.”

The Bee’s analysis of more than 25,000 deaths reported to the state through December found there are discrepancies among California’s 58 counties in reporting of what’s known as “comorbidities” — the underlying medical conditions that contributed to a person’s poor health or death.

[Download the raw data here. Information about missing fields and a data dictionary are also available.]

In some places, primarily affluent counties with better-funded health systems and top-of-the-line hospital networks, nearly all of the reported COVID-19 deaths listed comorbidities in CalREDIE.

For instance, about nine out of every 10 people who died in relatively wealthy San Diego County reported at least one pre-existing condition.

In Fresno County, less than one-in-three did. The same was true in Riverside County.

What wasn’t captured in predominantly lower-income and disadvantaged places might hamstring what we come to know about precisely why the pandemic killed and sickened certain groups of people, said Dr. Rebecca Wurtz, an infectious disease physician and researcher at the University of Minnesota School of Public Health.

“The infrastructure reflects the population,” Wurtz said. “And the gaps in the infrastructure reflect the inequity in the population.”

Gaping holes were well known internally

Hospitals, local health departments and private labs upload COVID-19 test results and other data into CalREDIE. The state’s health department manages the system, which for years has warehoused tranches of information about emerging health threats, from strange new respiratory illnesses to common sexually transmitted diseases.

The program’s motto — “using technology to improve disease surveillance” — remains splashed across the top of an information page. Throughout the pandemic, CalREDIE provided the most robust real-time disease surveillance snapshot.

In an emailed statement, a California Department of Public Health spokesperson said it was on local health workers to complete the fields.

The state said it is working with local officials “to improve our ability to collect this data by prioritizing the fields that are most informative and providing resources for case investigation and data collection.” That has included additional information on outbreaks, variants and vaccine administration, the spokesperson said.

California health officials last year denied The Bee’s request for death certificate records, saying they “are exempt from Public Records Act request disclosure.” Though incomplete, the CalREDIE records provide a granular glimpse at death information that has largely been out of public view.

The CalREDIE system captured important details about who tested positive for the disease. And for those who died in 2020, nearly all had information in the system about their race, a field that someone was required to complete before finishing a report.

But there are glaring holes in other places where data entry was optional.

San Diego, Los Angeles, San Joaquín, Ventura and Santa Clara all showed more than 78% of deaths had documented underlying health problems. Fresno, Riverside, San Mateo and Alameda showed fewer than four-in-10 did. In San Mateo County, one of the wealthiest in the state, the reason so few comorbidities were reported was because the county didn’t input them.

Preston Merchant, a spokesman for San Mateo’s health department said the county’s internal tally shows some 70% of people who died had underlying health problems.

In many cases, even when a patient had no pre-existing conditions listed in CalREDIE, nobody actually marked the box in the state’s surveillance database indicating they had a clear health history.

State officials said such “data entry error” in widespread disease tracking is “common.”

“These fields are not regularly evaluated for data quality,” an official wrote in an email to The Bee.

The omissions go beyond comorbidities. About 72% of deaths as of mid-December lacked information in the disease surveillance database about what the person did for work. The gaps are the latest in a series of failures in understanding the link between workplaces and COVID-19 outbreaks.

And nearly two-thirds of the cases reviewed by The Bee lacked information about the primary language spoken by the person who died — a major gap that further hurts efforts to understand the pandemic’s toll in non-English-speaking households.

Different priorities in a fragmented system

CalREDIE tracks a person from disease diagnosis to death.

But that information is often collected and entered into the system by different people at different times. Amid the chaos of a pandemic, overworked testing clinic staff, doctors and health department officials often left important data fields blank.

Experts said similar details about comorbidities and other important information also might have been left off of death certificates.

For instance, those who died in San Diego County might have been more likely to have coronavirus tests run by their doctor who had access to health history, experts said. Those samples would have copious other records attached to them that someone ultimately entered into CalREDIE.

Meanwhile, those in Fresno County and other places across the Central Valley were more likely to get diagnosed at mass-testing centers or other clinics serving lower-income populations, where huge numbers of people were being processed every day.

“When somebody is in that moment, it might not occur to them that it’s important to capture that information, right?” said Jamie White, epidemiology program manager at Sacramento County Public Health. “They’re just trying to get through the people in front of them. So there’s kind of an issue of both short-term and long-term priorities.”

The Fresno County Department of Public Health did not respond to questions from The Bee.

Any death certificate database has similar problems due to the same sorts of pressures, experts said.

“So the things that we think that are absolute and obvious and that the data can’t be corrupted in any way, like a death certificate, are absolutely fraught,” said Wurtz, the Minnesota researcher.

Some counties, like Los Angeles, have published their own information about pre-existing conditions showing 87% of people hospitalized for COVID-19 had underlying conditions. Researchers elsewhere have studied COVID-19 and comorbidities through various methods, including a review of private insurance claims.

The CDC attempts to monitor illnesses among the pandemic’s victims too.

Experts told The Bee that the holes in the state’s disease tracking database aren’t surprising, and they reflect the nature of the country’s fragmented healthcare system made up of a mishmash of more than 3,000 county health departments, thousands of private and public medical clinics, insurance companies and hospital networks, all working independently.

But the lack of uniform and complete information means our understanding of the worst pandemic in a century might always be imprecise.

“It’s shocking, I think, once most people hear that the gap is so large,” Wurtz said.

Another challenge: The person diagnosed at a mass testing site could have then died at home or in the ER. A coroner or the attending physician likely wouldn’t have access to the person’s entire medical history and they may not know what language they spoke or their profession, a challenge in filling out both death certificates and entering the information into CalREDIE

Public health investigators faced similar, unavoidable challenges for finding out more about patients who were critically ill, said White, the Sacramento County epidemiology branch chief.

“It’s really difficult to interview somebody who’s hospitalized and ventilated,” White said.

When they had the time and staff to do it, county health departments often tried to fill in the missing data pieces after a death, requesting medical records from hospitals and death certificates. They made difficult calls to grieving families, who might not have answered their probing questions because they were too busy mourning a death or planning a funeral.

“Comorbidity (data) is important to have for people who are thinking big picture in terms of, ‘How do we characterize this? How do we prevent deaths in the future?’” White said. “But when you’re in the moment, and you’re trying to figure out what’s most important to stop the spread, that’s not necessarily the first question that comes into mind.”

A ‘shoehorned’ system breaks down

By summer, the CalREDIE system ran into well-publicized problems when a backlog of case reports, primarily from private labs, suggested the COVID-19 deaths were dropping.

In reality, it was a reporting lag due to an overwhelmed system.

Before the pandemic, CalREDIE processed some 63,000 records about positive disease tests per month. After COVID-19 spread through California, the system was forced to process 100,000 positive and negative cases per day, records show. The crush of testing information overwhelmed the outdated technology, which officials agreed was not up for the challenge of the COVID-19 pandemic.

“They kind of just shoehorned that into it,” said Michael Weinstein, a UCLA researcher who studies health data systems and who helped the state’s health department solve the backlog.

As the state technology problems were unfolding, The Sacramento Bee and its attorneys were negotiating with the Department of Public Health for the release of its CalREDIE database. Only after 10 months of negotiations with The Bee’s attorneys and a $530.80 payment did the health department provide heavily redacted spreadsheets. The state did not provide the names of people who died of COVID-19.

Despite such an intensely complex set of more than 25,000 lines of data, the California Department of Public Health declined multiple interview requests for this story.

Among all the lessons from the past year, the pandemic has highlighted the need for standardized disease data reporting networks, said Vivian Singletary, director of the Public Health Informatics Institute in Atlanta studying how public health data is collected and shared among governments and researchers.

“COVID-19 was this big earthquake for public health,” she said. “And now you’re seeing all of the fault lines that have been there for many, many years.”