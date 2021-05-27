Census Bureau's use of 'synthetic data' worries researchers

FILE - Rows of homes, are shown in suburban Salt Lake City, on April 13, 2019. Utah is one of two Western states known for rugged landscapes and wide-open spaces that are bucking the trend of sluggish U.S. population growth. The boom there and in Idaho are accompanied by healthy economic expansion, but also concern about strain on infrastructure and soaring housing prices. (AP Photo/Rick Bowmer, File)
MIKE SCHNEIDER
·5 min read

ORLANDO, Fla. (AP) — First came the “noise” — small errors the U.S. Census Bureau decided to introduce into the 2020 census data to protect participants' privacy. Now the bureau is looking into “synthetic data,” manipulating the numbers widely used for economic and demographic research, to obscure the identities of people who provided information.

The moves have some researchers up in arms, worried that the statistical agency could sacrifice accuracy in its zeal to protect privacy.

Census Bureau statisticians disclosed at a virtual conference last week that over the next three years they will work toward developing a method to create “synthetic data" for files on individuals and homes that already are devoid of personalized information. These files, known as American Community Survey microdata, are used by researchers to create customized tables tailored to their research.

Census Bureau statisticians said more privacy protections are needed as technological innovations magnify the threat of people being identified through their survey answers, which are confidential. Computing power is now so vast that it can easily crunch third-party data sets that combine personal information from credit rating and social media companies, purchasing records, voting patterns and public documents, among other things.

“It’s a balancing act. The law requires us to do competing things. We need to release statistics on the nation to allow people to make useful decisions. But we also have to protect the privacy of our respondents,” said Rolando Rodriguez, a Census Bureau statistician, at the conference.

But critics say the proposal, coupled with an ongoing effort to add small inaccuracies to the 2020 census data in order to protect participants' privacy, undermines the Census Bureau's credibility as the go-to provider of precise data about the U.S. population.

University of Minnesota demographer Steven Ruggles said bluntly that synthetic data “will not be suitable for research."

“The Census Bureau is inventing imaginary threats to confidentiality to sharply reduce public access to data," Ruggles said. “I do not think this will stand, because society needs information to function."

The microdata are gathered every year from the American Community Survey with a sample size of 3.5 million households, extrapolated across populations of all sizes, from the entire nation down to neighborhoods. This provides a wide range of estimates on the nation’s demographic makeup and housing characteristics. The microdata are used in the drafting of around 12,000 research papers a year, Ruggles said.

The synthetic data are created by taking variables in the microdata to build models recreating the interrelationships of the variables and then constructing a simulated population based on the models. Scholars would conduct their research using the simulated population — or the synthetic data — and then submit it, if they want, to the Census Bureau for double checking against the real data to make sure their analyses are correct.

Ruggles said new discoveries in data will be missed since the models only capture what is already known.

Another problem is that synthetic data can amplify an outlier, such as in a health study where one person engages in risky behavior multiple times but others don't, and it makes it seem like the risky behavior is more widespread than it actually is, said David Swanson, a professor emeritus of sociology at the University of California Riverside.

There are benefits, though, such as the ability to get details about people at really small geographic levels such as neighborhood blocks, said Cornell University economist Lars Vilhuber, who has done research on the method. The synthetic data makes that possible because it protects privacy, he said,

“You can actually get far more detail into the data than with traditional methods," Vilhuber said.

The Census Bureau said in a statement on Thursday that it hasn't made any final decisions on the use of synthetic data in the American Community Survey and that it welcomed feedback from researchers.

The Census Bureau has taken other recent steps to protect individuals’ privacy, which has gotten harder in the face of a proliferation of outside data sources. This year, the bureau proposed using housing units instead of people when defining an urban area. And it has drawn fierce criticism for using a statistical technique known as “differential privacy” in 2020 census data that will be used for drawing congressional and legislative districts.

Differential privacy adds mathematical “noise,” or intentional errors, to the data to obscure any given individual’s identity while still providing statistically valid information. It has been challenged in court by the state of Alabama which says its use will result in inaccurate data.

“The Census Bureau is saying this is in the tradition of what they have always done” in protecting privacy, said historian Margo Anderson, a professor at the University of Wisconsin-Milwaukee. “There’s an increasingly substantial organization of critics saying this is completely different. They say, ‘You have never made the data intentionally inaccurate.'”

The Census Bureau first floated the idea of using synthetic data three years ago, but concerns over that and differential policy got shoved aside after the Trump administration failed unsuccessfully to add a citizenship question to the 2020 census questionnaire and the pandemic challenged the nation's head count last year, Anderson said.

For Swanson, the Census Bureau's efforts at privacy reminds him of the quote that reporter Peter Arnett attributed to an unnamed U.S. military official during the Vietnam War: ″We had to destroy the town in order to save it."

“I feel they literally would destroy the census data to save it from an uncertain threat,” Swanson said. “If they destroy the data, they are going to destroy the bureau.”

___

Follow Mike Schneider on Twitter at https://twitter.com/MikeSchneiderAP

Recommended Stories

  • Russia blocks flights that avoid Belarusian airspace

    Russia Thursday blocked at least two European planes from landing in Moscow because they planned to avoid Belarusian airspace after Belarus diverted a Lithuania-bound flight to detain a government dissident on Monday, U.S. News reports.Why it matters: The move comes after the EU told European airlines not to fly over Belarus, and appears to seek to undermine the bloc's response to the country's strongman leader Aleksandr Lukashenko, an ally of Russian President Vladimir Putin.Stay on top of the latest market trends and economic insights with Axios Markets. Subscribe for freeContext: EU leaders have described Lukashenko ordering the diversion of the Ryanair flight to arrest journalist Raman Pratasevich as a "hijacking," as did the CEO of Ryanair.Lukashenko claimed this week the action was necessary to quell a bomb threat.The UN's civil aviation agency, the International Civil Aviation Organization (ICAO), said in a statement the landing "could be in contravention of the Chicago Convention," a treaty protecting airspace sovereignty.The EU, in addition to telling European airlines not to fly over Belarus, banned Belarusian airlines and promised more economic sanctions on the country.The big picture: Russia on Thursday blocked an Austrian Airlines flight from Vienna and an Air France flight from Paris from landing in Moscow.The Kremlin denied any involvement in the detention of Pratasevich but did not criticize Lukashenko's actions. Go deeper ... Biden: U.S. will coordinate with the EU on Belarus responseMore from Axios: Sign up to get the latest market trends with Axios Markets. Subscribe for free

  • States tap federal aid to shore up empty unemployment funds

    Governors and lawmakers in more than half the states are planning to use at least part of their federal pandemic relief money to bail out unemployment insurance trust funds that were drained by a surge in jobless claims caused by business closures and restrictions, according to an Associated Press review. “For the first time in decades, states will be able to come out of an economic contraction with well-funded unemployment compensation trust funds and be able to save for the next downturn, rather than focusing on paying off the debt from the last one,” said Jared Walczak, vice president of state projects at the Tax Foundation, a Washington, D.C.-based nonprofit.

  • Police investigate fatal shooting of woman at apartments on Kansas City’s east side

    Officers found the victim in a common area of on of the the buildings at the Stonegate Meadows Apartments on Kansas City’s east side.

  • Activists: Charge Louisiana troopers in Black man's death

    Louisiana State Police troopers involved in the violent arrest of a Black motorist who died in police custody in 2019 should be fired and arrested, leaders of the National Urban League and other civil rights groups said Thursday. Marc Morial, the national president of the Urban League and a former mayor of New Orleans, discussed the arrest and death of Ronald Greene at a morning news conference with other state and local civil rights groups, including the American Civil Liberties Union and the NAACP. “Mr. Greene was killed by these state troopers,” Morial said after reviewing an Associated Press video report that included an interview with an expert on police use of force.

  • Coming soon: Remakes of Nintendo classics will be released in November

    The Nintendo Switch remakes of "Pokémon Diamond" and "Pearl," classics first released in 2006, will come out on Nov. 19.The big picture: As much as the video game industry and its culture focus on the novelty of new releases, 2021 is shaping up to be a showcase for the appeal of the old.Get market news worthy of your time with Axios Markets. Subscribe for free."Pokémon Brilliant Diamond" and "Pokémon Shining Pearl" will claim yet another spot on the release calendar for new versions of games millions of people have already played.Production delays due to COVID, tied with the ever-increasing difficulty of making new games is resulting in more visibility for remakes, remasters, and other recycled efforts.Game publishers have mined their back catalog for decades. The abundance of reissues and their rising prominence in 2021 is what's startling.The "Pokémon" releases will come just a week after Rockstar starts selling an "expanded and enhanced" version of 2013's "Grand Theft Auto V," a game that has already shipped 145 million copies.Between the lines: May 2021 has been rife with reissues and the many reasons for them:May 14: EA's "Mass Effect Legendary Edition," a chart-topping remaster of a trio of sci-fi games from a decade ago, is an attempted franchise revival.May 18: Sony's release of its formerly PlayStation-exclusive "Days Gone" to PC continued the trend of Microsoft and Sony expanding their audience by putting their games on previously verboten platforms.May 21: Nintendo's Switch release of 2017 Nintendo 3DS game "Miitopia" hit in the midst of an extraordinarily successful company effort to re-sell games, some of which floundered on older platforms, on its ubiquitous newer machine.Don't forget the new consoles. Publishers see a chance to shine up recent games and try to sell them again on the new generation of consoles.See “GTA,” but also next month’s expanded version of 2020’s “Final Fantasy VII Remake,” which was itself already a radical remake of a classic game. Older games also help add value to expanding catalogue services, most prominently Xbox Game Pass, which keeps showcasing the acquisition of classic content--a range of beloved “Final Fantasy” games, a whole series of “Yakuza” titles, etc.The bottom line: As much as some fans crave new games, all of these reissues are arguably a healthy development.Reissues defy the old currents that would cause most games, even the biggest, to largely exit the public eye within a year or two after release.More from Axios: Sign up to get the latest market trends with Axios Markets. Subscribe for free

  • Packers say they won't let Rodgers situation distract them

    Packers quarterback Aaron Rodgers’ teammates say the MVP’s uncertain status won’t distract them in their offseason preparations. Rodgers hasn’t been present for organized team activities this week following an ESPN report last month that he doesn’t want to return to Green Bay. Rodgers was noncommittal about his future in an ESPN interview Monday night.

  • San Jose shooter had a temper, "kept things to himself," ex-wife says

    Investigators in San Jose, California haven't determined why 57-year-old Samuel Cassidy shot and killed nine people and wounded several others.

  • Ford Drops Another EV Bombshell, and the Stock Is Jumping

    Ford Motor surprised investors by announcing more spending plans for vehicle electrification. The stock is climbing.

  • Buck Sexton and Clay Travis: Who are the replacement radio hosts for Rush Limbaugh?

    Limbaugh’s massive reach fractures with multiple hosts filling the coveted time slot in separate markets across the country

  • Bernie Sanders' hotel room must be kept at 60 degrees, have an extra blanket and a king-sized bed, according to his 'Senator Comfort Memo'

    The Vermont senator once sent an aide to buy him a fan for his hotel room in the middle of a blizzard, a new book says.

  • Toronto blanks Montreal 4-0 for a 3-1 lead in playoff series

    Alex Galchenyuk had plenty of memorable nights inside the Bell Centre. Galchenyuk set up two goals and scored into the empty net against his former team, Jack Campbell made 32 saves in his first playoff shutout, and the Toronto Maple Leafs beat the Montreal Canadiens 4-0 on Tuesday night for a 3-1 lead in their first-round series. “It’s the playoffs, man,” Galchenyuk said when asked if he still speaks with anyone on the other team.

  • Mayoral candidate in Mexico killed after sharing location in Facebook live stream

    Mexican president Andres Manuel Lopez Obrador says the killing was ‘without a doubt’ the work of organised crime gangs

  • San Jose shooting: Eight victims identified as neighbours describe killer as ‘scary, mean’

    Neighbour of suspected gunman recalls how he yelled at him once and never responded to greetings

  • Matz excels in NY return, Kluber hurt, Jays top Yanks 6-2

    Steven Matz felt at home in his return to New York, making it appear the Mets gave up too soon when they jettisoned the left-hander to Toronto. Matz limited the Yankees to one run over 6 2/3 innings in his finest outing in two years and Vladimir Guerrero Jr. hit his major league-leading 16th home run in a 6-2 victory on Tuesday that stopped the Blue Jays' six-game skid and the New York's six-game winning streak.

  • UK health minister should have been fired for lying - PM Johnson's ex-adviser

    Britain's health minister Matt Hancock should have been fired for lying in government meetings on COVID-19, Prime Minster Boris Johnson's former top adviser Dominic Cummings said on Wednesday. "I think that the Secretary of State for Health (Hancock), should have been fired for at least 15, 20 things including lying to everybody in multiple occasions in meeting after meeting in the cabinet room and publicly," Cummings told a parliamentary committee. Asked for an example, Cummings said Hancock had said that all patients got the treatment they needed during the first peak of the virus.

  • Disneyland to reopen to out-of-state visitors June 15; resort expanding booking window to 120 days

    The southern California theme park reopened in late April but only to California residents. That is set to change on June 15.

  • Domestic abuse activists celebrate after charges dropped against Montana woman who killed violent husband

    Woman free from prosecution over murder charges pressed in October against abusive ex-partner

  • EU, Japan throw support behind Olympics, with aid of vaccines from Europe

    BRUSSELS/TOKYO (Reuters) -The European Union and Japan on Thursday backed Tokyo's hosting of the Olympic Games this year, with EU-produced vaccines helping Japan in its battle against a fourth wave of infections. "We support the holding of the Olympic and Paralympic Games Tokyo 2020 in a safe and secure manner this summer as a symbol of global unity in defeating COVID-19," the EU and Japan said in a joint statement after a summit. Japan's vaccination drive has been glacially slow, with just over 5% of the population having had a shot, and several polls have shown the majority of the Japanese public are opposed to holding the Games.

  • RTHK: How authorities cracked down on Hong Kong's only public broadcaster

    New changes in RTHK have signalled that its fate as an independent public service may be under threat.

  • A Manhattan art installation is banning people from visiting alone after 3 people died by suicide

    The Vessel Hudson Yards will also be "installing National Suicide Prevention Lifeline signage and messaging" and increasing security.