Why the U.S. Is Struggling to Track Coronavirus Variants

A scattered and underfunded effort at genomic sequencing has hindered the country’s ability to detect different forms of the virus

letters over a map of the United states with different states shaded different colors
An analysis of the genome of the B.1.1.7 variant of the coronavirus overlaid on the CDC's map of different states' genome sequencing rates. Darker-shaded states have processed more genomes (relative to their total case count) than lighter, greener states. CDC; Sebastian Gollnow / Picture Alliance via Getty Images

There’s a reason why scientists in the United Kingdom, and not other nations, were the first to pinpoint a more transmissible variant of the virus that causes Covid-19. It wasn’t because the B.1.1.7 variant had necessarily originated from a patient in their country—scientists still don’t know that. Rather, British researchers had spotted the mutant spreading through London and southeast England because, more than any other country in the world, Britain was actively looking. Thanks to a $27 million government investment at the start of the pandemic, the country has analyzed the entire genetic makeup of more than 210,000 samples of SARS-CoV-2, the virus that causes Covid-19. That’s 43 percent of the total coronavirus genomes sequenced worldwide, and 5 percent of the country’s overall cases.

Meanwhile, the United States, home to a disproportionate one-fourth of the pandemic’s Covid-19 patients, has sequenced only about 96,000—a fraction of 1 percent—of its 27 million (and counting) cases. As of early February, this sequencing rate places the country 34th in the world, according to researchers at the Broad Institute. American scientists and public health authorities have been flying, if not blind, then at least with serious tunnel vision.

The United States’ limited view stems from the absence of a unified national plan and corresponding funds. Some well-prepared states, which had sequencing infrastructure and expertise already in place, have strung together thousands of viral genomes, but others, overwhelmed and under-resourced, have analyzed hardly any. The emergence of new variants in Britain, Brazil, South Africa and others, has made the need to record the viruses’ genomic sequences more urgent than ever. Do enough of it, and scientists will be able to better monitor the prevalence of mutant viruses and scan the horizon for new strains.

While standard PCR tests skim the virus’ genetic code for telltale segments unique to SARS-CoV-2, whole-genome sequencing records all 30,000 letters. A PCR test can tell whether someone is positive for the coronavirus; sequencing spells out the complete genetic makeup of that virus—its genetic fingerprint, including any mutations that might categorize it as a different variant. As the virus spreads, the imperfect process of replication means that every so often, mutations crop up, some inconsequential blips and others, like the 17 accumulated changes that differentiate the B.1.1.7. variant, substantial. Having the entire genome mapped out allows scientists to use these minute changes to build a family tree and decipher how a cluster of cases might have begun with person A and then spread to people B through Z. On a larger scale, genome sequencing clues researchers into larger patterns, such as the occurrence of new strains.

Genomic sequencing is a more complicated and time-intensive process than run-of-the-mill Covid-19 testing. Genetic material must be extracted, read and then the raw data needs to be stitched together and analyzed using high-end computer servers by researchers with a specialized degree. It takes most labs 48 hours to piece together a genome in the best-case scenario. Despite the time, effort, cost and technical expertise to sequence a viral genome, such surveillance is critical. For example, knowing the real prevalence of the three major formally identified variants, whose increased transmissibility could lead to skyrocketing case counts and more stress on already over-stretched hospitals, has allowed decisionmakers to take preventative action—including the U.K.’s winter lockdown. And pinpointing novel mutations early enables researchers to study them and see whether the variants can evade vaccines.

Module 1.1 - What is genomic epidemiology?

Important as sequencing is, the New York Times editorial board in late December likened the global surveillance outlook to “a giant canvas where one corner has been painted in extraordinary detail but the rest is blank.” So far, the U.K. has been populating that canvas with an unparalleled number of viral genome readouts, but the data on the variants in other corners of the world has been relatively slim. The U.K.’s data trove started with a call 13 months ago, in early March 2020, between microbiologist Sharon Peacock and five other researchers. When the World Health Organization (WHO) declared Covid-19 a pandemic on March 11, a larger team of scientists huddled in London to map out what would become COG-UK, the Covid-19 Genomics UK Consortium. A week later, they’d acquired ample government funding to coordinate a network of public health agencies, hospitals, academic institutions and non-profit research labs that would share best practices and data.

“People need to work together in a cooperative and collective way, setting aside individual priorities,” wrote Peacock in a blog post listing the factors that had contributed to the U.K.’s sequencing success. In the U.S., such national coordination has been lacking. “It’s the Wild West,” virologist Jeremy Kamil told Bloomberg’s Kristen V. Brown. “Every state, city, county is doing its own thing. It’s a bunch of random cats and no one is trying to herd them.”

Part of the U.S.’s sequencing struggle comes from the fact that sequencing infrastructure hasn’t been prioritized as a public health need, both historically and during the present pandemic. Traditionally costly, complicated pathogen sequencing was the domain of research universities; it wasn’t until around 2014 that the Centers for Disease Control and Prevention (CDC) began to fund public labs to do whole-genome sequencing as a tool to track foodborne illness. Since 2017, all 50 states have labs with the ability to sequence, says Kelly Oakeson, who heads the Utah Public Health Lab’s sequencing and bioinformatics work, “but funding has always been a struggle.”

The Utah team’s sequencing epidemiology is underwritten entirely by the CDC; while public health labs typically operate on fees and dollars from both local and federal governments, crisis funding tends to come from D.C., as a recent Association of Public Health Laboratories publication explains. The flood of diagnostic Covid-19 tests has stretched these underfunded labs thin. Focusing resources on telling patients whether they have Covid-19 sometimes left sequencing as an afterthought, especially because the CDC offered little guidance. “There hasn’t been a unified direction from CDC or anybody saying, ‘Okay guys, we’ve got to put a focus here; here’s money; here’s how you do it; go,’” Oakeson said when interviewed in January. As a result, labs have scaled up sequencing unevenly, and for most of 2020, the U.S. was left with a patchwork of academic, commercial and state labs fending for themselves.

Left to their own devices, some states have managed to sequence dramatically more genomes than others. In the year since the U.S.’s first case, Washington sequenced thousands of genomes, while West Virginia had recorded just 12, according to a CDC dashboard that debuted at the end of January. Still, even the six best-performing states have sequenced only between 1 and 3 percent of their total cases—far short of the 5 to 10 percent threshold experts would like to reach so they can adequately monitor for mutants. While epidemiological data is most useful fresh, in the U.S., the median turnaround time between testing and sharing the resulting sequence to a global data repository has been three times slower than the U.K.’s speed.

Utah, which has sequenced more than 5,800 viral samples in total and has an average turnaround time of about a week, ranks among the states that have most successfully implemented a surveillance program. That’s due in part to earlier investments in sequencing infrastructure, partnerships and trained staff, says Oakeson. The lab receives samples from testing partners from across the state. They can use leftover material from the regular “do I have Covid-19?” PCR test or take a fresh sample and simultaneously sequence and diagnose in one fell swoop.

For all Utah’s preparedness, challenges abounded. For starters, samples needed to make it to their lab in the Salt Lake area from different corners of the state—an issue researchers address by using a courier service. At first, they encountered personnel issues: Overloaded hospitals and other testing facilities couldn’t always spare staff to locate the leftover samples from tests that had returned a positive and send them to Oakeson. This has become less of a stumbling-block with time, according to Oakeson. Limited manpower also created a bottleneck when it came to the sequencing step; for the first six months of the pandemic, one staffer handled all the sequencing; now, the lab has three people on that job. And even 11-plus months into the pandemic, pipette tips are sometimes in short supply—and not having enough of these essential plastic components seriously limits how much sequencing can be done.

As of late January, the lab was decoding the genomes of about 2 percent of all positive cases in Utah, but they want to get to five times that number. To hit that lofty target all across the U.S., Oakeson says, “money has to start flowing.” His team operates on a “shoestring budget” of CDC funds, but recently was able to purchase high-throughput sequencing equipment—expensive machines some labs can’t afford. The Utah lab now has two liquid handling robots—each cost $700,000—that can perform the tedious and time-consuming initial steps, like adding barcodes, pipetting and converting viral RNA to DNA, on 384 samples at a time. And their new sequencing instrument, which can run diagnostic tests on and get the genomes of 3,072 samples every 24 hours, came with a hefty $900,000 price tag that they paid for through the CARES Act. Once the machines are programmed and running, the lab’s sequencing rate should jump.

Other states encountered similar obstacles in getting genomes sequenced. Before the pandemic, Pavitra Roychoudhury, an acting instructor at the University of Washington virology lab, studied herpes and respiratory syncytial viruses.* Now, she and a handful of colleagues, along with the philanthropically funded Brotman Baty Institute, do nearly all of the state’s SARS-CoV-2 sequencing. “Reagents are limited, and people are limited,” she says, referring to the substances used in the chemical reactions that are part of the sequencing process. “We’re just doing as much as we can.” They sequence 100-200 genomes per week, with a turnaround time of about four to seven days. Roychoudhury says her workday often stretches until late at night.

Again, funding is a sore spot. “Nobody is reimbursing us for these sequences,” she explains, although obtaining each viral genome costs the lab anywhere between $80 and $500, depending on the technique used. Securing government funding for sequencing research can be a prolonged process, so her lab got a Fast Grant, an quick-turnaround grant bankrolled by philanthropists and tech CEOs.

The same hurdles—money and logistics—came up in responses from other places. Since the pandemic started, Arkansas has uploaded a total of only 136 SARS-CoV-2 sequences to GISAID, an online repository where researchers across the globe share genomic data. “The main barrier currently is cost,” writes José Romero, Arkansas’ Health Secretary, in an email.

Funding isn’t the only hurdle, however. California has sequenced 11,000 genomes and counting. The state’s Department of Public Health coordinates a sprawling 30-lab network of diagnostic labs, public health groups, academic institutions and philanthropic and private partners such as the Chan Zuckerberg Biohub and genetic testing company Invitae. For these partner research centers, “The rate-limiting step isn’t sequencing; it’s really getting the sample,” microbiologist Charles Chiu told Wired. Samples pass slowly through a gauntlet of local labs without sequencing abilities before they finally make it to his lab at the University of California, San Francisco, and there’s a lot of red tape to contend with.

In terms of creating a sequencing plan and providing money and guidance for states to execute it, the CDC is “well aware that the ball has been dropped,” said Oakeson. Yes, an opt-in CDC consortium known as SPHERES has provided a place for scientists across the States to collaborate and share information via a Slack workspace, but numerous scientists maintain that the lack of national coordination has hampered sequencing efforts.

But this hands-off, free-for-all approach is shifting. In November, the CDC launched a national SARS-CoV-2 surveillance program called NS3, which asks state public health labs to overnight at least five SARS-CoV-2 samples representing different geographic regions and population groups to the CDC every other week for sequencing and other testing. The CDC also requested samples of suspected B.1.351 and P.1 variants—first found in South Africa and Brazil, respectively. The goal is to get a more complete view of the variants circulating the entire country, not just testing hubs.

NS3 has gradually increased its capacity as worry over variants has mounted. As of late January, it could process about 750 samples each week. Partnerships with private companies, like Illumina (which manufactures sequencing machines) and Helix should boost that number to 6,000 sequences per week in time for the one-year anniversary of the pandemic in mid-March. Those 6,000 sequences per week would represent around one-sixth of the sequences uploaded to GISAID from the U.S. in January, but that’s still less than 1 percent of the 760,000 new cases forecasted for the last week in February. And to get to 5 percent, the CDC and the hodgepodge of labs doing sequencing would need to process upward of 38,000 genomes.

Cash-strapped labs may receive more money, too. In September, the CDC gave university sequencing groups roughly $8 million, and in mid-December, the CDC earmarked $15 million for public health labs’ sequencing efforts. President Joe Biden’s original Covid-19 relief bill proposes improving surveillance, though the exact monetary details remain fuzzy.

Experts agree that new variants will likely emerge in the months to come, which makes it all the more essential to get up to speed with surveillance. “If you want to identify anything that is new and is spreading…” says Roychoudhury, “you should sequence everything, because that’s the truth.”

Beyond funding and national guidance, Roychoudhury says, looking to the U.K.’s example in terms of a unified approach to analytics could make data easier to parse. COG-UK created custom software and resources that their labs all use, whereas in the U.S., it’s not so standardized; different groups take distinct approaches to analyzing the trove of genomic data.

If coordinating a response across 50-odd states and territories has been a Sisyphean endeavor, globalized surveillance presents some of the same problems at a more immense scale. Forty countries have yet to submit a single SARS-CoV-2 genome to GISAID. Nonetheless, “improving the geographic coverage of sequencing is critical for the world to have eyes and ears on changes to the virus,” Maria Van Kerkhove, WHO’s Covid-19 technical lead, has said. Mutant viruses—even a vaccine-proof variant—could arise anywhere in the world, especially places where the disease runs rampant, and the past year has only proven how quickly an epidemiological worry in one corner of the globe can become everyone’s problem.

*Editor's Note, February 12, 2021: An earlier version of this story misspelled Pavitra Roychoudhury's first name.

Get the latest Science stories in your inbox.