Cataloging the library of life

What would you do if you held the skeleton key to life itself? Would you use your knowledge to combat the climate crisis? Develop life-saving medicines? Secure our food supply?

For USDA’s Agricultural Research Service scientists, the answer may be all of the above and more. In one of science’s most ground-breaking projects, researchers are building a “new foundation for biology,” by sequencing, cataloging, and characterizing the genomes of all of Earth’s living organisms.

The project, dubbed the Earth BioGenome Project, brings ARS researchers together with scientists from around the world to produce an open, public database of over 1.5 million species’ genomes. Originally launched in 2018, the EBP began with coordination and specimen collection, but has now entered a “production stage,” as teams begin to record genome sequences at a meaningful scale. The ARS component of the initiative is focused on the Ag100Pest Initiative, a program that is producing genome assemblies for over 170 arthropod pests of greatest agricultural concern in the United States.

Genomic sequencing is not a new technology; ARS and other scientists have been sequencing plants and animals for decades. What is new, though, is the level of international coordination, and the number of species involved.

The genomes are also high-quality, with the expectation that they will be at the reference level—a measure, according to Anna Childers, a project co-lead, of “the level of completeness, and how contiguous the pieces are.”

“When you’re trying to do certain analyses,” explains project co-lead Brian Scheffler, a computational molecular biologist with ARS’s Genomics and Bioinformatics Research Unit in Stoneville, Mississippi, “having that high-quality genome makes all the difference in the world.” Rather than creating lower-quality genome sequences, and then having to repeat the process when higher-quality sequences are required for other research questions, the EBP scientists plan to perform the process once, and get it right, acquiring a complete picture of any given genome sufficient for future research.

Part of the reason that level of quality is now possible is that technological advances have made genome sequencing faster and more efficient. “Before, to get enough DNA, you were often pooling numerous individuals in order to generate a good sequence, which negatively impacted the final assembly,” says Childers, who is an ARS computational biologist in Beltsville, Maryland. “Now we can take one tiny flour beetle, grind it up, and create a genome assembly from that.”

The research is already showing results: Childers’ group was the first to sequence the genome of the Asian Giant Hornet (popularly known as the “Murder Hornet”), an invasive insect that attacks pollinators like bees, at times killing tens of thousands in a matter of hours. Having the hornet’s genome can help researchers understand and manage the threat it poses. These results, though, are just the beginning.

As Childers explains, “The original inventors of things like the internet never fully anticipated how their inventions would evolve and what would be built upon them in the future.

“We can see how genome sequencing has already started affecting how we do biology, how we tackle medical challenges, how we invent new processes,” she adds. “But we really won’t fully appreciate what we’re going to be able to do until we have the genomes of everything.”

“The reality is nothing can get done without the basic genomic information,” said project co-lead and ARS National Program Leader Kevin Hackett. “We’re providing that basic infrastructure and developing the roadmap of all the genomic information for life.”