'Dark matter' in the world of viruses: Virus researchers have discovered more than 161,000 previously unknown RNA viruses in global sampling data – more than ever before. The new discoveries expand the previously known viral envelope one and a half times and add countless new virus groups to it. The discovery of RNA viruses is made possible by artificial intelligence that has identified virus fingerprints in genetic data sampled from a wide range of habitats.
Whether in glacial ice, deep in the Earth's crust, or high in the atmosphere: viruses are everywhere and in everything – we humans also host billions of viral species. Since the beginning of evolution, viruses have adapted to a wide range of hosts and environments, and can survive even in extreme conditions. But despite its ubiquity, most of the viral envelope is “dark matter” to us – we only know a fraction of its true diversity.
Until now, identifying new RNA viruses has been particularly difficult. Due to their high mutation rate and genetic diversity, they are much more difficult to identify than DNA viruses and require special analysis procedures. In addition to the RNA sequence, the distinct RNA building instructions for the RNA polymerase are also particularly sought after, as these make the task easier. However, the number of unreported cases of RNA viruses remains enormous.
A transformer model of viral RNA signatures
Now, artificial intelligence is also bringing movement to virus investigations. A team led by Shen Hu of the Shenzhen State Biocontrol Laboratory has developed an artificial intelligence system specifically designed to detect viral signatures in RNA data. The foundation of the AI system called “LucaProt” is an adaptive compiler model based on similar basic principles as ChatGPT and other large language models.
Unlike text generators, AI Virus Investigator does not analyze language, but rather analyzes RNA sequences and evaluates them for recurring patterns. For their study, he and his team first trained LucaProt on about 5,000 known RNA signatures of viral RNA polymerases. They then asked the AI system to analyze 51 terabytes of RNA data from environmental samples.
The samples examined came from 1,612 sites around the world and from 32 different habitats and types of ecosystems – from deep-sea sediments to Antarctic ice and hot springs to soil, air and water samples from our latitudes.
New RNA viruses are everywhere and around the world
The result: AI-powered research uncovered 161,979 new types of RNA viruses – more than ever before. “Finding so many new viruses at once is mind-boggling,” says senior author Edwards Holmes of the University of Sydney. The new discoveries expand the previously known viral envelope by 1.5 times and viral supergroups by 8.6 times, the team reported. “This opens a new window on this hidden part of Earth's living environment,” Holmes says.
The newly discovered RNA viruses are distributed across all 32 ecosystem types examined. “The highest virus diversity was in litter, in wetlands, inland waters, and in sewage,” the researchers reported. “We found the largest numbers of new RNA viruses in Antarctica, marine sediments and in some inland waters.” The team also discovered previously unrecognized RNA viruses in extreme environments such as hot springs, the atmosphere, or hydrothermal vents.
86% of newly identified RNA virus species are found in only one type of ecosystem. But there were also viruses that were found in almost all of the samples. “These appear to be environmental specialists,” he and his colleagues said.
“We're just scratching the surface.”
But despite the huge amounts of newly discovered RNA viruses, this is just the tip of a huge iceberg, the researchers stress: “We're just scratching the surface. There are millions more viruses to discover,” says Holmes. In addition, there are still significant gaps in knowledge about the evolution and ecology of the newly discovered virus species, and yet it is largely unknown who colonizes these viruses.
“The majority of RNA viruses known to date infect eukaryotes,” the scientists explain. In addition to humans, animals, and plants, these organisms also include unicellular organisms that carry a cell nucleus. “But it is also possible that a large proportion of newly discovered viruses are associated with bacteria or archaea as hosts.”
“The next step is to train our AI to discover more of this amazing diversity,” Holmes says. “Who knows what surprises await us” (Cell, 2024; doi: 10.1016/j.cell.2024.09.027)
Source: Cell, University of Sydney
October 10, 2024 – Nadia Podbrigar
More Stories
Exploding Fireball: Find the meteorite fragments
Neuralink's competitor lets blind people see again with an implant
A huge meteorite has hit Earth – four times the size of Mount Everest