Refinement of the Reference Viral Database (RVDB) for improving bioinformatics analysis of virus detection by high-throughput sequencing (HTS)
ABSTRACT All biological products are required to demonstrate the absence of adventitious viruses (AVs), which may be inadvertently introduced at different steps involved in the manufacturing process. The currently recommended in vitro and in vivo virus detection assays have limitations for broad det...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
American Society for Microbiology
2025-07-01
|
Series: | mSphere |
Subjects: | |
Online Access: | https://journals.asm.org/doi/10.1128/msphere.00286-25 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | ABSTRACT All biological products are required to demonstrate the absence of adventitious viruses (AVs), which may be inadvertently introduced at different steps involved in the manufacturing process. The currently recommended in vitro and in vivo virus detection assays have limitations for broad detection and are lengthy and laborious. Additionally, the use of animals is discouraged by the global 3 R’s initiative for replacement, reduction, and refinement. High-throughput or next-generation sequencing (HTS/NGS) technologies can rapidly detect known and novel viruses in biological materials. There are, however, challenges for HTS detection of AVs due to differential abundance of viral sequences in public databases, which led to the creation of a non-redundant, Reference Viral Database (RVDB) containing all viral, viral-like, and viral-related sequences, with a reduced cellular sequence content. In this paper, we describe improvements in RVDB, which include the transition of RVDB production scripts from the original Python 2 to Python 3 codebase, updating the semantic pipeline to remove misannotated non-viral sequences and irrelevant viral sequences, use of taxonomy for the removal of phages, and inclusion of a quality-check step for SARS-CoV-2 genomes to exclude low-quality sequences. Additionally, RVDB website updates include search tools for exploring the database sequences and implementation of an automatic pipeline for providing annotation information to distinguish non-viral and viral sequences in the database. These updates for refining RVDB are expected to enhance HTS bioinformatics by reducing the computational time and increasing the accuracy for virus detection.IMPORTANCEHigh-throughput sequencing (HTS) has emerged as an advanced technology for demonstrating the safety of biological products. HTS can be used as an alternative adventitious virus detection method for replacing the currently recommended in vivo and PCR assays and supplementing or replacing the in vitro cell culture assays. However, HTS bioinformatics analysis for broad virus detection, including both known and novel viruses, depends on using a comprehensive and accurately annotated database. In this study, we have refined our original comprehensive Reference Virus Database (RVDB) for greater accuracy of virus detection with a reduced computational burden. Additionally, the production script for automating the generation of RVDB was updated to facilitate reliable database production and timely availability. |
---|---|
ISSN: | 2379-5042 |