2025-12-15 ペンシルベニア州立大学(PennState)
<関連情報>
- https://www.psu.edu/news/agricultural-sciences/story/evaluating-evaluators-how-do-plant-virus-genome-analysis-tools-stack
- https://www.microbiologyresearch.org/content/journal/jgv/10.1099/jgv.0.002176
欠陥はあるが有望:RNA-Seqデータにおける欠陥ウイルスゲノムの検出に現在利用可能なバイオインフォマティクスパイプラインの有用性を評価する Defective but promising: evaluating the utility of currently available bioinformatic pipelines for detecting defective viral genomes in RNA-Seq data
Anthony Taylor, Cristina Rosa and Marco Archetti
Journal of General Virology Published: 17 November 2025
DOI:https://doi.org/10.1099/jgv.0.002176

ABSTRACT
Defective viral genomes (DVGs) affect viral dynamics, pathogenicity and evolution, have been found in many in vivo viral infections, and in theory can be detected from sequencing data. We explored the utility of the currently available bioinformatic programs ViReMa, DI-tector, DVGfinder, DG-Seq and VODKA2 for identifying junction points in plant virus high-throughput sequencing data, looking at whether the outputs from these bioinformatic tools generally agree and exploring the possibility of using these tools to help us understand whether DVGs are consistently generated and maintained in a specific virus-host combination. We conducted a meta-analysis of eight previously published RNA sequencing datasets utilizing all five programs and compared the degree of output overlap, the most common junctions present in each output and whether these junctions match previously reported junctions for that virus. Our results demonstrate a low degree of agreement regarding identified junctions between programs, including the most frequently identified one, although the most frequently identified junctions typically corresponded to large, disruptive deletions. We found preliminary support for our prevalence hypothesis, although we ultimately conclude that a more robust dataset generated expressly for testing this hypothesis will be required for a convincing answer. Finally, we suggest that when using bioinformatic programs to search for DVGs, it is best to run the same dataset through multiple programs and look at the overlap to inform decisions on downstream characterization.


