Using both long andshort reads we demonstrate simplified and improved alignment quality, ease of downstream analysis including identification of novel isoforms and their translational products, differential expression and more! For example, by comparing isoforms assembled from a public SIV dataset, we uncover a unique splicing donor site and junction (donor upstream of major splicing donor 0 in SIV) which is exclusively detected across most single cells in the AY69 group (CD4+ T cells from a mesenteric lymph node of SIVmac251-infected macaque). Significant differences in transcriptional landscapes are also detected with DEXSeq.
Using public datasets, we demonstrate how HIV Atlas improves analysis: uncovering novel isoforms, condition-specific splicing patterns and more!
We hope this resource supports more accurate and comprehensive HIV research in computational and bench sciences!
06.10.2025 16:30
๐ 1
๐ 0
๐ฌ 1
๐ 0
Our method consistently annotates all included major donor and acceptor sites across thousands of novel isolates. Splicing sites are strongly conserved with significant increase from the conservation of random genomic sites
Our methods are specifically designed to enable annotation transfer between genomes with extreme sequence diversity.
To scale beyond single isolates, we built Vira, a toolkit for transferring annotations across highly diverse viral genomes.
Using Vira, we generated transcriptome annotations for 2,080 complete HIV-1 and SIV genomes from the LANL database.
06.10.2025 16:30
๐ 0
๐ 0
๐ฌ 1
๐ 0
Inconsistency in SD4 donor and SA7 acceptor splice site annotations in the coding sequence of the GenBank HIV-189.6 (U39362.2) annotation. The GenBank annotation (shown in red) incorrectly positions the SD4 donor site one base upstream of the canonical GT dinucleotide and the SA7 acceptor site two bases downstream of the canonical AG dinucleotide. This erroneous annotation results in the truncation of the encoded protein sequence by one complete amino acid. Our suggested corrected annotation (in orange) is shown below the GenBank annotation.
We manually curated reference annotations for HIV-1 HXB2 and SIVmac239, drawing on decades of published knowledge.
Our work uncovered and corrected significant errors in public HIV-1 annotations, such as the HIV-1 89.6 GenBank entry that truncates the Tat and Rev proteins.
06.10.2025 16:30
๐ 1
๐ 0
๐ฌ 1
๐ 0
Most RNA-seq workflows โ alignment, assembly, quantification โ depend on accurate reference annotations.
Without them, HIV splicing studies have often been inconsistent, fragmented, and hard to compare across datasets.
06.10.2025 16:30
๐ 2
๐ 0
๐ฌ 1
๐ 0
Subset of the core alternative splicing map for the HIV-1 genome illustrating the complexity of the viral transcription and how the major donor-acceptor pairs are deterministic of the protein translation.
HIV-1 produces all its proteins from a single message using a highly complex splicing program.
This process controls replication, latency, and pathogenesis โ yet no centralized, accessible catalogue existed that was compatible with standard RNA-seq tools.
#HIV #Bioinformatics #RNAseq #Genomics
06.10.2025 16:30
๐ 2
๐ 0
๐ฌ 1
๐ 0
The HIV Atlas web interface provides a simple interface for explore and download transcriptome annotations for any of the currently annotate 2,080 genomes.
The logo for the HIV Atlas
Decades of HIV research. Thousands of RNA-seq studies. No transcriptome annotation
Announcing HIV Atlas: the first reference annotation of HIV. With Ela Pertea, @stevensalzberg.bsky.social , Diane Bolton, Mykhaylo Artamonov, Sophia Cheng
๐ www.biorxiv.org/content/10.1...
๐ ccb.jhu.edu/HIV_Atlas
06.10.2025 16:30
๐ 1
๐ 0
๐ฌ 1
๐ 0
Our method consistently annotates all included major donor and acceptor sites across thousands of novel isolates. Splicing sites are strongly conserved with significant increase from the conservation of random genomic sites
Our methods are specifically designed to enable annotation transfer between genomes with extreme sequence diversity.
To scale beyond single isolates, we built Vira, a toolkit for transferring annotations across highly diverse viral genomes.
Using Vira, we generated transcriptome annotations for 2,080 complete HIV-1 and SIV genomes from the LANL database.
06.10.2025 16:19
๐ 0
๐ 0
๐ฌ 0
๐ 0
Inconsistency in SD4 donor and SA7 acceptor splice site annotations in the coding sequence of the GenBank HIV-189.6 (U39362.2) annotation. The GenBank annotation (shown in red) incorrectly positions the SD4 donor site one base upstream of the canonical GT dinucleotide and the SA7 acceptor site two bases downstream of the canonical AG dinucleotide. This erroneous annotation results in the truncation of the encoded protein sequence by one complete amino acid. Our suggested corrected annotation (in orange) is shown below the GenBank annotation.
We manually curated reference annotations for HIV-1 HXB2 and SIVmac239, drawing on decades of published knowledge.
Our work uncovered and corrected significant errors in public HIV-1 annotations, such as the HIV-1 89.6 GenBank entry that truncates the Tat and Rev proteins.
06.10.2025 16:19
๐ 0
๐ 0
๐ฌ 1
๐ 0
Most RNA-seq workflows โ alignment, assembly, quantification โ depend on accurate reference annotations.
Without them, HIV splicing studies have often been inconsistent, fragmented, and hard to compare across datasets.
06.10.2025 16:19
๐ 1
๐ 0
๐ฌ 1
๐ 0
Subset of the core alternative splicing map for the HIV-1 genome illustrating the complexity of the viral transcription and how the major donor-acceptor pairs are deterministic of the protein translation.
HIV-1 produces all its proteins from a single message using a highly complex splicing program.
This process controls replication, latency, and pathogenesis โ yet no centralized, accessible catalogue existed that was compatible with standard RNA-seq tools.
#HIV #Bioinformatics #RNAseq #Genomics
06.10.2025 16:19
๐ 1
๐ 0
๐ฌ 1
๐ 0