Trending

#htslib

Latest posts tagged with #htslib on Bluesky

Latest Top
Trending

Posts tagged #htslib

Preview
SMT (@bioinfhotep@genomic.social) Attached: 4 images #Genomics #Bioinformatics Release of duckhts: #htslib based #Duckdb Extension for High Throughput Sequencing File Formats https://duckdb.org/community_extensions/extensions/duckh...

#Duckdb #htslib #Genomics #Bioinformatics #RStats

duckths: Read HTS (VCF/BCF/BAM/CRAM/FASTA/FASTQ/GTF/GFF) files in DuckDB via htslib

Rduckhts: 'DuckDB' High Throughput Sequencing File Formats Reader Extension
genomic.social/@bioinfhotep...

2 1 0 0
Rduckhts: 'DuckDB' 'HTS' File Reader Extension for 'R'

Bundles the 'duckhts' 'DuckDB' extension for reading 'HTS' file formats (VCF/BCF, SAM/BAM/CRAM, FASTA, FASTQ, GFF, GTF, tabix) from 'R' via 'DuckDB'. The extension and its 'htslib' dependency are compiled from vendored sources during package installation.

Authors:Sounkou Mahamane Toure [aut, cre], htslib authors [ctb], DuckDB C Extension API authors [ctb]

Rduckhts_0.1.1-0.0.1.tar.gz
Rduckhts_0.1.1-0.0.1.zip(r-4.6)Rduckhts_0.1.1-0.0.1.zip(r-4.5)Rduckhts_0.1.1-0.0.1.zip(r-4.4)
Rduckhts_0.1.1-0.0.1.tgz(r-4.6-any)Rduckhts_0.1.1-0.0.1.tgz(r-4.5-any)
Rduckhts_0.1.1-0.0.1.tar.gz(r-4.6-any)Rduckhts_0.1.1-0.0.1.tar.gz(r-4.5-any)
Rduckhts.pdf |Rduckhts.html✨
Rduckhts/json (API)
NEWS
# Install 'Rduckhts' in R:
install.packages('Rduckhts', repos = c('https://rgenomicsetl.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/rgenomicsetl/duckhts/issues0 issues

On CRAN: no

3.75 score 12 exports 2 dependencies

Last updated0 hours ago from:e99a28a305. Checks:7 OK, 1 NOTE, 1 FAIL. Indexed: yes.
Citation

To cite package ‘Rduckhts’ in publications use:

    Toure S (2026). Rduckhts: 'DuckDB' 'HTS' File Reader Extension for 'R'. R package version 0.1.1-0.0.1, https://github.com/rgenomicsetl/duckhts.

Corresponding BibTeX entry:

  @Manual{,
    title = {Rduckhts: 'DuckDB' 'HTS' File Reader Extension for 'R'},
    author = {Sounkou Mahamane Toure},
    year = {2026},
    note = {R package version 0.1.1-0.0.1},
    url = {https://github.com/rgenomicsetl/duckhts},
  }

Rduckhts: 'DuckDB' 'HTS' File Reader Extension for 'R' Bundles the 'duckhts' 'DuckDB' extension for reading 'HTS' file formats (VCF/BCF, SAM/BAM/CRAM, FASTA, FASTQ, GFF, GTF, tabix) from 'R' via 'DuckDB'. The extension and its 'htslib' dependency are compiled from vendored sources during package installation. Authors:Sounkou Mahamane Toure [aut, cre], htslib authors [ctb], DuckDB C Extension API authors [ctb] Rduckhts_0.1.1-0.0.1.tar.gz Rduckhts_0.1.1-0.0.1.zip(r-4.6)Rduckhts_0.1.1-0.0.1.zip(r-4.5)Rduckhts_0.1.1-0.0.1.zip(r-4.4) Rduckhts_0.1.1-0.0.1.tgz(r-4.6-any)Rduckhts_0.1.1-0.0.1.tgz(r-4.5-any) Rduckhts_0.1.1-0.0.1.tar.gz(r-4.6-any)Rduckhts_0.1.1-0.0.1.tar.gz(r-4.5-any) Rduckhts.pdf |Rduckhts.html✨ Rduckhts/json (API) NEWS # Install 'Rduckhts' in R: install.packages('Rduckhts', repos = c('https://rgenomicsetl.r-universe.dev', 'https://cloud.r-project.org')) Bug tracker:https://github.com/rgenomicsetl/duckhts/issues0 issues On CRAN: no 3.75 score 12 exports 2 dependencies Last updated0 hours ago from:e99a28a305. Checks:7 OK, 1 NOTE, 1 FAIL. Indexed: yes. Citation To cite package ‘Rduckhts’ in publications use: Toure S (2026). Rduckhts: 'DuckDB' 'HTS' File Reader Extension for 'R'. R package version 0.1.1-0.0.1, https://github.com/rgenomicsetl/duckhts. Corresponding BibTeX entry: @Manual{, title = {Rduckhts: 'DuckDB' 'HTS' File Reader Extension for 'R'}, author = {Sounkou Mahamane Toure}, year = {2026}, note = {R package version 0.1.1-0.0.1}, url = {https://github.com/rgenomicsetl/duckhts}, }

Rduckhts: DuckDB HTS File Reader Extension for R

Rduckhts provides an R interface to a DuckDB HTS (High Throughput Sequencing) file reader extension. This enables reading common bioinformatics file formats such as VCF/BCF, SAM/BAM/CRAM, FASTA, FASTQ, GFF, GTF, and tabix-indexed files directly from R using SQL queries via duckhts.
How it works

Following RBCFTools, tables are created and returned instead of data frames. VCF/BCF, SAM/BAM/CRAM, FASTA, FASTQ, GFF, GTF, and tabix formats can be queried. We support region queries for indexed files, and we target Linux, macOS, and RTools. htslib 1.23 is bundled so build dependencies stay minimal. The extensnion is built by adapting the generic extension infracstructure by using only makefiles unlike unlike the submitted communtity extension duckhts.
Installation

The package can be installed from github

remotes::install_github(
    "RGenomicsETL/duckhts", subdir = "r/Rduckhts")`.

System Requirements

Installation requires htslib dependencies such ad zlib and libbz2, and optionally for full functionally liblzma, libcurl, and openssl. The package requires GNU make. On Windows’s Rtools, htslib plugins are not enable.

Rduckhts: DuckDB HTS File Reader Extension for R Rduckhts provides an R interface to a DuckDB HTS (High Throughput Sequencing) file reader extension. This enables reading common bioinformatics file formats such as VCF/BCF, SAM/BAM/CRAM, FASTA, FASTQ, GFF, GTF, and tabix-indexed files directly from R using SQL queries via duckhts. How it works Following RBCFTools, tables are created and returned instead of data frames. VCF/BCF, SAM/BAM/CRAM, FASTA, FASTQ, GFF, GTF, and tabix formats can be queried. We support region queries for indexed files, and we target Linux, macOS, and RTools. htslib 1.23 is bundled so build dependencies stay minimal. The extensnion is built by adapting the generic extension infracstructure by using only makefiles unlike unlike the submitted communtity extension duckhts. Installation The package can be installed from github remotes::install_github( "RGenomicsETL/duckhts", subdir = "r/Rduckhts")`. System Requirements Installation requires htslib dependencies such ad zlib and libbz2, and optionally for full functionally liblzma, libcurl, and openssl. The package requires GNU make. On Windows’s Rtools, htslib plugins are not enable.

Quick Start

The extension is loaded with rduckhts_load(con, extension_path = NULL). We can create tables with rduckhts_bcf, rduckhts_bam, rduckhts_fasta, rduckhts_fastq, rduckhts_gff, rduckhts_gtf, and rduckhts_tabix using the parameters documented in their help pages

library(DBI)
library(duckdb)
library(Rduckhts)


ext_path <- system.file("extdata", "duckhts.duckdb_extension", package = "Rduckhts")
fasta_path <- system.file("extdata", "ce.fa", package = "Rduckhts")
fastq_r1 <- system.file("extdata", "r1.fq", package = "Rduckhts")
fastq_r2 <- system.file("extdata", "r2.fq", package = "Rduckhts")
con <- dbConnect(duckdb::duckdb(config = list(allow_unsigned_extensions = "true")))
rduckhts_load(con, extension_path = ext_path)
#> [1] TRUE

rduckhts_fasta(con, "sequences", fasta_path, overwrite = TRUE)
rduckhts_fastq(con, "reads", fastq_r1, mate_path = fastq_r2, overwrite = TRUE)

dbGetQuery(con, "SELECT COUNT(*) AS n FROM sequences")
#>   n
#> 1 7
dbGetQuery(con, "SELECT COUNT(*) AS n FROM reads")
#>    n
#> 1 10

Quick Start The extension is loaded with rduckhts_load(con, extension_path = NULL). We can create tables with rduckhts_bcf, rduckhts_bam, rduckhts_fasta, rduckhts_fastq, rduckhts_gff, rduckhts_gtf, and rduckhts_tabix using the parameters documented in their help pages library(DBI) library(duckdb) library(Rduckhts) ext_path <- system.file("extdata", "duckhts.duckdb_extension", package = "Rduckhts") fasta_path <- system.file("extdata", "ce.fa", package = "Rduckhts") fastq_r1 <- system.file("extdata", "r1.fq", package = "Rduckhts") fastq_r2 <- system.file("extdata", "r2.fq", package = "Rduckhts") con <- dbConnect(duckdb::duckdb(config = list(allow_unsigned_extensions = "true"))) rduckhts_load(con, extension_path = ext_path) #> [1] TRUE rduckhts_fasta(con, "sequences", fasta_path, overwrite = TRUE) rduckhts_fastq(con, "reads", fastq_r1, mate_path = fastq_r2, overwrite = TRUE) dbGetQuery(con, "SELECT COUNT(*) AS n FROM sequences") #> n #> 1 7 dbGetQuery(con, "SELECT COUNT(*) AS n FROM reads") #> n #> 1 10

FASTA, BAM, FASTQ, READER

FASTA, BAM, FASTQ, READER

#RStats
Rduckhts: 'DuckDB' 'HTS' File Reader Extension for 'R'

Sitting on the shoulders of the great #htslib API and the duckdb C API

Package : rgenomicsetl.r-universe.dev/Rduckhts

1 3 0 0
Open BCF/VCF in duckbd Tables in R using a bcf_reader duckdb extension based on htslib

Open BCF/VCF in duckbd Tables in R using a bcf_reader duckdb extension based on htslib

Open BCF/VCF in duckbd Tables in R using a bcf_reader duckdb extension based on htslib while in tidy format

Open BCF/VCF in duckbd Tables in R using a bcf_reader duckdb extension based on htslib while in tidy format

write into minio a parquet file obtained from conversion from VCF

write into minio a parquet file obtained from conversion from VCF

DuckLake Content

DuckLake Content

Maybe the fastest BCF/VCF to #RStats DataFrames using #htslib and #duckdb C API. Easily the title of fastest BCF/VCF to parquet convertors in #RStats (no other R options :D). This was motivated, among other things, by the idea of trying out #DuckLake in a familiar field
github.com/RGenomicsETL...

4 1 0 0
Samtools Samtools

Release 1.22 of HTSlib, SAMtools, and BCFtools is now available from GitHub. See htslib.org/download/ for links to tarballs and release notes. 🧪

#samtools #bcftools #htslib #bioinformatics

9 4 0 0