Daniel Anderson's Avatar

Daniel Anderson

@danderson123

Bioinformatician @ Basecamp Research

37
Followers
133
Following
3
Posts
02.12.2024
Joined
Posts Following

Latest posts by Daniel Anderson @danderson123

Splicer: Phylogenetic Placement in Sub-Linear Time https://www.biorxiv.org/content/10.64898/2026.02.10.705130v1

12.02.2026 19:47 πŸ‘ 1 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

Huge congratulations @martibartfast.bsky.social and @zaminiqbal.bsky.social on the publication of this fantastic and massive paper. A huge achievement!

11.02.2026 22:01 πŸ‘ 4 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Embarrassingly_FASTA: Enabling Recomputable, Population-Scale Pangenomics by Reducing Commercial Genome Processing Costs from $100 to less than $1 https://www.biorxiv.org/content/10.64898/2026.02.02.703356v1

04.02.2026 14:46 πŸ‘ 0 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

So anyway:
BiRank & QuadRank: single-cache-miss rank queries that are double the throughput of other Rust crates and fully saturate the memory bandwidth.
Side effect: QuadFm is smaller and 2-4x faster than the next-best FM-index.

github.com/RagnarGrootK...

raw.githubusercontent.com/RagnarGrootK...

04.02.2026 01:24 πŸ‘ 18 πŸ” 9 πŸ’¬ 2 πŸ“Œ 0

Very proud to have played a small part in this important work!

15.01.2026 18:50 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image Post image Post image Post image

EDEN: a family of genomic language models trained on up to 9.7 trillion nucleotides from @basecamp-research.bsky.social's BaseData can design large serine recombinases, bridge recombinases, and antimicrobial peptides.

www.biorxiv.org/content/10.6...

Happy to have played a small part in this!

13.01.2026 15:16 πŸ‘ 18 πŸ” 5 πŸ’¬ 0 πŸ“Œ 0

Rapid and Consistent Genome Clustering for Navigating Bacterial Diversity with Millions of MAGs and Isolates https://www.biorxiv.org/content/10.64898/2025.12.30.695181v1

31.12.2025 05:46 πŸ‘ 0 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

Rewriting protein alphabets with language models https://www.biorxiv.org/content/10.1101/2025.11.27.690975v1

29.11.2025 02:47 πŸ‘ 0 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

Deciphering enzymatic potential in metagenomic reads through DNA language model https://www.biorxiv.org/content/10.1101/2024.12.10.627786v1

12.12.2024 02:47 πŸ‘ 1 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

A General Transformer-Based Multi-Task Learning Framework for Predicting Interaction Types between Enzyme and Small Molecule https://www.biorxiv.org/content/10.1101/2025.10.09.681419v1

11.10.2025 08:46 πŸ‘ 2 πŸ” 2 πŸ’¬ 0 πŸ“Œ 0
Preview
RemoteFoldSet: Benchmarking Structural Awareness of Protein Language Models Protein language models (pLMs) have the capacity to infer structural information from amino acid sequences. Evaluating the extent to which structural information they truly encode is crucial for asses...

RemoteFoldSet: Benchmarking Structural Awareness of Protein Language Models
Zinnia Ma, Neville P. Bethel
bioRxiv 2025.09.23.678152; doi: doi.org/10.1101/2025...

29.09.2025 03:49 πŸ‘ 10 πŸ” 2 πŸ’¬ 0 πŸ“Œ 0
Preview
High-accuracy SNV calling for bacterial isolates using deep learning with AccuSNV Accurate detection of mutations within bacterial species is critical for fundamental studies of microbial evolution, reconstructing transmission events, and identifying antimicrobial resistance mutati...

Precisely calling mutations across hundreds of bacterial isolates has been hard, requiring manual filtering and expertise.

Until now, using AccuSNV.

Herui Liao trained an ML model based on our previous meticulously called SNVs.
www.biorxiv.org/content/10.1...

29.09.2025 19:45 πŸ‘ 72 πŸ” 34 πŸ’¬ 2 πŸ“Œ 1

Now published in @natcomms.nature.com πŸŽ‰

www.nature.com/articles/s41...

With Gillian Rodger, @nstoesser.bsky.social, @samlipworth.bsky.social, @stat-sarah.bsky.social, and many others!

30.09.2025 16:21 πŸ‘ 21 πŸ” 14 πŸ’¬ 0 πŸ“Œ 0
Preview
Machine learning for biosecurity: A probabilistic framework for invasive species management By using pre-introduction traits and leveraging ML for early detection, this study presents a scalable, data-driven framework for invasion risk assessment and conservation planning. Our approach enab...

Machine learning for biosecurity: A probabilistic framework for invasive species management. Journal of Applied Ecology, 00, 1–13. doi.org/10.1111/1365...

04.10.2025 12:30 πŸ‘ 1 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0
Preview
Alice: fast and haplotype-aware assembly of high-fidelity reads based on MSR sketching We introduce Mapping-friendly Sequence Reduction (MSR) sketches, a sketching method for high-fidelity (HiFi) long reads, and Alice, an assembler that operates directly on these sketches. MSR produces ...

Our preprint on our new metagenomic HiFi assembler Alice is out πŸ₯³ Based on a *new sketching method* (🧡1/6)
πŸ‘‰ Preprint www.biorxiv.org/content/10.1...
πŸ‘‰ Github github.com/rolandfaure/...

03.10.2025 14:51 πŸ‘ 25 πŸ” 21 πŸ’¬ 2 πŸ“Œ 0
Preview
How to rapidly search the world’s microbial DNA By making the world’s microbial DNA easier to explore, LexicMap helps researchers track outbreaks, study antibiotic resistance, and understand microbial diversity.

There are millions of openly available microbial genomes, but searching them can be slow.

Until now πŸ₯

Introducing LexicMap, a new alignment tool that lets scientists search these data in minutes, helping track antibiotic resistance, trace outbreaks, and more.

www.ebi.ac.uk/about/news/r...
🦠

30.09.2025 09:47 πŸ‘ 41 πŸ” 16 πŸ’¬ 1 πŸ“Œ 1
Preview
Compression of protein secondary structures enables ultra-fast and accurate structure searching Protein structure prediction has undergone a revolution with the advent of AI- based algorithms, such as AlphaFold and RoseTTAFold. As a result, over 200 million predicted protein structures have been...

"We show that, despite this compression factor, SSEs can be used as a highly effective tertiary structure comparison tool, with accuracy that approaches that of Foldseek, while offering a 200-fold speedup. "

www.biorxiv.org/content/10.1...

17.09.2025 18:53 πŸ‘ 19 πŸ” 10 πŸ’¬ 0 πŸ“Œ 0
Preview
Efficient sequence alignment against millions of prokaryotic genomes with LexicMap - Nature Biotechnology LexicMap uses a fixed set of probes to efficiently query gene sequences for fast and low-memory alignment.

Sometimes you meet absolutely incredible bioinfo-magicians.
It was a huge privilege when @shenwei356.bsky.social
joined our group for a year on an @embl.org sabbatical.
While here, he developed a new way of aligning to
millions of bacteria, called LexicMap 1/n
www.nature.com/articles/s41...

10.09.2025 09:12 πŸ‘ 190 πŸ” 99 πŸ’¬ 5 πŸ“Œ 4

Couldn’t have said it better myself!

20.05.2025 17:47 πŸ‘ 4 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0