Noam Teyssier's Avatar

Noam Teyssier

@noamteyssier

Bioinformatics Scientist at the Arc Institute. Working at the intersection of functional genomics, systems biology, and machine learning. I also build rusty bioinformatics tools https://github.com/noamteyssier

164
Followers
103
Following
86
Posts
14.11.2024
Joined
Posts Following

Latest posts by Noam Teyssier @noamteyssier

Preview
CRISPR screens in iPSC-derived neurons reveal principles of tau proteostasis CRISPR screens in iPSC-derived neurons reveal that the E3 ubiquitin ligase CRL5SOCS4 ubiquitinates tau, that CUL5 expression is correlated with resilience in human Alzheimer’s disease, and that electr...

After a long review process, I'm excited that our paper is finally in print: www.cell.com/cell/fulltex...

TL;DR: We use CRISPR screens in iPSC-derived neurons to find a new tau E3 ligase and a relationship between oxidative stress, the proteasome, and tau proteolytic fragments.

More below πŸ‘‡

28.01.2026 17:12 πŸ‘ 35 πŸ” 11 πŸ’¬ 2 πŸ“Œ 1
Post image

Arc bioinformatics scientists @noamteyssier.bsky.social
and Alex Dobin have just released cyto, an ultra-high throughput processor specifically optimized for
@10xgenomics.bsky.social Flex single-cell data.

We are excited to make this resource open source: www.biorxiv.org/content/10.6...

22.01.2026 18:13 πŸ‘ 7 πŸ” 3 πŸ’¬ 1 πŸ“Œ 0
Preview
GitHub - ArcInstitute/cyto: A mapper for single cell sequencing reads with abstract geometries A mapper for single cell sequencing reads with abstract geometries - ArcInstitute/cyto

cyto is free, open-source, and production-ready. Built in Rust for reliability at scale.

Currently supports 10x Flex GEX and CRISPR, with more modalities coming.

Try it out and let us know how it works for you!

github.com/ArcInstitute...

22.01.2026 17:23 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
GitHub - ArcInstitute/binseq: A high efficiency binary format for sequencing data A high efficiency binary format for sequencing data - ArcInstitute/binseq

cyto is the first large-scale bioinformatics project to build with BINSEQ. Switching to BINSEQ can achieve mapping rates of 50M reads per second reduce your storage requirements by about 40%.

github.com/ArcInstitute...

22.01.2026 17:23 πŸ‘ 1 πŸ” 0 πŸ’¬ 2 πŸ“Œ 0
Post image

We also show that we can reproduce the results of CellRanger at a fraction of the resource cost. Our concordance is above 99.85% as measured via Spearman on matched cell UMIs and our lower-dimensional representations show perfect overlap with no method specific clustering.

22.01.2026 17:23 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

cyto was built from the ground up to be modular and to expose the individual modules to the user. Each step is highly optimized and can be run independently, perfect for production scale workflows as it allows for better parallelization and resource allocation on smaller nodes

22.01.2026 17:23 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Currently the only tool that supports this data type is CellRanger and we show that cyto provides runtimes an order of magnitude faster (16x), uses less than half the memory, dramatically reduces CPU-hours (30x) and reduces total IO by more than 5x.

22.01.2026 17:23 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Today I’m happy to release cyto, a tool I’ve developed at @arcinstitute.org to dramatically increase our computational throughput with 10x-flex single-cell processing by more than 16X!

22.01.2026 17:23 πŸ‘ 11 πŸ” 4 πŸ’¬ 1 πŸ“Œ 0

I've tried this at least 3 times haha I think honestly the best way to do it is not really to port it but drastically rework the way that its written.

21.01.2026 23:07 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
GitHub - ArcInstitute/binseq: A high efficiency binary format for sequencing data A high efficiency binary format for sequencing data - ArcInstitute/binseq

Maybe this will be the year we start to really question the foundational infrastructure of the field.

Obligatory BINSEQ mention here - keep a lookout the next couple weeks for an update!

github.com/arcInstitute...

09.01.2026 19:06 πŸ‘ 3 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

It is the year 2026 - bioinformaticians are still trying to figure out the best way to handle fastq

09.01.2026 19:06 πŸ‘ 4 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
GitHub - ArcInstitute/binseq: A high efficiency binary format for sequencing data A high efficiency binary format for sequencing data - ArcInstitute/binseq

My attempt at the 15th competing standard: github.com/arcInstitute...

09.01.2026 19:00 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

To be one with the borrow checker one must first be willing to let go

09.01.2026 16:08 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
GitHub - ArcInstitute/bqtools: A command line utilty for working with BINSEQ files A command line utilty for working with BINSEQ files - ArcInstitute/bqtools

Check out the github and give it a shot!

github.com/arcinstitute...

12.12.2025 19:28 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

We can make as many pipes as we have threads, each with a fixed record range and with a specified segment (R1 / R2). Then we can connect to each pipe on a reader and treat it as a normal FASTX file.

What's great about this is we can process *either* sequentially or in parallel *without* deadlocks!

12.12.2025 19:28 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

This was a fun engineering problem but ultimately was not very difficult because of the way BINSEQ is designed in the first place!

Named pipes can be a headache because it requires coordination between readers and writers but because BINSEQ is random access the implementation is straightforward.

12.12.2025 19:28 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

New feature to bqtools v0.4.14 that I'm stoked on!

One of the limiting factors to adopting BINSEQ is that it's new and not widely supported by existing tools.

`bqtools pipe` addresses this by transparently creating FASTX named-pipes which can be processed normally by existing tools.

12.12.2025 19:28 πŸ‘ 6 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0
Video thumbnail

If you ever need to fuzzy search some DNA, sassy is your tool.

Please spread the word; I think many people just outside my own circle could benefit from this :)

cc @rickbitloo.bsky.social

github.com/RagnarGrootK...

10.12.2025 15:50 πŸ‘ 40 πŸ” 24 πŸ’¬ 4 πŸ“Œ 0
Post image

Some optimization on VBQ with the latest binseq update, especially in lossless mode. Some ways to trim the fat:

1. Reuse zstd decoders for each thread. I was creating a decoder for each vbq block which incurred redundant allocations

2. Zero-copy parsing of blocks, referencing similar to paraseq

09.12.2025 22:12 πŸ‘ 3 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

ARM64 linux I think is pretty common on cloud computing environments. Might be worth to build for it also

19.11.2025 18:18 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
Efficient sequence analysis with bqtools Interactive bqtools tutorial: learn to analyse sequence data efficiently with BINSEQ files using a command-line interface in your browser.

Excited to announce a new bqtools tutorial on sandbox.bio by @noamteyssier.bsky.social! Learn about the BINSEQ file format, and how it can replace FASTQ files for better data compression and faster parallel processing: sandbox.bio/tutorials/bq...

18.11.2025 20:35 πŸ‘ 8 πŸ” 4 πŸ’¬ 0 πŸ“Œ 0

Built with uv so you don't have to worry about the dependencies or environments. Simple as:

```
uv tool install anntools-bio
anntools --help
```

18.11.2025 20:14 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
GitHub - noamteyssier/anntools: a cli-driven anndata toolkit a cli-driven anndata toolkit. Contribute to noamteyssier/anntools development by creating an account on GitHub.

I work with large collections of AnnDatas for single-cell work and got tired of opening notebooks for simple operations. Built a CLI tool to handle some common stuff directly from the terminal.

Quick ops: downsample, concat, pseudobulk, QC, metadata export, etc.

github.com/noamteyssier...

18.11.2025 20:14 πŸ‘ 12 πŸ” 4 πŸ’¬ 1 πŸ“Œ 0
sandbox.bio - Interactive bioinformatics tutorials

Side note: sandbox.bio is so cool.

Setting up an environment where you can learn and play around with these tools in the browser is no simple feat and I think it's an excellent educational resource for the bioinformatics community.

I'm very happy and proud to contribute to it!

14.11.2025 18:12 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
Efficient sequence analysis with bqtools Interactive bqtools tutorial: learn to analyse sequence data efficiently with BINSEQ files using a command-line interface in your browser.

BINSEQ is a high-performance format for sequencing data and bqtools is a CLI tool that lets you create and manipulate these files in the style of samtools.

Excited to release a tutorial with @robert.bio showcasing how to use it to encode, decode, and grep sequences in the browser on sandbox.bio!

14.11.2025 18:12 πŸ‘ 5 πŸ” 2 πŸ’¬ 1 πŸ“Œ 0

The pattern counting is something I'm especially stoked about. I was actually very surprised to see that this feature isn't more common on grep-like tools (outside of bioinformatics as well).

I've had this problem for years and I end up writing bespoke tools that do some variation of it.

07.11.2025 01:12 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
Release bqtools-0.4.8 Β· ArcInstitute/bqtools What's Changed 116 support fuzzy grep with sassy by @noamteyssier in #118 119 gate fuzzy matching behind feature flag by @noamteyssier in #120 58 implement a pattern count feature by @noamteyssier...

New bqtools release with some nice new features!

1. Support for fuzzy matching using sassy
2. Multi-Pattern counting (like `grep -c` but the count is for each individual pattern provided)
3. Pattern files (providing large lists of patterns as either regex or literals)

github.com/ArcInstitute...

07.11.2025 01:12 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

And stay on the look out the next couple weeks (hopefully) for the release of an even bigger project built with binseq!

29.10.2025 20:41 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
GitHub - ArcInstitute/binseq: A high efficiency binary format for sequencing data A high efficiency binary format for sequencing data - ArcInstitute/binseq

And if you're interested in building with binseq here is the place to start!

github.com/arcinstitute...

29.10.2025 20:41 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
GitHub - ArcInstitute/bqtools: A command line utilty for working with BINSEQ files A command line utilty for working with BINSEQ files - ArcInstitute/bqtools

I've also added some nice functionality to bqtools including a very useful colored grep!

github.com/arcinstitute...

29.10.2025 20:41 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0