CSHL Meetings (@cshlmeetings)

Welcome to all of our Systems Biology: Global Regulation of Gene Expression attendees! #cshlsysbio

12.03.2026 13:55 👍 0 🔁 0 💬 0 📌 0

#cshlsysbio🧬

12.03.2026 13:54 👍 1 🔁 0 💬 0 📌 0

Regulatory & Non-Coding RNAs Cold Spring Harbor Laboratory Meetings & Courses -- a private, non-profit institution with research programs in cancer, neuroscience, plant biology, genomics, bioinformatics.

Upcoming Cold Spring Harbor meeting on Regulatory & Non-coding RNAs (April 7-11, 2026). Abstract deadline is fast approaching! @cshlmeetings.bsky.social meetings.cshl.edu/meetings.asp...

22.01.2026 18:09 👍 4 🔁 2 💬 0 📌 0

I will be attending "Gene Expression & Signaling in the Immune System" at @cshlmeetings.bsky.social next week - this is their first meeting of the year and will be an exciting one.

26.02.2026 14:05 👍 2 🔁 1 💬 0 📌 0

First one of 2026! Gene Expression & Signaling in the Immune System starts tonight! #cshlimmune

04.03.2026 22:29 👍 2 🔁 0 💬 0 📌 0

Great meeting!!! Amazing talks and posters today also.

10.12.2025 01:53 👍 6 🔁 2 💬 0 📌 0

Equally important, our meeting’s simple rule is holding throughout the sessions, with graduate students and postdoctoral scholars reliably asking the first 2 questions after each presentation.

A privilege to organize this with Sally Temple and Christine Mummery!

Next edition in December 2027!

10.12.2025 20:50 👍 2 🔁 1 💬 0 📌 0

Remarkable energy at the inaugural @cshlnews.bsky.social conference on Assembloids and complex cell–cell interactions across tissues and systems
A wonderful series of talks so far highlighting advances & discoveries being made with these self-organizing systems and a few common themes are emerging

10.12.2025 20:50 👍 18 🔁 4 💬 1 📌 0

That’s a wrap on 2025! Thanks to everyone who came to #cshlassembloid

12.12.2025 13:44 👍 2 🔁 0 💬 0 📌 0

The inaugural conference on #assembloids and complex cell-cell interactions starting tonight at @cshlnews.bsky.social ! Amazing energy and inspiring opening keynote by Ruslan Medzhitov.

09.12.2025 06:52 👍 15 🔁 2 💬 0 📌 1

Thank you to everyone who attended last week’s Plant Genomes, Systems Biology, and Engineering meeting! #cshlplant 🌱

08.12.2025 15:06 👍 1 🔁 0 💬 0 📌 0

Mapping the Sense of What’s Going On Inside

Great piece on #interoception and brain body interactions by @carlzimmer.com in the @nytimes.com www.nytimes.com/2025/11/25/s... What an exciting time to work on this. Join us @cshlmeetings.bsky.social to learn more about the this exciting field meetings.cshl.edu/meetings.asp...

26.11.2025 08:58 👍 42 🔁 14 💬 0 📌 4

Today is the last day of Zebrafish Neurobiology 🐟! #cshlzebrafish

22.11.2025 14:48 👍 1 🔁 0 💬 0 📌 0

Last day of Single Cell Analyses! 🍂 #cshlsca

15.11.2025 12:57 👍 2 🔁 0 💬 0 📌 0

Really excited to see our new work in scaling Mumemto to any size pangenome published in Genome Research this morning. And right on cue with the great opportunity to present this work at #GI2025 this week.

07.11.2025 21:29 👍 16 🔁 5 💬 1 📌 0

#GI2025 Vikram Shivakumar from Ben Langmead's lab (@benlangmead.bsky.social) presents "MumemtoM - partitioned Multi-MUM finding for scalable pangenomics ". Now published in Genome Research @genomeresearch.bsky.social. Read full text here ➡️ tinyurl.com/Genome-Res-2...

07.11.2025 15:08 👍 10 🔁 5 💬 0 📌 1

OCR Ortholog Open Chromatin Status Prediction Framework Overview. a We trained a convolutional neural network (CNN) for predicting brain open chromatin using sequences underlying brain open chromatin region (OCR) orthologs in a small number of species and used the CNN to predict brain OCR ortholog open chromatin status across the species in the Zoonomia Consortium. Specifically, we used the sequences underlying the orthologs for which we have brain open chromatin data to train a CNN for predicting open chromatin. Then, we used the CNN to predict the probability of brain open chromatin for all brain OCR orthologs; predictions are illustrated on the right. Animals for which we do not have open chromatin data are in dark gray instead of black to indicate that their brain open chromatin is imputed. While we cannot evaluate the accuracy of most of our predictions, obtaining open chromatin data from most tissues in most species is infeasible, so predictions might be the best OCR annotations that we can obtain. b To demonstrate that our models can accurately predict whether sequence differences between species are associated with open chromatin differences, in addition to the evaluations described in previous work [57], we evaluated our performance on species-specific open chromatin for a species not used in model training and clade-specific open and closed chromatin for clades not used in model training. Since such regions often comprise a minority of OCR orthologs, models could obtain good overall performance while obtaining poor performance on such regions. We also evaluated our performance on tissue-specific open and closed chromatin for a tissue not used in model training, where we expect models to predict 0 if model learns sequence signatures related to the tissue used in training. c Full mouse test set and lineage-specific OCR accuracy evaluations for mouse sequence-only brain model, illustrating that, even for the best of these models,

Third day of Genome Informatics #GI2025 began with an exciting session on “AI, ML and Integrative Genomics” chaired by Irene Kaplow & Thomas Pierrot.
The first talk, by Irene Kaplow, focused on Challenges in Predicting Enhancer Activity Differences Between Species
doi.org/10.1186/s12864-022-08450-7

07.11.2025 14:28 👍 14 🔁 2 💬 1 📌 1

https://arxiv.org/abs/2503.17547 Learning Multi-Level Features with Matryoshka Sparse Autoencoders

Second's day concluded by fantastic talk by Cristina Martin Linares on "Minimal reconstruction of SpliceAI using distilled matryoshka sparse autoencoders"

They showed that matryoshka SAEs arxiv.org/abs/2503.17547 improves upon openSpliceAI elifesciences.org/reviewed-preprints/107454. #GI2025

07.11.2025 13:08 👍 4 🔁 1 💬 1 📌 0

a, Left—Fasta representation of an individual SARS-CoV-2 genome consists of sample name followed by the entire ≈ 30 kbp genome sequence. Right—MAPLE format records only the differences between the genome under consideration and a reference; columns represent the variant character observed, the position along the genome and (when necessary) the number of consecutive positions for which the character is observed. b, Left—an example likelihood vector at an internal node of a phylogenetic tree (shown by the narrow blue arrow; only a small portion of the tree is shown); for simplicity, we show only ten genome positions. At each position (rows 1–10), each column contains the likelihood for a specific nucleotide. For rows 1–9, the likelihood is concentrated at only one nucleotide (highlighted in green), while for position 10, we show an example with more uncertainty. Right—MAPLE representation of these node likelihoods. Assuming that the reference sequence at the first nine positions matches the most likely nucleotides in the vector (ATTAAAGGT), then for positions 1–9, the likelihood of nonreference nucleotides is negligible and we represent the likelihoods with a single symbol (R). At position 10, due to non-negligible uncertainty, we explicitly calculate and store the four relative likelihoods. c, Examples of likelihood calculation steps in MAPLE. Red arrows represent the flow of information from the tips to the root of the tree. Left—if two child nodes are in reference state R for a region of the genome (here, positions 1–9), then MAPLE assumes that their parent is also in state R. Right—if at a genome position (here, position 10), two child nodes have likelihoods concentrated at different nucleotides, then for their parent, we explicitly calculate the relative likelihoods of all four nucleotides.

Nicola De Maio presented "Maximum likelihood phylogenetics at pandemic scales" and discussed the importance of scalable phylogenetics in genomic epidemiology. #GenomeInformatics #GI2025
MAPLE: nature.com/articles/s41588-023-01368-0

07.11.2025 12:56 👍 1 🔁 1 💬 1 📌 0

#GI2025 Ilias Georgakopoulos-Soares presents "Quadrupia - a comprehensive catalog of G-quadruplexes across genomes from the tree of life". Now published in Genome Research @genomeresearch.bsky.social Read full text here ➡️ tinyurl.com/Genome-Res-2...

06.11.2025 22:41 👍 4 🔁 2 💬 0 📌 0

#GI2025 Chirag Jain presents "Pangenome-based genome inference using integer programming". Now published in GenomeResearch @genomeresearch.bsky.social Read the full text here ➡️ tinyurl.com/Genome-Res-2...

06.11.2025 22:35 👍 5 🔁 1 💬 0 📌 0

#GI2025 Mile Sikic @msikic.bsky.social presents "Geometric deep learning framework for de novo genome assembly" Now published in GenomeResearch @genomeresearch.bsky.social Full text here ➡️ tinyurl.com/Genome-Res-2...

06.11.2025 22:33 👍 10 🔁 4 💬 0 📌 0

$Abstract: Seed-chain-extend with k-mer seeds is a powerful heuristic technique for sequence alignment used by modern sequence aligners. Although effective in practice for both runtime and accuracy, theoretical guarantees on the resulting alignment do not exist for seed-chain-extend. In this work, we give the first rigorous bounds for the efficacy of seed-chain-extend with k-mers in expectation. Assume we are given a random nucleotide sequence of length ∼n that is indexed (or seeded) and a mutated substring of length ∼m ≤ n with mutation rate θ < 0.206. We prove that we can find a k = Θ(log n) for the k-mer size such that the expected runtime of seed-chain-extend under optimal linear-gap cost chaining and quadratic time gap extension is O(mn^f(θ) log n), where f(θ) < 2.43 · θ holds as a loose bound. The alignment also turns out to be good; we prove that more than 1-o(sqrt(1/m)) fraction of the homologous bases is recoverable under an optimal chain. We also show that our bounds work when k-mers are sketched, that is, only a subset of all k-mers is selected, and that sketching reduces chaining time without increasing alignment time or decreasing accuracy too much, justifying the effectiveness of sketching as a practical speedup in sequence alignment. We verify our results in simulation and on real noisy long-read data and show that our theoretical runtimes can predict real runtimes accurately. We conjecture that our bounds can be improved further, and in particular, f(θ) can be further reduced.$

Abstract: Seed-chain-extend with k-mer seeds is a powerful heuristic technique for sequence alignment used by modern sequence aligners. Although effective in practice for both runtime and accuracy, theoretical guarantees on the resulting alignment do not exist for seed-chain-extend. In this work, we give the first rigorous bounds for the efficacy of seed-chain-extend with k-mers in expectation. Assume we are given a random nucleotide sequence of length ∼n that is indexed (or seeded) and a mutated substring of length ∼m ≤ n with mutation rate θ < 0.206. We prove that we can find a k = Θ(log n) for the k-mer size such that the expected runtime of seed-chain-extend under optimal linear-gap cost chaining and quadratic time gap extension is O(mn^f(θ) log n), where f(θ) < 2.43 · θ holds as a loose bound. The alignment also turns out to be good; we prove that more than 1-o(sqrt(1/m)) fraction of the homologous bases is recoverable under an optimal chain. We also show that our bounds work when k-mers are sketched, that is, only a subset of all k-mers is selected, and that sketching reduces chaining time without increasing alignment time or decreasing accuracy too much, justifying the effectiveness of sketching as a practical speedup in sequence alignment. We verify our results in simulation and on real noisy long-read data and show that our theoretical runtimes can predict real runtimes accurately. We conjecture that our bounds can be improved further, and in particular, f(θ) can be further reduced.

Second day of Genome Informatics #GI2025 began with the session “Genome Assembly and Sequence Algorithms" Yun William Yu presented “Average-case Analysis of Seed-Chain-Extend under Random Mutations"
genome.cshlp.org/content/33/7/1175
providing theoretical guarantees for the popular seed-chain-extend

06.11.2025 14:19 👍 20 🔁 3 💬 1 📌 1

A thread on #GI2025 's first session 👇🏻

06.11.2025 06:03 👍 6 🔁 1 💬 0 📌 0

The first session is PANGENOMES #GI2025. Alexander Schönhuth is delivering the first talk on "Generating synthetic genotypes using diffusion models"
Paper: academic.oup.com/bioinformati...
Code: github.com/TheMody/Gene...

06.11.2025 00:52 👍 2 🔁 1 💬 1 📌 1

@benlangmead.bsky.social kicking off the start of Genome Informatics! #gi2025 @cshlnews.bsky.social

06.11.2025 00:39 👍 26 🔁 3 💬 1 📌 0

Ben Langmead @benlangmead.bsky.social delivers the official opening for this year's Genome Informatics Conference #GI2025 at Cold Spring Harbor Laboratory.
List of talks and posters: meetings.cshl.edu/abstracts.as...

06.11.2025 00:38 👍 34 🔁 7 💬 1 📌 0

#GI2025 is about to start. I hope you'll enjoy this edition!

05.11.2025 14:13 👍 5 🔁 3 💬 0 📌 0

Excited to be at the Genome Informatics conference #GI2025 in Cold Spring Harbor Laboratory this week! I’ll be sharing my current work on using machine learning to improve the reliability of Metagenomics classification for analzing gut microbiota and soil samples. Let's connect!

04.11.2025 20:51 👍 7 🔁 1 💬 0 📌 0

Genome Informatics starts tonight 🧬 #gi2025

05.11.2025 19:26 👍 5 🔁 0 💬 0 📌 0

CSHL Meetings

Latest posts by CSHL Meetings @cshlmeetings