Trending
Leif Sieben's Avatar

Leif Sieben

@leif7ieben

Master student @ ETH Zurich Visiting Student @ MIT and Broad Institute Working on machine learning for drug discovery and bringing all of chemistry into the age of big data and AI

324
Followers
1,426
Following
57
Posts
18.11.2024
Joined
Posts Following

Latest posts by Leif Sieben @leif7ieben

The biggest bottle-neck to my personal code productivity:

The fact that OpenAI still hasn't pushed a Codex mobile app.

08.03.2026 17:55 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

My paper of the year is Andrew Gordon Wilson's "Deep Learning is Not So Mysterious or Different". I'll be thinking this year about what family of functions (support) combined with what prior over parameters (inductive bias) can actually well capture drug discovery data including activity cliffs.

29.12.2025 16:11 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

You learn a lot about the underlying system design of your apps when you run them in a low data environment.

26.10.2025 22:00 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

A fundamental lesson of modern AI is that scale is essential: training bigger models on bigger datasets unlocks new capabilities. A fundamental lesson of AI engineering is that scaling up isn't trivial: it is not just a matter of spending more money and resources.

22.09.2025 12:15 πŸ‘ 7 πŸ” 3 πŸ’¬ 1 πŸ“Œ 0

It shows

07.09.2025 07:35 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Harnessing the Universal Geometry of Embeddings

Very interesting article here vec2vec.github.io

Showing how the latent representations of two different vision models can be β€œtranslated” into each other via a universal β€œplatonic” representation. As the authors note: interesting cybersecurity implications

07.09.2025 07:35 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Strong Platonic Representation Hypothesis: the universal latent structure of text representations not only exists, but can be learned and, furthermore, harnessed to translate representations from one space to another without any paired data or encoders.

07.09.2025 07:35 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
Data contamination is all random forest needs Here's why we believe our Hermes prediction results are real

Got recommend this substack from Leash bio by a friend.

I think this is a masterclass in how to correctly split the data if there ever was one.

Respect your chemistry folks!

open.substack.com/pub/leashbio...

03.08.2025 18:25 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Contrasting photographs of the night-time skylines of Manhattan (left) and Nijmegen (right), with matching genome-wide association plots underneath each.

Contrasting photographs of the night-time skylines of Manhattan (left) and Nijmegen (right), with matching genome-wide association plots underneath each.

Not sure who came up with "Manhattan Plot", but in 2014 I coined the alternative term "Nijmegen Plot" (inspired by the Dutch town where I live) to describe underwhelming results from our earliest genome-wide association scans of language/reading traits.

28.07.2025 16:41 πŸ‘ 109 πŸ” 21 πŸ’¬ 2 πŸ“Œ 2
Post image Post image

Love these maps of "street-text sightings" in the Pudding's latest piece
pudding.cool/2025/07/stre...

28.07.2025 14:21 πŸ‘ 23 πŸ” 9 πŸ’¬ 0 πŸ“Œ 2
On N-dimensional Rotary Positional Embeddings An exploration of N-dimensional rotary positional embeddings (RoPE) for vision transformers.

Great blog post on rotary position embeddings (RoPE) in more than one dimension, with interactive visualisations, a bunch of experimental results, and code!

28.07.2025 14:51 πŸ‘ 18 πŸ” 2 πŸ’¬ 0 πŸ“Œ 0
Video thumbnail

Can an AI model predict perfectly and still have a terrible world model?

What would that even mean?

Our new ICML paper (poster tomorrow!) formalizes these questions.

One result tells the story: A transformer trained on 10M solar systems nails planetary orbits. But it botches gravitational laws 🧡

14.07.2025 13:49 πŸ‘ 40 πŸ” 14 πŸ’¬ 2 πŸ“Œ 6
Please stop saying β€œThe Tanimoto similarity is” – RDKit blog A simple tip to explain what you actually did

Today's #RDKit blog post is a heartfelt plea for clearer communication.
greglandrum.github.io/rdkit-blog/p...

17.07.2025 11:22 πŸ‘ 32 πŸ” 7 πŸ’¬ 2 πŸ“Œ 1

There is a new startup from China called Moonshot.

The original β€œmoonshot” was the Apollo Program.

An AI based moonshot could be referred to as an β€œAI pollo” program.

β€œai pollo” in Italian means something like β€œto the chicken”.

13.07.2025 14:47 πŸ‘ 10 πŸ” 1 πŸ’¬ 2 πŸ“Œ 0

I was recently on a flight with free Wi-Fi for texting but nothing else.

Jokes on them: I can use Llama through WhatsApp now …

26.06.2025 08:31 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
The impact of molecular size on similarity. – RDKit blog An exploration of how molecular size influences fingerprint similarity.

The new #RDKit blog post, inspired by a question from @valencekjell.com, looks at the impact of molecular size on similarity thresholds.
greglandrum.github.io/rdkit-blog/p...

20.06.2025 04:24 πŸ‘ 12 πŸ” 5 πŸ’¬ 3 πŸ“Œ 1

Yay for @pschwllr.bsky.social and @mlederbauer.bsky.social (and all your co-authors who aren't on BlueSky yet) πŸ₯³

This #dataset is a prime example of #GoodData, and it ties nicely with what @clarakirkvold.bsky.social and @grynova.bsky.social were talking about a few weeks ago in their #JournalClub

04.06.2025 15:43 πŸ‘ 6 πŸ” 2 πŸ’¬ 0 πŸ“Œ 1

I've got a joke about Osysseus. I got lost on the way to the punchline...

13.06.2025 20:16 πŸ‘ 1 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0
smell rights. in the US, Hasbro has a tradmark for the smell of Play Doh.

smell rights. in the US, Hasbro has a tradmark for the smell of Play Doh.

13.06.2025 01:59 πŸ‘ 2123 πŸ” 195 πŸ’¬ 88 πŸ“Œ 41
Online RIS to BibTeX converter The simple RIS (EndNote) to bib (BibTeX) online conversion app.

change my mind:

bruot RIS to Bibtex converter is the best website ever built.

www.bruot.org/ris2bib/

05.06.2025 00:30 πŸ‘ 0 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0
Post image

If anybody out there working on antimicrobial resistance (AMR) and needs some motivation on this gloomy New England Monday.

09.06.2025 14:35 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

I think the ranking of things which are hard to predict goes:

1. The stock market.
2. LaTeX figure placement.
3. The meaning of life.

07.06.2025 13:00 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image

#booksky

06.06.2025 11:25 πŸ‘ 19082 πŸ” 3226 πŸ’¬ 377 πŸ“Œ 260
Post image

Cheminformatics family businesses be like

06.06.2025 20:23 πŸ‘ 4 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

Just to clarify: I’m washing my wants twice now! Not to cause any concern here.

06.06.2025 16:09 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

One of the surprising things about working in a microbiology lab is that you become more worried about washing your hands before using the restroom rather than after.

06.06.2025 16:09 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Video thumbnail

I think the thing I'm most excited to see over the next ~10 years of #dataviz is web-based content that interweaves long-form text and modular interactives.

Not as heavy as scrollytelling and not as aimless as a dashboard, but something in between.

This is what I was going for with the QR project!

04.06.2025 14:46 πŸ‘ 41 πŸ” 7 πŸ’¬ 3 πŸ“Œ 1
Online RIS to BibTeX converter The simple RIS (EndNote) to bib (BibTeX) online conversion app.

change my mind:

bruot RIS to Bibtex converter is the best website ever built.

www.bruot.org/ris2bib/

05.06.2025 00:30 πŸ‘ 0 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

You know volatility is going crazy when sitting down to write a PAC proof about the sampling efficiency of an active-learning algorithm feels like a therapy session.

At least math hasn't changed over the past 12 months ...

28.05.2025 20:40 πŸ‘ 2 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

Not me accidentally typing `squeue` into the Facebook chat.

23.05.2025 01:00 πŸ‘ 8 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0