We gratefully acknowledge Professors L. Becus and M. Lenoblus for useless discussions.
I feel attacked
doi.org/10.1140/epje...
We gratefully acknowledge Professors L. Becus and M. Lenoblus for useless discussions.
I feel attacked
doi.org/10.1140/epje...
Image from their link showing evidence of fabricated references
Holy smoke. What ultimately happened???
A public database of binding-site predictions in the human proteome and a google colab notebook to use the model yourself can be found here: github.com/sokrypton/af...
New paper showing that much of the apparent success of protein language models in predicting mutational effects is a mirage: These models mostly memorize sites. 1/
www.biorxiv.org/content/10.6...
This is fully in line with our experience. One advantage of the pLM's is thatβonce they have been trainedβit's an easy approach to assess positional conservation w.o. having to build alignments.
Community perspective:
Toward a unified framework for determining conformational ensembles of disordered proteins π
with framework for experimental data acquisition, computational ensemble generation & validation
Led by @hamidrgh.bsky.social, Silvio Tosatto & Alex Monzon
doi.org/10.1038/s415...
New paper from former PhD student @tkschulze.bsky.social on supervised learning of protein variant effects across large-scale mutagenesis datasets
MAVE/DMS experiments provide large amounts of data for benchmarking variant effect predictors, but may be difficult to use in supervised learning. 1/5
This turned into a rant by accident.
We are always thinking about this and are facing it right now: new molecules and some new routes to them with some new chemistry, some new Assays for the biology slashing costs by three orders of magnitude (not this research...but per sample moving forward)
We also see lab-to-lab variation despite using the same approaches (the top three panels are all from Doug's lab)
bsky.app/profile/lind...
As we discuss briefly in the paper, it should in principle be possible to learn the calibration curves from the MAVE data alone, but it's not easy
Thanks. One point of our paper was also to highlight that having calibration curves can be useful both for training and benchmarking, in particular when the latter occurs on a shared and physically meaningful scale.
As for tests on unseen data, I only know of CAGI genomeinterpretation.org
Review from Fia B. Larsen in @rhp-lab.bsky.social with everything you always wanted to know about proteasomal control of transcription factors (but were afraid to ask about)
Proteasomal control of transcription factors: mechanisms, regulation and dysregulation.
doi.org/10.1007/s000...
Supervised learning of protein variant effects across large-scale mutagenesis datasets
onlinelibrary.wiley.com/doi/10.1002/...
@tkschulze.bsky.social, Lasse Blaabjerg, @mcagiada.bsky.social
See also: Effects of residue substitutions on the cellular abundance of proteins
doi.org/10.7554/eLif...
5/5
Thea therefore built an approach to train models that takes this dataset-to-dataset variability into account via specific "standard curves", thus enabling training a model on the high-throughput data while learning to predict on the abundance scale only available in low-throughput experiments. 4/5
This variation makes it hard to perform supervised learning because a VAMP-seq score of, say 0.5, can mean quite different things in different datasets (see paper for a discussion of why that's the case). 3/5
Thea collected VAMP-seq data from the literature on how variants impact protein abundance, and showed that while there is a high correlation between abundance (as measured in low-throughput) and the sequence-based VAMP-seq scores, the relationship may be non-linear and vary across datasets. 2/5
New paper from former PhD student @tkschulze.bsky.social on supervised learning of protein variant effects across large-scale mutagenesis datasets
MAVE/DMS experiments provide large amounts of data for benchmarking variant effect predictors, but may be difficult to use in supervised learning. 1/5
That may be the situation in some cases but not in the one I refer to. I donβt think it would have made sense to publish method separately from this application. But strategically it might have been better. So in that case sort of the opposite of the (very real) issue to describe
I don't know. We tried to describe the new technique with a few equations in the main text as well as a flow-chart and benchmarking using synthetic data, targetting what I thought was the right the audience. And we did cut some corners compared to the full Bayesian approach.
But the key ideaβa Bayesian framework that βback-propagatesβ deviations between simulations and experiments to update force-field parameters efficientlyβwas placed in the Supporting Information. In hindsight, that was a mistake: the main conceptual advance was hidden, so few noticed it.
Donβt hide the good stuff in the SI
In 2008, we published a paper on parameterizing force fields for unfolded proteins using NMR data, developing an early HPS model for intrinsically disordered proteins. The paper showed the idea, validation with synthetic data, and applications to real proteins.
FMLWY are wrong, so thatβs indeed more than half that are correct if we accept the weird locations of the ring heteroatoms, how the OHβs are connected and the wrong names for DE
Design of the protein FRET ladder
Fancy a fresh preprint for Friday? When we were first getting involved with single molecule FRET, there weren't any standard protein molecules that suited our applications to help us develop our pipeline. So we built some! A universal protein ladder for FRET. π§΅ 1/
We have started a project trying to predic the interactions/structures of all yeast protein pairs using an AlphaFold pooling approach. We are making the current dataset open and we welcome collaborations.
www.evocellnet.com/2026/03/mapp...
The Division of Biological Physics @mpipks.bsky.social seeks a Research Group Leader in #biophysics, #softmatter physics, or related areas. (Further particulars in the ad.)
Apply by the 3rd of April 2026 at pks.mpg.de/bprgl to join us in Dresden!
Have a look at our latest publication:
Cupriavidus necator as an alternative source for 15N/13C isotopic enrichment of proteins expressed in insect cells for NMR - enabling structural studies of disease-relevant targets.
Full text link:
link.springer.com/article/10.1...
10712 cups/12 years ~ 2.5 cups/day
Coffee and Tea Intake, Dementia Risk, and Cognitive Function
doi.org/10.1001/jama...
My trusted Jura has made myriad coffee and needed some TLC. Look forward to having it back
First preprint of the @pollyfordyce.bsky.social and @dunnlab.bsky.social collaboration! We used high-throughput microfluidics for sequence-strength mapping at the single-molecule level. Our new tech allowed us to discover a fundamental nonequilibrium property of multivalent systems. 1/13
Silvia's simulations showed that while all epitopes expose Sia for binding, Siglec-6 recognises and binds only GM1 because of a key interaction with the membrane through W127 and K126, which orientates the V-set domain to bind the Sia through Arg122 and the terminal Gal to the C-C' loop π