The new conformal prediction book now seems to be final after a bunch of updates: arxiv.org/abs/2411.118...
The new conformal prediction book now seems to be final after a bunch of updates: arxiv.org/abs/2411.118...
In my #OTD threads I often try to draw attention to important scientists who historically haven't received the same recognition as their peers.
A science education involves a lot of lore: Stories and anecdotes meant to flesh out a narrative about the development of a field...
I have added a new tutorial on discrete diffusion models:
github.com/gpeyre/ot4ml
Was saddened today to hear the passing of Joseph Halpern. I knew him from my undergraduate days at Cornell, for part of which he was the department chair.
www.bangsfuneralhome.com/obituaries/j...
If you are interested in all things β¨causal inferenceβ¨, please join our multidisciplinary Causal Inference Interest Group (CIIG). We host monthly seminars featuring speakers with various academic backgrounds and research interests.
Links below.
cc @clscohorts.bsky.social @pwgtennant.bsky.social
We propose a method for estimating long-term treatment effects with many short-term proxy outcomes: a central challenge when experimenting on digital platforms. We formalize this challenge as a latent variable problem where observed proxies are noisy measures of a low-dimensional set of unobserved surrogates that mediate treatment effects. Through theoretical analysis and simulations, we demonstrate that regularized regression methods substantially outperform naive proxy selection. We show in particular that the bias of Ridge regression decreases as more proxies are added, with closed-form expressions for the bias-variance tradeoff. We illustrate our method with an empirical application to the California GAIN experiment.
arXivππ€
Long-Term Causal Inference with Many Noisy Proxies
By Lal, Imbens, Hull
I've just started reading this new book by Shekhar Khare, a renowned number theorist at UCLA, and I highly recommend it. It takes the reader on an intellectual adventure, with very illuminating analogies and vivid storytelling. It's a book about some brilliant math, but it also has a great heart.
We introduce epiplexity, a new measure of information that provides a foundation for how to select, generate, or transform data for learning systems. We have been working on this for almost 2 years, and I cannot contain my excitement! arxiv.org/abs/2601.03220 1/7
New paper: Inference on Local Variable Importance Measures for Heterogeneous Treatment Effects (with Peter B. Gilbert & @alexluedtke.bsky.social).
Preprint: arxiv.org/abs/2510.18843.
Gradient optimization methods: the benefits of instability β Peter Bartlett, UC Berkeley
www.youtube.com/watch?v=wEgT...
#MathSky #SMRISeminar
Read last night. Very nice. arxiv.org/abs/2512.01868
TLDR; The PSF has made the decision to put our community and our shared diversity, equity, and inclusion values ahead of seeking $1.5M in new revenue. Please read and share. pyfound.blogspot.com/2025/10/NSF-...
π§΅
A nice variant of the kernel two-sample test. arxiv.org/abs/2510.11853
Sketch of the idea: The MMD is the core of a commonly used nonparametric test for distribution testing. It works by embedding distributions into a RKHS and comparing their mean embeddings. [+]
As a grad student, the biologist Yitzhi βPatrickβ Cai helped program π. π€π°ππͺ bacteria to become a biosensor for arsenic contamination in drinking water. Today, he is leading a global effort to build the first-ever synthetic eukaryotic genome. www.quantamagazine.org/hes-gleaning...
We have two new mentees who are offering their time via office hours! Please show Sandeep Silwal and Kevin Tian some love and sign up to meet them!
let-all.com/officehours....
The paper this talk is based on is quite impressive arxiv.org/abs/2507.04441 one of those cases where you see direct real actionable insight using the categorical hammer.
Today at IAS, I gave a 2 hr 15 mins lecture on why TIME[t] is in SPACE[β(t log t)]. You can watch it here!
www.youtube.com/watch?v=ThLv...
Aligning an AI with human preferences might be hard. But there is more than one AI out there, and users can choose which to use. Can we get the benefits of a fully aligned AI without solving the alignment problem? In a new paper we study a setting in which the answer is yes.
Sergei Drozdov has published his nice proof using hyperbolic simplexes of the necessary and sufficient condition on the radii of spheres that sit inside and outside a Euclidean simplex in any dimension.
www.sciencedirect.com/science/arti...
The workshops focused on (in chronological order):
- Variational Inference (youtube.com/playlist?lis...)
- Optimal Transport (youtube.com/playlist?lis...)
- Parallel Computing (youtube.com/playlist?lis...), and
- Computational Physics (youtube.com/playlist?lis...).
The mathematician Lingrui Ge recently helped find a new way to understand the solutions of almost-periodic functions, important equations that appear in quantum physics. The work has helped cement an intriguing connection between number theory and physics. www.quantamagazine.org/ten-martini-...
Pretty cool: the "Fundamental Examples" of independence structures in Non-Commutative Probability.
Carlos Cinelli, Avi Feller, Guido Imbens, Edward Kennedy, Sara Magliacane, Jose Zubizarreta
Challenges in Statistics: A Dozen Challenges in Causality and Causal Inference
https://arxiv.org/abs/2508.17099
Adi Shamir's advice to young researchers:
1. Read, read, read. Back in the eighties, I read every cryptography paper out there. Once that became impossible, I read the abstract of every paper. Now I read at least every title.
This week's post is about why spherical cows are physics' mascot βοΈπ§ͺ
open.substack.com/pub/nirmalya...
Big fan of this perspective:
This is about one of my greatest inspirations. It would mean a lot to me if you gave it a watch
Regardless of whether you plan to use them in applications, everyone should learn about Gaussian processes, and Bayesian methods. They provide a foundation for reasoning about model construction and all sorts of deep learning behaviour that would otherwise appear mysterious.
After a bit of a summer pause, I'm back to making episodes. In this episode, I explain the notion of confounding, and clarify why confounders should not be thought of as alternate explanations of an observed effect.
youtu.be/kAgS7cltBhM
Randomized trials (RCTs) help evaluate if deploying AI/ML systems actually improves outcomes (e.g., survival rates in a healthcare context).
But AI/ML systems can change: Do we need a new RCT every time we update the model? Not necessarily, as we show in our UAI paper! arxiv.org/abs/2502.09467