Title: LOO-PIT predictive model checking ,
Authors: Herman Tesso and Aki Vehtari ,
Abstract: We consider predictive checking for Bayesian model assessment using leave-one-out
probability integral transform (LOO-PIT). LOO-PIT values are conditional cumulative predictive
probabilities given LOO predictive distributions and corresponding left out observations. For a
well-calibrated model, LOO-PIT values should be near uniformly distributed, but in the finite sample
case they are not independent, due to LOO predictive distributions being determined by nearly the
same data (all but one observation). We prove that this dependency is non-negligible in the finite
case and depends on model complexity. We propose three testing procedures that can be used for
continuous and discrete dependent uniform values. We also propose an automated graphical method
for visualizing local departures from the null. Extensive numerical experiments on simulated and real
datasets demonstrate that the proposed tests achieve competitive performance overall and have much
higher power than standard uniformity tests based on the independence assumption that inevitably
lead to lower than expected rejection rate
If you have been using LOO-PIT, this is a must read for you;
"LOO-PIT predictive model checking" by me and @avehtari.bsky.social ,
doi.org/10.48550/arX.... 1/4
04.03.2026 12:39
π 8
π 1
π¬ 1
π 3
If you have been using LOO-PIT, this is a must read for you! @herman-tesso.bsky.social has done excellent work with this paper! Thanks for @florencebockting.bsky.social and @aloctavodia.bsky.social for getting this to bayesplot and ArviZ. I'll notify when I have my casestudies updated with this
04.03.2026 12:41
π 17
π 3
π¬ 0
π 0
Data Science Best Sellers
Bayesian Analysis with Python is part of Packtβs $10 eBook campaign right now. If itβs been on your reading or recommendation list, this could be a good time to grab it, along with many other books too: landing.packtpub.com/data-science...
08.01.2026 10:28
π 2
π 0
π¬ 1
π 0
Bayesian Data Analysis course - Aalto 2025 β Bayesian Data Analysis course
All the material for my Bayesian Data Analysis course is available online, including the lectures, which we re-recorded this fall (some of them by @aloctavodia.bsky.social and Noa Kallioinen while I was on vacation). The video links are listed in the schedule at avehtari.github.io/BDA_course_A...
11.12.2025 14:20
π 77
π 30
π¬ 0
π 0
Latent projection predictive feature selection
New projpred (projection predictive variable selection for brms and rstanarm) release 2.10.0. Frank Weber added support for censored observations when using the latent projection (see a vignette mc-stan.org/projpred/art...). @aloctavodia.bsky.social fixed bugs and made the release.
10.12.2025 08:44
π 10
π 3
π¬ 0
π 0
Now I'm also looking for a research software engineer to implement a pile of research results to R packages loo, posterior, bayesplot, projpred, priorsense, brms or/and Python packages ArviZ, Bambi and Kulprit. Apply by email with no specific deadline (see contact info at users.aalto.fi/~ave/)
03.11.2025 11:13
π 55
π 51
π¬ 2
π 2
I'm now also looking for a postdoc with strong Bayesian background and interest in developing Bayesian cross-validation theory, methods and software. Apply by email with no specific deadline (see contact information at users.aalto.fi/~ave/).
Others, please share
29.10.2025 14:37
π 29
π 22
π¬ 1
π 1
Uncertainty in Bayesian Leave-One-Out Cross-Validation Based Model Comparison
It is useful to estimate the expected predictive performance of models planned to be used for prediction. We focus on leave-one-out cross-validation (LOO-CV), which has become a popular method for estimating predictive performance of Bayesian models. Given two models, we are interested in comparing the predictive performances and associated uncertainty, which can also be used to compute the probability of one model having better predictive performance than the other model. We study the properties of the Bayesian LOO-CV estimator and the related uncertainty quantification for the predictive performance difference, and analyse when a normal approximation of this uncertainty is well calibrated and whether taking into account higher moments could improve the approximation. We provide new results of the properties both theoretically in the linear regression case and empirically for hierarchical linear, latent linear, and spline models and discuss the challenges. We show that problematic cases include: comparing models with similar predictions, misspecified models, and small data. In these cases, there is a weak connection between the distributions of the LOO-CV estimator and its error. We show that that the problematic skewness of the error distribution for the difference, which occurs when the models make similar predictions, does not fade away when the data size grows to infinity in certain situations. Based on the results, we also provide some practical recommendations for the users of Bayesian LOO-CV for comparing predictive performance of models.
"Uncertainty in Bayesian leave-one-out cross-validation based model comparison" with Tuomas Sivula, Asael Alonzo Matamoros, and @mansmag.bsky.social, has been published in Bayesian Analysis doi.org/10.1214/25-B...
π§΅ 1/
28.10.2025 13:39
π 19
π 4
π¬ 1
π 0
I got mail! I canβt not wait @vincentab.bsky.social Iβll try to do many of these examples by βhandβ (learning by doing).
14.10.2025 12:22
π 8
π 1
π¬ 1
π 0
ELLIS PhD Program: Call for Applications 2025
The ELLIS mission is to create a diverse European network that promotes research excellence and advances breakthroughs in AI, as well as a pan-European PhD program to educate the next generation of AI...
I'm looking for a doctoral student with Bayesian background to work on Bayesian workflow and cross-validation (see my publication list users.aalto.fi/~ave/publica... for my recent work) at Aalto University.
Apply through the ELLIS PhD program (dl October 31) ellis.eu/news/ellis-p...
06.10.2025 09:28
π 47
π 33
π¬ 0
π 2
Our faculty is looking for PhD students in artificial intelligence and machine learning! Meet our new Principal Investigators and apply for a PhD position by the end of October: www.ellisinstitute.fi/PIs-2025
All the info about the ELLIS PhD program is below! π
02.10.2025 11:30
π 2
π 3
π¬ 0
π 0
The Pink Book of #MarginalEffects (aka Model to Meaning) ships next week and I've got a backlog of Zoolander memes.
Hope you're hungry for some spam in your timeline.
#RStats #PyData
22.09.2025 16:52
π 89
π 18
π¬ 1
π 3
Our recent media coverage π§΅π with features in @hs.fi, Finland's public broadcaster Yle and the @aalto.fi Keys to Growth series. cc @ellis.eu @okm.fi @csaalto.bsky.social
bsky.app/profile/elli...
22.09.2025 11:29
π 3
π 3
π¬ 1
π 0
For data scientists using VS Code: a new resource just dropped to help you easily migrate your setup to Positron.
Check it out here: positron.posit.co/migrate-vsco...
#Python #VSCode #Positron
03.09.2025 13:10
π 12
π 5
π¬ 0
π 0
Happy to announce β¨quarto-revealjs-editableβ¨
This fully supersedes the imagemover extension, as I back then didn't realize the potential. You can now also move, resize, change font size and alignment for text in your slides
github.com/EmilHvitfeld...
#quarto #slidecrafting
20.08.2025 17:38
π 117
π 36
π¬ 10
π 8
Useless posterior predictive checking bar graphs for Models 1 and 2
Posterior predictive checking of binary, categorical and many ordinal models with bar graphs is useless. Even the simplest models without covariates usually have such intercept terms that category specific probabilities are learned perfectly. Can you guess which model, 1 or 2, is misspecifed? 1/4
13.08.2025 14:33
π 35
π 6
π¬ 1
π 1
LinkedIn
This link will take you to a page thatβs not on LinkedIn
The ArviZ core devs have done tremendous work on an improved API with a lot of novel improvements. They have put together a great migration guide: python.arviz.org/en/stable/us...
If you are an ArviZ user please take a look at it and provide feedback. Open source is all about the community π«Ά
09.07.2025 17:29
π 7
π 3
π¬ 0
π 0
I wrote a blog post to celebrate 10 years of loo package π (R package implementing fast Pareto smoothed importance sampling cross-validation and many other useful methods for cross-validation)
26.06.2025 11:01
π 66
π 14
π¬ 0
π 1
Some people think RΒ² doesnβt belong in Bayesian models
π David Kohns disagrees, and he has the math to back it
ποΈEp. 134: @alex-andorra.bsky.social sits down with economist David Kohns to explore how modern Bayesian methods are reshaping time series modelling
π§ learnbayesstats.com/episode/134-...
13.06.2025 15:27
π 6
π 3
π¬ 0
π 1
Mathematics of Machine Learning
Data Science | Packt
Been following Mathematics of Machine Learning since early on, great to see it out!
Most ML math books are either too applied or too abstract. This one hits the middle: rigorous, relevant, and approachable without dumbing things down. And with Python examples!
landing.packtpub.com/mathematics-...
13.06.2025 12:27
π 12
π 2
π¬ 1
π 0
For simple estimands, treating everything as Gaussian works unreasonably well! But lots to learn from less simple estimands. @avehtari.bsky.social has a nice case study examining this (part of our forthcoming book on workflow) users.aalto.fi/~ave/casestu...
28.05.2025 18:24
π 69
π 8
π¬ 2
π 0
Dynamic Regression Case Study
New to the blog-game, but excited to share a piece I wrote on how to use the ARR2 prior for dynamic regression using cmdstanr: davkoh.github.io/case-studies...
It extends the idea of using R2-type priors to autoregressive state-space models (published in Bayesian Analysis) π΄ββ οΈ @avehtari.bsky.social
01.06.2025 12:49
π 12
π 2
π¬ 1
π 1
Photo of four persons in front of a sign saying "Mordor". One of the persons is holding flowers and another one is holding an ice cream.
We went to Mordor and all we got were flowers and ice cream.
Bayesian workflow group was a runner-up in Aalto Open Science Award 2024. The current and past group members running-up in alphabetical order: Alejandro Catalina, Anna Riha, Asael Alonzo Matamoros, David Kohns, ...
20.05.2025 12:29
π 25
π 2
π¬ 1
π 0
NeurIPS participation in Europe
We seek to understand if there is interest in being able to attend NeurIPS in Europe, i.e. without travelling to San Diego, US. In the following, assume that it is possible to present accepted papers ...
Would you present your next NeurIPS paper in Europe instead of traveling to San Diego (US) if this was an option? SΓΈren Hauberg (DTU) and I would love to hear the answer through this poll: (1/6)
30.03.2025 18:04
π 280
π 160
π¬ 6
π 12
Happy to hear that you find the functions in PreliZ useful! Iβm unsure I can help with the R part, but Iβd love to hear if you think anything is missing in PreliZ.
05.03.2025 07:03
π 2
π 0
π¬ 0
π 0
Abstract
Introduction
A key step in the Bayesian workflow for model building is the graphical assessment of model predictions, whether these are drawn from the prior or posterior predictive distribution. The goal of these assessments is to identify whether the model is a reasonable (and ideally accurate) representation of the domain knowledge and/or observed data. There are many commonly used visual predictive checks which can be misleading if their implicit assumptions do not match the reality. Thus, there is a need for more guidance for selecting, interpreting, and diagnosing appropriate visualizations. As a visual predictive check itself can be viewed as a model fit to data, assessing when this model fails to represent the data is important for drawing well-informed conclusions.
Demonstration
We present recommendations for appropriate visual predictive checks for observations that are: continuous, discrete, or a mixture of the two. We also discuss diagnostics to aid in the selection of visual methods. Specifically, in the detection of an incorrect assumption of continuously-distributed data: identifying when data is likely to be discrete or contain discrete components, detecting and estimating possible bounds in data, and a diagnostic of the goodness-of-fit to data for density plots made through kernel density estimates.
Conclusion
We offer recommendations and diagnostic tools to mitigate ad-hoc decision-making in visual predictive checks. These contributions aim to improve the robustness and interpretability of Bayesian model criticism practices.
New paper SΓ€ilynoja, Johnson, Martin, and Vehtari, "Recommendations for visual predictive checks in Bayesian workflow" teemusailynoja.github.io/visual-predi... (also arxiv.org/abs/2503.01509)
04.03.2025 13:15
π 64
π 20
π¬ 4
π 0
Field of Play UK | Sports data analytics conference
Ready to level up your sports analytics game? Attend our sports data conference on 18th March run by Field of Play UK
βΎ @fonnesbeck.bsky.social (@pymc-labs.bsky.social, @pymc.io) will be at the Field of Play Conference giving a talk on Bayesian modelling in baseball.
π Our host, @alex-andorra.bsky.social , will also be attending, donβt miss this chance to connect and chat research!
π www.fieldofplay.co.uk
28.02.2025 15:00
π 2
π 2
π¬ 0
π 0