I am pleased to share that my paper "VCBART: Bayesian Trees for Varying Coefficients" (with Sameer Deshpande, Cecilia Balocchi, Jennifer Sterling, and Jordan Weiss) has been published in the latest issue of Bayesian Analysis!
Read it here: doi.org/10.1214/24-B...
10.03.2026 03:23
π 4
π 0
π¬ 0
π 0
Seven Major Directions and Trends in Modern Statistics β Ray Bai
New blog post: "Seven Major Directions and Trends in Modern Statistics"! In this post, I summarize a few of the latest trends and prominent areas in the field of statistics.
raybai.net/seven-major-...
03.03.2026 18:44
π 0
π 0
π¬ 0
π 0
I often explain deep learning and DGMs to non-experts & students who are not familiar but are interested in exploring this area. I find it's very helpful to start by framing linear regression and logistic regression as special cases of neural networks with a single output layer.
25.02.2026 17:56
π 1
π 0
π¬ 0
π 0
Happening tomorrow at UMBC! Excited for my visit
19.02.2026 18:33
π 0
π 0
π¬ 0
π 0
(2/2) Never thought of myself as much of a probabilist either, but my recent work on DGMs delved into functional inequalities in probability theory to characterize transport maps. You just never know when these things will pop up or when you'll use them!
15.02.2026 19:22
π 0
π 0
π¬ 0
π 0
(1/2) It's always a bit wild to me when something I learned many years ago comes up again. I wasn't sure I'd ever use differential equations again, but now with flow matching and diffusion models being the current state-of-the-art generative models, I'm reviewing a bit of ODEs.
15.02.2026 19:21
π 3
π 0
π¬ 1
π 0
A bit late but group pic from the Maryland Statistics Symposium at Brinn Mathematics Research Center this past Dec! Left to right: Jianhui Zhou, Gemma Moran, Lizhen Lin, Alden Green, Ray Bai, Anindya Roy, Cindy Rush, Yubai Yuan, Yun Yang, Yang Feng, Anderson Ye Zhang, Yanyuan Ma
13.02.2026 15:55
π 0
π 0
π¬ 0
π 0
I hope Sinners wins the Academy Award for Best Picture this year. Not just because it was an incredible movie, but because as a longtime horror aficionado, this would signal a broader appetite for horror & other genre-bending films in the academy (justice for Get Out!).
12.02.2026 14:04
π 0
π 0
π¬ 0
π 0
I'm giving a talk "Deep Generative Models for Statistical Problems: Methods, Computation, and Theory" at the UMBC Mathematics and Statistics Dept next Friday, Feb. 20 from 11:00 am-12:00 pm! Come join if you're in the area. mathstat.umbc.edu/events/event...
10.02.2026 16:31
π 2
π 0
π¬ 0
π 1
π
09.02.2026 14:40
π 0
π 0
π¬ 0
π 0
This Super Bowl game is fairly boring, but absolutely loved the Halftime Show and the other musical performances! Green Day, Lady Gaga, Bad Bunny β€οΈβ€οΈ
09.02.2026 02:21
π 0
π 0
π¬ 0
π 0
Congrats to my collaborator and former student Qingyang Liu (I taught him in 2 classes, served on his dissertation committee, and have co-authored several papers with him)! He will be joining @wakeforeststats.bsky.social as an Assistant Professor in July. π₯³Great department!
03.02.2026 23:44
π 0
π 0
π¬ 0
π 0
So maddening what happened at the University of Nebraska-Lincoln
magazine.amstat.org/blog/2026/01...
06.01.2026 17:49
π 0
π 0
π¬ 0
π 0
Our paper "Quantifying predictive uncertainty of aphasia severity in stroke patients with sparse heteroscedastic Bayesian high-dimensional regression" was published in the most recent issue of Computational Statistics. Read the paper here: doi.org/10.1007/s001...
02.01.2026 13:57
π 1
π 0
π¬ 0
π 0
Will you incorporate LLMs and AI prompting into the course in the future?
No.
Why wonβt you incorporate LLMs and AI prompting into the course?
These tools are useful for coding (see this for my personal take on this).
However, theyβre only useful if you know what youβre doing first. If you skip the learning-the-process-of-writing-code step and just copy/paste output from ChatGPT, you will not learn. You cannot learn. You cannot improve. You will not understand the code.
In that post, it warns that you cannot use it as a beginner:
β¦to use Databot effectively and safely, you still need the skills of a data scientist: background and domain knowledge, data analysis expertise, and coding ability.
There is no LLM-based shortcut to those skills. You cannot LLM your way into domain knowledge, data analysis expertise, or coding ability.
The only way to gain domain knowledge, data analysis expertise, and coding ability is to struggle. To get errors. To google those errors. To look over the documentation. To copy/paste your own code and adapt it for different purposes. To explore messy datasets. To struggle to clean those datasets. To spend an hour looking for a missing comma.
This isnβt a form of programming hazing, like βI had to walk to school uphill both ways in the snow and now you must too.β Itβs the actual process of learning and growing and developing and improving. Youβve gotta struggle.
This Tumblr post puts it well (itβs about art specifically, but it applies to coding and data analysis too):
Contrary to popular belief the biggest beginnerβs roadblock to art isnβt even technical skill itβs frustration tolerance, especially in the age of social media. It hurts and the frustration is endless but you must build the frustration tolerance equivalent to a roachβs capacity to survive a nuclear explosion. Thatβs how you build on the technical skill. Throw that βwonβt even start because Iβm afraid it wonβt be perfectβ shit out the window. Just do it. Just start. Good luck. (The original post has disappeared, but hereβs a reblog.)
Itβs hard, but struggling is the only way to learn anything.
You might not enjoy code as much as Williams does (or I do), but thereβs still value in maintaining codings skills as you improve and learn more. You donβt want your skills to atrophy.
As I discuss here, when I do use LLMs for coding-related tasks, I purposely throw as much friction into the process as possible:
To avoid falling into over-reliance on LLM-assisted code help, I add as much friction into my workflow as possible. I only use GitHub Copilot and Claude in the browser, not through the chat sidebar in Positron or Visual Studio Code. I treat the code it generates like random answers from StackOverflow or blog posts and generally rewrite it completely. I disable the inline LLM-based auto complete in text editors. For routine tasks like generating {roxygen2} documentation scaffolding for functions, I use the {chores} package, which requires a bunch of pointing and clicking to use.
Even though I use Positron, I purposely do not use either Positron Assistant or Databot. I have them disabled.
So in the end, for pedagogical reasons, I donβt foresee me incorporating LLMs into this class. Iβm pedagogically opposed to it. Iβm facing all sorts of external pressure to do it, but Iβm resisting.
Youβve got to learn first.
Some closing thoughts for my students this semester on LLMs and learning #rstats datavizf25.classes.andrewheiss.com/news/2025-12...
09.12.2025 20:17
π 331
π 99
π¬ 14
π 31
Exam question: After you have explained 97% confidence to Bob, he responds, "I see. 97% is pretty good, but it could be great if we can make a 100% confidence interval." What is your response to this?
Student's answer: "Bob, you are a fool amongst fools. Truly, I pity you. A 100% confidence interval would be useful as it would give us a result of all real numbers. Taht's the only way to be 100% sure our true mean is in the interval; if every number could be included."
Grading my final exams for undergrad probability & statistics, and this response to one of my questions seriously made me laugh out loud for minutes. Should I give Extra Credit for the student's response? "Bob, you are a fool amongst fools." πππ
12.12.2025 18:16
π 1
π 0
π¬ 0
π 0
Our R package for VCBART, or fitting BART-based varying coefficient models, is now available on CRAN! Useful for flexible regression modeling + can be used to estimate heterogeneous treatment effects in causal inference by specifying X and Z appropriately. Check it cran.r-project.org/web/packages...
10.12.2025 04:55
π 1
π 0
π¬ 0
π 0
Colleges Are Preparing to Self-Lobotomize
The skills that students will need in an age of automation are precisely those that are eroded by inserting AI into the educational process.
Yes. "... the skills that future graduates will most need in the AI eraβcreative thinking, the capacity to learn new things, flexible modes of analysisβare precisely those that are likely to be eroded by inserting AI into the educational process."
08.12.2025 06:58
π 1
π 0
π¬ 0
π 0
I'm at the Brin Mathematics Research Center today for the Maryland Statistics Symposium! Presenting my work on generative quantile regression w/ former PhD student Dr. Shijie Wang (U. South Carolina '24) and Dr. Minsuk Shin of Yonsei U. (published in JCGS last year).
05.12.2025 14:20
π 1
π 0
π¬ 0
π 0
Maryland Statistical Symposium | Brin Mathematics Research Center
The Maryland Statistical Symposium looks awesome! brinmrc.umd.edu/fall25-mss/
So honored to be invited to speak at this event alongside many outstanding researchers, some of whose work I have followed and admired for years!
24.11.2025 17:12
π 1
π 1
π¬ 0
π 0
University of Nebraska-Lincoln Department of Statistics seminar "The Metrics" on November 6, 2025
YouTube video by Chris Bilder
If you're following the #UNL #statistics saga (proposed for elimination based on bad stats), you might find the seminar we gave yesterday interesting... youtu.be/fUk2R0UYWpA
It was weird to rail against someone for an hour, but strangely cathartic, and the #datavis seems to have been effective?
07.11.2025 14:33
π 9
π 8
π¬ 1
π 1
Congrats to my student Leah Wood for successfully defending her Senior honors thesis "Spatiotemporal Modeling of Maternal Mortality in South Carolina 2018-2023"! Leah will pursue a Masters in Biostatistics next.
This was on par with an excellent Masters thesis, tbh. Great job!
05.11.2025 20:06
π 6
π 0
π¬ 0
π 0
nailed it!
04.11.2025 06:36
π 1
π 0
π¬ 0
π 0
One week till my trip to Columbia, SC to see my Honors student Leah defend her senior thesis! She did an excellent job on Bayesian spatiotemporal modeling of maternal mortality in South Carolina from 2018-2023. She coded up the model in Stan & R and produced some very nice maps!
29.10.2025 00:40
π 3
π 1
π¬ 0
π 0
Excited to give a talk at the Maryland Statistical Symposium at the Brin Mathematics Research Center this December! Looking forward to connecting with many outstanding statistics researchers in the DMV area and the mid-Atlantic region.
21.10.2025 15:15
π 0
π 0
π¬ 0
π 0
Today is my 40th birthday, and I had a very special treat for it -- getting to meet one of my idols, Dr. Jianqin Fan! Dr. Fan's papers on SCAD penalty and sure independence screening for high-dimensional data were among the first papers I read as a PhD student. So inspiring!
26.09.2025 20:15
π 1
π 0
π¬ 0
π 0