Correction! @jamditis.bsky.social 's skills collection is at skills.amditis.tech (Not presented at #NICAR26 as far as I know but a worthwhile resource for journalists interested in skills!)
Correction! @jamditis.bsky.social 's skills collection is at skills.amditis.tech (Not presented at #NICAR26 as far as I know but a worthwhile resource for journalists interested in skills!)
Interested in using AI skills for Claude/Codex in journalism? Check out @akesslerdc.bsky.social 's #NICAR26 session repo: github.com/amkessler/ni...
Another great resource: @jamditis.bsky.social 's skills collection for journalists/researchers/academics github.com/amkessler/ni...
This #rstats #package #negligible examine negligible effect / #equivalent testing in #SEM model after #lavaan, a few of the functions include
1) #neg.semfit (CFI, RMSEA, SRMR)
2) #neg.normal
github.com/cribbie/negl...
Lots to say on this, but one thing recently came to my mind is the isomorphism between “wide format” & “long format” for panel datasets. The latter treats time as a variable, which confuses the issue, at least conceptually. DiD with wide format is clear and here: journals.sagepub.com/doi/10.1177/...
And if you want to hear more about this, you can attend my seminar at @lshtm-dash.bsky.social on 26 February - both IN PERSON and ONLINE!
A screenshot of the mode effects database with example survey item of "Did not always wear a seat belt (%)"
What to do if you don't know what the size of the mode effect is likely to be?
We've got you covered!
We have a database of mode effect estimates that can be used to inform this decision: cls-data.github.io/mode-effects...
👉🏼 Stay tuned for a step-by-step tutorial on how to apply QBA.
What to do instead?
With some information on the likely size of the mode effect, we can do quantitative bias analysis:
• calibrate ('correct') measures in one of the modes, as if obtained by the other
or
• estimate the likely impact mode effects would have on our substantive conclusions.
DAGs depicting an example of collider bias introduced by conditioning on mode due to it being a common consequence of the latent exposure and another unobserved variable.
As is often the case, collider bias can occur in multiple ways. For example, via an unobserved common cause of the outcome and mode.
We can therefore only use conditioning where we can plausibly assume no mode selection exists, or where we can condition on all such common causes.
DAGs depicting an example of confounding introduced by mode, due to the presence of mode effects on the measures of both the exposure and outcome.
The exact consequences will depend on the scenario. Generally:
Mode effects on the outcome -> uncertainty in the estimate
Mode effects on the exposure -> regression dilution bias
Mode effects on both -> confounding
In all of these, however, simply conditioning on mode can resolve the problem.
A measured version of a variable (X*), which is caused by its latent version (X) and Mode
Variables can be measured differently based on the mode used. These "mode effects" can be conceptualised as a type of systematic measurement error.
For example, sensitive information may be under-reported when provided to a human interviewer, compared to in a self-completion questionnaire.
The way data are collected (e.g. in person, online, telephone) is referred to as a "mode". More and more surveys are transitioning to mixed-mode designs.
This creates two interesting challenges: "mode effects" and "mode selection". As usual, DAGs are very helpful for understanding such phenomena.
Users of survey data, lovers of DAGs, and general methodological enthusiasts, gather round!
I'm so excited to share this new paper, joint work with my brilliant colleagues @rjsilverwood.bsky.social, @pwgtennant.bsky.social, and Liam Wright.
🧵
Somewhat fittingly, here's a super recent paper discussing survey mode effects with causal graphs: bsky.app/profile/geor...
>
Two scenarios discussed with causal graphs: Survey mode causally affects the gender gap in life satisfaction and Survey mode is confounded with the gender gap in life satisfaction
This could either be a gender-specific survey mode effect, or just a reflection of selection effects, or a mix of the two. What we consider more likely determines how we should analyze the data though.>
Mean life satisfaction in the survey years 2010, 2015 and 2023, separately for female and male gender. In general, girls have lower scores but in 2023, they are drifting even further down
The city of Leipzig in Germany conducts large-scale school surveys of adolescents in secondary education schools. Following the regular surveys in 2010 and 2015, the 2020 survey had to be rescheduled to 2023 due to the COVID-19 pandemic. In this latest survey wave, the gender gap in general life satisfaction has significantly grown. While in 2010 and 2015 girls were somewhat less satisfied than boys (0.26 to 0.33 SD), in 2023 this gender gap had doubled (with girls 0.57 SD less satisfied). Why? Here, we probe various explanations, aiming to provide a template for researchers who are asking reverse causal questions (“What caused this?”). First, we find that the widening of the gender gap is much more pronounced among students with a migration background. This could plausibly be explained by a shift in the composition of the underlying population, with a strong increase of Syrian students, and a relative decrease of Vietnamese ones. Second, among students without a migration background, part of the increasing gender gap could potentially be attributed to survey mode: In 2023, for the first time, the survey was conducted on tablets—and unexpectedly, girls (but not boys) reported significantly lower satisfaction when surveyed on tablet rather than on paper. Third, beyond these two patterns, we still find significantly widening gender gaps in satisfaction with leisure time activities and relationships to friends. Thus, there may be a substantive increase in the gender gap in satisfaction in those two domains that is not readily attributable to changes in population and survey mode.
New preprint 🥳
The city of Leipzig conducts large-scale surveys of adolescents. In 2023, the gender gap in life satisfaction has significantly widened, with girls declining more steeply than boys. What's up with that?>
(work with @rmcelreath.bsky.social and @gregork.bsky.social)
There's possible reverse causality, there's potential reverse causality, and then there's the fear that young people living with their parents will hurt their job prospects.
Very nice agentic uses in academia! By @gvrkiran.bsky.social
m.youtube.com/playlist?lis...
DAG representing the causal structure of a standard difference-in-differences design with two locations and two time periods—units in one location in the post-period receive treatment. $L$ = group or location indicator (treated vs. untreated location); $T$ = time indicator (pre vs. post period); $U$ = unobserved time-invariant confounders (e.g., GDP per capita, general health status, public health infrastructure). $X \leftarrow T \rightarrow Y$ represents a common time trend affecting both locations equally. The causal effect of $X$ on $Y$ is identified by conditioning on $\{L, T\}$, which corresponds to using location and time indicator variables in a regression like `y ~ location * period`.
DAG representing the causal structure of a standard difference-in-differences design, but with explicit pre- and post-treatment outcomes. $L$ = group or location indicator (treated vs. untreated location); $T_\text{post}$ = post-period measurement (indicator that the observation occurs after the intervention); $X_\text{post}$ = treatment (which only occurs for treated locations in the post period); $Y_\text{pre}$ and $Y_\text{post}$ = outcome measured before and after the intervention. $U$ = unobserved time-invariant confounders (e.g., GDP per capita, general health status, public health infrastructure). $Y_\text{pre} \rightarrow Y_\text{post}$ represents outcome persistence (e.g. autocorrelation or slow-moving changes); $X_\text{post} \leftarrow T_\text{post} \rightarrow Y_\text{post}$ represents a common time trend affecting both locations equally. The causal effect of $X_\text{post}$ on $Y_\text{post}$ is identified by conditioning on $\{L, T_\text{post}\}$, which corresponds to using location and time indicator variables in a regression like `y ~ location * period`.
spending my sunday evening once again attempting to draw a DAG for diff-in-diff
This thread is interesting, but just wanna propose that academic philosophers are also thoroughly in the "think its normal to reason with/about counter-factuals, and especially think they are vital for causal inference" crowd. Easily doubling the number of voters who appreciate that to the low 100s.
In a way, conterfactual thinking *is* arcane. Most people - including policy-makers and many who took a causal inference course in grad school and should know better - just work backwards from the conclusion they want to reach.
What the BAT chief scientist had to say about causality half a century ago
E-cigarette advocates, global warming deniers and social media companies continue to sidestep the implications of research showing dangers of their products by, among other things, retreating behind the high walls of claims…
Causal AI in Clinical Trials: Vin Singh of BullFrog AI
open.substack.com/pub/afurther...
A big difference between the 21st century evolution of two social sciences, economics and psychology is that economics got very serious about causal inference, and psychology... didn't.
Partly because economists are so focused on observational data, partly because they're better at math.
On Causality
A History of How Economics Learned to Think About Cause and Effect carloschavezp29.substack.com/p/on-causali...
New blog post about the age-period-cohort identification problem!
In which, for the first time ever, I ask "What's the mechanism?" and also suggest that sometimes you may actually *not* be interested in causal inference.
www.the100.ci/2026/02/13/o...
#statstab #485 Bayesian ANCOVA and the ATE
Thoughts: Still grappling with the implications of using the causal inference approach to randomized experiments. But it's interesting.
#ATE #causalinference #ancova #ANOVA #rstats #estimand #counterfactuals
solomonkurz.netlify.app/blog/2025-07...
A while back, I wrote a thing. If you like experiments and causal inference, you should read it:
On this page What’s the difference between statistical significance and substantial significance? Can we measure substantial significance with statistics? What are all the different ways we can look at model coefficients? Print the object name Use summary() Use tidy() from the {broom} package Use model_parameters() and model_details() from the {parameters} and {performance} packages Make nice polished side-by-side regression tables with {modelsummary} Make automatic coefficient plots with modelplot() from {modelsummary} Plot model predictions and marginal effects Automatic interpretation with {report}
Posted a helpful little set of FAQs about regression for my causal inference class, including illustrations of statistical vs. substantive signficance and all the different things you can do with #rstats model objects
evalsp26.classes.andrewheiss.com/news/2026-02...
Happy to share our recent article on causal inference in science studies. It aims to introduce causal thinking to the science of science community with an example from Open Science.