Scott James Perry (@sjperry)

Not fully randomized, but sort of quasi-experimental. Looking at verb choice during an elicited narration. There were three different possible verbs for the grammatical construct under study. If this is close enough I can probably find the data.

19.02.2026 19:27 👍 0 🔁 0 💬 0 📌 0

A picture of a slide titled "The estimand" with the subtitle (The most important word you'll never hear).

Invited talk to a doctoral seminar tomorrow, and we're taking time to talk about the estimand. I feel like senior researchers generally have an implicit understanding of this concept, but I've found it helps people a lot to talk about it explicitly at the beginning of statistical training.

18.01.2026 12:39 👍 1 🔁 0 💬 0 📌 0

In data cleaning this week, I learned that ePrime often defaults to entering a value of '0' when there is no response. This is pretty bonkers, and it's crazy that a dedicated experimental software that you pay a lot of money for does this. Needless to say I have to re-write some scripts...

13.11.2025 16:22 👍 0 🔁 0 💬 0 📌 0

I was hired to analyze some data a while back and the professors were surprised at the end when I sent a folder with all of the scripts + a README file. Apparently the consultant before me never shared scripts because he said "then the next time you'll just use the script and not have to hire me".

23.09.2025 15:16 👍 5 🔁 0 💬 2 📌 0

The numbers on sharing these things later on "reasonable request" are pretty bleak, so if you don't share it puiblicly in a way that is linked with the paper somehow that information is/will be gone forever. If you can't share the evidence of a claim, the strength claims make by the paper is weaker.

16.07.2025 13:52 👍 0 🔁 0 💬 0 📌 0

I always go through scripts shared during peer-review and make sure they are clear, run properly, and check it for common statistical errors. I have found everything from silly mistakes in coding to serious errors in data cleaning that invalidated the claims of the paper.

16.07.2025 13:50 👍 0 🔁 0 💬 1 📌 0

b) Unless there is a valid reason for the data to NOT be shared (and there are several), it should be the default because it's hard to verify what was done with a script and no data. Or at least it's harder to tell if what was done was reasonable.

16.07.2025 13:47 👍 0 🔁 0 💬 1 📌 0

IMO data cleaning scripts also need to be in the open materials, although I've had spirited discussions about this with co-authors before who think that only the bare minimum code to replicate the numbers in the paper needs to be shared.

16.07.2025 13:46 👍 0 🔁 0 💬 2 📌 0

a) It stands as a record of what you did, including details that will never make it into a paper. As analyses get larger and more unwieldy, less details can be reported in papers that sometimes have strict word limits.

16.07.2025 13:45 👍 0 🔁 0 💬 2 📌 0

An exception: you want cheddar in Spain and all the locally-made stuff is white and the KerryGold is yellow.

01.07.2025 05:59 👍 1 🔁 0 💬 0 📌 0

I honestly think that I've had to cite this paper in more than half of the reviews I've ever done. At least in linguistics, it seems to be a super common misinterpretation.

03.06.2025 07:13 👍 3 🔁 0 💬 0 📌 0

My son tried a bagel for first time recently. My wife taught him to correct my pronunciation. Now, when I say "do you want a [bægəl]?", his response is "No, [beɪgəl]!".

25.05.2025 10:58 👍 1 🔁 0 💬 0 📌 0

Yeah, I taught a class in a new department during the pandemic when masks were mandatory. When the mandate lifted and they all stopped wearing them I realized that I couldn't recognize most students because I filled in the lower half of their faces and was not accurate not even once.

21.05.2025 08:34 👍 2 🔁 0 💬 0 📌 0

To be fair I built and troubleshot the model/priors on simulated data a few months ago before we built the survey, but I always feel more comfortable when something goes wrong and I dig around to fix it. Now I feel like a cartoon character that's walked off a cliff but hasn't looked down yet.

20.05.2025 12:59 👍 0 🔁 0 💬 0 📌 0

DW from Arthur looking super suspicious

Me when my Bayesian SEM model samples properly and quickly the first time around:

20.05.2025 12:57 👍 2 🔁 0 💬 1 📌 0

I'm still very novice in terms of Stan code (simple models I can code up - hierarchical models kick my butt), but if your target audience is brms people looking to learn more Stan then that's me and I'd be happy to work through drafts and give feedback.

20.05.2025 12:38 👍 5 🔁 0 💬 0 📌 0

The idea that human cognition is, or can be understood as, a form of computation is a useful conceptual tool for cognitive science. It was a foundational assumption during the birth of cognitive science as a multidisciplinary field, with Artificial Intelligence (AI) as one of its contributing fields. One conception of Al in this context is as a provider of computational tools (frameworks, concepts, formalisms, models, proofs, simulations, etc.) that support theory building in cognitive science. The contemporary field of Al, however, has taken the theoretical possibility of explaining human cognition as a form of computation to imply the practical feasibility of realising human(-like or -level) cognition in factual computational systems; and, the field frames this realisation as a short-term inevitability. Yet, as we formally prove herein, creating systems with human(-like or -level) cognition is intrinsically computationally intractable.

🚨Our paper `Reclaiming AI as a theoretical tool for cognitive science' is now forthcoming in the journal Computational Brain & Behaviour. (Preprint: osf.io/preprints/ps...)

Below a thread summary 🧵1/n

#metatheory #AGI #AIhype #cogsci #theoreticalpsych #criticalAIliteracy

16.08.2024 19:40 👍 484 🔁 168 💬 22 📌 63

Happy US book release day to me, my amazing co-author @alexhanna.bsky.social and everyone else who can now open up their own copy!!

All the details on ordering, events, and news coverage at thecon.ai

13.05.2025 12:31 👍 277 🔁 62 💬 16 📌 16

The Stan Forums A community to discuss Stan and Bayesian modeling.

I've recently spent lots of time on the Discourse pages of different academic softwares, and it made me appreciate the Stan site so much. I found horrendous amounts of condescension and general unhelpfulness out there, which is not something I see much of over at discourse.mc-stan.org

06.05.2025 12:22 👍 2 🔁 0 💬 0 📌 0

No, YOU just wasted an hour checking whether or not the list randomization in your experiment was working before realizing you'd copy and pasted the same file into all parts of the ifelse statement...

04.05.2025 18:37 👍 1 🔁 0 💬 0 📌 0

Tagging @dingdingpeng.the100.ci and @vincentab.bsky.social because I told them I'd keep them in the loop. Also, I'll be writing up the code and examples used in this class into a blog post directed towards linguists that want to start to use the package marginaleffects!

11.04.2025 13:45 👍 4 🔁 0 💬 2 📌 0

For the few it didn't work for, I'm trying to keep in mind that the previous version had been honed for 4 years, while this was my first time teaching this. I'm sure I have lots of room for improvement in walking others through this material. Hopefully this can mitigate the confusion in future years

11.04.2025 13:42 👍 1 🔁 0 💬 1 📌 0

Overall, I think I will pivot to teaching marginaleffects for model interpretation. It seems like it was positive for many students, neutral for quite a few, and potentially discouraging for a small minority.

11.04.2025 13:42 👍 1 🔁 0 💬 1 📌 0

A couple students said they understand LMMs and GLMs worse than before the course. This was hard to read, because I want to motivate students to learn stats, not discourage them. I'm hoping this is a "I'm confused because I'm paying attention" thing, but maybe this change didn't work for everyone.

11.04.2025 13:42 👍 1 🔁 0 💬 1 📌 0

We also used non-aggregated predictions and comparisons to help us understand what is actually happening in a mixed-effects model (i.e., varying intercepts and slopes). Of those who already had experience with these, more than half said that doing this made them understand the models better.

11.04.2025 13:41 👍 1 🔁 0 💬 1 📌 0

For predictions/comparisons, I asked how comfortable people felt from "1" (I don't feel comfortable using this at all) to "5" (I could use this in my own research). Most answered between 2-4. Those with previous stats knowledge answered higher. predictions() were rated higher than comparisons().

11.04.2025 13:41 👍 2 🔁 0 💬 1 📌 0

25% of students took this course previously. Of that 25%, half thought that model interpretation was more straightforward with marginaleffects compared to summary tables. The other half said both were equally hard to understand. No one thought summary tables were easier, or that both were easy.

11.04.2025 13:40 👍 3 🔁 0 💬 1 📌 0

Some context: 1/3 of my students were learning about regression for the first time. Another third took an MA course with me where we learned to fit models and interpret summary tables, along with visualizing model predictions. Another third were PhD students with previous knowledge about regression

11.04.2025 13:40 👍 2 🔁 0 💬 1 📌 0

The past 4 weeks I've taught a regression crash course to graduate students in linguistics. This is my fifth year teaching the course, and I decided to switch to teaching model interpretation using the marginaleffects package. Here are my thoughts along with info from an anonymous student survey 🧵

11.04.2025 13:39 👍 11 🔁 3 💬 1 📌 0

a close up of a man in a military uniform looking at the camera with a serious look on his face . Alt: a close up of a man in a military uniform looking at the camera with a serious look on his face. He ages rapidly.

Me, designing a fun "analog vs. digital" activity for my digital signal processing lecture when I realize my students probably never owned an iPod.

02.04.2025 09:33 👍 0 🔁 0 💬 0 📌 0

Scott James Perry

Latest posts by Scott James Perry @sjperry