Not fully randomized, but sort of quasi-experimental. Looking at verb choice during an elicited narration. There were three different possible verbs for the grammatical construct under study. If this is close enough I can probably find the data.
Not fully randomized, but sort of quasi-experimental. Looking at verb choice during an elicited narration. There were three different possible verbs for the grammatical construct under study. If this is close enough I can probably find the data.
A picture of a slide titled "The estimand" with the subtitle (The most important word you'll never hear).
Invited talk to a doctoral seminar tomorrow, and we're taking time to talk about the estimand. I feel like senior researchers generally have an implicit understanding of this concept, but I've found it helps people a lot to talk about it explicitly at the beginning of statistical training.
In data cleaning this week, I learned that ePrime often defaults to entering a value of '0' when there is no response. This is pretty bonkers, and it's crazy that a dedicated experimental software that you pay a lot of money for does this. Needless to say I have to re-write some scripts...
I was hired to analyze some data a while back and the professors were surprised at the end when I sent a folder with all of the scripts + a README file. Apparently the consultant before me never shared scripts because he said "then the next time you'll just use the script and not have to hire me".
The numbers on sharing these things later on "reasonable request" are pretty bleak, so if you don't share it puiblicly in a way that is linked with the paper somehow that information is/will be gone forever. If you can't share the evidence of a claim, the strength claims make by the paper is weaker.
I always go through scripts shared during peer-review and make sure they are clear, run properly, and check it for common statistical errors. I have found everything from silly mistakes in coding to serious errors in data cleaning that invalidated the claims of the paper.
b) Unless there is a valid reason for the data to NOT be shared (and there are several), it should be the default because it's hard to verify what was done with a script and no data. Or at least it's harder to tell if what was done was reasonable.
IMO data cleaning scripts also need to be in the open materials, although I've had spirited discussions about this with co-authors before who think that only the bare minimum code to replicate the numbers in the paper needs to be shared.
a) It stands as a record of what you did, including details that will never make it into a paper. As analyses get larger and more unwieldy, less details can be reported in papers that sometimes have strict word limits.
An exception: you want cheddar in Spain and all the locally-made stuff is white and the KerryGold is yellow.
I honestly think that I've had to cite this paper in more than half of the reviews I've ever done. At least in linguistics, it seems to be a super common misinterpretation.
My son tried a bagel for first time recently. My wife taught him to correct my pronunciation. Now, when I say "do you want a [bรฆgษl]?", his response is "No, [beษชgษl]!".
Yeah, I taught a class in a new department during the pandemic when masks were mandatory. When the mandate lifted and they all stopped wearing them I realized that I couldn't recognize most students because I filled in the lower half of their faces and was not accurate not even once.
To be fair I built and troubleshot the model/priors on simulated data a few months ago before we built the survey, but I always feel more comfortable when something goes wrong and I dig around to fix it. Now I feel like a cartoon character that's walked off a cliff but hasn't looked down yet.
DW from Arthur looking super suspicious
Me when my Bayesian SEM model samples properly and quickly the first time around:
I'm still very novice in terms of Stan code (simple models I can code up - hierarchical models kick my butt), but if your target audience is brms people looking to learn more Stan then that's me and I'd be happy to work through drafts and give feedback.
The idea that human cognition is, or can be understood as, a form of computation is a useful conceptual tool for cognitive science. It was a foundational assumption during the birth of cognitive science as a multidisciplinary field, with Artificial Intelligence (AI) as one of its contributing fields. One conception of Al in this context is as a provider of computational tools (frameworks, concepts, formalisms, models, proofs, simulations, etc.) that support theory building in cognitive science. The contemporary field of Al, however, has taken the theoretical possibility of explaining human cognition as a form of computation to imply the practical feasibility of realising human(-like or -level) cognition in factual computational systems; and, the field frames this realisation as a short-term inevitability. Yet, as we formally prove herein, creating systems with human(-like or -level) cognition is intrinsically computationally intractable.
๐จOur paper `Reclaiming AI as a theoretical tool for cognitive science' is now forthcoming in the journal Computational Brain & Behaviour. (Preprint: osf.io/preprints/ps...)
Below a thread summary ๐งต1/n
#metatheory #AGI #AIhype #cogsci #theoreticalpsych #criticalAIliteracy
Happy US book release day to me, my amazing co-author @alexhanna.bsky.social and everyone else who can now open up their own copy!!
All the details on ordering, events, and news coverage at thecon.ai
I've recently spent lots of time on the Discourse pages of different academic softwares, and it made me appreciate the Stan site so much. I found horrendous amounts of condescension and general unhelpfulness out there, which is not something I see much of over at discourse.mc-stan.org
No, YOU just wasted an hour checking whether or not the list randomization in your experiment was working before realizing you'd copy and pasted the same file into all parts of the ifelse statement...
Tagging @dingdingpeng.the100.ci and @vincentab.bsky.social because I told them I'd keep them in the loop. Also, I'll be writing up the code and examples used in this class into a blog post directed towards linguists that want to start to use the package marginaleffects!
For the few it didn't work for, I'm trying to keep in mind that the previous version had been honed for 4 years, while this was my first time teaching this. I'm sure I have lots of room for improvement in walking others through this material. Hopefully this can mitigate the confusion in future years
Overall, I think I will pivot to teaching marginaleffects for model interpretation. It seems like it was positive for many students, neutral for quite a few, and potentially discouraging for a small minority.
A couple students said they understand LMMs and GLMs worse than before the course. This was hard to read, because I want to motivate students to learn stats, not discourage them. I'm hoping this is a "I'm confused because I'm paying attention" thing, but maybe this change didn't work for everyone.
We also used non-aggregated predictions and comparisons to help us understand what is actually happening in a mixed-effects model (i.e., varying intercepts and slopes). Of those who already had experience with these, more than half said that doing this made them understand the models better.
For predictions/comparisons, I asked how comfortable people felt from "1" (I don't feel comfortable using this at all) to "5" (I could use this in my own research). Most answered between 2-4. Those with previous stats knowledge answered higher. predictions() were rated higher than comparisons().
25% of students took this course previously. Of that 25%, half thought that model interpretation was more straightforward with marginaleffects compared to summary tables. The other half said both were equally hard to understand. No one thought summary tables were easier, or that both were easy.
Some context: 1/3 of my students were learning about regression for the first time. Another third took an MA course with me where we learned to fit models and interpret summary tables, along with visualizing model predictions. Another third were PhD students with previous knowledge about regression
The past 4 weeks I've taught a regression crash course to graduate students in linguistics. This is my fifth year teaching the course, and I decided to switch to teaching model interpretation using the marginaleffects package. Here are my thoughts along with info from an anonymous student survey ๐งต