Katie Keith's Avatar

Katie Keith

@katakeith

NLP and computational social science (CSS) researcher. Assistant Professor in Computer Science at Williams College. AI2 and UMass Amherst alum. she/her. https://kakeith.github.io/

4,091
Followers
290
Following
35
Posts
22.12.2023
Joined
Posts Following

Latest posts by Katie Keith @katakeith

Preview
NLP+CSS Workshops https://www.pexels.com/photo/group-hand-fist-bump-1068523/

✨The NLP+CSS workshop is returning to ACL 2026!✨

And this year, we have a new shared task with prizes!

Website/CfP: sites.google.com/site/nlpandc...
Deadlines: March 5 (direct), March 24 (pre-reviewed ARR)

#NLProc #CompSocialSci #ComputationalSocialScience #ACL2026NLP
@aclmeeting.bsky.social

18.12.2025 12:38 👍 18 🔁 12 💬 0 📌 3
Codebook LLMs: Evaluating LLMs as Measurement Tools for Political Science Concepts | Political Analysis | Cambridge Core Codebook LLMs: Evaluating LLMs as Measurement Tools for Political Science Concepts

Very excited that my paper with @katakeith.bsky.social is now out in @polanalysis.bsky.social. We investigate whether LLMs actually follow the instructions/definitions provided in codebooks, propose some diagnostics, and release a new evaluation dataset.
www.cambridge.org/core/journal...

19.09.2025 13:45 👍 30 🔁 14 💬 0 📌 2

Whoa...!! If social-science leaning at all maybe try other preprint servers? SocArXiv for example? We put one of our preprints there: osf.io/preprints/so...

27.08.2025 19:02 👍 3 🔁 0 💬 1 📌 0

Yes! I agree. It's so rare these days to see a keynote that is so thorough and full of new conceptualizations.

12.08.2025 02:12 👍 4 🔁 0 💬 0 📌 0

5300 attendees in person here at #acl2025 😮

30.07.2025 15:31 👍 4 🔁 0 💬 0 📌 0

The #ACL2025 #ACL2025NLP feed is up and running! It matches both hashtags and any posts from or mentions of @aclmeeting.bsky.social

Pin it to your home 📌 and enjoy!

bsky.app/profile/did:...

17.07.2025 11:15 👍 48 🔁 14 💬 2 📌 0

Topic @adeldaoud.bsky.social and I were discussing today at lunch at #ic2s2 and want to ask here:

What are the “known facts” in the social sciences? Which relationships between at least two social variables have been empirically found to have large effects and replicated by multiple groups?

24.07.2025 12:57 👍 5 🔁 2 💬 2 📌 0

Under review! Happy to share a draft if you email me. Thanks!

23.07.2025 19:14 👍 0 🔁 0 💬 0 📌 0

Thanks:)

23.07.2025 14:39 👍 0 🔁 0 💬 0 📌 0

Highlighting this thread. Based on what I'm seeing at #ic2s2 this week, this line of work is hot (if a bit crowded), but I predict will only be more widely adopted by social scientists in the future.

23.07.2025 13:07 👍 11 🔁 3 💬 1 📌 0

Not as recent, but still LLM-based

"WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation." GPT-3 composes new examples with similar patterns to challenging examples.

aclanthology.org/2022.finding...

23.07.2025 13:05 👍 3 🔁 0 💬 0 📌 0

I thought this was a clever and useful paper from Xiong, ... Hovy, El-Assady, Ash "Co-DETECT: Collaborative Discovery of Edge Cases in Text Classification." Using LLMs to help humans refine their codebooks (before codebooks are fixed for the true annotation stage) arxiv.org/pdf/2507.05010

23.07.2025 13:00 👍 7 🔁 0 💬 0 📌 0

We used active learning to create a human-annotated dataset of 1050 instances from FOMC transcripts—labeled for FOMC members’ opinions and directional stance towards monetary policy. Preprint and dataset should be released publicly by the end of the summer but email me for an advanced copy.

23.07.2025 12:52 👍 2 🔁 0 💬 0 📌 0
Post image Post image

Congrats to Alisa Kanganis (Williams College ’25) for presenting her thesis work at #ic2s2 today!

23.07.2025 12:52 👍 11 🔁 0 💬 3 📌 0

Yay! I'm there as well. Let's sync up.

20.07.2025 11:31 👍 2 🔁 0 💬 0 📌 0
Preview
U.S. college is first to decline federal science grants because of new DEI language Williams College says NSF and NIH requirement related to discrimination “undermines” academic freedom

This was top-down decision and Williams faculty have yet to formally discuss it. Unclear whether it is resistance or capitulation.
www.science.org/content/arti...

06.06.2025 21:48 👍 2 🔁 0 💬 1 📌 0
Preview
A Co-op for Computing Faculty are diving into the exciting, data-crunching, AI world of GPMoo.

Honored by the feature on my research, grant, and GPU cluster by the Williams magazine. today.williams.edu/magazine/a-c...

28.05.2025 01:41 👍 9 🔁 1 💬 0 📌 0

Personally, I find I have to burn a day answering all the questions (particularly for a dataset release). I think it should be condensed to the 5 most important ones.

20.05.2025 18:27 👍 2 🔁 0 💬 1 📌 0
Post image

A full room for @katakeith.bsky.social's talk on proximal causal inference with text data ✨✨✨

27.01.2025 23:19 👍 17 🔁 1 💬 1 📌 0
Post image

Mark your calendars for these upcoming events tied to SCI and its One-U Responsible AI Initiative! Visit rai.utah.edu/events for details.

@parasharmanish.bsky.social @katakeith.bsky.social @anamarasovic.bsky.social @freiling.bsky.social

24.01.2025 22:30 👍 7 🔁 4 💬 0 📌 0

Our semi-synthetic experiments use MIIMIC-III clinical notes and two open-weight LLMs and show that our method produces estimates with low bias.

11.12.2024 01:10 👍 2 🔁 0 💬 0 📌 0

For settings with an unobserved (but known) confounding variable, we propose a new causal inference method that uses two instances of pre-treatment text data, infers two proxies using two zero-shot models on the separate instances, and applies these proxies in the proximal g-formula.

11.12.2024 01:10 👍 1 🔁 0 💬 1 📌 0
Post image

Check out our #NeurIPS2024 poster (presented by my collaborators Jacob Chen and Rohit Bhattacharya) about “Proximal Causal Inference With Text Data” at 5:30pm tomorrow (Weds)!

neurips.cc/virtual/2024...

11.12.2024 01:10 👍 12 🔁 4 💬 1 📌 0
Details - Assistant/Associate Professor - Natural Language Processing (NLP) | Human Resources | UMass Amherst

We're hiring new #nlp faculty this year!

Asst or Assoc Professors in NLP at UMass CICS --
careers.umass.edu/amherst/en-u...

19.11.2024 14:33 👍 66 🔁 34 💬 1 📌 0

I'm excited to share that we've released v1.0 of our podcast corpus, SPoRC, led by my PhD student Ben Litterer! This first dataset is a slice of time, comprising over one million episodes from May and June 2020, including transcripts, diarization, and extracted audio features.

15.11.2024 15:03 👍 51 🔁 15 💬 1 📌 4
Preview
All - Bluesky Directory A curated collection of all things relating to the Blue Sky social media platform.

Starter packs are genius, but I was surprised there wasn't a list of them for people to find.

So I built it:
blueskydirectory.com/starter-pack...

The website monitors the packs being shared and adds the ones it finds to the database.

Missed your stater pack? Message me and I'll get it added.

11.11.2024 16:13 👍 6547 🔁 2964 💬 1111 📌 430

New here? Interested in AI/ML? Check out these great starter packs!

AI: go.bsky.app/SipA7it
RL: go.bsky.app/3WPHcHg
Women in AI: go.bsky.app/LaGDpqg
NLP: go.bsky.app/SngwGeS
AI and news: go.bsky.app/5sFqVNS

You can also search all starter packs here: blueskydirectory.com/starter-pack...

09.11.2024 09:13 👍 553 🔁 212 💬 67 📌 55

🫠🫶

06.11.2024 20:35 👍 1 🔁 0 💬 0 📌 0

In my NLP class (www.cs.williams.edu/~kkeith/teac...) next week, we're talking about eval.

I'd like to have a large section of the lecture focus on contamination. Crowd-sourcing--please send me your favorite contamination papers! Thanks! 🙏

06.11.2024 20:27 👍 16 🔁 3 💬 6 📌 0

go.bsky.app/PCckf3C

05.11.2024 21:39 👍 17 🔁 11 💬 1 📌 0