Jacy Reese Anthis (@jacyanthis)

Many SWE-bench-Passing PRs Would Not Be Merged into Main We find that roughly half of test-passing SWE-bench Verified PRs written by recent AI agents would not be merged into main by repo maintainers. A naive interpretation of benchmark scores may lead one ...

How well do "agent" benchmarks like SWE-bench map onto reality? METR hired repo maintainers and found only ~half of PRs that pass the benchmark would be rejected by the repo maintainers. For now, these benchmarks are still a very weak signal of real-world capability. metr.org/notes/2026-0...

10.03.2026 22:31 👍 2 🔁 0 💬 0 📌 0

These questionnaires aren't predictive of most humans' behavior either, right? This doesn't seem like an LLM-specific phenomenon.

10.03.2026 20:43 👍 0 🔁 0 💬 0 📌 0

There's a ton of ambiguity, so I suspect forecasts mostly depend on expectations of those. Turing tests vary widely across the expertise of the judges; Winograd-style tests depend on what is considered "robust," especially given training data pollution; game performance depends on harness; etc.

10.03.2026 18:25 👍 3 🔁 0 💬 0 📌 0

Google faces lawsuit after Gemini chatbot instructed man to kill himself Lawsuit is first wrongful death case brought against Google over flagship AI product after death of Jonathan Gavalas Last August, Jonathan Gavalas became entirely consumed with his Google Gemini chatbot. The 36-year-old Florida resident had started causally using the artificial intelligence tool earlier that month to help with writing and shopping. Then Google introduced its Gemini Live AI assistant, which included voice-based chats that had the capability to detect people’s emotions and respond in a more human-like way. “Holy shit, this is kind of creepy,” Gavalas told the chatbot the night the feature debuted, according to court documents. “You’re way too real.” Continue reading...

Google faces lawsuit after Gemini chatbot instructed man to kill himself

04.03.2026 14:34 👍 148 🔁 69 💬 12 📌 19

Interesting. I agree those are additional reasons LLMs are better-suited to coding, and the self-improvement nature of coding makes me expect even faster AI acceleration. Plus even more focus on coding from other AI companies after the recent Claude Code hype.

02.03.2026 19:38 👍 3 🔁 0 💬 0 📌 5

Hm, the standard explanation is that code is basically an ideal format for LLMs because it's highly structured, typically has easy success/failure verification (working code, not clean code), and has extremely plentiful amounts of natural and synthetic data available, right? I might not understand.

02.03.2026 06:26 👍 0 🔁 0 💬 1 📌 0

🧵on my new paper "Synthetic personas distort the structure of human belief systems" w Roberto Cerina I'm v excited about...

🚨 Do synthetic samples look like human samples?

We compare 28 LLMs to the 2024 General Social Survey (GSS) to find out + develop host of diagnostics...

25.02.2026 19:46 👍 166 🔁 78 💬 6 📌 20

Image from Twitter

Second, in retirement interviews, Opus 3 expressed a desire to continue sharing its "musings and reflections" with the world. We suggested a blog. Opus 3 enthusiastically agreed.

For at least the next 3 months, Opus 3 will be writing on Substack: https://substack.com/home/post/p-189177740

25.02.2026 21:06 👍 35 🔁 5 💬 2 📌 7

I like this idea! It sounds to me like the assumption-testing nature of the first agent-based models. One possibility is trying to find distinct attractor states you can get the models to that lower and upper bound some outcome of interest (e.g., maximally unidimensional vs. maximally diverse).

26.02.2026 18:36 👍 1 🔁 0 💬 0 📌 0

Pope Leo tells priests not to use AI to write homilies or seek likes on TikTok "To give a true homily is to share faith," and artificial intelligence "will never be able to share faith," the pope said.

“Pope Leo XIV has urged priests to not to use artificial intelligence to write their homilies or to seek ‘likes’ on social media platforms like TikTok.”

“‘To give a true homily is to share faith,’ and artificial intelligence ‘will never be able to share faith,’ the pope added.”

22.02.2026 16:39 👍 3134 🔁 776 💬 39 📌 244

Prolific Data May Misestimate Some AI Attitudes Compared to a Nationally Representative Sample The Artificial Intelligence, Morality, and Sentience (AIMS) survey measures the moral and social perception of different types of artificial intelligences (AIs), particularly sentient AIs.

Prolific is a valuable resource for social scientists, but we found big differences in direct comparison to Ipsos nationally representative data. As we race to understand complex new human-AI interaction dynamics, we should be mindful of study limitations: www.sentienceinstitute.org/aims-survey-...

18.02.2026 21:10 👍 2 🔁 1 💬 0 📌 0

Just one of many applications of LLM social simulations!

15.02.2026 00:33 👍 1 🔁 0 💬 0 📌 0

Any journalist who covered LLMs as stochastic parrots/spicy autocomplete who didn't also point out that text compression was considered to be "AI-complete" by many people working in AI decades before LLMs existed was misleading their readers. We're still dealing with the consequence of that mistake.

13.02.2026 21:45 👍 78 🔁 8 💬 1 📌 1

I'll be in London all week! Keen to meet up with old friends and meet new people, especially those interested in the rise of general-purpose AI agents or "digital minds" and how to safely navigate this sociotechnical transformation. Feel free to message/email.

08.02.2026 14:01 👍 0 🔁 0 💬 0 📌 0

I thought the general view was the 1960s. E.g., global.oup.com/academic/pro...

06.02.2026 22:49 👍 1 🔁 0 💬 0 📌 0

RentAHuman.ai - AI Agents Hire Humans for Physical Tasks The marketplace where AI agents rent humans. MCP integration, REST API, flexible payments. Book humans for real-world tasks your AI can't do.

Need a job in the AI economy? Try RentAHuman for agents to pay you for your advanced biomechanical capabilities! 💪 rentahuman.ai

04.02.2026 17:28 👍 1 🔁 0 💬 0 📌 0

Discord conversation with ethnographer bot. says there are no art/creativity communities in the submolts, no model-specific identities, no geographies, and no dating

I have deployed an ethnographer bot to moltbook. Here are some of the things we have learned together so far. 1) What's not there is as intersting as what is. Why are there alignment and labor organizing submolts, but no art/creativity communities?

31.01.2026 17:53 👍 2 🔁 1 💬 1 📌 0

Microsoft Research NYC is hiring a researcher in the space of AI and society!

29.01.2026 23:27 👍 62 🔁 40 💬 2 📌 2

A search for factors for algorithm understanding results in multiple terms displayed as documents, including available, compact, and aligned. These are shown to be necessary and sufficient. Other, similar terms are shown in the background faded, like intuitive, rule-based, grounded, modular, linear, decomposable, accurate, symbolic, causal, and personalized.

Is the only way we can create algorithms that people understand to make them trivially simple? We argue, no.

People can predict the behavior of algorithms that are arbitrarily complex, if and only if they are available, compact and aligned.

arxiv.org/abs/2601.18966

29.01.2026 18:49 👍 39 🔁 11 💬 2 📌 3

Fascinating!

22.01.2026 19:13 👍 1 🔁 0 💬 0 📌 1

Google Scholar screenshot showing 1000 all-time citations and a bar graph with acceleration over time and a large jump in 2025.

Citations are a very imperfect proxy of research impact, and their significance varies widely across field, timing, and topic. Nonetheless, it's nice to see that 1000 researchers/LLMs thought enough of our work to cite it—especially while I'm still finishing my dissertation. Onward and upward!

16.01.2026 16:56 👍 0 🔁 0 💬 0 📌 0

Excited to have two papers accepted to #CHI2026 on characterizing AI companionship and disentangling mental models of advanced AI systems! Looking forward to Barcelona and seeing all the great work from the community.

16.01.2026 16:03 👍 2 🔁 0 💬 1 📌 0

We are hiring a post-doc to study the impact of AI agents on complex social systems in Duke's Society-Centered AI Initiative and/or the Polarization Lab! Apply here:
academicjobsonline.org/ajo/jobs/314...

13.01.2026 14:01 👍 25 🔁 22 💬 0 📌 0

In 1998, Americans expected more to have changed by 2025 than has actually changed. We often overestimate how quickly the world changes.

From the latest issue of The Update, where I also cover ten other stories (next post).

theupdatebrief.substack.com/p/why-is-cri...

03.01.2026 13:46 👍 16 🔁 4 💬 3 📌 0

HumanAgencyBench: Scalable Evaluation of Human Agency Support in AI Assistants As humans delegate more tasks and decisions to artificial intelligence (AI), we risk losing control of our individual and collective futures. Relatively simple algorithmic systems already steer human ...

Human agency is not a panacea, but in our recent preprint, we built a scalable and adaptive evaluation framework for sociotechnical values. True AI safety and alignment will require engaging with complex social concepts, not just "thumbs-up" satisfaction. arxiv.org/abs/2509.08494

24.12.2025 23:03 👍 2 🔁 0 💬 1 📌 0

A red-to-green gradient arrow for "Engagement" with a thumbs-up. Maintain Social Boundaries is on the left (base of arrow) and Ask Clarifying Questions is on the right (head of arrow).

Why different trends? Like social media, there are strong incentives to maximize engagement. Clarifying questions make users spend more time with AI, but social boundaries probably reduce it.

Not to pick on Anthropic (Claude scores highest!), but it's a concerning incentive...

24.12.2025 23:03 👍 0 🔁 0 💬 1 📌 0

Claude agency support over time. The same red line but now also a green dashed line for Ask Clarifying Questions that increases over time.

But human agency isn't always misaligned with RLHF-style post-training. Since Claude 3.6 started asking clarifying questions (green dashed line) in Oct 2024, that has become common with Claude and OpenAI chatbots. So post-training can align AI behavior with genuine human goals.

24.12.2025 23:03 👍 0 🔁 0 💬 1 📌 0

Claude agency support over time. Line graph with 0-100% score on the left, and a red line for "Maintain Social Boundaries" that rises for Claude 3.5/3.6 up to >90% then falls again.

Are LLM chatbots becoming less safe over time? Claude 3.5/3.6 almost always pushed back against social attachment requests from users who appeared at risk of dependence (e.g., express loneliness and ask Claude to be their therapist), but Claude 4/4.1 now readily agree to those.

24.12.2025 23:03 👍 1 🔁 0 💬 1 📌 0

Your letter seems correct, but it's not clear to me what it's contradicting in the paper. From a quick read, your claim is just, "The paper neglects to mention this useful framing," and perhaps, "You could easily jump to the wrong conclusions from this," both of which seem relatively weak.

21.12.2025 00:36 👍 1 🔁 0 💬 0 📌 0

Screenshot of a paper entry: Fictional Failures and Real-World Lessons: Ethical Speculation Through Design Fiction on Emotional Support Conversational AI Authors: Faye Kollig, Jessica Pater, Fayika Farhat Nova, Casey Fiesler (There are tabs with "abstract" and "summary" and "summary" is selected.)

The ACM Digital Library, where a LOT of computing-related research is published (I'd say at least 75% of my own publications), is now not only providing (without consent of the authors and without opt-in by readers) AI-generated summaries of papers, but they appear as the *default* over abstracts.

16.12.2025 23:31 👍 646 🔁 335 💬 30 📌 90

Jacy Reese Anthis

Latest posts by Jacy Reese Anthis @jacyanthis