(@chenhaotan) — KonKok

The blog actually discussed different types of reviews. If AI reviewing helps authors produce better science, I do not see why one needs to be so hostile against AI. It actually helps authors slow down to produce better-quality articles.

10.03.2026 03:13 👍 0 🔁 0 💬 1 📌 0

It is open, can and will be improved! Feedback like these is highly appreciated.

The issue is here: github.com/ChicagoHAI/O...

09.03.2026 20:13 👍 3 🔁 0 💬 0 📌 0

OpenAIReview — AI-Powered Academic Paper Reviewer

Try it, read the blog, or contribute:
🌐 openaireview.github.io
📝 openaireview.github.io/blog.html
💻 github.com/ChicagoHAI/O...

09.03.2026 18:49 👍 1 🔁 0 💬 0 📌 0

We also don't have good evaluations for AI-generated reviews yet. We're working on it and welcome collaborators. Feedback welcome, especially from conference organizers and journal editors who want to think seriously about the future of peer review.

09.03.2026 18:49 👍 1 🔁 0 💬 1 📌 0

There are two types of reviewing. Reviewing for quality (improving the work) — what Refine and OpenAIReview do — is very different from gatekeeping (accept/reject), which is what Stanford Agentic Reviewer targets. We think automating gatekeeping requires much more care.

09.03.2026 18:49 👍 1 🔁 0 💬 1 📌 0

Our progressive approach finds issues at 87% of locations flagged by Refine, for the price of a coffee per paper. @joehsu.bsky.social added a Claude skill making it essentially free for Claude subscribers.

09.03.2026 18:49 👍 1 🔁 0 💬 1 📌 0

3/7 The only intervention that can stabilize the system is improving review precision, the ability to distinguish good papers from weak ones. AI production tools lower submission costs; only AI review tools can raise the signal. That asymmetry is why we built OpenAIReview.

09.03.2026 18:49 👍 0 🔁 0 💬 1 📌 0

2/7 The review death spiral: more submissions → overloaded reviewers → noisier reviews → more random acceptance → even more submissions. Bergstrom & Gross already warned about this. AI production tools make it worse by lowering submission costs and pushing the system toward collapse faster.

09.03.2026 18:48 👍 1 🔁 0 💬 2 📌 0

AI-assisted Reviewing is Necessary and Should be Open Peer review is facing a death spiral. AI production tools are speeding it up. AI-assisted reviewing is necessary and should be open.

Peer review is facing a death spiral, and AI production tools are speeding it up. AI-assisted reviewing is necessary and should be open. We built OpenAIReview: open AI reviewing for everyone, for the cost of a coffee.

openaireview.github.io/blog.html 🧵

09.03.2026 18:48 👍 19 🔁 7 💬 1 📌 4

Local ballot measures are now on CivicChats! Local elections happen year-round, 10+ states have measures coming up in the next few months. Check your ballot and think through what you'll be voting on → civicchats.org

25.02.2026 18:35 👍 2 🔁 1 💬 0 📌 0

CivicChats - Building AI to support voting behavior CivicChats is a platform for exploring, debating, and thinking through upcoming ballot measures.

We have been developing automatic evaluation based on checklists. We are also planning to run a study at the same time. Learn more at the end of this blog: cichicago.substack.com/p/civicchats...

20.02.2026 00:33 👍 0 🔁 0 💬 0 📌 0

Check out our effort in thinking about how AI can help with democratic processes!

19.02.2026 21:48 👍 1 🔁 1 💬 1 📌 0

Anyone can help reviewing an ACL submission today on parameter efficient fine-tuning?

Sorry that it is very tight.

16.02.2026 19:27 👍 0 🔁 1 💬 0 📌 0

📖 ≠ 🧪 The Story is Not the Science.
Code is submitted but rarely executed during peer review—an issue likely to worsen with research agents. 🧑‍🔬
We introduce 𝐌𝐞𝐜𝐡𝐄𝐯𝐚𝐥𝐀𝐠𝐞𝐧𝐭, an execution-grounded evaluation of narrative + execution. 𝐕𝐞𝐫𝐢𝐟𝐲 𝐭𝐡𝐞 𝐬𝐜𝐢𝐞𝐧𝐜𝐞, 𝐧𝐨𝐭 𝐣𝐮𝐬𝐭 𝐭𝐡𝐞 𝐬𝐭𝐨𝐫𝐲.
1/n

10.02.2026 19:44 👍 8 🔁 4 💬 2 📌 0

Mark Yatskar will be speaking this Friday!

You can tune in either on

Zoom: uchicago.zoom.us/j/9897879984...
Youtube: www.youtube.com/@AIScientifi...

09.02.2026 21:09 👍 0 🔁 0 💬 0 📌 0

Hannes Stark will be speaking this Friday on BoltzGen!

You can tune in either on

Zoom: uchicago.zoom.us/j/9897879984...
Youtube: www.youtube.com/@AIScientifi...

02.02.2026 22:47 👍 0 🔁 0 💬 0 📌 1

Happening in three hours!

30.01.2026 14:03 👍 0 🔁 0 💬 0 📌 0

Microsoft Research NYC is hiring a researcher in the space of AI and society!

29.01.2026 23:27 👍 62 🔁 40 💬 2 📌 2

@profbuehlermit.bsky.social from MIT will be speaking this Friday!

You can tune in either on

Zoom: uchicago.zoom.us/j/9897879984...
Youtube: www.youtube.com/@AIScientifi...

26.01.2026 20:42 👍 5 🔁 1 💬 0 📌 1

Happening in two hours!

23.01.2026 15:03 👍 1 🔁 0 💬 0 📌 0

Peter Clark from @ai2.bsky.social will be speaking on Friday!

You can tune in either on

Zoom: uchicago.zoom.us/j/9897879984...
Youtube: www.youtube.com/@AIScientifi...

20.01.2026 19:17 👍 3 🔁 0 💬 0 📌 1

We study how radiologists use AI to diagnose pulmonary embolism (PE), tracking over 100,000 scans interpreted by nearly 400 radiologists during the staggered rollout of an FDA-approved diagnostic platform. When AI flags PE, radiologists agree 84% of the time; when AI predicts no PE, they agree 97%. Disagreement evolves substantially: radiologists initially reject AI-positive PEs in 30% of cases, dropping to 12% by year two. Despite a 16% increase in scan volume, diagnostic speed remains stable while per-radiologist monthly volumes nearly double, with no change in patient mortality—suggesting AI improves workflow without compromising outcomes. We document significant heterogeneity in AI collaboration: some radiologists reject AI-flagged PEs half the time while others accept nearly always; female radiologists are 6 percentage points less likely to override AI than male radiologists. Moderate AI engagement is associated with the highest agreement, whereas both low and high engagement show more disagreement. Follow-up imaging reveals that when radiologists override AI to diagnose PE, 54% of subsequent scans show both agreeing on no PE within 30 days.

Posted a very early stage draft with rock star collaborators.

Key question: when we actually roll out AI tools, how do people use them? Do they just defer completely? Does it improve productivity and ability?

We look in the medical setting of pulmonary embolisms
paulgp.com/papers/Radio...

19.01.2026 20:16 👍 89 🔁 18 💬 4 📌 2

I've often joked that as faculty I program in a high-level language called "graduate student". Having tried out Claude Code this morning, I (i) feel extremely at home, (ii) am realizing that research-by-graduate-student is perhaps the original vibe-coding. 1/2

08.01.2026 12:24 👍 87 🔁 11 💬 7 📌 3

I've seen this message and similar echos for other writing, and I want strongly push back on this narrative. It's not that you shouldn't use ChatGPT but that you shouldn't *use ChatGPT to write it for you*. ChatGPT—and AI in general—is not a monolith. How you use it matters.

18.01.2026 16:55 👍 8 🔁 1 💬 1 📌 0

Very much enjoyed this talk by @yisongyue.bsky.social ! The measurement challenge deserves a lot more attention from the AI community!

16.01.2026 18:54 👍 2 🔁 0 💬 0 📌 0

Happening in two hours!

16.01.2026 14:43 👍 2 🔁 0 💬 0 📌 1

Title + abstract of the preprint

Excited to present a new preprint with @nkgarg.bsky.social: presenting usage statistics and observational findings from Paper Skygest in the first six months of deployment! 🎉📜

arxiv.org/abs/2601.04253

14.01.2026 19:48 👍 147 🔁 45 💬 4 📌 4

Emergent misalignment made into @nature.com! The key insight is that models fine-tuned on writing insecure code present a wide range of insecure behavior in other contexts.

15.01.2026 15:37 👍 5 🔁 0 💬 0 📌 0

I think it would be useful to attract researchers in industry to the platform as well.

13.01.2026 01:28 👍 3 🔁 0 💬 0 📌 0

Latest posts by @chenhaotan