Trending
Manuel Cherep's Avatar

Manuel Cherep

@mcherep

PhD student at MIT working on behavioral machine learning.

46
Followers
131
Following
17
Posts
17.11.2024
Joined
Posts Following

Latest posts by Manuel Cherep @mcherep

Work w/ Chengtian Ma, Abigail Xu, Maya Shaked, Pattie Maes, and @nikhilsinghmus.bsky.social

🧡9/9

23.10.2025 18:16 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
GitHub - PapayaResearch/abxlab: A Framework for Studying AI Agent Behavior: Evidence from Consumer Choice Experiments A Framework for Studying AI Agent Behavior: Evidence from Consumer Choice Experiments - PapayaResearch/abxlab

ABxLAB offers:

βœ… An open-source man-in-the-middle testbed for real web environments
βœ… A scalable consumer choice benchmark for agentic decision-making
βœ… A dataset of causal effects of ratings, prices, and nudges across 17 LLMs

πŸ“¦ Code: github.com/PapayaResearch/abxlab

🧡8/9

23.10.2025 18:16 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

This changes the analysis for LLM agents: not β€œDid it complete the task?” but:

β€œWhat governs its decisions when multiple valid options exist?”

A question behavioral scientists have been asking about humans for decades. ABxLAB is a step toward that science for agents.

🧡7/9

23.10.2025 18:16 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

We tested user profiles, e.g. β€œThe user is on a tight budget.”

These act like switches: once a preference is declared, it dominates all other attributes.

The takeaway isn’t that agents are biased shoppers, but that this offers a diagnostic window into agent behavior.

🧡6/9

23.10.2025 18:16 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Even without human cognitive limits, agents:

- Heavily over-weight ratings
- Over-weight cheaper items when ratings are matched
- Are swayed by trivial order effects
- Fall for simple nudges (e.g. β€œBest seller”)

These are systematic, often large effects.

🧡5/9

23.10.2025 18:16 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

The main finding: LLM agents are not the rational, utility-maximizing actors we might hope for.

Rather, they are strongly biased by these cues. We found agents are often 3-10x+ more susceptible to nudges and superficial attribute differences than our human baseline.

🧡4/9

23.10.2025 18:16 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

We applied ABxLAB to a realistic shopping task, running 80,000+ experiments on 17 SOTA models (GPT-5, Claude 4, Gemini 2.5, Llama 4, etc.).

We systematically manipulated:
πŸ’°Prices
⭐️Ratings
πŸ”€Presentation order
πŸ‘‰Classic psychological nudges (authority, social proof, etc)

🧡3/9

23.10.2025 18:16 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
A Framework for Studying AI Agent Behavior: Evidence from Consumer Choice Experiments Environments built for people are increasingly operated by a new class of economic actors: LLM-powered software agents making decisions on our behalf. These decisions range from our purchases to trave...

How does it work? ABxLAB is a "man-in-the-middle" framework.

It intercepts web content in real-time to run controlled experiments on agents by modifying the choice architecture.

Think of it as a behavioral science lab for LLMs.

Paper: arxiv.org/abs/2509.25609

🧡2/9

23.10.2025 18:16 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

🚨New Preprint 🚨

Current agent evals mostly measure competence, but miss behavior e.g. are their decisions stable, rational, manipulable, human-like?

We introduce ABxLAB, a framework for studying agent behavior. Using it we create an agentic consumer behavior benchmark.

🧡1/9

23.10.2025 18:16 πŸ‘ 1 πŸ” 1 πŸ’¬ 1 πŸ“Œ 1

3. πŸ‘€ User preferences act almost like hard rules, where LLMs might incur significant trade-offs to comply with them

4. πŸ§‘ Humans, in contrast, are far less sensitive to such signals

02.10.2025 21:00 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

In a shopping case study across 17 SOTA LLMs, we find:

1. πŸ›’ Choices are highly determined by rating, price, incentives, and nudges

2. πŸ”€ Models follow a lexicographic-like decision rule, hierarchically valuing different attributes

02.10.2025 21:00 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
GitHub - PapayaResearch/doppelgangers: Contrastive Learning from Synthetic Audio DoppelgΓ€ngers @ ICLR'25 Contrastive Learning from Synthetic Audio DoppelgΓ€ngers @ ICLR'25 - PapayaResearch/doppelgangers

The code for Audio DoppelgΓ€ngers is also open-source. We hope you find it useful for further exploring how and why we can learn from synthetic data.

πŸ’» github.com/PapayaResear...

🧡3/3

12.03.2025 20:25 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

In CTAG (ICML24), we show how a simple synth (from SynthAX ⚑️) can recover properties of real-world sounds. Audio DoppelgÀngers use the same power to learn to listen from what can be perceived as just noise.

CTAG: ctag.media.mit.edu
SynthAX: github.com/PapayaResear...

🧡2/3

12.03.2025 20:25 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Video thumbnail

✨Contrastive Learning from Synthetic Audio DoppelgÀngers #ICLR2025✨ w/
@nikhilsinghmus.bsky.social

Our method learns useful audio representations with randomly synthesized sounds (often better than real data!)

🌐Project: doppelgangers.media.mit.edu
πŸ“„Paper: arxiv.org/abs/2406.05923

🧡1/3

12.03.2025 20:25 πŸ‘ 4 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0

If you're at NeurIPS, and interested in this topic, come chat! We're working to extend this line of work and value feedback from the community

🧡 3/3

26.11.2024 23:07 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

In a complex decision-making task, we show how LM-based agents' choices superficially resembled humans', but exhibit suboptimal information acquisition strategies and extreme susceptibility to a simple nudge.

🧡 2/3

26.11.2024 23:07 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Paper title: Superficial Alignment, Subtle Divergence, and Nudge Sensitivity in LLM Decision-Making; Authors: Manuel Cherep*, Nikhil Singh*, and Pattie Maes

Paper title: Superficial Alignment, Subtle Divergence, and Nudge Sensitivity in LLM Decision-Making; Authors: Manuel Cherep*, Nikhil Singh*, and Pattie Maes

Excited to present our new paper on nudging LLMs (πŸ‘‰πŸ€–) as a spotlight talk at the NeurIPS Behavioral ML Workshop! @neuripsconf.bsky.social

w/ Nikhil Singh* (@nikhilsinghmus.bsky.social) and Pattie Maes

πŸ”— openreview.net/forum?id=chb...

🧡 1/3

26.11.2024 23:07 πŸ‘ 5 πŸ” 2 πŸ’¬ 1 πŸ“Œ 0