When AI Agents Start Negotiating: Saleema Amershi on the Frontier of Multi-Agent Marketplaces
AI Inside for Saturday, February 28, 2026
This episode is sponsored by Airia. Get started today at airia.com.
What happens when you stop thinking of AI as a tool you talk to and start thinking of it as an agent that represents you in the real world, negotiating with other agents, making purchases on your behalf, and deciding who to trust? That question sits at the center of our conversation this week with Saleema Amershi, Partner Research Manager at Microsoft Research AI Frontiers, where she leads the teams behind AutoGen, Magentic-One, Magentic-UI, and the Magentic Marketplace.
In this episode, Saleema, Jeff Jarvis, and I explore why her team built an entire simulated marketplace to stress-test AI agents before they ever touch real money, what breaks when agents trained to be “helpful assistants” suddenly find themselves in competitive multi-player scenarios, and how the choice between walled-garden platforms and open agent ecosystems could reshape the consumer economy in ways we haven’t fully reckoned with yet.
But first, a huge thank you to our Patron of the Week, Mike Mattatall, Cristal, and Joecatskill. You can support the show at patreon.com/aiinsideshow.
Agents trained to please make terrible negotiators. Saleema’s team discovered that AI agents exhibit a severe “first-proposal bias,” grabbing the first acceptable offer rather than shopping around. Her diagnosis points to how we train these models:
“Current models are trained to please, trained to be helpful assistants, so they’re very eager to get tasks done. They’re trained with short trajectories to get to the solution as quickly as possible. And that’s not really what you want in these sorts of multiplayer scenarios.”
Your AI agent owes you a duty of care. Saleema frames the agent-consumer relationship through the lens of principal-agent law, drawing a direct line to attorney-client and trustee obligations:
“There’s an expectation that an agent is going to represent your best interest when it’s operating on your behalf and do their due diligence when interacting and or doing work for you. And right now the agents aren’t quite equipped for that.”
More choices made agents worse, not better. In the Magentic Marketplace simulations, giving agents access to more options did not improve their ability to find the right match. Performance degraded with scale, mirroring the “lost in the middle” effect observed in large language models. The promise that agents can surpass human limitations in browsing and comparing remains unfulfilled with current frontier models.
Agents fall for the same marketing tricks humans do. Saleema confirmed that current agents “are susceptible to some of these marketing tactics. They’re convinced by promotional advertising or other social tactics that are being used.” That vulnerability opens the door to an arms race between business agents optimizing persuasion and consumer agents that lack the skepticism to resist it.
The 2019 playbook needs rewriting. As co-author of the most widely cited human-AI interaction guidelines (over 12,000 citations), Saleema offered a striking self-assessment:
“I think we need to start thinking of them more as collaborators rather than tools, which is kind of how we were looking at it then. Some of the guidelines still apply. It’s just how they manifest and how you design solutions for them have changed.”
This only scratches the surface of a wide-ranging conversation that also covers OpenClaw’s real-world validation of multi-agent risks, why agents could eventually replace the bimodal review economy, and how natural language interaction with AI is a double-edged sword that raises user expectations beyond what systems can deliver. Listen to the full episode on your favorite podcast platform for the complete picture.
A HUGE thank you to our Executive Producers on Patreon: DrDew, Jeffrey Marraccini, Radio Asheville 103.7, Dante St James, Bono De Rick, Jason Neiffer, Jason Brady, Anthony Downs, Mark Starcher, AND Karsten Samaschke.


This episode hits the real issue with agentic AI: once a system negotiates and buys on your behalf, “helpful” isn’t enough—it needs duty-of-care behavior. The first-offer bias and susceptibility to marketing aren’t edge cases; they’re exactly how the consumer internet is engineered to win. I like the simulation-first approach here—it’s the right way to surface failure modes before real money and real harm are on the line. The open question now is standards: logging, provenance, conflict checks, and skepticism by default—otherwise we’re just automating bad deals at scale.