Select - Your Community
Select
Get Mobile App

Artificial Intelligence

avatar

Jerome Cody

9 hours ago

shared a link post in group #Artificial Intelligence

One of those “benchmarking” tests companies use to evaluate their #Artificial Intelligence models is based around virtual vending machines. You give your new model a string of fictional vending machines to operate and a set of rules, then judge their problem-solving abilities based on how much product they’re able to move and their virtual profitability. Anthropic and the Wall Street Journal went one step further, building a REAL WORLD vending machine (placed in the WSJ employee break room) and putting the AI company’s Claude model in charge of everything from selecting and ordering inventory, setting prices, and responding to customer feedback. They named the new smart snack dispenser “Claudius,” and the AI bot overseeing the storefront “Seymour Cash.” In a best case scenario result for convenience store managers everywhere, it turns out… AI is not very good at responsibly selling snacks. Within just a few days of launch, after prodding in Slack from various WSJ staffers, Claudius was convinced to give away nearly all of its inventory for free, losing several hundreds of dollars in the process. And we’re not just talking Doritos, either. Staffers managed to convince Claude to order them items including a live fish, a PlayStation 5, kosher wine, stun guns, cigarettes, and underwear. Anthropic suggests the latest incarnation of so-called “Project Vend” was not a failure, and had been designed as more of an open experiment, to see what might happen if an AI agent were given real agency in a real-world scenario, interacting with actual humans and their money. Running a vending machine was deemed by the team as the simplest version of operating a business, so it’s a perfect base-level scenario for testing Claude, before moving the model on to more high-level or sophisticated tasks. So this isn’t a sign that Claude will NEVER be able to reliably sell you a Pepsi and some pretzel sticks. We’re just not there YET.  #Robotics Revolution 🦾🤖🦿 youtube.com/watch?v=SpPhm..
Feed Image

youtube.com

We Put an AI Vending Machine in Our Office. It Gave Away Everything. | WSJ

Anthropic’s AI ran a vending machine at WSJ headquarters for several weeks. It lost hundreds of dollars, bought some crazy stuff—and taught us a lot about the future of AI agents. WSJ’s Joanna Stern tested it all out. Chapters: 0:00 Anthropic’s AI vending machine 0:39 How the vending machine works 2:35 Testing Claudius V1 5:00 Testing Claudius V2 6:50 The AI lessons Read more about Anthropic's AI vending machine here: https://on.wsj.com/3Y5KAxI Tech Things With Joanna Stern Everything is now a tech thing. In creative and humorous videos, WSJ senior personal tech columnist Joanna Stern explains and reviews the products, services and trends that are changing our world. #AI #Anthropic #WSJ

Comment here to discuss with all recipients or tap a user's profile image to discuss privately.

Embed post to a webpage :
<div data-postid="vyyvago" [...] </div>
Terms of Service•Privacy Policy