OpenAI's agent tool may be close to release

OpenAI may be close to releasing an AI tool that can take control of your computer and perform actions on your behalf.

Tibor Blaho, a software engineer with a reputation for accurately leaking future AI products, claims to have discovered evidence of OpenAI’s long-rumored Operator tool. Publications including Bloomberg have previously reported on Operator, which is said to be an “agent” system capable of autonomously handling tasks such as writing code and booking travel.

According to The Information, OpenAI is targeting January as the release month of Operator. The code revealed by Blaho this weekend adds credence to that report.

OpenAI’s ChatGPT client for macOS has gained options, hidden for now, to define shortcuts for the Toggle Operator and Force Quit Operator, according to Blaho. And OpenAI has added references to the Operator on its website, Blaho said — though references that aren’t yet publicly visible.

Confirmed – ChatGPT macOS desktop app has hidden options to define desktop launcher shortcuts to “Toggle Operator” and “Force Quit Operator” https://t.co/rSFobi4iPN pic.twitter.com/j19YSlexAS

— Tibor Blaho (@btibor91) January 19, 2025

According to Blaho, the OpenAI site also contains not-yet-public charts that compare Operator’s performance to other computer-based AI systems. Tables can be placeholders. But if the numbers are accurate, they suggest that the Operator is not 100% reliable, depending on the task.

The OpenAI website already has references to Operator/OpenAI CUA (Computer Usage Agent) – “Operator System Card Table”, “Operator Request Evaluation Table” and “Operator Rejection Rate Table”

Including comparison with computer usage of Sonnet Claude 3.5, Google Mariner, etc.

(table preview… pic.twitter.com/OOBgC3ddkU

— Tibor Blaho (@btibor91) January 20, 2025

At OSWorld, a benchmark that tries to mimic a real computing environment, the OpenAI Computer Use Agent (CUA)—perhaps the AI model that powers the Operator—scores 38.1%, ahead of Anthropic’s computer control model, but well behind slightly less than 72.4% of people. point. OpenAI CUA outperforms human performance in WebVoyager, which assesses an AI’s ability to navigate and interact with web pages. But the model falls short of human-level results on another web-based benchmark, WebArena, according to the benchmarks revealed.

The operator also struggles with tasks that a human can easily perform, if the leak is to be believed. In a test that tasked Operator with signing up with a cloud provider and launching a virtual machine, Operator was successful only 60% of the time. Tasked with creating a Bitcoin wallet, the Operator succeeded only 10% of the time.

OpenAI’s sudden entry into the AI agent space comes as rivals including the aforementioned Anthropic, Google and others make plays for the nascent segment. AI agents may be risky and speculative, but tech giants are already touting them as the next big thing in AI. According to analytics firm Markets and Markets, the market for AI agents could be $47.1 billion by 2030.

Agents today are quite primitive. But some experts have raised concerns about their safety if the technology improves rapidly.

One of the leaked charts shows Operator performing well in selected security assessments, including tests that attempt to force the system to perform “illegal activities” and request “sensitive personal data.” Security testing is said to be among the reasons for Operator’s long development cycle. In a recent X post, OpenAI co-founder Wojciech Zaremba criticized Anthropic for releasing an agent that he claims lacks security mitigations.

“I can only imagine the backlash if OpenAI made a similar release,” Zaremba wrote.

It’s worth noting that OpenAI has been criticized by AI researchers, including former staff, for allegedly de-emphasizing security work in favor of rapidly producing its technology.

What's Hot

Trump calls

Arsenal Champions League Exit: Mikel Artesa says Gunners have been the best team in competition | Football news

Create Kate Middleton’s spring Sunnis with this $ 19 looked

OpenAI’s agent tool may be close to release

Stripe reveals the AI Foundation model for payments, reveals ‘deeper partnership’ with Nvidia

Fasttino Train he’s models on free game GPUs and just gathered $ 17.5 million led by Khosla

The registration of the new court indicates that Meta Execs agreed that Facebook was losing Tiktok

Uber invests $ 100m in Weride to feed robot expansion in 15 other cities

BREATHE LANDS $ 21 million Series B to predict battery performance

Between Greek Boom Tech, a prominent firm in the seed phase blocks € 75m

Man City 2 – 2 Brighton

Biden grants activist Leonard Peltier clemency to serve life sentence at home

The owner of Leopards Leopards Derek Beaumont criticizes Salford Red Devils for Fielding ‘Reserves’ in 82-0 Drubbing vs st Helens | Rugby League news

Meghan Markle revealed that she uses this mascara of the 26 dollar extension

India remove Rohit Sharma but bowlers struggle again in Australia | Cricket News

Katherine Heigl swears for these beyond yoga’s leggings

Trump calls

Arsenal Champions League Exit: Mikel Artesa says Gunners have been the best team in competition | Football news

Create Kate Middleton’s spring Sunnis with this $ 19 looked

UK to limit permanent residence to several migrants

Our Picks

Trump calls

Arsenal Champions League Exit: Mikel Artesa says Gunners have been the best team in competition | Football news

Create Kate Middleton’s spring Sunnis with this $ 19 looked

Most Popular

Morgan Stanley Cedes Chief Goldman Sachs Rival

Steven Crueger of Yellowjackets excites the big responses that fans won’t see to come

VP JD Vance and his new family begin their life in the official residence

What's Hot

Subscribe to Updates

OpenAI’s agent tool may be close to release

Keep Reading