OpenAI released its first artificial intelligence (AI) agent, Operator, on Thursday. Currently available as a research preview, the agent comes with a dedicated web browser. It is a general-purpose AI agent that can autonomously perform tasks online based on prompts given by the user. The AI firm said the tool can be used to book tickets online, reserve a table in a restaurant, or buy a product online. Currently, Operator is only available in the US to ChatGPT Pro subscribers, but the company plans to expand it to other subscription tiers in the future.
OpenAI Introduces Operator AI Agent
In a live stream, OpenAI CEO Sam Altman introduced the company’s first AI agent. Explaining what agents are, Altman said, “AI agents are AI systems that do work for you independently. You give them a task, and they go off and do it. We think it will be a big trend in AI.”
Operator is powered by the Computer-Using Agent (CUA), an AI model that combines vision capabilities from GPT-4o with advanced reasoning, an OpenAI blog post explained. The AI agent was post-trained using reinforcement learning. It can interact with graphical user interfaces (GUIs) including buttons, menus, and text fields on the screen. With its dedicated browser, the agent can perform tasks behind the scenes while freeing up the screen for the user.
The AI agent accepts both text and images as input. To complete tasks, the CUA processes raw pixel data of the screen and uses a virtual keyboard and mouse to execute actions. OpenAI claims it can navigate multi-step tasks, handle errors, and can also adapt to unexpected changes.
Use Cases of the Operator AI Agent
Rowan Cheung, founder of the AI newsletter The Rundown AI, had early access to Operator and highlighted some of its use cases in a series of posts on X (formerly known as Twitter). The AI agent was able to plan a weekend trip based on advice from Reddit, a specific budget, and interests. Interestingly, when the agent was blocked from accessing Reddit, it completed the task by running a Bing search with Reddit as a keyword.
2. Planning a weekend trip based on hidden gems off Reddit, my budget and interests
Notice how at 0:06, ChatGPT Operator was blocked from Reddit but then decided to just do a Bing search with “Reddit” at the end
Very impressive decision-making pic.twitter.com/D5m3ouiiqt
— Rowan Cheung (@rowancheung) January 23, 2025
In another instance, Cheung asked the Operator to find cryptocurrency tokens worth looking into. During its research, the agent got stuck on an “Are you human” CAPTCHA and immediately pinged the user to take control to confirm. Once Cheung confirmed, the AI agent took control and continued with the task.
The AI agent can seamlessly allow the user to jump in and take control at any given time and edit or change the task. Once the user is done, they can also give the control back to the agent. This ensures that the user has control over the AI agent at all times.
OpenAI also stated that it is collaborating with companies such as DoorDash, eBay, Instacart, and Uber to ensure that Operator respects the terms of service agreements of these businesses while accessing the platforms.
Operator’s Safety Risks and Mitigation
Coming to safety, the AI firm claimed that it has run extensive safety testing and has implemented mitigations against three safety classes — misuse, model mistakes, and frontier risks.
To reduce the risk of misuse, OpenAI has trained the CUA model to refuse harmful tasks and illegal or regulated activities. The company has also blocked gambling, adult entertainment, as well as drug and gun retailer websites. In addition, the company has also implemented automated and human-based reviews of user interactions.
For model mistakes or hallucinations, the AI agent is trained to ask for user confirmation before finalising tasks with external side effects. The CUA also declines to help with tasks such as banking transactions and while accessing sensitive websites, the agent requires active user supervision.
Frontier risks are the unexpected actions taken by a state-of-the-art AI model as it is generally not tested exhaustively. OpenAI said the CUA model has been evaluated against its Preparedness Framework, and the Operator System Card provides full details into the safety approach and ongoing improvements.
Currently, Operator is only available via the operator.chatgpt.com URL to ChatGPT Pro subscribers in the US. The company has stated that it plans to integrate the AI agent with all ChatGPT clients in the future. Notably, a ChatGPT Pro subscription is priced at $200 (roughly Rs. 17,200) a month.