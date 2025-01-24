OpenAI has introduced a new feature called Operator, a digital assistant that can complete tasks on a computer in the same way humans do. Powered by a model named Computer-Using Agent (CUA), this AI is designed to use a mouse and keyboard, interact with screens, and handle tasks like browsing websites or filling out forms without needing any special programming or shortcuts.

CUA’s standout capability is its flexibility. Instead of relying on pre-built tools or coded instructions, it interacts directly with the buttons, menus, and forms on your screen, just like a human. This makes it much more adaptable, enabling it to tackle a wide variety of tasks across different applications and websites.

How it works

Operator combines vision and reasoning to get things done. First, it “sees” the screen by processing screenshots to understand what’s happening. Then, it plans the next steps, like deciding what to click or type. Finally, it takes action, completing tasks using a virtual mouse and keyboard.

For example, if you ask Operator to fill out an online form, it will look at the form, figure out the correct fields to fill, and type in the necessary information step by step. If something goes wrong—like a button not working—it can adapt and try a different approach.

However, it’s cautious about sensitive tasks. For actions like entering passwords or confirming purchases, it pauses and asks for your approval before moving forward.

Why is it a big deal?

CUA represents a big step forward for AI. Most AI tools today rely on specialised code to interact with apps and websites, which limits what they can do. CUA skips this step entirely by working directly with the same interfaces humans use. This approach allows it to adapt to nearly any situation, even in environments where AI traditionally struggles.

According to OpenAI, during tests, CUA achieved impressive results in handling online tasks, like navigating e-commerce websites or managing accounts. On some benchmarks, like WebVoyager (a test for browsing live websites like Amazon and GitHub), it was successful 87% of the time.

That said, it’s still early days. For more complex tasks, such as fully managing a computer system, its success rate is only 38.1%, far behind the 72.4% accuracy of a human. OpenAI acknowledges these limitations and plans to improve the system over time.

Because Operator can take real actions online, OpenAI has made safety a top priority. The system has been trained to avoid harmful or illegal tasks and blocks access to risky websites, like gambling platforms or adult content. It also double-checks actions that could have serious consequences, such as sending an email or making a purchase.

There are also safeguards in place to prevent mistakes. For instance, Operator asks for user confirmation before finalising critical tasks and avoids handling higher-risk activities like banking transactions for now. OpenAI has also implemented systems to monitor and stop suspicious behaviour, such as phishing attempts or unsafe commands.

What next?

Operator is currently available as a research preview for ChatGPT Pro users in the United States. OpenAI claims its goal is to gather feedback from real users, refine the technology, and expand its capabilities. The company also plans to make the underlying model, CUA, available to developers in the future, opening the door to custom AI-powered assistants.