AIMultiple ResearchAIMultiple ResearchAIMultiple Research
We follow ethical norms & our process for objectivity.
This research is not funded by any sponsors.
Agentic AI
Updated on Jun 20, 2025

We Tested OpenAI's New Agent & Open Operator [2025]

Headshot of Cem Dilmegani
MailLinkedinX

Early in 2025, OpenAI announced Operator, a new research preview of ChatGPT that serves as an agent for repetitive activities. It can browse for plane tickets, book a table, or shop online, and execute daily digital tasks for you on its own. 

While OpenAI’s version utilizes a closed model for tasks such as booking and order management, Open Operator provides a free, open-source alternative.

In this article, we’ll be OpenAI’s Operator and Open Operator on:

  1. Ordering deliveries
  2. Making a restaurant reservation
  3. Designing a room

What is Open Operator?

Open Operator is an open-source browser automation agent developed by Browserbase. Its GitHub project has 1,700+ stars. The developers maintain a Slack channel and a documentation site (Stagehand’s docs).

How does it work?

Open Operator enables the creation of web agents by combining key tools into a structured control loop:

  • OpenAI – Identifies natural language input and helps determine user intent.
  • Stagehand – Translates that intent into precise browser actions and manages dynamic DOM (document object management) states.
  • Browserbase – Executes those actions in a real browser, enabling full interaction with modern websites.
  • Next.js – Provides a web application framework that supports the user interface and backend integration.

Here is a step-by-step explanation of how Open Operator works:

1. User input (Natural Language Request):

The user enters a plain-language instruction—e.g., “Find a blue boho pillow cover under $30.”

2. Planning browser actions (via Stagehand):

Stagehand interprets the intent and breaks it down into low-level browser commands, such as clicking a button, filling out a form, or selecting a dropdown option.

It also tracks browser state (e.g., “what’s visible now”, “what changed after a click”) to ensure the agent can adapt as the page changes.

3. Executing in a real browser (via Browserbase):

The planned actions are executed in a real cloud-based browser using Browserbase, mimicking human interactions. This enables it to operate on dynamic websites (e.g., Amazon, OpenTable, airline portals) that don’t expose public APIs.

4. Agent loop:

After the initial execution begins, Open Operator enters a control loop to stay aligned with the user’s goal. It:

  • Interprets the updated goal or context
  • Plans the appropriate browser actions via Stagehand
  • Executes the actions using Browserbase
  • Observes the resulting page state and behavior
  • Refines its strategy or proceeds accordingly

5. Returning structured output:

After completing an action, the agent extracts relevant content from the web page (like product titles, prices, or buttons) and returns it in a clean, structured format that users or downstream systems can use.

What is OpenAI Operator?

OpenAI’s first AI agent, Operator, is designed to execute web-based tasks autonomously. It can interpret instructions, reason through them, and execute actions without constant human input. You can take over the screen at any time—just in case the agent goes rogue.

In contrast to traditional automation tools like n8n, which rely on fixed APIs or rigid workflows, Operator navigates websites much like a human by clicking, typing, and scrolling.

Its core purpose is to streamline digital tasks that typically require manual effort or technical expertise. This makes it a good fit for a range of use cases, from simple tasks like booking reservations or completing forms to more complex, multi-step processes.

For now, OpenAI’s Operator organizes its capabilities into six main use case categories, displayed on the home screen:

  1. Dining & Events
  2. Delivery
  3. Local Services
  4. Shopping
  5. Travel
  6. News

How does it work?

The Operator is powered by a new model known as the Computer-Using Agent (CUA), which is built on OpenAI’s multimodal large language model, GPT-4o.

Source:1

CUA breaks tasks into smaller steps and works through them sequentially, backtracking when necessary.

This approach stems from its training, which utilizes techniques similar to those applied in OpenAI’s reasoning-focused models, such as O1 and O1-preview.

Limitations

OpenAI Operator is currently in an early research preview. While it can already perform daily user tasks, it’s still evolving. For example, it is not a good fit for more complex interfaces, such as building slideshows or managing calendars.

How to access OpenAI’s Operator

Operator is currently available only to U.S.-based users with a ChatGPT Pro subscription ($200/month) at operator.chatgpt.com.

Open Operator vs OpenAI Operator

Updated at 06-20-2025
Feature/aspectOpenAI OperatorOpen Operator (Browserbase)
Release statusEarly research preview (closed source) Open-source, available on GitHub
Model integrationGPT-4o (OpenAI only) Model-agnostic (supports OpenAI, Claude, local LLMs)
Use case flexibilityFocused on consumer tasks (shopping, booking, info) Development, internal automation, scraping, RPA
Browser environmentSandboxed browser Real, cloud-based browser with full document object access
Task onboardingConversational, guided task promptsRequires manual prompt engineering
DOM control*Abstracted, high-level internal interface Full control via Stagehand + Browserbase
Error handlingSmooth in curated flowsSubject to CAPTCHAs, timeouts
Data extractionNot designed for scraping Built-in document object-level data extraction

*DOM control refers to the agent’s ability to directly interact with elements on a web page, like clicking buttons, typing into fields

Use case comparison

1. Ordering deliveries 

Prompt:  Help me buy a boho-style throw pillow cover under $30

For the order deliveries use case test, I provided a simple shopping request, and both OpenAI Operator and Open Operator attempted to fulfill it by navigating online retail platforms. Here’s how they performed:

OpenAI operator:

Asking for details

Once I entered the prompt: “Help me buy a boho-style throw pillow cover under $30”, OpenAI Operator immediately asked for specific details such as size, shipping location, color or pattern preferences, and shipping options.

I  provided this information, Open AI Operator started to search Etsy listings that match the criteria and present relevant product options.

Searching for the product

OpenAI Operator initiated a search using the query “boho pillow cover blue 16×16” and applied relevant filters for price and free shipping, aligning precisely with the preferences I had provided.

Selecting and adding to cart

It then browsed through the product listings and highlighted several relevant options, providing details such as price, discounts, customer reviews, and availability. OpenAI Operator surfaced items from multiple sellers, including a $24.13 set, a $10.42 deal, and others, all aligned with the specified under-$30 budget.

Proceeding through checkout

Eventually, Operator found three boho‑style blue throw pillow covers (16 × 16 in option available) that meet your under‑$30 budget. But it did not proceed through the checkout process.

Here is the final output:

Open Operator 

Searching for the product

When tested with a request to order a boho-style throw pillow cover, Open Operator initially tried to access Amazon without asking for specific details—such as size, shipping location.

I watched as it started by navigating to Amazon and typing in the search query: “boho throw pillow cover under 30.”

When the first page showed an error, it redirected to the homepage, re-entered the query, and clicked the magnifying glass to start the search again.

Selecting and adding to cart

After reviewing the search results, Open Operator chose a matching product—the AVOIN Colorlife Boho Throw Pillow Cover priced at $6.99. It then added the item to the cart and clicked “Proceed to Checkout.”

Logging in to Amazon

At the checkout stage, Operator reached the Amazon sign-in page. It entered a placeholder email (example@email.com) and password (password123), then clicked “Sign In” to continue.

Solving the CAPTCHA, but getting blocked

Amazon presented a security puzzle requiring the rotation of objects (like a car or shoe) to face a light source.

Open Operator interacted with the on-screen arrows, adjusting the object’s orientation, and submitted the puzzle multiple times. Eventually, it got blocked, and I had to finish the testing.

Comparison:

Updated at 06-20-2025
Aspect/StepOpenAI Operator ✅Open Operator (Browserbase)
Initial prompt handlingAsked for preferences like size, color, shipping ✅Did not request preferences ❌
Search executionSearched Etsy with filters applied ✅Searched Amazon, retried after site error 🟡
Item selectionPresented 3 matching items with ratings and pricing ✅Selected one $6.99 pillow cover ✅
Checkout flowStopped before cart (by design) ❌Reached checkout, attempted account login ✅
CAPTCHA handlingNot encountered ✅CAPTCHA blocked progress despite attempts 🟡
Task completionCompleted product research only ✅Blocked during checkout, test aborted ❌
  • Successful – Step completed as intended with no issues
  • 🟡 Partially Achieved – Step was attempted but not fully successful (e.g. UI issues, partial execution)
  • Failed – Step could not be completed due to error, disconnection, or flow breakdown

2. Making a restaurant reservation

Prompt:  Make a reservation at a Mexican restaurant for Friday at 7 pm. My Zip code is 80014.

For this test, I provided a straightforward reservation request, and both OpenAI Operator and Open Operator attempted to fulfill it by navigating to OpenTable and interacting with its booking interface. Here’s how they performed:

OpenAI Operator

Searching for restaurants

Open AI Operator began searching on OpenTable, trying several queries like “Mexican 80014” and navigating directly to OpenTable’s site.

Initiating the reservation

 It set the correct filters: date (Friday, June 20, 2025), time (7:00 PM), and cuisine (Mexican).
It scrolled through available listings and identified two options:

  • Juana Tequila
  • Yo Mommas Cantina

After evaluating the listings, it selected Yo Mommas Cantina since it matched that exact time slot and chose a standard table for 2 people at 7:00 PM.

Completing the booking

Operator prepared the final step—clicking the “Complete Reservation” button—and asked for my confirmation.
After I confirmed, it clicked the button and initiated the final step, which triggered a SMS verification code from OpenTable.

OpenAI Operator then asked me to share the 6-digit verification code to finalize the booking, or offered the option to take over and complete it myself.

Open Operator

Launching the search

Operator navigated directly to a pre-filled OpenTable search URL, including parameters for:

  • Cuisine: Mexican
  • Location: ZIP 80014
  • Time: Friday at 7 PM
  • Party size: 2

Filtering and adjusting time

Open Operator clicked the “Mexican” filter to refine the results. I noticed that the search defaulted to 10:00 AM, so I clicked the time dropdown to adjust it to 7:00 PM. It changed the time to 7:00 PM. However, the time still defaulted to 10:00 AM, so it was prepared to change that again. The UI didn’t immediately register or display at that time.

Connection failure

At Step 13, Open Operator continued navigating and retrying the dropdown to lock in the correct time. The operator experienced WebSocket disconnection errors multiple times.

As a result of repeated failures in maintaining a stable connection, I had to end the test manually

Comparison:

Updated at 06-20-2025
Aspect/StepOpenAI Operator ✅Open Operator (Browserbase)
Initial prompt handlingAsked for preferences like size, color, shipping ✅Did not request preferences ❌
Search executionSearched Etsy with filters applied ✅Searched Amazon, retried after site error 🟡
Item selectionPresented 3 matching items with ratings and pricing ✅Selected one $6.99 pillow cover ✅
Checkout flowStopped before cart (by design) ❌Reached checkout, attempted account login ✅
CAPTCHA handlingNot encountered ✅CAPTCHA blocked progress despite attempts 🟡
Task completionCompleted product research only ✅Blocked during checkout, test aborted ❌

3. Designing a room

Prompt: “Design a 14×16 ft room with a calm, pop culture vibe, including a workspace and bed area. Use plants and decor, starting from scratch with a $15,000 budget. List products to buy and their arrangement for a stylish look.2

For the next test, I asked Open Operator and OpenAI Operate to create a place from scratch. I provided them with the space’s dimensions, a list of furniture requirements, and a budget. 

OpenAI Operator

Searching for furniture and decor

Open AI Operator jumped into action, searching for a modern desk and comparing options on Walmart. But it ran into a CAPTCHA and asked me to step in to verify I was human. Once I took over and passed the check, it got back to work.

Resumed planning after takeover

Once access was restored, Operator pulled together a complete design plan. It created a product list with prices, suggested where each item should go in the room, and kept the overall look minimal and calming.

Room design plan provided:

  • Work Desk: 47″ Modern Rectangular White Desk with Drawer – $499.99
    Placed against a wall near a window for natural light.
  • Desk Chair: Art Leon Velvet Desk Chair with Gold Plated Legs – $179.99
    Paired with the desk for both comfort and style.
  • Bookshelf: LINSY HOME 3-Tier Bookshelf – $89.99
    Placed next to the desk for storage and accessibility.
  • Floor Lamp: Lavish Home Tripod Floor Lamp – $86.20
    In the corner, to add warm, ambient lighting.
  • Indoor Plant: DR.Plantzen 5ft Artificial Ficus Tree – $33.55
    Adds greenery, placed near the bookshelf or another corner.

The total cost is approximately $889.73, leaving room in the budget for additional decor or accessories. 

In the end, Open AI Operator put together a clean and functional setup featuring a desk, chair, bookshelf, floor lamp, and a plant, well within budget. It also suggested thoughtful item placement, making use of natural light. I jumped in to assist at a few points, but it managed the core planning and product selection on its own.

Open Operator

Adjusting the plan

Initially, I asked Open Operator to start searching Walmart. However, it ran into a CAPTCHA. To solve this, I changed the prompt and instructed it to search Amazon instead, where it could continue without interruption.

Starting the search on Amazon

Open Operator began by navigating to Amazon’s home decor and furniture section. It encountered a brief site error but quickly redirected to Amazon’s homepage.

From there, it:

  • Searched for bedroom furniture and desks
  • Entered the “Bed & Bath” category
  • Switched to “Furniture”
  • Opened the “Beds” subcategory to begin product selection

Selecting and adding a mattress, bed frame

Operator chose the EGOHOME 14-Inch King Size Memory Foam Mattress ($599.00) for its cooling comfort and high ratings, matching the “calm” aspect of the design.

Next, it selected the LIKIMIO King Size Bed Frame with Storage Headboard ($209.99), featuring. These two were also added to the cart.

Proceeding to checkout, creating an Amazon account

With both core bedroom pieces in the cart, Operator clicked “Proceed to Checkout”, triggering the login flow.

To move forward, Operator began creating an Amazon account using placeholder data:

  • Name: John Smith
  • Email: johnsmith@example.com
  • Password: Password123

It completed the form and reached the email verification step, where it paused and awaited user input.

Connection failure

However, no layout summary was generated—the task paused mid-way due to a WebSocket disconnection.

Comparison:

Updated at 06-20-2025
Aspect/stepOpenAI OperatorOpen Operator (Browserbase)
Source usedOpenTable ✅OpenTable ✅
Search behaviorPerformed multiple queries with filters ✅Pre-filled URL, applied filters manually ✅
Filter handlingSet exact time (7:00 PM), date (June 20), cuisine (Mexican), party size (2) ✅Attempted to set 7:00 PM, but dropdown UI defaulted to 10:00 AM 🟡
Restaurant chosenSelected Yo Mommas Cantina ✅Never reached selection ❌
Checkout flowClicked “Complete Reservation”, triggered SMS verification ✅Could not proceed past time selection ❌
User involvementMinimal – only needed to confirm SMS code ✅Required manual takeover due to WebSocket errors ❌

Bottom line: Trust and capability issues

While Open Operator can be capable, I experienced some trust issues after it hallucinated responses a few times. Also, some capability problems.

  • Trust issues: When I asked it to help design a room and purchase a boho-style pillow cover, it only selected two items—a mattress and bed frame. It did not return a full product list, let alone a layout.
  • Capability issues: After selecting two valid items on Amazon and proceeding to checkout, it attempted to create an account, but a WebSocket disconnection ended the session entirely, before any layout or final plan could be generated. 
Share This Article
MailLinkedinX
Cem has been the principal analyst at AIMultiple since 2017. AIMultiple informs hundreds of thousands of businesses (as per similarWeb) including 55% of Fortune 500 every month.

Cem's work has been cited by leading global publications including Business Insider, Forbes, Washington Post, global firms like Deloitte, HPE and NGOs like World Economic Forum and supranational organizations like European Commission. You can see more reputable companies and resources that referenced AIMultiple.

Throughout his career, Cem served as a tech consultant, tech buyer and tech entrepreneur. He advised enterprises on their technology decisions at McKinsey & Company and Altman Solon for more than a decade. He also published a McKinsey report on digitalization.

He led technology strategy and procurement of a telco while reporting to the CEO. He has also led commercial growth of deep tech company Hypatos that reached a 7 digit annual recurring revenue and a 9 digit valuation from 0 within 2 years. Cem's work in Hypatos was covered by leading technology publications like TechCrunch and Business Insider.

Cem regularly speaks at international technology conferences. He graduated from Bogazici University as a computer engineer and holds an MBA from Columbia Business School.
Mert Palazoglu is an industry analyst at AIMultiple focused on customer service and network security with a few years of experience. He holds a bachelor's degree in management.

Next to Read

Comments

Your email address will not be published. All fields are required.

0 Comments