MCP (Model Context Protocol) establishes a standardized communication bridge between AI agents and applications, allowing AI apps and LLMs to interact with external tools and services.
When configured in an MCP client such as Claude Desktop, LLMs can leverage specialized MCP servers. Browsing MCPs are particularly valuable as they enable web data extraction, including the ability to render JavaScript-heavy pages, bypass common access restrictions, take actions, fill forms and access geo-restricted content from various global locations. See top MCP servers:
MCP servers with web browsing capabilities
Product | Success rate for search & extract | Success rate for browser automation | Speed |
---|---|---|---|
Bright Data | 92% | 100% | 9% |
Apify | 52% | 0% | 21% |
Oxylabs | 50% | N/A | 47% |
Firecrawl | 45% | N/A | 100% |
Browserbase | 35% | 0% | 11% |
Tavily | 26% | N/A | 48% |
Exa | 15% | N/A | 45% |
Each of the dimensions above and their measurement methods are outlined below:
Success rate
*N/A indicates that the MCP server does not have this capability.
Our benchmark results reveal that Bright Data has the highest success rates.
Across all the tools we benchmarked, Apify, Bright Data and Browserbase are the only ones with both of the capabilities required for agents working on the web:
- Navigation includes searching the web and using links on the page to navigate between pages to collect and process data.
- Browser automation includes interacting with JS elements to fill forms etc.
To see the tasks used in the benchmark in detail, see our methodology.
Speed
If we consider:
- Task completion time: Oxylabs is the fastest MCP with the average task completion time of 29 seconds however its accuracy rate was 50%.
- Browsing time: Firecrawl is the fastest with 7 seconds of average browsing time however its accuracy was 45%
All speed metrics are for correctly completed tasks. Sometimes MCP servers produce quick responses indicating failure which isn’t comparable to the time to complete a task.
Our dataset for navigation included the participation of all brands and yielded 105 data points (i.e. 7 brands, 3 tasks and 5 repetitions for each task). Based on these data points, there seems to be a negative correlation between success rates and speed:
This correlation is intuitive:
- Sometimes websites identify bots as suspicious traffic and trigger anti-scraping features.
- This leads some MCP servers to fail.
- Those that don’t fail need to use unblocking technology which can be slower (i.e. 95% confidence interval includes 4 seconds for one of the providers in our web unblocker benchmark.
Features
We have also measured some important features of these MCP servers. For an explanation of features, please see the methodology section in agent browser benchmark.
Search engine support
Product | Bing | DuckDuckGo | Baidu | |
---|---|---|---|---|
Bright Data | ✅ | ✅ | ✅ | ✅ |
Oxylabs | ✅ | ✅ | ✅ | ✅ |
Firecrawl | ❌ | ✅ | ❌ | ❌ |
Apify | ✅ | ✅ | ✅ | ✅ |
Browserbase | ✅ | ✅ | ❌ | ❌ |
Tavily | ❌ | ❌ | ❌ | ❌ |
Exa | ❌ | ❌ | ❌ | ❌ |
Targeting
Product | City-Level Targeting | ZIP-Code Targeting | ASN Targeting |
---|---|---|---|
Bright Data | ✅ | ✅ | ✅ |
Oxylabs | ✅ | ✅ | ❌ |
Firecrawl | ✅ | ❌ | ❌ |
Apify | ❌ | ❌ | ❌ |
Browserbase | ❌ | ❌ | ❌ |
Tavily | ❌ | ❌ | ❌ |
Exa | ✅ | ✅ | ✅ |
Security
Data security is crucial for enterprise operations. We checked whether the companies of these agent browsers had data security certification. All of the companies claim on their websites to have either an ISO 27001 or a SOC 2 certification.
Pricing benchmark
Since all MCP servers with browsing capabilities use different parameters in pricing, it is hard to compare them.
Therefore, we measured their price for a single task. It is difficult to measure the cost for only correct tasks, most providers do not break down costs granularly over time. Therefore, to be fair to all products, we choose the first task for measurement, since it has the highest overall success rates.
Most products are available through various plans with different limits, and some of these plans also allow for the purchase of additional credits. They measure the spent credits in different parameters like per API call, per GB, or per page.
Please note that these prices do not include the cost of the LLM and our cost of using Claude Sonnet 3.5 was more than the browsing costs during these tasks. Therefore, LLM pricing is likely to be more important than MCP server pricing while building agents for web-related tasks.
*Prices may vary depending on the selected plan and enterprise discounts.
Participants
We included all MCP servers that provide cloud-based web browsing capabilities:
- Apify
- Bright Data
- Browserbase
- Firecrawl
- Oxylabs
- Exa
- Tavily
Apify, Bright Data and Oxylabs are sponsors of AIMultiple.
For this version of our benchmark, we excluded MCP servers that worked on the users’ own devices since they have limited capabilities for responding to a high number of requests. If we missed any cloud-based MCP servers with web browsing capabilities, please let us know in the comments.
Methodology to assess the MCP servers’ browsing capabilities
MCPs function across various development environments, including Claude Desktop, VSCode, and Cursor. In our evaluation, we integrated MCPs into a LangGraph agent framework using the langchain-mcp-adapters library. We used 4 prompts:
- Shopping assistant: “Go to Amazon and find 3 headphones under 30 dollars. Provide their names, ratings and URLs.”
- AI SDR for lead generation: “Go to LinkedIn, find 2 people who work at AIMultiple, provide their names and profile URLs.”
- Travel assistant: “Find the best price for the Betsy Hotel, South Beach, Miami on June 16, 2025. Provide the price and URL.”
- Form filler: “https://aimultiple.com/ go to that page, enter my e-mail xxx@aimultiple.com to the newsletter subscription and click to the subscribe button.”
We executed each task 5 times per AI agent and evaluated performance based on specific data points.
Each task constituted 25% of the total score, with points awarded for successfully retrieving each required data element. Our code tracked both the MCP tools’ execution time and the complete agent processing duration, using claude-3-5-sonnet-20241022 as the large language model of the AI agent.
To be fair to all MCPs, we used the same agent with the same prompts and the same system prompts. The system prompt is written in a language suitable for all the agents (no specific tool mentions or detailed instructions).
The first three tasks measured the MCPs’ search and extraction capabilities, and the last task measured their browser automation abilities.
MCP web browsing challenges & mitigations
While we faced similar challenges to the agent browser benchmark, MCPs present novel challenges to benchmarking. LLMs, with the addition of an external memory function, can be used as a Turing machine, and with an MCP server that provides browsing capabilities, it is theoretically possible to complete any web navigation or browser automation task with MCP servers that provide these capabilities.
Therefore, by writing custom code for each agent, it is possible to achieve 100% success rates. However, that is not a good proxy for MCP users who want to provide simple instructions and achieve high success rates. Therefore, we chose prompts that are as simple and as universal as possible and do not make references to functionality in specific MCP servers.
Context window
The context window may be exceeded in long tasks. Agents are consuming full pages as they navigate the web and as a result the limited context window of LLMs is sooner or later exceeded. Therefore, to build agents that complete tasks that involve many pages, users need
- LLMs with large context windows
- Optimize the sizes of the pages passed to the LLM. For example, you may be able to programmatically remove unnecessary parts of pages and have LLM focus only on the important parts of the pages.
Developer experience
Since MCPs are relatively new, using them without well-known frameworks like Claude Desktop or Cursor might be challenging for developers due to the lack of documentation available on the Internet. However, using it on these platforms does not require any coding, and by following our instructions, you can configure your Claude Desktop to use the MCP you need.
Comments
Your email address will not be published. All fields are required.