AIMultiple ResearchAIMultiple Research

Proxy Benchmark: Scraping 10k Webpages with Top 5 Proxies in '24

Gulbahar Karatas
Updated on Feb 16
4 min read

AIMultiple’s proxy benchmark report provides an in-depth analysis of five proxy service providers, featuring industry leaders such as Oxylabs and Smartproxy. Our evaluation encompasses a detailed review of five different products focusing on residential proxy networks covering:

  • Effectiveness as measured by success rate
  • Scalability as measured by
    • Scraping time per page which is critical for high scale scraping operations
    • Variation in scraping time per page.

How to choose the right proxy for your web data task?

If you aim to rapidly (within couple hours) get all the pages in a large (>10,000 URLs) target URL list which includes URLs that leverage anti-scrape techniques: Work with market leading proxy companies like Bright Data, Oxylabs and Smartproxy that offer unblocker services.

Even if you don’t need this data rapidly, our recommendation doesn’t change. Web unblockers are necessary to access pages that attempt to block scrapers.

If you are targeting easy-to-crawl websites and are OK with not getting results from all pages, you can leverage any proxy provider based on their pricing. For more, check out AIMultiple’s proxy pricing guide.

What are the benchmark results?

  • Regardless of the brand, residential proxies either work or don’t work for a specific site based on that website’s anti scraping approach:
    • For web sites which heavily rely on anti-scraping techniques, the success rate was ~0% (just a handful of successful results out of 1,700 pages) for all 5 providers.
    • For lightly protected websites, success rate ranges from 90 to 99%.
  • There are significant speed differences between brands in parameters that impact scalability.
    • Average response time ranges from 2 to 6 seconds
    • Standard deviation of response time ranges from 1 to 7 seconds

What are the best residential proxy providers?

Effectiveness

Effectiveness is measured by success rate: The percentage of connection requests that are successfully processed by the proxy server (i.e. returns a 200 response).

In terms of success rate in lightly protected websites, all benchmarked providers achieved >90% success rate.

Since the results are relatively close, we are not sharing exact percentages. It may not make a major difference if a brand achieved 97% success rate while another achieved 94%. In both cases, another crawl of unsuccessful pages is necessary.

Scalability

Smartproxy and Oxylabs lead the pack in terms of average time to extract data and the standard deviation of time to extract data.

Proxy providerAverage Successful Response Time (s)Standard Deviation of Successful Response Time (s)
Oxylabs21
Smartproxy22
Webshare46
IPRoyal45
Rayobyte67
  • Average Time to Successful Response (s): The average time it obtains a  successful response from the proxy network, measured in seconds
  • Standard Deviation of successful response time (s): Indicates the variability of the response time. A lower standard deviation reflects more consistent proxy performance.

The data collection occurred in November 2023.

Methodology

AIMultiple team leveraged each proxy brand to extract data from pre-selected URLs. Requests were processed sequentially.

Target URL set included 10,200 URLs from 6 websites in these domains:

  • E-commerce: 3 
  • Travel: 1
  • HR: 1
  • Real estate: 1

Limitations and next steps

Success rate was measured based on the message code returned by the webpage. An anti scraping technique is to return successful codes to crawlers while sharing incorrect information. AIMultiple’s benchmark ignored this technique since it was challenging to verify correctness of webpage data.

For unsuccessful crawl attempts, no more attempts were tried. For example, a user can then try to scrape the same page with an unblocker. AIMultiple’s next benchmark would incorporate unblockers as well.

Proxy server FAQ

1. What is a proxy?

A proxy server is an intermediary between a user’s device and the target website. When you use a proxy server, your internet traffic is masked by the proxy server’s IP address before reaching its destination. 

2. What are the different types of proxies?

Proxies are typically classified into two main categories: residential and datacenter proxies

  • Residential proxies: Residential proxies are associated with an IP address provided by an Internet Service Provider (ISP).
    • Static residential proxies (ISP): ISP proxies are residential IP addresses that remain consistent over time. 
    • Mobile proxies: These are residential proxies that use IP addresses assigned to mobile devices, such as phones and tablets, by mobile network operators. 
  • Datacenter proxies: Datacenter IP addresses are provided by data centers rather than internet service providers. 

Utilizing a proxy server is not inherently illegal. However, the legality of using proxy servers depends on how they are used, the laws in your specific country. 

4. How do I choose the right proxy provider?

It’s essential to determine the type of proxy suitable for your requirements. For instance, datacenter proxies typically provide faster speeds compared to residential proxies, making them suitable for high-speed tasks. If selecting IPs from particular countries is important, verify that the proxy provider supplies proxies in those regions. Consult user reviews on third party review platforms for unbiased opinions. Additionally, see if the provider allows a trial period or offers a money-back guarantee, enabling you to evaluate their service before fully committing.

5. What is the difference between a VPN and a proxy?

VPNs (Virtual Private Networks) and proxies both serve to route internet traffic through a server, concealing your IP address in the process. VPNs provide encryption for all the data transmitted between your device and the VPN server, including your browsing history. In contrast, a proxy server only hides your IP address and redirects your requests. While a VPN encrypts and routes all your internet traffic, not just limited to your browser, a proxy only redirects traffic from specific applications or your browser. Generally, proxies offer faster speeds compared to VPNs.

Transparency statement

AIMultiple serves numerous emerging tech companies, including Smartproxy and Oxylabs.

More on proxy server

If you need help finding a vendor or have any questions, feel free to contact us:

Find the Right Vendors
Access Cem's 2 decades of B2B tech experience as a tech consultant, enterprise leader, startup entrepreneur & industry analyst. Leverage insights informing top Fortune 500 every month.
Cem Dilmegani
Principal Analyst
Follow on

Gulbahar Karatas
Gülbahar is an AIMultiple industry analyst focused on web data collections and applications of web data.

Next to Read

Comments

Your email address will not be published. All fields are required.

0 Comments