AIMultipleAIMultiple
No results found.

Best Instagram Scrapers: Apify vs Python vs Bright Data

Gulbahar Karatas
Gulbahar Karatas
updated on Nov 25, 2025

Instagram is one of the most aggressive platforms when it comes to blocking automated scraping. We show how to scrape Instagram reliably using Python and Instagram scraper APIs. In this guide, you’ll learn how to collect different types of Instagram data, including profiles, posts, and comments.

Key takeaways: Instagram scraper, Python code & APIs

  • Basic Python scraping doesn’t work on Instagram due to strong anti-bot systems, so we rely on scraper APIs that handle proxies, browser simulation, and rate limits.
  • We built three scrapers in Python: profiles, posts, and comments, each using snapshot-based API jobs and clean CSV outputs.
  • Used Google Search to discover Instagram post URLs within keyword and date filters reliably.
  • Our polling system handles snapshot states, fallback downloads, JSON-line parsing, and 15-minute timeouts.

The best Instagram scrapers

Scrapers
Supported pages
Scrape options
Formats
Scraper type
Comments
Posts
Profiles
Reels
Query
URL
CSV
JSON
NDJSON
JSON lines
Specialized API
Comments
Posts
Profiles
Hashtags
Reels
Query
URL
Table
JSON
Specialized API
Decodo
Posts
Profiles
Hashtags
Reels
Query
Table
JSON
Specialized API
Nimble
No preset templates for IG
Query
JHTML
JSON
General-purpose

The vendors with links are AIMultiple’s sponsors.

  • Specialized API: Instagram-specific scraper API tailored for gathering data exclusively from Instagram. For example, Bright Data offers templates tailored to specific Instagram data points, such as “instagram-comments-collect by URL.”

  • General-purpose: Offers a versatile scraper that isn’t specialized for Instagram but can be modified to handle Instagram web scraping tasks.

  • Supported page types: Pages where the Instagram scraping tool delivers data in a structured format.

Instagram scraper APIs benchmark results

Compare providers’ median response time and the average number of fields that they returned in our benchmark:

Loading Chart

Pricing of the best Instagram scraper APIs

Monthly pricing options for these providers are listed below.

Loading Chart

How to scrape Instagram profiles with Python

Step 1: Setup and configuration

This step:

  • Imports the Python libraries for HTTP requests, JSON, and pandas.
  • Set your API token and the Instagram profiles dataset ID.
  • Defines profile_urls, the list of Instagram accounts you want to scrape (here it’s just langchain.ai, but you can add as many as you like).

Step 2: Submitting profile URLs to the web scraper

Here you start the profile scraping job:

  • Each profile URL is wrapped as an object in data and sent to the profiles dataset.
  • The API responds with a snapshot_id representing this job; you’ll use it in the next step to fetch the scraped profile data.

Step 3: Polling the API until profile data is ready

This loop:

  • Checks the snapshot status every 10 seconds, up to a 15-minute timeout.
  • Handles both “ready with download_url” and “items embedded in the response” formats, plus a fallback download endpoint.
  • Collects all returned profile records into the items list before moving on.

Step 4: Processing and saving Instagram profile data

Finally, you turn the raw API records into a clean dataset:

  • Safely parses numeric fields like followers, posts_count, and avg_engagement.
  • Keeps useful profile attributes: account IDs, business/professional flags, verification status, bio, full name, and external URL.
  • Stores everything in a pandas DataFrame and writes it to instagram_profiles_data.csv for further analysis or reporting.

How to scrape Instagram posts with Python

Step 1: Setup and configuration

In this example, we’ll use the Instagram dataset API plus proxies to collect Instagram posts that match a keyword within a date range.

This block:

  • Imports the Python libraries for URL parsing, HTTP requests, JSON handling, and data analysis with pandas.
  • Set your API token and Instagram dataset ID.
  • Configures the proxy for Instagram scraping.
  • Defines the search parameters: KEYWORD, the number of posts to fetch (NUM_POSTS), and the date window (DATE_START → DATE_END).

We use Google Search to find relevant Instagram posts that match our criteria within a specific date range.

This step uses Google Search to find posts. The script:

  • Builds a query like site:instagram.com/p/ “{KEYWORD}” after: DATE_START before: DATE_END and paginates through Google results.
  • Uses regex patterns to extract Instagram post URLs from the HTML, normalizes them (www.instagram.com vs instagram.com), and removes duplicates.
  • Stops when it has collected NUM_POSTS unique URLs or when it reaches the maximum number of Google result pages.

Step 3: Sending Instagram post URLs to API for scraping

This step kicks off the actual scraping job:

  • It sends all collected Instagram URLs to the Instagram dataset in a single batch request.
  • The API returns a snapshot_id that identifies this scraping job and is used in the next step to fetch the results once processing is complete.

Step 3: Polling for results and saving data

How to Scrape Instagram comments with Python

Step 1: Setup and configuration

This step:

  • Imports libraries for URL handling, regular expressions, HTTP requests, and pandas.
  • Set your comments dataset ID and API_TOKEN.
  • Configures the proxy to use and defines the search parameters: keyword, the number of posts to pull comments from, and the date window.

Step 2: Finding Instagram posts via Google search

Here you:

  • Use Google Search with the site:instagram.com/p/query and your keyword and date filters to find relevant posts.
  • Extract and normalize Instagram post URLs with regex, deduplicate them, and stop once you have NUM_POSTS posts.
  • Store the final list in instagram_urls, which will feed into the comments scraper.

Step 3: Submitting post URLs to comments scraper API

This step:

  • Sends all Instagram URLs to the Instagram comments dataset in a single batch.
  • Each URL is wrapped as {“url”: …} so the API knows which post to scrape comments from.
  • The API returns a snapshot_id that identifies this comments scrape job.

Step 4: Polling for results and saving comment data

We continuously check if the scraping is complete, then process and save the comment data.

This section polls the API every 10 seconds until the scraping is complete. Once ready, it retrieves all comment data, extracting key information like the commenter’s username, comment text, likes, replies, hashtags used, and tagged users. The data is structured into a pandas DataFrame and saved as a CSV file.

How Instagram detects scrapers (why basic Python scripts fail)

Simple Python scripts using requests fail immediately because they lack real browser behavior and rely on a single IP that gets blocked within minutes. The platform can detect Instagram web scrapers instantly through multiple layers of defenses:

  • No JavaScript execution: Instagram loads much of the page dynamically, and Python scripts cannot execute JavaScript, so pages appear empty. This instantly reveals non-human behavior.
  • Rate limiting: Human users do not make 50 requests per second. Basic scrapers retry with predictable timing, and Instagram blocks this immediately.
  • IP reputation: Instagram maintains real-time IP trust scores, datacenter IPs, and IP duplication. Do not use free proxies; these get blocked after a few requests.

We used a web scraper API that handles browser simulation, IP rotation, JavaScript, rate limits, and captcha solving.

Proxies, rate limits & running your Instagram scraper at scale

Instagram will ban any script that reuses the same IP repeatedly. To scrape Instagram at scale, you must use rotating residential proxies, respect rate limits, introduce delays, and avoid direct requests to Instagram. Here is how we used proxies while scraping data from Instagram:

At scale, Instagram performs velocity checks (too many requests too fast) and concurrency checks (too many requests at once). Our tutorial avoids this by:

  • sleeping between Google Search pages (time.sleep(2))
  • polling APIs every 10 seconds,
  • never hitting Instagram directly.

Instagram scrapers used in the benchmark

Our benchmark tested the dedicated Instagram scraper API solutions listed below. To learn more, see the benchmark methodology for web scraping APIs.

See which major web infrastructure companies offer specific scrapers for Instagram pages:

FAQs about Instagram scraper (Python & API)

Industry Analyst
Gulbahar Karatas
Gulbahar Karatas
Industry Analyst
Gülbahar is an AIMultiple industry analyst focused on web data collection, applications of web data and application security.
View Full Profile

Comments 1

Share Your Thoughts

Your email address will not be published. All fields are required.

0/450
Alyaa Anter
Alyaa Anter
Mar 15, 2023 at 14:03

Could you help me in collecting data from Instagram

Bardia Eshghi
Bardia Eshghi
Sep 11, 2023 at 05:52

Hello, Alyaa, Doesn't the article help you with that?