AIMultipleAIMultiple
No results found.

TikTok Scraper: How to Scrape TikTok (Python Tutorial + Legal Tips)

Sedat Dogan
Sedat Dogan
updated on Nov 10, 2025

This guide covers how to scrape TikTok data safely, efficiently, and within TikTok’s legal boundaries. Use the links below to jump directly to the sections you need:

Discover the top TikTok scrapers, including details on their data coverage, pricing, and regional support.

  • UI: User interface
  • Dedicated: Provides a dedicated TikTok scraper API solution specifically designed for collecting data from TikTok.
  • General-Purpose: This offers a scraper that is not explicitly designed for TikTok but can be adapted for TikTok web scraping purposes.
  • Supports: Pages where the scraper returns structured data.

Choosing the right TikTok scraper API

Bright Data

Bright Data’s TikTok Scraper API offers direct endpoints for collecting profile, video, and hashtag data. The platform supports:

  • Profile endpoint: Collect public fields, such as nickname, follower count, video count, and total likes.
  • Post endpoint: Extract video URLs, hashtags, view counts, captions, and engagement metrics.
  • Comment endpoint: Retrieve top-level comments and engagement ratios.
  • Response formats: JSON, CSV, or streaming output for real-time data pipelines.

Bright Data handles IP rotation, browser emulation, and rate-limit control automatically. It’s best suited for teams needing large-scale, structured data feeds.

Apify

Apify provides a modular TikTok scraper actor that allows developers to gather public TikTok data via API or Node.js scripts. Here is how the Apify TikTok scraper actor works:

  1. Generate an API token from your Apify account.
  2. Install the apify-client package.
  3. Call the TikTok Scraper Actor with parameters such as:
    • region (e.g., “US”)
    • type: “HASHTAG”, “USER”, or “MUSIC”
    • url: The target TikTok hashtag or profile URL
    • limit: Number of videos to extract
  4. Export results via the Dataset API in JSON or CSV.
  5. Download videos using the video.play_addr.url_list[0] path.

TikTok-specific strengths:

  • Automatically handles dynamic JavaScript loading and pagination.
  • Allows retrieval of engagement metrics, hashtags, and music IDs.
  • Works with Python, Node.js, or cURL, supporting multi-language integration.

Nimbleway

Nimble’s web scraping API offers proxy rotation and fingerprint evasion features that improve TikTok scraping reliability. While not TikTok-exclusive, its residential proxy network and anti-bot bypass logic make it a strong choice for accessing public TikTok endpoints from different regions.

Decodo

Decodo offers a TikTok post scraper that focuses on collecting comment threads and search results by country or keyword. The API supports XHR-only mode, which filters raw network responses to provide developers with precise JSON payloads. This mode helps integrate TikTok post data into dashboards or NLP pipelines.

Octoparse

Octoparse offers multiple pre-built TikTok scraper templates for collecting post, profile, and comment data directly from TikTok’s public pages.

Unlike API-based tools such as Bright Data or Apify, Octoparse utilizes visual automation that replicates real user interactions through its browser emulator. Each template supports configuration for:

  • Batch input (up to 10,000 TikTok URLs)
  • Custom page size (50–200 results)
  • Export options (Excel, CSV, JSON, or Google Sheets)
  • Pricing tiers (Free: $0.4/1,000 lines – $2/1,000 lines for detailed video metadata)

How to build a TikTok profile scraper in Python

If you prefer coding your own TikTok data scraper instead of using no-code tools, Python gives you complete control over what data you collect and how you process it. In this tutorial, you’ll learn how to scrape TikTok data such as usernames, captions, and engagement metrics using Python libraries.

Note: Always comply with TikTok’s robots.txt3 and Terms of Service when collecting public data.

This TikTok scraping tutorial shows you how to scrape TikTok profile data using Bright Data TikTok scraper to extract detailed post information.

Step 1: Set up Your Python TikTok scraper

To start TikTok scraping with Python, you first need to import the required libraries and configure your API credentials. This setup step prepares your environment for running a TikTok scraper or any other TikTok scraper script.

In this step, you’re importing essential Python packages used for sending HTTP requests, handling JSON responses, and managing data with Pandas. These libraries form the base of any Python TikTok scraper.

The script needs your API token and TikTok dataset ID to authenticate and connect to the platform. You can find both values inside your API dashboard under the TikTok scraper section.

Set the profile URL you want to analyze. This example utilizes a single TikTok profile scraper URL; however, you can easily modify it to include multiple competitor profiles for large-scale TikTok data scraping.

Step 2: Trigger TikTok Scraping with the scraper API

This step activates the TikTok scraping job and begins retrieving the data from your selected profiles.

Here, you’re making a POST request to Bright Data’s trigger endpoint using your API token and TikTok dataset ID. This API call tells your custom TikTok scraper to start scraping the specified TikTok profile URL.

Once the request is successful, the scraper returns a snapshot_id, which uniquely identifies this TikTok scraper job. You’ll use this ID in the next step to check the scraping status and retrieve the collected TikTok data.

If the request fails, the script exits safely with an error message. This ensures that your Python TikTok scraper stops running if authentication or endpoint issues occur.

Step 3: Retrieve & save the scraped TikTok data

Once the scraping job is complete, it’s time to retrieve your TikTok data and export it for analysis. The following Python script waits for Bright Data’s API to finish processing, then downloads and saves the results into a structured dataset.

The code below checks the snapshot status from the API. It repeatedly polls the endpoint until the scraping process is complete, then retrieves the data file and saves it locally.

This section of your TikTok scraper Python script uses a polling loop to repeatedly check the TikTok Scraper API until your dataset is ready.

Here’s how it works:

  • Polling with timeout: The scraper checks for completion every 10 seconds with a 15-minute cap.
  • Data retrieval: Once the API status returns “ready” or “done”, the script downloads the data for your TikTok post.
  • NDJSON parsing: Each record is processed line by line into Python dictionaries.
  • Data organization: The code extracts post IDs, engagement metrics (likes, comments, shares, plays), hashtags, and descriptions.
  • Export: The data is structured into a Pandas DataFrame and saved as tiktok_competitor_analysis.csv.
  • Error handling: Try-except blocks prevent crashes when encountering unexpected or missing fields.

TikTok terms of service, robots.txt and scraping policy

When building a TikTok scraper, whether using a managed scraper API or Playwright, it’s essential to understand TikTok’s Terms of Service (ToS) and robots.txt rules.

1. TikTok terms of service and scraping restrictions

TikTok’s Terms of Service explicitly prohibit automated access or scraping of non-public content.4 This includes:

  • Logging in programmatically to view private or restricted accounts
  • Circumventing CAPTCHA or authentication mechanisms
  • Copying or redistributing TikTok’s code or media assets

However, collecting publicly visible metadata (like usernames, captions, like counts, and hashtags) for research or analytics is legal if done respectfully and without disruption.

2. TikTok robots.txt and crawling policy

The robots.txt file is a small text document that tells TikTok crawlers which parts of the website they can or cannot access. TikTok’s robots.txt includes disallow rules for paths such as /login, /ads, and other internal endpoints. A responsible TikTok data scraper should:

  • Check robots.txt before crawling
  • Respect rate limits (introduce delays between requests)
  • Avoid restricted endpoints listed under Disallow
  • Use APIs or browser-based renderers that fetch content exactly as a regular user would

3. Scraping TikTok data / What’s allowed and what’s not

Allowed:

  • Gathering public metadata (captions, usernames, view counts, hashtags)
  • Analyzing aggregated trends (without re-publishing individual videos)
  • Using data for market research or AI model training with anonymization

Not Allowed:

  • Accessing private user data, DMs, or login-only endpoints
  • Scraping for commercial resale or content republishing
  • Circumventing security layers or rate-limit enforcement

Follow the following rule:

  • If a regular visitor can see it without logging in, it’s generally safe to scrape data with moderation.

Final Thoughts

Based on internal tests across multiple TikTok pages (profiles, hashtags, and comment threads), browser-based scraping approaches proved significantly more reliable than static request methods.
Tools like Bright Data and Python Playwright maintained access for more extended periods, while lightweight HTTP-based scrapers frequently failed to capture dynamic content.

  • Browser-based scraping is the most reliable approach: The Python script utilizes Playwright to render dynamic JavaScript content, enabling you to accurately capture videos, captions, and engagement metrics as viewed by real users.
  • Polling and error handling enhance scraper stability: The code waits for completion, verifies response status, and manages errors such as timeouts, invalid JSON, or missing data. These strategies ensure that TikTok web scrapers remain resilient against TikTok’s changing frontend.
  • Ethical scraping ensures long-term sustainability: The tutorial’s design adheres to best practices, including scraping only publicly visible data, incorporating delay logic, and avoiding endpoints blocked by TikTok’s robots.txt or Terms of Service.

FAQs about TikTok scrapers

CTO
Sedat Dogan
Sedat Dogan
CTO
Sedat is a technology and information security leader with experience in software development, web data collection and cybersecurity. Sedat:
- Has ⁠20 years of experience as a white-hat hacker and development guru, with extensive expertise in programming languages and server architectures.
- Is an advisor to C-level executives and board members of corporations with high-traffic and mission-critical technology operations like payment infrastructure.
- ⁠Has extensive business acumen alongside his technical expertise.
View Full Profile
Researched by
Gulbahar Karatas
Gulbahar Karatas
Industry Analyst
Gülbahar is an AIMultiple industry analyst focused on web data collection, applications of web data and application security.
View Full Profile

Be the first to comment

Your email address will not be published. All fields are required.

0/450