In 2026, TikTok moved its U.S. operations to the TikTok USDS Joint Venture, managed by Oracle. This changed how the platform handles data and anti-bot measures.
To understand how well different tools handle TikTok data, we tested the leading TikTok scrapers by running 500 unique TikTok videos per provider.
Best TikTok scraper tools: Feature & pricing comparison
- UI: User interface
- Dedicated: Provides a dedicated TikTok scraper API solution specifically designed for collecting data from TikTok.
- General-Purpose: This offers a scraper that is not explicitly designed for TikTok but can be adapted for TikTok web scraping purposes.
- Supports: Pages that return structured data.
TikTok scrapers benchmark results
Comparison of the best TikTok scrapers: Bright Data, Apify & more
Bright Data’s TikTok Scraper API offers direct endpoints for collecting profile, video, and hashtag data. The platform supports:
- Profile endpoint: Collect public fields, such as nickname, follower count, video count, and total likes.
- Post endpoint: Extract video URLs, hashtags, view counts, captions, and engagement metrics.
- Comment endpoint: Retrieve top-level comments and engagement ratios.
- Response formats: JSON, CSV, or streaming output for real-time data pipelines.
Bright Data handles IP rotation, browser emulation, and rate-limit control automatically. It’s best suited for teams needing large-scale, structured data feeds.
Get 25% off Bright Data’s TikTok Scraping APIs by entering the promo code API25.
Visit WebsiteDecodo offers a TikTok post scraper that collects comment threads and search results by country or keyword. The API supports XHR-only mode, which filters raw network responses to deliver precise JSON payloads to developers. This mode helps integrate TikTok post data into dashboards or NLP pipelines.
Save 30% with code: SCRAPE30Apify provides a modular TikTok scraper actor that allows developers to gather public TikTok data via API or Node.js scripts. Here is how the Apify TikTok scraper actor works:
- Generate an API token from your Apify account.
- Install the apify-client package.
- Call the TikTok Scraper Actor with parameters such as:
- region (e.g., “US”)
- type: “HASHTAG”, “USER”, or “MUSIC”
- url: The target TikTok hashtag or profile URL
- limit: Number of videos to extract
- Export results via the Dataset API in JSON or CSV.
- Download videos using the video.play_addr.url_list[0] path.
TikTok-specific strengths:
- Automatically handles dynamic JavaScript loading and pagination.
- Allows retrieval of engagement metrics, hashtags, and music IDs.
- Works with Python, Node.js, or cURL, supporting multi-language integration.
Nimble’s web scraping API offers proxy rotation and fingerprint evasion, improving the reliability of TikTok scraping. While not TikTok-exclusive, its residential proxy network and anti-bot bypass logic make it a strong choice for accessing public TikTok endpoints from different regions.
Octoparse offers multiple pre-built TikTok scraper templates for collecting post, profile, and comment data directly from TikTok’s public pages.
Unlike API-based tools such as Bright Data or Apify, Octoparse utilizes visual automation that replicates real user interactions through its browser emulator. Each template supports configuration for:
- Batch input (up to 10,000 TikTok URLs)
- Custom page size (50–200 results)
- Export options (Excel, CSV, JSON, or Google Sheets)
- Pricing tiers (Free: $0.4/1,000 lines – $2/1,000 lines for detailed video metadata)
Smarter bot detection and the rise of AI scrapers
Basic scraping scripts on platforms like GitHub are becoming ineffective. TikTok now uses ‘device integrity’ checks to distinguish real devices from automated bots. In response, the industry is adopting new approaches:
- AI Scrapers (Agentic): These tools use artificial intelligence to navigate TikTok like human users, automatically adjusting to changes in the website’s layout.
- AI-ready browsers (MCP): Emerging protocols such as the Model Context Protocol (MCP) enable AI models, including Claude and GPT, to control scrapers directly and automatically manage complex website restrictions.
Is it legal? Understanding TikTok’s 2026 scraping rules
It is usually legal to scrape public data, such as hashtags or view counts, for research, as long as you do not bypass login screens or access private information.
- US data rules: The USDS framework protects U.S. user data and prohibits sending it to servers outside the U.S. that lack compliance standards.
- Music restrictions: After a 2026 dispute with Universal Music Group (UMG), it has become harder to access music metadata, and many audio fields are now empty.
1. TikTok terms of service and scraping restrictions
TikTok’s Terms of Service explicitly prohibit automated access or scraping of non-public content.3 This includes:
- Logging in programmatically to view private or restricted accounts
- Circumventing CAPTCHA or authentication mechanisms
- Copying or redistributing TikTok’s code or media assets
However, collecting publicly visible metadata (like usernames, captions, like counts, and hashtags) for research or analytics is legal if done respectfully and without disruption.
2. TikTok robots.txt and crawling policy
The robots.txt file is a small text document that tells TikTok crawlers which parts of the website they can or cannot access. TikTok’s robots.txt includes disallow rules for paths such as /login, /ads, and other internal endpoints. A responsible TikTok data scraper should:
- Check robots.txt before crawling
- Respect rate limits (introduce delays between requests)
- Avoid restricted endpoints listed under Disallow
- Use APIs or browser-based renderers that fetch content exactly as a regular user would
3. Scraping TikTok data / What’s allowed and what’s not
Allowed:
- Gathering public metadata (captions, usernames, view counts, hashtags)
- Analyzing aggregated trends (without re-publishing individual videos)
- Using data for market research or AI model training with anonymization
Not Allowed:
- Accessing private user data, DMs, or login-only endpoints
- Scraping for commercial resale or content republishing
- Circumventing security layers or rate-limit enforcement
How to build a TikTok profile scraper in Python
If you prefer coding your own TikTok data scraper instead of using no-code tools, Python gives you complete control over what data you collect and how you process it. In this tutorial, you’ll learn how to scrape TikTok data such as usernames, captions, and engagement metrics using Python libraries.
Note: Always comply with TikTok’s robots.txt4 and Terms of Service when collecting public data.
This TikTok scraping tutorial shows you how to scrape TikTok profile data using Bright Data TikTok scraper to extract detailed post information.
Step 1: Set up Your Python TikTok scraper
To start TikTok scraping with Python, you first need to import the required libraries and configure your API credentials. This setup step prepares your environment for running a TikTok scraper or any other TikTok scraper script.
In this step, you’re importing essential Python packages used for sending HTTP requests, handling JSON responses, and managing data with Pandas. These libraries form the base of any Python TikTok scraper.
The script needs your API token and TikTok dataset ID to authenticate and connect to the platform. You can find both values inside your API dashboard under the TikTok scraper section.
Set the profile URL you want to analyze. This example utilizes a single TikTok profile scraper URL; however, you can easily modify it to include multiple competitor profiles for large-scale TikTok data scraping.
Step 2: Trigger TikTok Scraping with the scraper API
This step activates the TikTok scraping job and begins retrieving the data from your selected profiles.
Here, you’re making a POST request to Bright Data’s trigger endpoint using your API token and TikTok dataset ID. This API call tells your custom TikTok scraper to start scraping the specified TikTok profile URL.
Once the request is successful, the scraper returns a snapshot_id, which uniquely identifies this TikTok scraper job. You’ll use this ID in the next step to check the scraping status and retrieve the collected TikTok data.
If the request fails, the script exits safely with an error message. This ensures that your Python TikTok scraper stops running if authentication or endpoint issues occur.
Step 3: Retrieve & save the scraped TikTok data
Once the scraping job is complete, it’s time to retrieve your TikTok data and export it for analysis. The following Python script waits for Bright Data’s API to finish processing, then downloads and saves the results into a structured dataset.
The code below checks the snapshot status from the API. It repeatedly polls the endpoint until the scraping process is complete, then retrieves the data file and saves it locally.
This section of your TikTok scraper Python script uses a polling loop to repeatedly check the TikTok Scraper API until your dataset is ready.
Here’s how it works:
- Polling with timeout: The scraper checks for completion every 10 seconds with a 15-minute cap.
- Data retrieval: Once the API status returns “ready” or “done”, the script downloads the data for your TikTok post.
- NDJSON parsing: Each record is processed line by line into Python dictionaries.
- Data organization: The code extracts post IDs, engagement metrics (likes, comments, shares, plays), hashtags, and descriptions.
- Export: The data is structured into a Pandas DataFrame and saved as tiktok_competitor_analysis.csv.
- Error handling: Try-except blocks catch exceptions when unexpected or missing fields are encountered.
TikTok scraper Python GitHub vs. agentic AI solutions
Based on internal tests across multiple TikTok pages (profiles, hashtags, and comment threads), browser-based scraping approaches proved significantly more reliable than static request methods.
Tools like Bright Data and Python’s Playwright maintained access for longer periods, while lightweight HTTP-based scrapers frequently failed to capture dynamic content.
Browser-based scraping is the most reliable approach:
The Python script utilizes Playwright to render dynamic JavaScript content, enabling you to accurately capture videos, captions, and engagement metrics as viewed by real users.
Polling and error handling enhance scraper stability:
The code waits for completion, verifies response status, and manages errors such as timeouts, invalid JSON, or missing data. These strategies ensure that TikTok web scrapers remain resilient against the platform’s ever-changing frontend.
Ethical scraping ensures long-term sustainability:
The tutorial’s design adheres to best practices, including scraping only publicly visible data, incorporating delay logic, and avoiding endpoints blocked by TikTok’s robots.txt or Terms of Service.
TikTok scrapers methodology
We benchmarked web data scrapers to evaluate their ability to scrape TikTok video data. We executed 500 video URL’s per provider, with each video tested once.
- Dataset: We used a curated list of 500 TikTok video URLs spanning diverse content categories and engagement levels.
- Target: Each provider scraped individual video metadata, including descriptions, creation times, video durations, comment counts, and other engagement metrics.
- Runs: We performed 1 run per video.
Success rates:
We defined three levels of success:
Submission success: We considered a submission successful if the API accepted our initial request (HTTP 200/202) without authentication or rate limit errors.
Execution success: We considered an execution successful if the scraping job completed without timeout or system errors.
Validation success: We applied a set of rules to ensure data quality and usability. We considered a result VALID only if it met at least 60% of the validation criteria below, with at least 3 of 5 criteria passing.
A trial that fails at any earlier stage cannot proceed to later stages and is recorded as a failed trial in the final validation calculation. For example, if a request fails during submission, it receives a validation score of 0. The final validation success rate includes all trials across all stages.
Validation Criteria
We validated five key fields to ensure data accuracy and completeness:
1. URL Validation
- Video ID must match exactly between the requested and scraped URLs
- Example: Extract 7557884684533910815 from both URLs and verify match
2. Description Validation
- At least 3 common words are required between the ground truth and scraped text
- Skipped if ground truth has fewer than 3 words
- Method: Tokenize (lowercase, alphanumeric only) and count matches
3. Create Time Validation
- Within ±2 minutes OR ±24 hours
- Accounts for timing discrepancies and timezone differences
4. Video Duration Validation
- Within ±2 seconds tolerance
- Tight tolerance suitable for TikTok’s typical 15-180 second videos
5. Comment Count Validation
- Logarithmic + 5% tolerance: max(count × 0.05, log₁₀(count + 1) × 5, 3)
- Wider tolerance for small counts (≤100), tighter for large counts (>100)
- Examples: 2 → [0, 5] | 100 → [90, 110] | 1000 → [950, 1050]
A result is VALID if at least 3 out of 5 non-null criteria pass (60% threshold). Criteria are skipped only when the ground truth is null. If ground truth exists for a criterion but the scraped value is null, that criterion is marked as failed and counted in the validation calculation.
A video scrape result is considered VALID if:
- At least 3 out of 5 criteria pass, OR
- At least 60% of non-null criteria pass
This approach accounts for cases where certain fields may be legitimately unavailable while still requiring majority accuracy across available data points.
Broken URL detection
We automatically skipped videos with broken or unavailable URLs. Detection included:
- HTTP 404 errors
- “Video not found” or “Video removed” messages
- “Video unavailable” or “Content removed” errors
- TikTok-specific errors (e.g., “aweme not found”)
However, there were no broken URLs in our dataset, so we did not need to exclude any videos.
Available metadata
We counted the number of structured data fields returned by each provider, including:
- Core fields: video ID, description, create time, duration, comment count
- Engagement metrics: likes, shares, views, play count
- Author information: username, nickname, follower count
- Additional metadata: hashtags, music info, video quality, captions
FAQs about TikTok scrapers
Reference Links
- Has 20 years of experience as a white-hat hacker and development guru, with extensive expertise in programming languages and server architectures.
- Is an advisor to C-level executives and board members of corporations with high-traffic and mission-critical technology operations like payment infrastructure.
- Has extensive business acumen alongside his technical expertise.
Be the first to comment
Your email address will not be published. All fields are required.