100 free credits — no credit card required.Start building
Logo
Back to blog

How to Scrape TikTok Comments with Python (2026): API Approach That Actually Works

·12 min read

Scrape TikTok comments with Python in ~40 lines: one GET request, cursor pagination, CSV export. Real endpoint, real response shapes, no Selenium.

How to Scrape TikTok Comments with Python (2026): API Approach That Actually Works

The fastest way to scrape TikTok comments with Python is a single GET request to a comments API — no browser automation, no reverse-engineered mobile endpoints, no session cookies that expire mid-run. Here's the whole thing:

import requests

resp = requests.get(
    "https://www.socialcrawl.dev/v1/tiktok/post/comments",
    params={"url": "https://www.tiktok.com/@charlidamelio/video/7321485815660738859"},
    headers={"x-api-key": "sc_your_api_key_here"},
)
comments = resp.json()["data"]["items"]

That returns one page of comments (typically 10–30, the upstream decides) with text, author, like count, reply count, and timestamp — normalized into one schema. The rest of this tutorial covers what the response looks like, how to paginate through every comment on a video, how to export to CSV with pandas, how to handle errors without losing credits, and what the math looks like when you scale to hundreds of videos.

Stack: Python 3.10+ · requests · pandas (optional, for CSV) · a SocialCrawl API key (free tier: 100 credits)


Why doesn't the official TikTok API give you comments?

TikTok has two official APIs, and neither works for this job.

The TikTok Research API does expose comment data — but access is limited to academic researchers at qualifying non-profit institutions in supported regions, with an application process that reviews your research proposal. If you're a developer building a product, a marketer running comment analysis, or a data engineer feeding a pipeline, you don't qualify. That's not a loophole problem; it's the stated scope of the program.

The Display API and Commercial Content API, which regular developers can register for, cover profile display, video publishing, and ad library data. Comments are not in the surface area.

That leaves two unofficial routes: scrape TikTok's web frontend yourself (Selenium/Playwright against an aggressively bot-defended SPA whose internal endpoints change without notice), or use a third-party API that maintains that scraping infrastructure for you and sells the output as a stable contract. The first option is a maintenance subscription paid in your own evenings — covered in detail in how to get TikTok data without the API. This tutorial takes the second route.

Setting up: one package, one key

Install the only required dependency:

pip install requests pandas

(pandas is just for the CSV step — skip it if you're writing rows yourself.)

Get an API key at socialcrawl.dev — sign up, then Dashboard → API Keys. Keys are formatted sc_ plus 32 random bytes and shown in full once, so copy it immediately. New accounts get 100 free credits, and the comments endpoint costs 1 credit per page, so the free tier covers roughly 100 pages — call it 1,000–3,000 comments — before you pay anything. The quickstart covers key management in more detail.

Keep the key out of your source:

export SOCIALCRAWL_API_KEY=sc_your_api_key_here

Fetching comments for a TikTok video

The endpoint is GET /v1/tiktok/post/comments. It takes exactly one required parameter — url, the full TikTok video URL — and two optional ones: cursor for pagination and trim for a slimmer response.

import os
import requests

API_KEY = os.environ["SOCIALCRAWL_API_KEY"]
BASE = "https://www.socialcrawl.dev/v1"

VIDEO_URL = "https://www.tiktok.com/@charlidamelio/video/7321485815660738859"

resp = requests.get(
    f"{BASE}/tiktok/post/comments",
    params={"url": VIDEO_URL},
    headers={"x-api-key": API_KEY},
    timeout=30,
)
payload = resp.json()

Every SocialCrawl response — this endpoint and the other 189 — arrives in the same envelope:

{
  "success": true,
  "platform": "tiktok",
  "endpoint": "/v1/tiktok/post/comments",
  "data": {
    "items": [
      {
        "id": "7321501234567890123",
        "url": null,
        "parent_id": null,
        "post_id": "7321485815660738859",
        "text": "the dunkin lore continues",
        "author": {
          "username": "commenter_handle",
          "display_name": "Commenter Name",
          "avatar_url": "https://p16-sign-va.tiktokcdn.com/...",
          "verified": false
        },
        "engagement": {
          "likes": 1043,
          "replies": 12
        },
        "flags": {
          "pinned": false,
          "deleted": false
        },
        "published_at": "2026-05-30T14:22:08Z"
      }
    ],
    "next_cursor": "20",
    "total": 8412
  },
  "credits_used": 1,
  "credits_remaining": 99,
  "request_id": "req-abc123",
  "cached": false
}

Three things worth noting before you write the loop:

  1. The comment shape is canonical. text, author.username, engagement.likes, engagement.replies, flags.pinned, published_at — these field names are identical on the Instagram, YouTube, Facebook, and Reddit comment endpoints too. If you later expand beyond TikTok, your parser doesn't change.
  2. data.total is an upstream estimate. TikTok's count and the number of comments you can actually paginate to frequently disagree. Never use total as a loop condition.
  3. Cache hits are free. If someone (including you, five minutes ago) requested the same video's comments recently, the response comes back with X-Cache: HIT and credits_used: 0.

Paginating through every comment

The response carries data.next_cursor. You pass it back as the cursor query parameter on the next request — verbatim, no decoding or arithmetic — and stop when next_cursor is missing or null. That's the entire contract, documented in full on the pagination page.

def scrape_tiktok_comments(video_url: str, max_pages: int = 50) -> list[dict]:
    """Fetch all comments on a TikTok video, one page per credit."""
    comments: list[dict] = []
    cursor = None

    for _ in range(max_pages):
        params = {"url": video_url}
        if cursor is not None:
            params["cursor"] = cursor  # pass next_cursor back verbatim

        resp = requests.get(
            f"{BASE}/tiktok/post/comments",
            params=params,
            headers={"x-api-key": API_KEY},
            timeout=30,
        )
        payload = resp.json()

        if not payload.get("success"):
            raise RuntimeError(
                f"{payload['error']['type']}: {payload['error']['message']} "
                f"(request_id={payload['request_id']})"
            )

        data = payload["data"]
        comments.extend(data["items"])

        cursor = data.get("next_cursor")
        if not cursor:
            break  # no more pages

    return comments


comments = scrape_tiktok_comments(VIDEO_URL)
print(f"Fetched {len(comments)} comments")

The max_pages guard matters more than it looks. Viral videos can carry six-figure comment counts, and each page costs 1 credit — a runaway loop on a 400,000-comment video is a 13,000+ credit afternoon. Decide your per-video budget up front and encode it.

Exporting to CSV with pandas

The nested comment objects flatten cleanly:

import pandas as pd

df = pd.DataFrame(
    {
        "comment_id": c["id"],
        "text": c["text"],
        "username": c["author"]["username"],
        "verified": c["author"]["verified"],
        "likes": c["engagement"]["likes"],
        "replies": c["engagement"]["replies"],
        "pinned": c["flags"]["pinned"],
        "published_at": c["published_at"],
    }
    for c in comments
)

df = df.sort_values("likes", ascending=False)
df.to_csv("tiktok_comments.csv", index=False)
print(df.head(10)[["username", "likes", "text"]])

One data-quality note: comments TikTok has removed arrive with text: null and flags.deleted: true rather than a "[deleted]" placeholder string — the API collapses tombstone sentinels to null before you see them. If you're feeding the text into sentiment analysis or an LLM, filter on df["text"].notna() first.

Handling errors and rate limits

Failures use the same envelope with success: false and a typed error object:

{
  "success": false,
  "error": {
    "type": "UPSTREAM_ERROR",
    "message": "TikTok returned an error for this request.",
    "status": 502,
    "doc_url": "https://www.socialcrawl.dev/docs/errors/upstream-error"
  },
  "credits_remaining": 87,
  "request_id": "req-def456"
}

The error types you'll actually encounter scraping comments, and what to do with each:

Error typeStatusWhat it meansYour move
RESOURCE_NOT_FOUND404Video deleted, private, or URL typo'dSkip the video. Credit refunded on empty upstream.
UPSTREAM_ERROR502TikTok hiccuped (or rejected a stale cursor)Retry with backoff. Credit refunded.
SERVICE_UNAVAILABLE503Circuit breaker open for TikTokHonor the Retry-After: 30 header.
CONCURRENCY_LIMIT429More than 50 concurrent requests on one keyCap your worker pool below 50.
INSUFFICIENT_CREDITS402Balance below the 1-credit costTop up; the loop above raises before retrying.

Note what's not in that table: a requests-per-minute quota. There isn't one — the only throughput ceiling is 50 concurrent requests per key. And the refund behavior is worth designing around: 502s, 503s, and empty-upstream 404s automatically refund the credit, so a retry wrapper doesn't double-spend. The full matrix lives in the error handling docs.

A minimal retry wrapper:

import time

def get_with_retry(url: str, params: dict, retries: int = 3) -> dict:
    for attempt in range(retries):
        resp = requests.get(url, params=params,
                            headers={"x-api-key": API_KEY}, timeout=30)
        payload = resp.json()
        if payload.get("success"):
            return payload
        if payload["error"]["status"] in (502, 503):
            time.sleep(int(resp.headers.get("Retry-After", 2 ** attempt)))
            continue  # credit was refunded; safe to retry
        raise RuntimeError(payload["error"]["message"])
    raise RuntimeError("retries exhausted")

Scraping TikTok comments at scale

Single-video scraping is the demo. The real workload is usually "comments on every video this creator posted last month" or "comments on the top 50 videos for this hashtag." Both are two-endpoint pipelines on the same TikTok platform:

def video_urls_for_profile(handle: str, max_pages: int = 3) -> list[str]:
    """List a creator's video URLs via /v1/tiktok/profile/videos."""
    urls, cursor = [], None
    for _ in range(max_pages):
        params = {"handle": handle}
        if cursor:
            params["max_cursor"] = cursor  # note: this endpoint's param is max_cursor
        payload = get_with_retry(f"{BASE}/tiktok/profile/videos", params)
        urls += [v["url"] for v in payload["data"]["items"] if v.get("url")]
        cursor = payload["data"].get("next_cursor")
        if not cursor:
            break
    return urls


all_rows = []
for url in video_urls_for_profile("charlidamelio"):
    for c in scrape_tiktok_comments(url, max_pages=5):
        all_rows.append({**c, "video_url": url})

One subtlety the snippet encodes: the pagination response field is always next_cursor, but the request parameter name varies by endpoint — cursor on /tiktok/post/comments, max_cursor on /tiktok/profile/videos. The pagination table lists the exact parameter for all 59 paginatable endpoints.

The credits math is flat enough to budget on a napkin. Both endpoints are standard tier — 1 credit per page:

WorkloadRequestsCredits
One video, first page (~10–30 comments)11
One video, ~2,000 comments (~20/page)~100~100
50 videos × 5 pages of comments each250 + ~3 video pages~253
1,000 videos × 5 pages each5,000 + ~40~5,040

The free tier's 100 credits cover the first two rows. Past that, pricing is pay-as-you-go and credits never expire — 5,000 credits is £14, and the Growth plan's 32,000 credits (£49) funds the 1,000-video row six times over. Repeated requests for the same video within the cache window cost 0 credits, which in practice discounts iterative development substantially: you'll re-run your parser against cached pages all afternoon for free.

If you need the reply threads under specific comments, there's a sibling endpoint: GET /v1/tiktok/video/comment/replies takes the comment_id (the id from the comments response) plus the video url, paginates the same way, and costs the same 1 credit per page.

Frequently asked questions

Scraping publicly accessible data has repeatedly survived legal challenge in US courts (the hiQ v. LinkedIn line of cases), but ToS exposure, GDPR/CCPA obligations on the personal data in comments, and copyright in comment text are real considerations that depend on your jurisdiction and use case. The honest answer is longer than a FAQ — read the legal and technical guide to social media scraping before shipping anything commercial, and talk to a lawyer if revenue depends on the answer.

Do I need TikTok's approval to use a TikTok comments API?

No. The approval gate exists for TikTok's own Research API, which requires academic affiliation. A third-party comments API operates its own data collection and sells you the output — you sign up with the API provider, not with TikTok. There's no app review, no research proposal, and the key works in minutes rather than weeks.

How fresh is the comment data?

Requests hit TikTok live unless a recent identical request was cached. The response tells you which: cached: false means the data was fetched at request time; X-Cache: HIT (with credits_used: 0) means it came from the recent-request cache. For comment streams on actively viral videos, re-requesting after the cache window gives you the new state — and your earlier cached re-reads cost nothing.

What are the rate limits for scraping TikTok comments?

There is no requests-per-minute or daily quota. The two real constraints are 50 concurrent requests per API key (exceeding it returns a 429 CONCURRENCY_LIMIT envelope, no credits charged) and your credit balance. A ThreadPoolExecutor(max_workers=40) against the comments endpoint is comfortably inside the limit.

Can I get TikTok comments in JavaScript or TypeScript instead of Python?

Yes — it's one fetch. The endpoint, parameters, and envelope are identical:

const res = await fetch(
  "https://www.socialcrawl.dev/v1/tiktok/post/comments?url=" +
    encodeURIComponent("https://www.tiktok.com/@charlidamelio/video/7321485815660738859"),
  { headers: { "x-api-key": process.env.SOCIALCRAWL_API_KEY! } },
);
const { data } = await res.json();
console.log(data.items.length, "comments, next cursor:", data.next_cursor);

The same pagination loop applies: pass data.next_cursor back as cursor until it disappears.

Can the same code scrape comments from other platforms?

Almost. Instagram, YouTube, Facebook, and Reddit each have a comment-list endpoint returning the same canonical comment object (text, author, engagement.likes, engagement.replies, published_at), so your parsing and CSV code transfers unchanged. The differences are the input parameter (video URL vs. post URL vs. video ID) and the pagination parameter name per platform — both listed in the pagination docs.


Where to go from here

You now have a TikTok comment scraper that costs 1 credit per page and survives TikTok's frontend changes because maintaining that compatibility is the API's job, not yours. Three natural next steps:

  • Browse the other 17 TikTok endpoints — profiles, video search, hashtags, transcripts, followers — on the TikTok platform page.
  • Feed the comments to an LLM — the AI agent monitoring tutorial builds a Claude-powered classification loop on the same API in under 150 lines.
  • Start with the free tier100 credits, no card is about a hundred pages of comments, which is more than enough to find out whether the data answers your question.
Topics
#scrape-tiktok-comments-python#tiktok-comments-api#tiktok-comment-scraper#get-tiktok-comments#tiktok-scraper#tiktok-api-python