100 free credits — no credit card required.Start building
Logo

Integrate once. Every platform returns the same shape.

Every social platform speaks a different language. SocialCrawl normalizes 42 platforms and 325 endpoints into one canonical schema, validated on every response, so you write one integration instead of a dozen.

The problem

Why is social data so hard to work with?

Reddit calls it score, TikTok nests it under aweme_info.statistics, Instagram returns reels flat in one endpoint and nested in another. A developer who wants the engagement on a post across five platforms normally writes five integrations and five sets of types.

TikTok logo
TikTokaweme_info.statistics.digg_count
Instagram logo
Instagramedge_media.reels[].nodes
Reddit logo
Redditdata.score / ups
+ dozens more platforms, each with its own raw shape
How it works

How does SocialCrawl normalize every platform into one shape?

Every raw payload runs through the same five-stage pipeline before it reaches you. Nothing is guessed, and nothing undeclared slips through.

  1. 1

    Strip the envelope

    Unwrap each platform's raw payload and normalize lists to one { items, next_cursor, total } shape.

  2. 2

    Field map

    A declarative source-to-target rename. Subtractive by design, so junk and surprises never reach the customer.

  3. 3

    Enrich hook

    An additive per-platform step for what the map cannot express: absolute-URL construction, carousel flattening, boolean coercion.

  4. 4

    Normalize + null backstop

    Null backstops, ID-prefix stripping, tombstone collapse. We never substitute a zero for we-do-not-know.

  5. 5

    Validate (Zod gate)

    Every response is checked against the canonical Zod schema at the wire. Invalid rows are dropped, never passed on.

Junk and surprises never reach the customer.

Computed intelligence

What does SocialCrawl add on top of the raw data?

Every response carries a deterministic computed block: an engagement_rate normalized 0 to 1 and comparable across platforms, a language detection across 33 ISO codes, a content_category, and an estimated_reach.

"computed": {
  "engagement_rate": 0.043,
  "language": "en",
  "content_category": "entertainment",
  "estimated_reach": 128400
}

These are deterministic arithmetic, not machine learning. We return null rather than a guess below a confidence floor, and we never substitute a zero for we-do-not-know.

LLMs narrate, code computes.

The proprietary edge

How do you keep the schema and the docs from drifting apart?

One canonical schema, written once in Zod, is the source of truth for every endpoint on every platform. Rename a single field and the docs and validation update themselves.

One source of truth

The canonical schema lives once, in Zod. Every endpoint on every platform conforms to it, PostObject to QuoteObject.

Mechanical cascade

One function walks the schema tree and drives the normalizer, the CI coverage gate, and the OpenAPI docs from the same source.

Validated on every response

Strict in CI, so drift fails the build. Forgiving in production, so a bad row is dropped and the customer never receives a malformed record.

Tombstones rejected

Deleted and removed sentinels fail a schema check, so a deleted comment can never masquerade as content in your response.

“The documentation cannot drift from the implementation, because both derive from the same source.”
One schema, every object

Does the unified schema cover more than social posts?

The same discipline extends across every object we return. A Google Play app and an App Store app deserialize into the same AppObject. QuoteObject spans stocks, ETFs, crypto, and forex in one shape.

PostObjectCommentObjectAuthorObjectProductObjectReviewObjectSellerObjectPlaceObjectAppObjectNewsArticleObjectQuoteObjectJobObject

Once you build typed consumers against these objects across 42 platforms, you never rewrite five integrations again.

The shape of it

One schema, holding across everything we return.

42Platformsone schema across all
325Endpointseach validated at the wire
11Canonical objectsPost to Quote to Job
1Schemawritten once, in Zod

Counts are read live from the registry, so this page can never quote a stale number.

One key, one schema

SocialCrawl versus wiring up scrapers yourself

The difference is not the raw data. It is the normalization, validation, and trust layer on top of it.

Integration

SocialCrawl
One key, one schema
Wiring up scrapers yourself
N vendors, N schemas

Field naming

SocialCrawl
One canonical shape across every platform
Wiring up scrapers yourself
A different JSON shape per platform

Pagination

SocialCrawl
One opaque cursor everywhere
Wiring up scrapers yourself
A different pagination model per source

Data quality

SocialCrawl
Validated on every response, tombstones rejected
Wiring up scrapers yourself
Raw blobs, deleted rows leak through

Docs

SocialCrawl
Cannot drift, CI-enforced
Wiring up scrapers yourself
Hand-maintained, drifts silently

Frequently asked questions

Can't find what you're looking for? Talk to our team or ask the AI agent below

It is one canonical response shape that every endpoint on every platform conforms to. Instead of learning TikTok's aweme_info, Instagram's edge_media, and Reddit's score separately, you read one PostObject with the same fields everywhere. SocialCrawl normalizes 42 platforms and 325 endpoints into that single shape.

One schema, every platform

Start free.

Get an API key and see the same validated shape come back from every platform you call.

curl https://www.socialcrawl.dev/v1/tiktok/profile \
  -G --data-urlencode "handle=nasa" \
  -H "x-api-key: $SC_KEY"