100 free credits — no credit card required.Start building
Logo
Back to blog

Social Media Monitoring API: Build Your Own in 5 Steps

··21 min read

Social media monitoring API guide: build a multi-platform mention tracker with sentiment, dedup, and Slack alerts in Python. Five steps, all runnable code.

Social Media Monitoring API: Build Your Own in 5 Steps

A social media monitoring API is a unified data layer that collects brand mentions, post text, and engagement signals from dozens of social platforms through a single authenticated endpoint, so your application works on normalized data instead of managing platform-by-platform integrations. Monitoring social media at scale without one means calling dozens of different platform APIs, normalizing a dozen different schemas, and watching all of them for 429s. There is a faster way.

This guide shows you how to build a working social media monitoring API pipeline from scratch in Python. By the end, you will have five components running: multi-platform mention search, scheduled polling, sentiment classification, deduplication with SQLite storage, and Slack alerting on spikes. Every step includes runnable code.

What you are building: a lightweight monitoring daemon that uses the SocialCrawl API as its data layer, covering 42 platforms through a single authenticated endpoint. The five steps follow in order: (1) search for mentions, (2) poll on a schedule, (3) classify sentiment, (4) deduplicate and store, (5) alert on spikes.

Why build with a monitoring API instead of packaged social media monitoring software?

The social media listening market reached $11.91 billion in 2026 and is growing at a 13.9% CAGR. That growth reflects real practitioner demand. But most of the investment goes to packaged social media monitoring software, not developer tooling, and the difference matters if you are building something custom.

Packaged tools price on seats. Enterprise monitoring platforms run from $249/month (Brand24) to $800+/month (Talkwalker), with predictive tiers starting at $30,000+/year. Cost is the most frequently cited frustration among practitioners who need programmatic data access.

The data restrictions are more fundamental than the price. Brandwatch's developer API, despite enterprise pricing, strips full text from X posts, redacts Reddit content entirely, and provides no snippet or full text for LinkedIn due to licensing restrictions. You pay for enterprise access and still cannot get full-text data from three of the most monitored platforms.

And then there is the infrastructure reality. As one practitioner noted in Brand24's 2026 research: "Building your own system is getting impossible. Social platforms have gotten so good at anti-bot detection that even their workarounds get blocked." A unified API layer that handles platform-level complexity for you collapses that problem to a single credential.

A functional social media monitoring API pipeline has five components: platform coverage, scheduled polling, sentiment classification, deduplication, and alerting. This guide builds each one.

What social media monitoring tools and libraries do you need?

Unlike packaged social media monitoring software, an API-based setup lets you choose your own social media monitoring tools and libraries. Here is what this guide uses:

  • Python 3.9+ with requests and schedule installed:
    pip install requests schedule
  • A SocialCrawl API key: every account starts with 400 free credits, enough to follow all five steps. Get yours at socialcrawl.dev/dashboard.
  • Basic REST API familiarity: you know what a GET request and a JSON response look like.
  • curl: for the first snippet in Step 1.
  • SQLite: included in Python's standard library, needed for Step 4.

Set your key as an environment variable before running anything:

export SOCIALCRAWL_API_KEY="sc_your_key_here"

You can preview results in the visual Explorer before writing a single line of Python, which is useful for validating keyword coverage and estimating credit spend.

Step 1: How does brand mention monitoring work across multiple platforms?

The core operation in any brand mention monitoring system is a multi-platform keyword search: one request that fans out across platforms and returns normalized results, not one request per platform.

The /v1/search/everywhere endpoint does this. It fans out a single query across up to 15 sources in parallel, fuses the results, reranks them, and returns a single response. Here is the basic curl:

curl -s -H "x-api-key: $SOCIALCRAWL_API_KEY" \
  "https://www.socialcrawl.dev/v1/search/everywhere?query=YourBrand&lookback_days=7"

A trimmed version of what comes back:

{
  "success": true,
  "platform": "search",
  "endpoint": "/v1/search/everywhere",
  "data": {
    "items": [
      {
        "id": "tt_7234567890",
        "platform": "tiktok",
        "url": "https://www.tiktok.com/@techreviewer99/video/7234567890",
        "content": {
          "text": "Just tried YourBrand and honestly impressed with the data freshness..."
        },
        "author": {
          "username": "techreviewer99",
          "display_name": "Tech Reviewer",
          "followers": 48200,
          "verified": false
        },
        "engagement": {
          "views": 128400,
          "likes": 3100,
          "comments": 142,
          "shares": 89
        },
        "published_at": "2026-06-27T14:22:00Z"
      },
      {
        "id": "rd_abc123def",
        "platform": "reddit",
        "url": "https://reddit.com/r/SaaS/comments/abc123/",
        "content": {
          "text": "Has anyone compared YourBrand to the competition for real-time social data?"
        },
        "author": {
          "username": "saas_builder",
          "display_name": "saas_builder"
        },
        "engagement": {
          "likes": 47,
          "comments": 23
        },
        "published_at": "2026-06-26T09:15:00Z"
      }
    ],
    "sources_succeeded": ["tiktok", "reddit", "instagram", "youtube", "hackernews"],
    "sources_failed": [],
    "clusters": [
      { "label": "Product reviews", "count": 31 },
      { "label": "Questions & comparisons", "count": 14 }
    ]
  },
  "credits_used": 20,
  "credits_remaining": 380,
  "request_id": "req-abc123xyz"
}

Now wrap it in a reusable Python function:

import requests
import os

API_KEY = os.environ["SOCIALCRAWL_API_KEY"]
BASE_URL = "https://www.socialcrawl.dev"


def search_mentions(keyword, lookback_days=7, sources=None):
    """
    Fan out a keyword search across social platforms.

    `sources` is optional. Omit it to search all 42 platforms.
    Pass a list like ["reddit", "hackernews", "tiktok"] to scope the search.
    Each call costs 20 credits.
    """
    params = {
        "query": keyword,
        "lookback_days": lookback_days,
    }
    if sources:
        params["sources"] = ",".join(sources)

    response = requests.get(
        f"{BASE_URL}/v1/search/everywhere",
        headers={"x-api-key": API_KEY},
        params=params,
        timeout=30,
    )
    response.raise_for_status()
    return response.json()


# Run a search
result = search_mentions("YourBrand", lookback_days=7)
mentions = result["data"]["items"]
succeeded = result["data"].get("sources_succeeded", [])
print(f"Found {len(mentions)} mentions across {len(succeeded)} platforms")

A few things worth knowing about how this endpoint behaves:

The sources parameter is optional. Omit it to fan out to all supported platforms. Pass a comma-separated list to scope the search when you care about specific platforms. exclude is the inverse: block specific platforms and search everything else.

Pagination uses the next_cursor key. Pass it back verbatim as a cursor parameter to fetch the next page.

Credit cost. Each call to /v1/search/everywhere costs 20 credits flat. For high-frequency polling, that accumulates, which is why Step 2 covers interval strategy and a lower-cost native alternative.

The unified schema. Every item in data.items shares the same shape regardless of which platform it came from: platform, url, content.text, author.username, engagement (views, likes, comments, shares), and published_at. Same shape across every platform: write the processing code once.

Step 2: How do you schedule and poll for new mentions in real time?

Polling is running the same search on a defined interval and processing only the mentions that appeared since the last check. The Python schedule library handles the timing; a last_seen_at cursor prevents reprocessing the same mentions twice.

import schedule
import time
from datetime import datetime, timezone


last_seen_at = None


def process_mentions(new_mentions):
    """Replace this with your downstream handler (Step 3 classifier, Step 4 storage)."""
    for m in new_mentions:
        print(f"  [{m['platform']}] {m.get('url', 'no-url')} ({m.get('published_at', '')})")


def poll_mentions():
    global last_seen_at

    result = search_mentions("YourBrand", lookback_days=1)
    items = result["data"]["items"]

    # Filter to only mentions newer than our last run
    new_mentions = [
        m for m in items
        if last_seen_at is None or m.get("published_at", "") > last_seen_at
    ]

    if new_mentions:
        ts = datetime.now(timezone.utc).isoformat()
        print(f"[{ts}] {len(new_mentions)} new mention(s)")
        process_mentions(new_mentions)

    # Advance the cursor to the most recent published_at we have seen
    if items:
        last_seen_at = max(m.get("published_at", "") for m in items)


# Poll every 30 minutes
schedule.every(30).minutes.do(poll_mentions)

print("Polling started. Press Ctrl+C to stop.")
while True:
    schedule.run_pending()
    time.sleep(1)

On polling interval: Brandwatch's best-practices documentation notes that a 30-second interval is appropriate for real-time mention streams, while longer intervals (hourly, daily) are sufficient for aggregate reporting. At 20 credits per call, a 30-second interval runs to 2,400 credits per 20 minutes. A 30-minute interval is 20 credits per run, which fits most monitoring budgets.

The native alternative for real time social media monitoring: SocialCrawl Monitors. If you want to skip the polling loop entirely, Monitors run any recipe on a cadence server-side and push each result to a signed webhook. No cron job, no process to keep alive, no last_seen_at to manage:

curl -X POST "https://www.socialcrawl.dev/v1/monitors" \
  -H "x-api-key: $SOCIALCRAWL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Brand mention tracker",
    "recipe": "prism/brand-mentions",
    "params": {
      "keyword": "YourBrand",
      "date_from": "2026-06-01"
    },
    "cadence": "daily",
    "webhook_url": "https://your-server.com/hooks/mentions"
  }'

This creates a monitor that runs the prism/brand-mentions recipe daily and delivers each result to your webhook endpoint. Managing monitors costs 0 credits; each scheduled run bills the recipe's normal cost plus 1 scheduling credit.

Step 3: How do you add social media sentiment tracking to your monitoring feed?

The goal is to route each mention into one of three buckets (positive, neutral, negative) so you can feed the negative bucket into the alert queue in Step 5.

The search results from Step 1 include content.text for each mention. A local keyword-based classifier is the fastest pre-filter, and handles the most common production concern: null content values from image-only posts or platforms that do not return text.

def classify_mention(mention):
    """
    Quick local sentiment classification on a mention's text content.
    Returns: "positive" | "neutral" | "negative"

    For aggregate social media sentiment tracking across the full web
    (news, blogs, message boards, ecommerce), use the
    /v1/content_analysis/sentiment endpoint (5 credits) instead.
    """
    text = mention.get("content", {}).get("text") or ""
    if not text:
        # Handle None gracefully: default to neutral rather than crashing
        return "neutral"

    text_lower = text.lower()

    negative_signals = [
        "broken", "hate", "terrible", "disappointed", "refund",
        "scam", "worst", "fraud", "unusable", "bug",
    ]
    positive_signals = [
        "love", "great", "excellent", "recommend", "amazing",
        "helpful", "fantastic", "best", "impressed", "solved",
    ]

    neg_score = sum(1 for w in negative_signals if w in text_lower)
    pos_score = sum(1 for w in positive_signals if w in text_lower)

    if neg_score > pos_score:
        return "negative"
    if pos_score > 0:
        return "positive"
    return "neutral"


# Classify all mentions and route negatives to the alert queue
alert_queue = []
for mention in mentions:
    label = classify_mention(mention)
    mention["_sentiment"] = label  # annotate for Step 4 storage

    if label == "negative":
        alert_queue.append(mention)
        text_preview = (mention.get("content", {}).get("text") or "")[:80]
        print(f"Negative on {mention['platform']}: {text_preview}")

Three things worth noting here:

Always guard against None content. The or "" fallback at line 11 is not optional. Some platforms return null for image-only posts, carousel covers, and very short captions. Missing it means your pipeline crashes on a perfectly normal mention.

Modern monitoring systems classify seven emotions, not three. Tools like Brand24 now detect admiration, anger, disgust, fear, joy, sadness, and surprise, and can explain why a spike happened rather than just that it did. The local keyword approach is a starting point. For statistically grounded sentiment analysis across the web, the /v1/content_analysis/sentiment endpoint gives you a full connotation breakdown across news, blogs, message boards, and ecommerce.

Watch the ratio, not just the count. Brand24's original research on the "social media monitoring" topic found a 12.5:1 positive-to-negative baseline. If your brand's rolling ratio drops below 5:1 in a given window, something is probably happening. That ratio check is the threshold logic you will wire into Step 5.

Step 4: How do you deduplicate mentions and store them?

Fanning out across 42 platforms means the same content can surface on multiple sources with slightly different URLs. Without deduplication, a single viral post counted 40 times looks like a genuine spike in Step 5.

The fix is a hash keyed on (platform, url) with INSERT OR IGNORE on a UNIQUE constraint:

import sqlite3
import hashlib
from datetime import datetime, timezone


def init_db(db_path="mentions.db"):
    conn = sqlite3.connect(db_path)
    conn.execute("""
        CREATE TABLE IF NOT EXISTS mentions (
            hash         TEXT PRIMARY KEY,
            platform     TEXT NOT NULL,
            url          TEXT,
            content      TEXT,
            sentiment    TEXT,
            author       TEXT,
            published_at TEXT,
            ingested_at  TEXT DEFAULT CURRENT_TIMESTAMP
        )
    """)
    conn.commit()
    return conn


def mention_hash(mention):
    """
    Deterministic dedup key.
    Prefers (platform, url). Falls back to (platform, content[:100], published_at)
    for platforms that do not return stable URLs.
    """
    platform = mention.get("platform", "")
    url = mention.get("url", "")
    if url:
        raw = f"{platform}:{url}"
    else:
        text = (mention.get("content", {}).get("text") or "")[:100]
        ts = mention.get("published_at", "")
        raw = f"{platform}:{text}:{ts}"
    return hashlib.sha256(raw.encode()).hexdigest()


def store_mention(conn, mention, sentiment):
    """
    Attempt to insert the mention. Returns True if new, False if duplicate.
    `INSERT OR IGNORE` silently skips duplicates without error.
    """
    h = mention_hash(mention)
    conn.execute(
        """INSERT OR IGNORE INTO mentions
           (hash, platform, url, content, sentiment, author, published_at)
           VALUES (?, ?, ?, ?, ?, ?, ?)""",
        (
            h,
            mention.get("platform"),
            mention.get("url"),
            (mention.get("content", {}).get("text") or "")[:1000],
            sentiment,
            mention.get("author", {}).get("username"),
            mention.get("published_at"),
        ),
    )
    conn.commit()
    return conn.total_changes > 0


# Wire it up
conn = init_db()
for mention in mentions:
    label = mention.get("_sentiment") or classify_mention(mention)
    is_new = store_mention(conn, mention, label)
    if is_new:
        print(f"Stored [{mention['platform']}]: {mention.get('url', 'no-url')}")
    else:
        print(f"Duplicate skipped: {mention.get('url', 'no-url')}")

Storage guidance. SQLite works fine for under a million mentions and zero infrastructure overhead. For higher-volume production pipelines (continuous polling on many keywords), move to Postgres with a B-tree index on (platform, url) or a Redis sorted set keyed by published_at for fast time-window reads. The fields worth persisting: hash, platform, url, content, sentiment, author, published_at, ingested_at.

A clean, deduplicated store is what makes the threshold logic in Step 5 accurate. Without dedup, a single viral post counted 40 times looks like a crisis; with it, you see the actual mention volume.

Step 5: How do you trigger alerts when mentions spike or sentiment turns negative?

The alerting step queries your stored mentions for a rolling time window, checks volume and negative ratio against thresholds, and posts to Slack if either trips:

import sqlite3
from datetime import datetime, timezone, timedelta
import requests as req_lib

SLACK_WEBHOOK_URL = "https://hooks.slack.com/services/YOUR/WEBHOOK/URL"
VOLUME_THRESHOLD = 50       # alert if >50 mentions in the window
NEGATIVE_THRESHOLD = 0.20  # alert if >20% of mentions are negative
COOLDOWN_MINUTES = 15      # suppress repeat alerts within this window

_last_alerted_at = None


def check_alerts(db_path="mentions.db", window_minutes=60):
    global _last_alerted_at

    # Skip if within the cooldown window (prevents alert storms)
    if _last_alerted_at:
        elapsed = datetime.now(timezone.utc) - _last_alerted_at
        if elapsed < timedelta(minutes=COOLDOWN_MINUTES):
            return

    conn = sqlite3.connect(db_path)
    since = (datetime.now(timezone.utc) - timedelta(minutes=window_minutes)).isoformat()

    rows = conn.execute(
        "SELECT sentiment FROM mentions WHERE ingested_at >= ?", (since,)
    ).fetchall()
    conn.close()

    if not rows:
        return

    total = len(rows)
    negative = sum(1 for (s,) in rows if s == "negative")
    negative_ratio = negative / total

    volume_alert = total >= VOLUME_THRESHOLD
    sentiment_alert = negative_ratio >= NEGATIVE_THRESHOLD

    if not (volume_alert or sentiment_alert):
        return

    reasons = []
    if volume_alert:
        reasons.append(f"{total} mentions in {window_minutes} min (threshold: {VOLUME_THRESHOLD})")
    if sentiment_alert:
        reasons.append(f"{negative_ratio:.0%} negative (threshold: {NEGATIVE_THRESHOLD:.0%})")

    payload = {
        "text": ":warning: *Brand mention alert*\n" + "\n".join(f"- {r}" for r in reasons)
    }
    req_lib.post(SLACK_WEBHOOK_URL, json=payload, timeout=10)
    _last_alerted_at = datetime.now(timezone.utc)
    print(f"Alert sent: {reasons}")


# Call this at the end of each poll_mentions() cycle
check_alerts(window_minutes=60)

On alert fatigue: the COOLDOWN_MINUTES = 15 guard prevents a cascade of identical alerts during a sustained spike. If the process restarts between polls, store _last_alerted_at in your database rather than in memory; otherwise the cooldown resets on restart.

The native alternative. Monitors accept alert_rules that evaluate server-side metrics like negative_share against a percentage change threshold, so you skip the query-and-compare logic entirely:

curl -X POST "https://www.socialcrawl.dev/v1/monitors" \
  -H "x-api-key: $SOCIALCRAWL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Brand sentiment alert",
    "recipe": "prism/brand-mentions",
    "params": { "keyword": "YourBrand", "date_from": "2026-06-01" },
    "cadence": "daily",
    "webhook_url": "https://your-server.com/hooks/alerts",
    "alert_rules": [
      {
        "metric": "negative_share",
        "op": "pct_change_gt",
        "value": 25,
        "window": "1w"
      }
    ]
  }'

This fires your webhook only when negative share increases more than 25% week-over-week, with no threshold code to maintain. AI-powered systems can also explain why a spike happened (viral hashtag versus genuine PR issue) before you have read a single mention.

What could go wrong?

SymptomFix
Concurrency limit (429)You have more than 50 in-flight requests on one key. Add exponential backoff with jitter: time.sleep(2 ** attempt + random.random()) in a retry loop. Wait for active requests to complete before starting new ones.
Empty results windowSome platforms return zero items for narrow time windows. Widen lookback_days or use a rolling 24-hour window instead of a fixed start date.
Duplicate alerts firingThe cooldown guard is resetting on process restart. Persist last_alerted_at to your database, not to a module-level variable.
None content crashing the classifierPlatforms return null content on image-only posts and short captions. The or "" fallback in Step 3 handles this; confirm it is present in every code path that reads mention["content"]["text"].
schedule library driftThe schedule library is not precise on loaded machines: tasks drift over time. For production, use a system cron (crontab -e) or SocialCrawl Monitors instead.
Platform-level 502sIndividual platform upstreams go down. Credits are auto-refunded on 502 and 503 responses. Retry after 30 seconds. A single platform outage does not affect the rest of the fan-out.

What is next?

The five-step pipeline covers the core brand mention monitoring loop. Here is where to take it further:

Add more keywords. Create a separate monitor (or a separate polling schedule) for each competitor brand, campaign hashtag, and product name you care about. Independent monitors run in parallel with independent alert rules.

Prototype new queries first. Before adding a keyword to the full pipeline, use the visual Explorer at socialcrawl.dev/explorer to preview results and estimate credit spend. You can see the data before committing to a polling interval.

Use the prism/brand-mentions composite. The /v1/prism/brand-mentions endpoint (20 credits) returns a pre-aggregated brand health report: mention volume time-series, sentiment split, top sources, and recent mentions in one call. Useful when you want the dashboard view alongside the raw feed.

Track your credit balance. GET /v1/credits/balance costs 0 credits and returns your current balance. Run it before any high-volume polling run.

Frequently asked questions

Is there a free social media monitoring API?

Yes. Every SocialCrawl account starts with 400 free credits, which covers all five steps in this guide and several rounds of keyword prototyping. The visual Explorer lets you see your data before writing a single line. Production polling across many keywords at short intervals will consume credits faster; see SocialCrawl pricing for one-time credit pack options.

How do I monitor multiple social platforms at once?

The /v1/search/everywhere endpoint handles multi-platform fan-out in a single request. You do not send one request per platform. Omit the sources parameter to search all 42 supported platforms. Pass a comma-separated sources value (for example, sources=reddit,tiktok,instagram) to scope the search. For ongoing multi-platform tracking on a schedule, Monitors re-run any recipe on a cadence and push results to your webhook automatically.

What data does a social media monitoring API track?

A social media monitoring API tracks keyword occurrences in posts and comments, author metadata (username, follower count, verified status), engagement metrics (views, likes, shares, comments), the post URL, the platform of origin, and the published timestamp. The exact field set varies by platform: some platforms do not expose share counts, others do not return text for video content. The SocialCrawl unified schema normalizes all of this into one consistent shape across 42 platforms, so the processing code you write in Step 3 works regardless of which platform the mention came from.

How much does a social media monitoring API cost?

Packaged social media monitoring software runs from $249/month to $800+/month, with enterprise tiers at $30,000+/year. API-based monitoring with SocialCrawl uses a credit model: you pay per call, not per seat. A daily prism/brand-mentions monitor costs 21 credits per run (20 for the recipe plus 1 scheduling credit). A 30-minute polling setup using /v1/search/everywhere costs 20 credits per call, or roughly 960 credits per day. Credit packs are one-time purchases with no subscription. See SocialCrawl pricing for current pack sizes.

Can you get real-time alerts from a social media monitoring API?

Yes. The fastest path is the native Monitor approach from Step 5: create a Monitor with an alert_rules block and it fires a POST to your webhook when a defined metric (such as negative sentiment share) crosses your threshold. No polling process to keep alive, no cron job to maintain. For a self-hosted approach, the polling loop in Step 2 combined with the Slack alerting code in Step 5 achieves the same result: the scheduler runs on your defined interval, queries the stored mention window, and sends a Slack message if volume or negative ratio trips the threshold. The Monitor approach supports week-over-week percentage-change comparisons; the DIY loop gives you more control over the alert logic.

What is the difference between social media monitoring and social media listening?

Monitoring is reactive and keyword-specific: you define what to watch, and the system tells you what was said, when, and by whom. Social media listening is strategic and pattern-based: you look at aggregate data over time to understand what conversations mean, which audiences are driving them, and how they shift. The pipeline in this guide is a monitoring pipeline. To add the listening layer, take the stored mention data from Step 4 and run NLP topic clustering or rolling trend analysis on top of it. The distinction matters operationally because monitoring is real-time and action-oriented, while listening is analytical and long-term.

Topics
#social-media-monitoring-api#brand-mention-monitoring#social-media-sentiment-tracking#real-time-social-media-monitoring#social-media-listening-api#social-media-monitoring-tools#social-media-monitoring-software

Related posts