용어집
소셜 데이터 용어집
소셜 미디어 데이터를 다루다 마주치는 개념, 오류 코드, 규칙을 짧게 답부터 정리했어요. 항목마다 실제로 쓰는 API 호출을 함께 보여드려요.
- 429 ErrorA 429 error is the HTTP status code an API returns when you have sent too many requests in a given window and hit its rate limit. The response usually carries a Retry-After header telling you how long to wait before trying the request again.
- Instagram Rate LimitsInstagram's Graph API rate limit is dynamic: an app can make up to 4,800 calls times the number of impressions an account received, measured over a rolling 24 hour window. Higher-traffic accounts get a higher ceiling, so the cap moves as engagement changes rather than sitting at a fixed number.
- TikTok APITikTok has official APIs, but none serve commercial data collection well: the Research API is limited to approved academic researchers at non-profit universities, and the Content Posting and Login Kit APIs cover publishing and authentication, not reading public data at scale.
- Reddit API KeysReddit API keys are the OAuth 2.0 client ID and secret you create by registering an app at reddit.com/prefs/apps. You exchange them for a bearer token, then send that token on every request. New commercial apps must also be approved under Reddit's Responsible Builder Policy before they get access.
- Web Scraping LegalityScraping public data is generally lawful in the US: the Ninth Circuit's hiQ v. LinkedIn ruling held that collecting publicly accessible pages does not violate the Computer Fraud and Abuse Act. It is not a blanket permission, though, because platform Terms of Service and data-protection laws like GDPR still apply to what you collect and how you use it.
- Engagement RateEngagement rate is a percentage that measures how much an audience interacts with content: total interactions such as likes, comments, shares, and saves, divided by follower count or reach, times 100. It normalizes engagement across accounts of different sizes so a small creator and a large brand can be compared fairly.
- Cursor PaginationCursor pagination is a way to page through API results using an opaque token, the cursor, that marks your position, instead of an offset or page number. You pass the cursor from the previous response back on the next request. Because it points at a record rather than a numeric position, results stay stable even as new items arrive.
- Social ListeningSocial listening is the practice of tracking public conversations across social platforms to understand what people say about a brand, topic, or competitor, then acting on the sentiment and trends behind those mentions. It goes a step past monitoring: monitoring counts mentions, listening interprets them.
- Social Media ScrapingSocial media scraping is the automated collection of public data such as profiles, posts, follower counts, and engagement from social platforms, either by parsing pages directly or by calling an API that does it for you. A scraping API absorbs the anti-bot, proxy, and rate-limit complexity and returns clean structured data.
- Unified Social Media APIA unified social media API is a single API that returns data from many social platforms in one consistent schema, behind one authentication model. Instead of learning each platform's field names, you get the same shape everywhere: followers means followers on TikTok, Instagram, and YouTube alike, with one key.
- Webhooks vs PollingPolling and webhooks are two ways to learn about new data. Polling means your code asks the API on a fixed schedule whether anything changed. Webhooks invert that: the provider sends your server an HTTP request the moment something changes, so you skip the wasted empty checks between updates.
- API Rate LimitingAPI rate limiting is the practice of capping how many requests a client can make in a given time window, to protect a service from overload and abuse. Common models include the fixed window, sliding window, token bucket, and credit-based quotas. Exceed the cap and the API answers with a 429 error.
- User-AgentA User-Agent is an HTTP request header that identifies the client software making the request, such as a browser, bot, or app. APIs use it for analytics, debugging, and access control. Reddit, for instance, rejects requests that do not send a descriptive, unique User-Agent string.
- Residential vs Datacenter ProxiesResidential proxies route your requests through IP addresses that ISPs assign to real homes, so they look like ordinary users and are hard to block. Datacenter proxies use IPs owned by hosting providers: they are cheaper and faster but easier for sites to detect. The trade-off is cost and speed versus block rate.
- Reciprocal Rank FusionReciprocal rank fusion, or RRF, is a method for combining several ranked lists into one final ranking. Each item scores the sum of 1 divided by (k plus its rank) across every list it appears in, where k is a small constant, often 60. Items ranked highly in multiple lists rise to the top.
