Rate Limiting
Restricting how often a single user or IP can call an API endpoint, to prevent abuse, protect downstream systems, and control cost.
Also known asthrottlingrate limit
What is rate limiting?
Rate limiting caps the number of requests a given actor (user, IP address, API key) can make against your API in a time window. Common limits: 60 requests per minute for a dashboard API, 5 per minute for a password reset.
Without rate limiting, a single script can:
- Brute force passwords or OTP codes
- Exhaust your LLM API quota in minutes
- Rack up Supabase egress charges
- Hammer a payment provider into rate-limiting your whole IP
Why AI-built apps ship without it
Rate limiting requires either middleware, a shared store (Redis, a database table), or a hosted service. AI tools generate the business logic route and move on, leaving the rate limit for "later". In 89% of FinishKit scans, later never came.
Baseline rate limits
A starter policy that covers most apps:
| Endpoint type | Limit |
|---|---|
| Authenticated read API | 60 req/min per user |
| Authenticated write API | 30 req/min per user |
| Login and signup | 10 req/min per IP |
| Password reset and OTP | 5 req/min per IP |
| Newsletter signup | 5 req/min per IP |
| LLM proxy endpoints | 20 req/min per user |
Implementation patterns
- In-memory: works for a single Node instance, fails at scale
- Redis / Upstash: the standard for serverless
- Database table: slower but works everywhere
- Provider features: Vercel, Cloudflare, and Supabase all ship built-in options