Handling Rate Limits and Retries When Scraping

A scraper that ignores rate limits fails twice: it gets blocked, and it burdens the server it depends on. When a site returns HTTP 429 (Too Many Requests) or 503 (Service Unavailable), it is asking you to slow down, and the correct response is to wait the right amount of time and retry — not to repeat the request immediately or abandon the run. This page shows how to detect a rate-limit response, read and honor the Retry-After header, fall back to exponential backoff with jitter when that header is absent, and throttle concurrency so you stay under the limit in the first place. It is the detailed walkthrough beneath Anti-Bot Defenses & Rate Limiting, part of Web Scraping & Data Extraction.

Retry-After wins when present; otherwise exponential backoff with jitter governs the wait, capped to a maximum attempt count.

Root cause: the server is rate limiting you

A 429 means you crossed a request budget the server enforces over some window — per second, per minute, or per token. A 503 often means the server is shedding load. Both are transient and both are explicit instructions to back off. Two mistakes turn a recoverable situation into a failed run. The first is retrying immediately, which keeps you over the limit and can extend a temporary throttle into a long block. The second is retrying in a tight, perfectly periodic loop across many workers, which produces a thundering herd that all hammer the server the instant the window resets. The fix is to wait the amount the server asks for, and when it does not specify, to grow the wait exponentially with randomized jitter. This is the same transient-versus-deterministic reasoning that drives Flaky Test Management: retry what is transient, fail fast on what is not.

Minimal reproducible example

The helper below uses Playwright's request API to fetch a URL, honors Retry-After when present, and otherwise backs off exponentially with jitter, capping the attempts.

import { test, expect, type APIRequestContext } from '@playwright/test';

async function fetchWithRetry(
  request: APIRequestContext,
  url: string,
  maxAttempts = 5,
): Promise<string> {
  for (let attempt = 0; attempt < maxAttempts; attempt++) {
    const res = await request.get(url);
    // Success: return the body immediately.
    if (res.status() === 200) return res.text();

    // Only 429 and 503 are retryable; everything else fails fast.
    if (res.status() !== 429 && res.status() !== 503) {
      throw new Error(`Non-retryable status ${res.status()} for ${url}`);
    }

    // Prefer the server's instruction when it gives one.
    const retryAfter = res.headers()['retry-after'];
    const waitMs = retryAfter
      ? Number(retryAfter) * 1000 // Retry-After is in seconds
      : Math.min(2 ** attempt * 500, 30_000) + Math.random() * 500; // backoff + jitter

    await new Promise((r) => setTimeout(r, waitMs));
  }
  throw new Error(`Exhausted ${maxAttempts} attempts for ${url}`);
}

test('retries a rate-limited endpoint', async ({ request }) => {
  const body = await fetchWithRetry(request, '/api/items');
  expect(body.length).toBeGreaterThan(0);
});

Step-by-step fix

Detect the rate-limit status. After each request, inspect res.status(). Treat 429 and 503 as retryable signals to back off. Return immediately on 200, and fail fast on non-retryable codes like 403 or 404 so you do not waste attempts on errors a retry cannot fix.
Read Retry-After and honor it exactly. Check res.headers()['retry-after']. When present it is the server's explicit instruction — a number of seconds (multiply by 1000 for milliseconds) or an HTTP date (compute the delay to that time). Waiting exactly this long is both polite and the fastest path back to success.
Fall back to exponential backoff. When no Retry-After is given, compute the wait as a base interval times two to the power of the attempt number, so successive waits grow 0.5s, 1s, 2s, 4s, and so on. Cap it at a maximum (for example 30 seconds) so a stubborn endpoint does not produce absurd delays.
Add jitter. Add a random component to every computed delay so multiple workers do not retry in lockstep. Jitter spreads retries across the window and prevents a thundering herd from re-overloading the server the instant its limit resets.
Cap attempts and surface failure. Bound the loop with a maximum attempt count and throw a clear error when it is exhausted. A scrape that fails loudly after reasonable retries is better than one that loops forever or silently drops records.
Throttle concurrency to avoid 429s in the first place. Limit in-flight requests with a small pool and pace each worker, isolating them with Browser Contexts & Isolation. Prevention beats recovery: staying under the limit means fewer backoffs and a faster overall run.

Troubleshooting variants

Retry-After is an HTTP date, not a number

The header can hold either a delta-seconds integer or an absolute HTTP-date. Detect which: if Number(value) is NaN, parse it with Date.parse(value) and wait parsed - Date.now() milliseconds (clamped to at least zero). Honoring the date form is just as important as the numeric form.

Retries keep failing and the block gets longer

You are likely still over the limit because concurrency is too high or jitter is missing, so workers retry together. Reduce the concurrency pool, increase the base backoff, and confirm jitter is applied to every delay. If a host repeatedly throttles you, lower the steady-state pacing for the whole run rather than relying on retries to absorb it.

The same request runs twice and double-submits

A retry must be idempotent. For GET requests this is automatic, but for any request with a side effect, include an idempotency key or guard so a retried request does not create a duplicate record. When you need to inspect or adjust the outgoing request, combine this with Intercepting and Modifying Network Requests.

Verification

Confirm the retry logic three ways. First, point it at a test endpoint that returns 429 with a Retry-After for the first two calls and 200 after — the helper should succeed on the third attempt having waited the header's interval. Second, log each attempt's status and computed delay and confirm the waits grow exponentially with jitter when no header is present, and match Retry-After when it is. Third, capture a trace with --trace on and review the timing of the retried requests in the Playwright Trace Viewer to verify the spacing between attempts is what you intended.

Frequently Asked Questions

How do I read and use the Retry-After header?

Read it from res.headers()['retry-after']. It is either a number of seconds, which you multiply by 1000 to get milliseconds, or an HTTP date, which you parse and subtract from the current time. Wait exactly that long before retrying — it is the server's explicit instruction and the fastest route back to a successful response.

What is jitter and why does exponential backoff need it?

Jitter is a small random amount added to each backoff delay. Without it, many workers that failed at the same moment retry at the same moment, creating a thundering herd that re-overloads the server the instant its limit resets. Adding randomness spreads the retries across the window so recovery is smooth.

Should I retry every failed request?

No. Retry only transient failures — 429, 503, and network errors. Fail fast on deterministic ones like 403 and 404, because a retry cannot fix them and only wastes the server's capacity and your time. Distinguishing transient from deterministic failures is the core of a reliable retry policy.