Skill

scraperapi-async

Reference for ScraperAPI's Async Jobs API: submit background scraping jobs, poll results, use webhooks, and handle batch jobs up to 50k URLs.

automation

api-development

Popularity

Stars

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/scraperapi:scraperapi-async

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

The Async API submits scraping jobs in the background and retries them for up to 24 hours to

SKILL.md

283 lines · ~2.3k tokens

Stats

LanguagePython

Stars9

MaintenanceExcellent

Last CommitJun 17, 2026

Actions

View Source View Plugin View on GitHub View README

ScraperAPI Async Jobs API

The Async API submits scraping jobs in the background and retries them for up to 24 hours to maximize success. Results are retrieved by polling a status URL or received automatically via webhook.

When NOT to use Async

Single URL, result needed immediately → use the Standard API (api.scraperapi.com) — simpler and returns inline.
Need to follow links across a site → use the Crawler.
Need recurring scheduled scraping → use DataPipeline.

Use Async when: scraping 20+ URLs, the target site is slow or flaky, you want webhook delivery, or you need to scrape PDFs/images.

Endpoints

Action	Method	URL
Submit single job	POST	`https://async.scraperapi.com/jobs`
Submit batch (up to 50k)	POST	`https://async.scraperapi.com/batchjobs`
Check / retrieve job	GET	`https://async.scraperapi.com/jobs/<jobId>`
Cancel job	DELETE	`https://async.scraperapi.com/jobs/<jobId>`

Auth: apiKey in the JSON request body (note: apiKey camelCase, unlike the Standard API's api_key).

Single Job

import os, requests, time

API_KEY = os.environ["SCRAPERAPI_API_KEY"]

# Submit
r = requests.post(
    "https://async.scraperapi.com/jobs",
    json={
        "apiKey": API_KEY,
        "url":    "https://example.com/product/123",
        "apiParams": {
            "render":       True,
            "country_code": "us",
        }
    }
)
job = r.json()
# {"id": "...", "status": "running", "statusUrl": "...", "url": "..."}

# Poll
def poll(status_url, interval=5, max_wait=120):
    deadline = time.time() + max_wait
    while time.time() < deadline:
        data = requests.get(status_url).json()
        if data["status"] == "finished":
            return data["response"]["body"]
        if data["status"] == "failed":
            raise RuntimeError(f"Job failed: {data.get('failReason')}")
        time.sleep(interval)
    raise TimeoutError("Job did not finish in time")

html = poll(job["statusUrl"])

Finished job response shape:

{
  "id": "...",
  "status": "finished",
  "statusUrl": "...",
  "url": "https://example.com/product/123",
  "response": {
    "headers": { "content-type": "text/html", "sa-final-url": "...", "sa-statuscode": "200" },
    "body": "<!doctype html>...",
    "statusCode": 200
  }
}

Batch Jobs (up to 50,000 URLs)

jobs = requests.post(
    "https://async.scraperapi.com/batchjobs",
    json={
        "apiKey": API_KEY,
        "urls": [
            "https://example.com/page/1",
            "https://example.com/page/2",
            # ... up to 50,000
        ],
        "apiParams": {"country_code": "us"}
    }
).json()
# Returns a list of {id, status, statusUrl, url} — one per submitted URL

results = [poll(job["statusUrl"]) for job in jobs]

For workloads over 50,000 URLs, split into multiple batch requests. Use webhooks (below) instead of polling when batches are large — polling 10,000 status URLs serially is slow.

Webhook Callbacks

Use webhooks to receive results without polling. ScraperAPI POSTs the completed job payload to your URL when the scrape finishes.

requests.post(
    "https://async.scraperapi.com/jobs",
    json={
        "apiKey": API_KEY,
        "url":    "https://example.com/",
        "callback": {
            "type": "webhook",
            "url":  "https://yourapp.com/scraperapi/callback"
        }
    }
)

Webhook mechanics:

By default, only successful jobs trigger the callback.
Set "expectUnsuccessReport": true to also receive failed job payloads.
ScraperAPI retries delivery 3 times; if all fail, the job is cancelled.
Webhook URL must be publicly accessible.
For testing without a server, use webhook.site.

Failed job callback payload:

{
  "id": "...",
  "attempts": 50,
  "status": "failed",
  "failReason": "failed_due_to_timeout",
  "url": "https://example.com/"
}

All Request Body Parameters

{
  "apiKey":               "YOUR_KEY",
  "url":                  "https://example.com",
  "urls":                 ["url1", "url2"],
  "method":               "GET",
  "headers":              { "Accept-Language": "en-US" },
  "body":                 "foo=bar",
  "callback":             { "type": "webhook", "url": "https://..." },
  "expectUnsuccessReport": false,
  "timeoutSec":           600,
  "meta":                 { "jobLabel": "batch-42" },
  "apiParams": {
    "autoparse":          false,
    "country_code":       "us",
    "keep_headers":       false,
    "device_type":        "desktop",
    "follow_redirect":    true,
    "premium":            false,
    "ultra_premium":      false,
    "render":             false,
    "wait_for_selector":  ".content",
    "screenshot":         false,
    "retry_404":          false,
    "output_format":      "html",
    "max_cost":           10
  }
}

Async-exclusive parameters

Parameter	Type	Purpose
`expectUnsuccessReport`	boolean	Receive webhook payload for failed jobs too
`timeoutSec`	integer	Override default job timeout (seconds)
`meta`	object	Custom metadata — echoed back in every response/callback for correlation

meta is especially useful for tracking which batch or workflow a job belongs to:

{ "meta": { "batchId": "run-2024-06", "sourceFile": "urls.csv" } }

Passing a POST request to the target site

requests.post(
    "https://async.scraperapi.com/jobs",
    json={
        "apiKey":  API_KEY,
        "url":     "https://api.example.com/search",
        "method":  "POST",
        "headers": {"content-type": "application/x-www-form-urlencoded"},
        "body":    "query=scraperapi&page=1",
    }
)

Binary Responses (PDFs and Images)

When the target URL returns binary content, the response body is Base64-encoded in response.base64EncodedBody.

import base64

r = requests.post(
    "https://async.scraperapi.com/jobs",
    json={"apiKey": API_KEY, "url": "https://example.com/report.pdf"}
)
job = r.json()

# ... wait or poll ...
result = requests.get(job["statusUrl"]).json()
pdf_bytes = base64.b64decode(result["response"]["base64EncodedBody"])
with open("report.pdf", "wb") as f:
    f.write(pdf_bytes)

Retention Policy

Job results are stored for up to 72 hours (24 hours guaranteed) after the job finishes. After that, the data is deleted — resubmit the job if you need it again.

Retrieve results before the retention window closes. For long pipelines, prefer webhooks so results are pushed to your system immediately upon completion.

Error Handling

Status	Meaning	Action
Job `finished`, `statusCode: 200`	Success	Use `response.body`
Job `finished`, `statusCode: 403`	Target blocked the scrape	Retry with `premium: true` in `apiParams`
Job `failed`, `failReason: failed_due_to_timeout`	Timed out after 24h retries	Check if target is reachable; try `render: false`
HTTP 401 on submission	Bad API key	Check `SCRAPERAPI_API_KEY`
HTTP 403 on submission	Out of credits or plan limit	Check dashboard
HTTP 429 on submission	Too many concurrent submissions	Back off and re-submit in batches

Use max_cost in apiParams to cap per-request credit spend — requests that would exceed the cap return a 403 rather than consuming more credits than expected.

Credit Cost

The Async API uses the same credit costs as the Standard API:

Request type	Credits
Standard	1
`render: true`	10
`premium: true`	10
`ultra_premium: true`	30
Failed requests	0

Async jobs that fail after exhausting all retries are not charged.

scraperapi-async

Popularity

Invocation

Context Preview

SKILL.md

scraperapi-async

Popularity

Invocation

Context Preview

SKILL.md

ScraperAPI Async Jobs API

When NOT to use Async

Endpoints

Single Job

Batch Jobs (up to 50,000 URLs)

Webhook Callbacks

All Request Body Parameters

Async-exclusive parameters

Passing a POST request to the target site

Binary Responses (PDFs and Images)

Retention Policy

Error Handling

Credit Cost

Documentation

Similar Skills

ScraperAPI Async Jobs API

When NOT to use Async

Endpoints

Single Job

Batch Jobs (up to 50,000 URLs)

Webhook Callbacks

All Request Body Parameters

Async-exclusive parameters

Passing a POST request to the target site

Binary Responses (PDFs and Images)

Retention Policy

Error Handling

Credit Cost

Documentation

Similar Skills