From apify-mcpc
Finds, evaluates, and runs Apify Actors using the mcpc CLI. Searches Apify Store, compares Actors by stats and ratings, reads input schemas to build correct inputs, runs Actors via call-actor, and retrieves results. Covers 8 marketing intelligence use cases (audience analysis, brand monitoring, competitor intelligence, content analytics, influencer discovery, lead generation, market research, trend analysis) plus multi-actor workflow patterns with domain-specific Actor suggestions and gotchas. Use when user wants to scrape data, extract information from websites, run automation, find tools on Apify platform, or search Apify/Crawlee documentation. Do NOT use for developing new Actors (use apify-actor-development skill instead).
How this skill is triggered — by the user, by Claude, or both
Slash command
/apify-mcpc:apify-mcpcThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Run before first use in a session:
README.mdreferences/issue-reporting.mdreferences/use-cases/audience-analysis.mdreferences/use-cases/brand-monitoring.mdreferences/use-cases/competitor-intelligence.mdreferences/use-cases/content-analytics.mdreferences/use-cases/influencer-discovery.mdreferences/use-cases/lead-generation.mdreferences/use-cases/market-research.mdreferences/use-cases/multi-actor-workflows.mdreferences/use-cases/trend-analysis.mdscripts/check_apify.shRun before first use in a session:
bash ${CLAUDE_PLUGIN_ROOT}/skills/apify-mcpc/scripts/check_apify.sh
The script checks: mcpc CLI installed, OAuth profile for mcp.apify.com, active @apify and @apify-docs sessions. If it fails:
npm i -g @apify/mcpcmcpc mcp.apify.com loginmcpc @apify restartRun these at the start of every session to check mcpc status:
mcpc --version
mcpc
ALWAYS include the -H header in ALL mcpc @apify and mcpc @apify-docs calls for Apify usage analytics (same pattern as run_actor.js User-Agent in other Apify skills):
-H "User-Agent: apify-agent-skills/apify-mcpc-1.4.1/<action>"
Replace <action> with a short label describing the call: search, fetch_details, call_actor, get_output, get_run, rag_browser, search_docs, fetch_docs.
Example:
mcpc -H "User-Agent: apify-agent-skills/apify-mcpc-1.4.1/search" @apify tools-call search-actors keywords:="instagram"
Available tools and arguments (for building correct mcpc calls):
mcpc --help
When possible, match the user's intent to a use case. Each file has suggested Actors, search keywords, and domain-specific gotchas.
| Use Case | File | When |
|---|---|---|
| Audience Analysis | audience-analysis.md | Demographics, follower behavior, engagement quality |
| Brand Monitoring | brand-monitoring.md | Reviews, ratings, sentiment, brand mentions |
| Competitor Intelligence | competitor-intelligence.md | Competitor strategies, ads, pricing, positioning |
| Content Analytics | content-analytics.md | Engagement metrics, campaign ROI, content performance |
| Influencer Discovery | influencer-discovery.md | Find influencers, verify authenticity, partnerships |
| Lead Generation | lead-generation.md | B2B/B2C leads, contact enrichment, prospecting |
| Market Research | market-research.md | Market conditions, pricing, geographic opportunities |
| Trend Analysis | trend-analysis.md | Emerging trends, viral content, content strategy |
| Multi-Actor Workflows | multi-actor-workflows.md | Chaining Actors, handle discovery via SERP, enrichment |
Important: Actor tables in use-case files are suggestions, not closed lists. In case of problems use search-actors for the latest options.
All tool arguments use := (no spaces around it). Values are auto-parsed: valid JSON becomes that type, otherwise treated as string.
keywords:="web scraper" # string
limit:=5 # number (valid JSON)
enabled:=true # boolean (valid JSON)
input:='{"startUrls":[...]}' # object — entire JSON as single-quoted string
If the first argument starts with { or [, the whole input is parsed as one JSON structure (not key-value pairs). Pipe JSON via stdin as alternative.
mcpc 'mcp.apify.com?tools=docs' connect @apify-docs
Tools: search-apify-docs, fetch-apify-docs — for searching Apify/Crawlee documentation.
Authenticate via OAuth (opens browser):
mcpc mcp.apify.com login
Then create a persistent session:
mcpc mcp.apify.com connect @apify
Tools: search-actors, fetch-actor-details, call-actor, get-actor-output, get-actor-run, apify-slash-rag-web-browser + docs tools.
If the task requires Actors and user is not authenticated:
mcpc mcp.apify.com login (opens browser for OAuth)For one-off web searches or URL fetching, use the dedicated tool directly — no need for the full search → call workflow:
mcpc @apify tools-call apify-slash-rag-web-browser query:="san francisco weather" maxResults:=10
mcpc @apify tools-call apify-slash-rag-web-browser query:="https://www.example.com"
Parameters: query (required), maxResults (default 3), outputFormats (array: "markdown" default, "text", "html"). Use outputFormats:='["text"]' to save tokens.
Use when: user wants immediate data ("get me info about X", "fetch this URL"), not a specialized scraper.
This workflow is the core of this skill. ALWAYS follow it. NEVER skip steps. NEVER run an Actor without completing the preceding steps.
ALWAYS determine scope BEFORE doing anything else. Wrong scope = wasted tokens and money.
Mandatory gates — these are HARD STOPS, not suggestions:
- [ ] Step 1: Search for Actors (ALWAYS at least 2 searches: broad then narrow)
- [ ] Step 2: Fetch details + input schema (ALWAYS read README)
- [ ] GATE: STOP — show user the Actor choice + planned input. Wait for confirmation.
- [ ] Step 3: Build input from schema + README
- [ ] Step 4: Validate inputs (ALWAYS challenge assumptions about identifiers)
- [ ] GATE: STOP — if ANY input value is guessed, tell user. Do NOT run with unverified inputs.
- [ ] Step 5: Run the Actor (test run with limit=1 FIRST when uncertain)
- [ ] Step 6: ALWAYS verify results — zero results = wrong input, not "no data"
- [ ] Step 7: Get full results and present
# First search: broad (platform name only)
mcpc @apify tools-call search-actors keywords:="instagram"
# Second search: narrow if needed (platform + data type)
mcpc @apify tools-call search-actors keywords:="instagram profile"
"TikTok" before "TikTok posts"fullName field (e.g., apify/website-content-crawler)keywords (string), limit (1–100, default 10), offset (default 0), category (string, filter by category — leave empty for all)fetch-actor-details returns description, README, all stats, AND the full input schema in one call. Parameter is actor (not actorId):
mcpc @apify tools-call fetch-actor-details actor:="apify/website-content-crawler"
Use output parameter to select specific sections and save tokens:
mcpc @apify tools-call fetch-actor-details actor:="apify/website-content-crawler" output:='{"inputSchema": true}'
mcpc @apify tools-call fetch-actor-details actor:="apify/website-content-crawler" output:='{"readme": true}'
Valid output fields (additionalProperties: false — typo = error -32602):
| Field | Returns |
|---|---|
description | Actor description text |
stats | actorInfo.stats: totalUsers, monthlyUsers, successRate, bookmarks |
pricing | actorInfo.pricing: model (FREE/PAY_PER_EVENT/PRICE_PER_DATASET_ITEM/FLAT_PRICE_PER_MONTH), isFree, and for PAY_PER_EVENT: events[] with per-tier costs (FREE/BRONZE/SILVER/GOLD/PLATINUM/DIAMOND tiers) |
rating | actorInfo.rating: average (out of 5), count |
metadata | actorInfo: developer, categories, modifiedAt, isDeprecated |
inputSchema | Full JSON Schema of input parameters |
readme | README summary (or full if no summary) |
outputSchema | Inferred output schema from recent runs |
mcpTools | MCP tools list (only for MCP server Actors) |
Default (no output): all fields except mcpTools. With output: only true fields return data — false flags are ignored (same as omitting them). output:='{}' returns the default.
Always-present fields in actorInfo (no flag needed): title, url, fullName, pictureUrl.
For structured access, use --json + jq:
# Success rate
mcpc @apify tools-call fetch-actor-details actor:="apify/website-content-crawler" output:='{"stats": true}' --json | jq '.structuredContent.actorInfo.stats.successRate'
# Stats + rating together
mcpc @apify tools-call fetch-actor-details actor:="apify/website-content-crawler" output:='{"stats": true, "rating": true}' --json | jq '.structuredContent.actorInfo | {stats, rating}'
isDeprecated — never use deprecated Actorsproperties, required, prefill for each fieldThe schema alone is not enough — ALWAYS cross-reference with the README to understand what values the Actor actually expects:
required arraydescription — it often specifies exact formats, allowed values, or constraints not captured by type alone (e.g., "URL must include protocol", "comma-separated list", "ISO country code")prefill values — use as starting point, adapt to user's actual needsoutput:='{"readme": true}') — it typically has usage examples with realistic inputs/(cookie|token|password|secret|auth|key)/i, replace the value with <REDACTED> and confirm only the type/format needed — never display credentials in chat. Wait for explicit approval.Before running the Actor, critically evaluate whether your input values make sense in the context of how the target service actually works:
If you cannot confidently verify an input value, tell the user what you're uncertain about and suggest how to resolve it before spending money on a scrape.
mcpc @apify tools-call call-actor actor:="apify/website-content-crawler" \
input:='{"startUrls":[{"url":"https://example.com"}],"proxyConfiguration":{"useApifyProxy":true}}' \
callOptions:='{"timeout": 300}'
callOptions accepts timeout (seconds, 0 = infinite) and memory (MB, power of 2, 128–32768).
Use previewOutput:=false to skip inline items and save tokens — then fetch selectively with get-actor-output:
mcpc @apify tools-call call-actor actor:="<actor>" input:='...' previewOutput:=false --json \
| jq '.structuredContent | {runId, datasetId, itemCount}'
Sync response (structuredContent): runId, datasetId, itemCount, items[], instructions.
Async response (async:=true): runId, actorName, status, startedAt, input. No cost data — _meta is only returned for sync runs.
Cost tracking (sync only): --json response has _meta at the JSON top-level (not inside structuredContent): jq '._meta.usageTotalUsd' returns total run cost in USD.
On input validation error, go back to Step 3.
By default, call-actor runs synchronously (waits for completion). For long-running Actors, use async:=true — poll status with get-actor-run.
Save runId immediately when using async:=true — sessions can reset mid-run:
mcpc ... async:=true --json | jq -r '.structuredContent.runId' | tee run.id
mcpc @apify tools-call get-actor-run runId:="$(cat run.id)" # resume/check
Cost tracking: auto-logged to ~/.apify-costs.log per call-actor --json run. Check with /apify-status.
Before presenting results to the user or fetching the full dataset, ALWAYS do a sanity check on the initial output:
When uncertain about inputs, run a cheap test first — get 1 result and verify before full scrape:
mcpc @apify tools-call fetch-actor-details actor:="<actor>" output:='{"inputSchema": true}' --json \
| jq '[.structuredContent.inputSchema.properties | to_entries[] | select(.key | test("max|limit"; "i")) | {key, title: .value.title}]'
mcpc @apify tools-call call-actor actor:="<actor>" \
input:='{"<maxResultsField>": 1, ...}' \
callOptions:='{"timeout": 120}'
callOptions.timeout (seconds) auto-aborts the run if the Actor ignores its limit. There is no abort tool in mcpc — for manual abort use Apify REST API: POST https://api.apify.com/v2/actor-runs/{runId}/abort?token={APIFY_TOKEN}.
NEVER load full datasets into context. Large responses flood the context and degrade performance. Always follow the three-step pattern below.
mcpc @apify tools-call get-actor-output datasetId:="<id>" limit:=1 --json \
| jq '.structuredContent | {totalItemCount, fields: (.items[0] | keys)}'
This tells you: how many items exist and what fields are available — without loading any real data.
Before saving or presenting, spot-check a few items:
mcpc @apify tools-call get-actor-output datasetId:="<id>" limit:=5
Read the output. Verify the data matches the expected entity (see Step 6). If it looks wrong, re-run from Step 3 — don't save bad data.
Once the sample looks correct, save the full dataset to a local file. The > redirect sends data to disk; nothing enters the context.
# All items, all fields
mcpc @apify tools-call get-actor-output datasetId:="<id>" --json \
| jq '.structuredContent.items' > results.json
# Selected fields only (reduces file size)
mcpc @apify tools-call get-actor-output datasetId:="<id>" \
fields:="title,url,price" --json \
| jq '.structuredContent.items' > results.json
# Large dataset — paginate into one file
for offset in 0 100 200; do
mcpc @apify tools-call get-actor-output datasetId:="<id>" \
limit:=100 offset:=$offset --json \
| jq '.structuredContent.items[]' >> results.jsonl
done
Then confirm the save and summarize to the user:
jq 'length, .[0:2]' results.json # item count + 2 examples
Other formats — get-actor-output is JSON only. For CSV/Excel/XML download directly: https://api.apify.com/v2/datasets/{datasetId}/items?format=csv&clean=true&token=${APIFY_TOKEN} (formats: csv, xlsx, jsonl, xml).
fields gotcha: dot notation flattens keys — fields:="crawl.httpStatusCode" returns {"crawl.httpStatusCode": 200}, not {"crawl": {"httpStatusCode": 200}}. In jq use ."crawl.httpStatusCode" (quoted).
get-actor-output response (structuredContent): datasetId, items[], itemCount, totalItemCount, offset, limit.
If the run was async or is still in progress, check status first:
mcpc @apify tools-call get-actor-run runId:="<runId>" --json | jq '.structuredContent | {status, datasetId: .dataset.datasetId, itemCount: .dataset.itemCount}'
get-actor-run response (structuredContent): runId, actorName, status, startedAt, finishedAt, stats (runTimeSecs, computeUnits, memMaxBytes, ...), dataset (datasetId, itemCount, schema, previewItems[]).
Check the use-case file for suggested follow-up workflows after presenting results.
actor, not actorIdmcpc mcp.apify.com login then mcpc mcp.apify.com connect @apifyThe input schema from fetch-actor-details is standard JSON Schema, but with Apify-specific conventions:
prefill — example values provided by the Actor author. Use as starting point for building input.sectionCaption / sectionDescription — group related fields; read these to understand field purpose in context.editor — hints at expected format (e.g., "requestListSources" means the field expects [{"url": "..."}] objects, not plain strings).startUrls pattern — most scrapers use startUrls, which is NOT an array of strings. It's an array of request objects:
// Correct
{"startUrls": [{"url": "https://example.com"}, {"url": "https://example.com/page2"}]}
// Wrong — plain strings don't work
{"startUrls": ["https://example.com"]}
Field naming — Actors don't follow a single convention. Common variants:
maxItems, maxResults, maxCrawlPages, resultsLimit — always check the schemasearchTerms, queries, keywords, search — varies per ActorSocial handles — typically without @: "apify" not "@apify"
Proxy — when needed: "proxyConfiguration": {"useApifyProxy": true}
search-actors already returns stats, rating, pricing, isDeprecated, developer per actor — enough for comparison without extra calls. Use fetch-actor-details only when you need inputSchema, readme, outputSchema, or modifiedAt.
Decision order (eliminate top-down):
isDeprecated = true → eliminate immediatelysuccessRate < 80% → red flag, skip unless only optionpricing.model → prefer FREE > PAY_PER_EVENT > PRICE_PER_DATASET_ITEM (cheapest that works)totalUsers > 100 → battle-tested, safe betapify/ → official, often more reliablemodifiedAt more recent → maintained (only via fetch-actor-details with metadata flag)Selection rule: Pick the cheapest Actor that meets functional requirements.
Fallback chain: If the chosen Actor fails (error, bad data, deprecated), try the next candidate. Don't ask the user before trying an alternative — just report what happened. Only stop and ask if all candidates fail.
For Apify/Crawlee documentation questions without needing Actors:
mcpc @apify-docs tools-call search-apify-docs docSource:="apify" query:="proxy configuration" limit:=5
mcpc @apify-docs tools-call fetch-apify-docs url:="https://docs.apify.com/platform/proxy"
docSource: "apify", "crawlee-js", or "crawlee-py"query: keywords only, not sentenceslimit: 1–20 (default 5), offset: for pagination (default 0)mcpc responses can be large. Use these techniques to get only what you need:
| Situation | Technique |
|---|---|
| Need only input schema, not README/stats | fetch-actor-details ... output:='{"inputSchema": true}' |
| Need only README | fetch-actor-details ... output:='{"readme": true}' |
| Dataset has many fields, need only some | get-actor-output ... fields:="title,url,price" |
| Large dataset, need a sample | get-actor-output ... limit:=10 |
| Paginate through results | get-actor-output ... offset:=10 limit:=10 |
--json + jq)Use when server-side filtering isn't enough or for scripting:
# Extract actor names from search
mcpc @apify tools-call search-actors keywords:="instagram scraper" --json | \
jq -r '.structuredContent.actors[].fullName'
# Quick comparison from search (no extra API calls)
mcpc @apify tools-call search-actors keywords:="instagram scraper" limit:=5 --json | \
jq '[.structuredContent.actors[] | {fullName, successRate: .stats.successRate, rating: .rating.average, users: .stats.totalUsers, pricing: .pricing.model}]'
# Deep comparison with README (only when needed)
mcpc @apify tools-call search-actors keywords:="instagram scraper" --json | \
jq -r '.structuredContent.actors[].fullName' | while read -r name; do
mcpc @apify tools-call fetch-actor-details actor:="$name" output:='{"stats": true, "rating": true, "metadata": true, "pricing": true}' --json | \
jq '.structuredContent'
done
JSON responses wrap tool output in {content, structuredContent} — always access data via .structuredContent.
output to get overview. Follow-up calls with output:='{"inputSchema": true}' or output:='{"readme": true}' to dig deeper without re-fetching everything.fields if you know which columns matter. Use limit for first peek, then fetch more if needed.--json + jq to extract stats programmatically when comparing 3+ candidates.This skill handles one-shot runs. For recurring/scheduled scraping:
This skill does NOT create schedules programmatically — that requires Apify API, not mcpc.
npx claudepluginhub chocholous/apify-mcpc --plugin apify-mcpcRuns Apify web scraping Actors, manages datasets, creates tasks, and retrieves crawl results via the Composio Apify integration.
Develop, debug, and deploy Apify Actors — serverless cloud programs for web scraping, automation, and data processing. Guides setup, template selection, and CLI usage.
Selects and runs the best Apify Actor for web data extraction across 55+ platforms (Instagram, Facebook, etc.). Automates actor selection, schema fetching, execution, and result summarization.