From apify-pack
Sets up local Apify Actor development with CLI and Crawlee: create projects, configure actor.json/inputs, test via apify run emulating platform storage.
How this skill is triggered — by the user, by Claude, or both
Slash command
/apify-pack:apify-local-dev-loopThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Build and test Apify Actors on your local machine before deploying to the platform. Uses the Apify CLI (`apify run`) which emulates the platform environment locally, creating local storage directories for datasets, key-value stores, and request queues.
Build and test Apify Actors on your local machine before deploying to the platform. Uses the Apify CLI (apify run) which emulates the platform environment locally, creating local storage directories for datasets, key-value stores, and request queues.
npm install -g apify-cli (global CLI)apify login completed with valid tokenmy-actor/
├── .actor/
│ ├── actor.json # Actor metadata and config
│ └── INPUT_SCHEMA.json # Input schema (auto-generates UI on platform)
├── src/
│ └── main.ts # Entry point
├── storage/ # Created by apify run (git-ignored)
│ ├── datasets/default/
│ ├── key_value_stores/default/
│ └── request_queues/default/
├── package.json
└── tsconfig.json
# Create from template (interactive)
apify create my-actor
# Or create from specific template
apify create my-actor --template project_cheerio_crawler_ts
# Templates: project_empty, project_cheerio_crawler_ts,
# project_playwright_crawler_ts, project_puppeteer_crawler_ts
{
"actorSpecification": 1,
"name": "my-actor",
"title": "My Actor",
"description": "Scrapes data from example.com",
"version": "0.1",
"meta": {
"templateId": "project_cheerio_crawler_ts"
},
"input": "./INPUT_SCHEMA.json",
"dockerfile": "./Dockerfile",
"storages": {
"dataset": {
"actorSpecification": 1,
"title": "Scraped items",
"views": {
"overview": {
"title": "Overview",
"transformation": { "fields": ["url", "title", "text"] },
"display": {
"component": "table",
"properties": {
"url": { "label": "URL", "format": "link" },
"title": { "label": "Title" },
"text": { "label": "Content" }
}
}
}
}
}
}
}
{
"title": "My Actor Input",
"type": "object",
"schemaVersion": 1,
"properties": {
"startUrls": {
"title": "Start URLs",
"type": "array",
"description": "URLs to crawl",
"editor": "requestListSources",
"prefill": [{ "url": "https://example.com" }]
},
"maxPages": {
"title": "Max pages",
"type": "integer",
"description": "Maximum number of pages to crawl",
"default": 10,
"minimum": 1,
"maximum": 1000
}
},
"required": ["startUrls"]
}
// src/main.ts
import { Actor } from 'apify';
import { CheerioCrawler } from 'crawlee';
await Actor.init();
const input = await Actor.getInput<{
startUrls: { url: string }[];
maxPages?: number;
}>();
if (!input?.startUrls?.length) {
throw new Error('startUrls is required');
}
const crawler = new CheerioCrawler({
maxRequestsPerCrawl: input.maxPages ?? 10,
async requestHandler({ request, $, enqueueLinks }) {
const title = $('title').text().trim();
const h1 = $('h1').first().text().trim();
await Actor.pushData({
url: request.url,
title,
h1,
timestamp: new Date().toISOString(),
});
// Enqueue links on the same domain
await enqueueLinks({ strategy: 'same-domain' });
},
});
await crawler.run(input.startUrls.map(s => s.url));
await Actor.exit();
# Run with default input from storage/key_value_stores/default/INPUT.json
apify run
# Run with input from command line
apify run --input='{"startUrls":[{"url":"https://example.com"}],"maxPages":5}'
# View results
cat storage/datasets/default/*.json | jq '.'
# Or list dataset files
ls storage/datasets/default/
Create storage/key_value_stores/default/INPUT.json:
{
"startUrls": [{ "url": "https://example.com" }],
"maxPages": 5
}
apify run creates a storage/ directory that mirrors platform storage:
| Platform Storage | Local Path | Access via SDK |
|---|---|---|
| Default dataset | storage/datasets/default/ | Actor.pushData() |
| Default KV store | storage/key_value_stores/default/ | Actor.setValue() / Actor.getValue() |
| Default request queue | storage/request_queues/default/ | Managed by crawler |
{
"scripts": {
"start": "tsx src/main.ts",
"dev": "tsx watch src/main.ts",
"test": "vitest"
}
}
# Direct tsx execution (faster iteration than apify run)
npx tsx src/main.ts
# With environment variables emulating platform
APIFY_IS_AT_HOME=0 APIFY_LOCAL_STORAGE_DIR=./storage npx tsx src/main.ts
// tests/main.test.ts
import { describe, it, expect, vi } from 'vitest';
import { Actor } from 'apify';
describe('Actor', () => {
it('should process input correctly', async () => {
vi.spyOn(Actor, 'getInput').mockResolvedValue({
startUrls: [{ url: 'https://example.com' }],
maxPages: 1,
});
const pushSpy = vi.spyOn(Actor, 'pushData').mockResolvedValue(undefined);
// Run actor logic...
// Assert pushData was called with expected shape
expect(pushSpy).toHaveBeenCalledWith(
expect.objectContaining({ url: 'https://example.com' })
);
});
});
| Error | Cause | Solution |
|---|---|---|
apify: command not found | CLI not installed | npm i -g apify-cli |
INPUT.json not found | No input provided | Create storage/key_value_stores/default/INPUT.json |
Cannot find module 'apify' | SDK not installed | npm install apify crawlee |
Dockerfile not found | Missing actor config | Run apify create or create .actor/actor.json |
See apify-sdk-patterns for production-ready Actor code patterns.
npx claudepluginhub jeremylongshore/claude-code-plugins-plus-skills --plugin apify-packGuides Apify Actor development: safe CLI setup and auth, project bootstrapping from templates, input/output wiring, runtime logic, debugging, and deployment.
Develop, debug, and deploy Apify Actors — serverless cloud programs for web scraping, automation, and data processing. Guides setup, template selection, and CLI usage.
Runs a sample Apify Actor via apify-client to crawl sites and fetch results. For initial Apify setup, connectivity tests, or learning actor invocation and dataset retrieval.