Writes, fixes, or updates Python scripts using shub_workflow base classes for Scrapy Cloud operations: scheduling spiders, querying jobs, aggregating stats, or running as crawl managers/monitors.
How this skill is triggered — by the user, by Claude, or both
Slash command
/shub-workflow-toolkit:shub-workflow-scriptsThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
shub-workflow scripts subclass a base class in
shub-workflow scripts subclass a base class in
shub_workflow/script.py
and get, for free: argument parsing (+ reusable -g/-v "programs"), project-id resolution, the
ScrapinghubClient, job scheduling with flow/name tagging, paginated+retrying job queries, a Scrapy
stats collector and an FSHelper. The deep reference is the wiki:
Appendix B: Script Classes.
A brand-new file isn't yet registered in the project's setup.py, so you can't tell from the repo.
When asked to "create a script", confirm with the user whether it will deploy to or operate on
Scrapy Cloud (run as an SC job, schedule spiders/scripts, scan/query SC jobs, aggregate stats,
etc.). If yes → use these base classes. If it's just a local utility with no SC interaction, a plain
script is fine and this skill doesn't apply. When editing an existing file that already imports
shub_workflow.script, this skill applies directly.
| Base class | Use when | Template |
|---|---|---|
BaseScript | one-shot: parse args, do work in run(), exit (the default) | examples/plain_script.py |
BaseLoopScript | must repeat work on an interval / run continuously until stopped | examples/loop_script.py |
BaseLoopScriptAsyncMixin (+ BaseLoopScript) | the loop cycle is asyncio-based (schedules/awaits many things at once) | examples/async_loop_script.py |
ArgumentParserScript | only argparse + PROGRAMS, no SC access (rare; the base the others build on) | — |
Projects usually add a shared base mixin (common CLI options/helpers) that every concrete script inherits — see examples/project_base_mixin.py. Check whether the project already has one and build on it rather than re-adding shared options.
description (property) and add your arguments in add_argparser_options() —
always call super() first so --project-id/-g/-v/etc. survive.run() for BaseScript; workflow_loop() (returns bool; async def for the async mixin) plus optional on_start()/on_close() for loop scripts.__main__ boilerplate (below).setup.py. Deployment itself is handled by the
scrapy-cloud-deployment skill, not here.super().add_argparser_options() before adding arguments, or you lose the framework
flags (--project-id, --flow-id, -g/-v, loop flags).class X(ProjectMixin, BaseScript) — mixin first, concrete base last. The mixin
must inherit the typing-only BaseScriptProtocol (never BaseScript), so the implementation
isn't duplicated in the MRO.BaseLoopScriptAsyncMixin script's run() is a coroutine —
launch it with asyncio.run(script.run()), and make workflow_loop an async def.self.project_id is the target (where you schedule/query, from
--project-id); the script's own running project can differ. Don't hardcode ids; pass
--project-id or set default_project_id.workflow_loop() returns bool: True keeps looping (with --loop-mode/loop_mode);
False stops immediately. A loop with loop_mode = 0 runs its body once.project_required = False only for scripts that genuinely never touch SC.-g/-v PROGRAMS mechanism (predefined command-line shortcuts), see the
scanjobs-programs skill — scanjobs.py is the canonical example of a heavily
PROGRAMS-driven script.if __name__ == "__main__":
import logging
from shub_workflow.utils import get_kumo_loglevel
logging.basicConfig(format="%(asctime)s %(name)s [%(levelname)s]: %(message)s", level=get_kumo_loglevel())
script = MyScript()
script.run() # ... or asyncio.run(script.run()) for a BaseLoopScriptAsyncMixin script
npx claudepluginhub scrapinghub/shub-workflow --plugin shub-workflow-toolkitBuilds, updates, and troubleshoots shub-workflow crawl managers that schedule Scrapy Cloud spider jobs and react to outcomes. Covers base class selection, generator pattern, hooks, and concurrent scheduling.
Guides using Claude Code dynamic workflows to orchestrate many subagents for large-scale tasks like codebase sweeps or migrations.
Creates and executes temporary Python, Node.js, shell, Ruby, or Go scripts in workflows for external API integrations like Reddit, data processing with pip/npm packages like pandas/requests, and custom tools.