Skill

cheerio-consistency

Write, review, refactor, or debug Node.js code that parses HTML with Cheerio (cheerio.load, $ selections, scraping, server-side DOM manipulation) using one canonical idiom set. Use this skill whenever code extracts data from HTML strings in Node, builds a scraper on fetched pages, transforms markup server-side, or when the user hits "$(...).click is not a function", .map returning a cheerio object instead of an array, undefined attr results, ESM/CJS import confusion, or expects JavaScript on the page to run. Trigger it even when the user just says "parse this HTML in Node" or "get all the links from this page" — without saying the word "Cheerio."

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/cheerio-consistency:cheerio-consistency

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Cheerio is stable, but it occupies a confusing niche: a jQuery-*like* API over a static

SKILL.md

98 lines · ~1.5k tokens

Stats

Stars0

MaintenanceGood

Last CommitJun 12, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Cheerio — consistent idioms

Cheerio is stable, but it occupies a confusing niche: a jQuery-like API over a static parse tree, with no browser underneath. Generated code drifts across that boundary — calling .click(), waiting for JS-rendered content, mishandling the raw elements that each/map callbacks receive, and tripping over the 1.0 ESM/CJS import split. This skill pins the canonical idiom set for Cheerio 1.x.

Canonical idioms — always X, never Y

Always	Never	Why
`import * as cheerio from "cheerio"` then `const $ = cheerio.load(html)`	`const $ = require("cheerio")(html)` default-call style	There is no default export; `load` is the entry point in both module systems.
treat the tree as static data	`.click()`, `.trigger()`, waiting for content	No events, no rendering, no script execution exist; JS-rendered content needs a browser tool.
rewrap in callbacks: `$items.each((i, el) => $(el).find(...))`	calling cheerio methods on the bare `el`	Callbacks receive raw parse-tree nodes; only `$(el)` has the API.
`$items.map((i, el) => $(el).text()).get()`	forgetting `.get()` (`.toArray()`)	`.map` returns a cheerio collection; `.get()` unwraps to a plain array.
`$el.attr("href") ?? fallback`	assuming attr returns null	Missing attributes yield `undefined`; empty selections too.
`$el.text()` for all matched text, `$el.html()` knowingly first-element-only inner HTML	confusing the two	`.text()` concatenates across the set; `.html()` reads only the first match.
scope with `$container.find("a")` or `$("a", container)`	re-querying the whole doc inside loops	Context queries express intent and avoid cross-card bleed.
`$.html()` to serialize the document	`$("html").html()` reconstruction	`$.html()` is the canonical full-document serializer.
`cheerio.load(xml, { xml: true })` for XML	parsing XML in HTML mode	HTML mode lowercases tags and "fixes" structure — XPath-ish queries silently miss.
check `$sel.length` before single-element logic	`$sel.first().text() === ""` ambiguity	Empty selections don't throw; they return empty strings/undefined that flow onward.

House style for a scrape:

import * as cheerio from "cheerio";

const $ = cheerio.load(html);

const products = $("ul.products > li.product")
  .map((_, el) => {
    const $card = $(el);
    return {
      name: $card.find("a.title").text().trim(),
      url: $card.find("a.title").attr("href"),
      price: $card.find(".price").text().trim(),
      sku: $card.attr("data-sku"),
    };
  })
  .get();

Pitfalls that produce silently wrong results

Cheerio sees the served HTML. If fetched markup lacks the data (SPA shells), every selector "works" and returns nothing. Inspect the raw string first; switch to Playwright/Puppeteer when the data is rendered client-side.
.map without .get() passes a cheerio object to JSON.stringify/array code — often serializing as {} instead of erroring.
each early exit is return false (jQuery convention), not break.
Selector support is css-select, not full jQuery: :visible, :hidden don't exist (no layout); :contains("text") is supported; case-sensitive in XML mode.
.text() includes <script>/<style> contents when present in the subtree — remove them ($("script, style").remove()) before whole-page text extraction.
Relative URLs come back relative; resolve with new URL(href, baseUrl).href before storing.
Entities: text is decoded (& → &); when serializing back out, cheerio re-encodes — don't double-decode with extra libraries.
Whitespace: .text() preserves source whitespace; .trim()/normalize per field, or join structured pieces deliberately rather than splitting big text blobs.

Version notes

Target Cheerio 1.x stable (1.0+). The 1.0 release (after the long rc series) reorganized entry points: named exports (load, fromURL), ESM+CJS dual support, parse5 default HTML parsing with htmlparser2 available via xml/options. Code from the 0.x/rc era (require("cheerio").load, default-import styles) mostly still runs in CJS — but write the modern form. fromURL exists for convenience fetching; in scrapers prefer explicit fetch + load so headers/errors are yours.

Workflow

Confirm the data exists in the HTML string (log/inspect) before writing selectors.
load once per document; select with CSS, scope per item card, rewrap raw elements.
Extract to plain JS data early (.map(...).get()); resolve URLs; trim text fields.
For transformation tasks, mutate (addClass, attr, remove, replaceWith) then serialize with $.html().
When reviewing, hunt: browser-API calls, bare el method calls, missing .get(), doc-wide queries in loops, XML parsed in HTML mode, unresolved relative URLs.

For the selector-support matrix, traversal/manipulation API reference, and load-option details, read references/cheerio-patterns.md.

cheerio-consistency

Invocation

Context Preview

SKILL.md

cheerio-consistency

Invocation

Context Preview

SKILL.md

Cheerio — consistent idioms

Canonical idioms — always X, never Y

Pitfalls that produce silently wrong results

Version notes

Workflow

Similar Skills

Cheerio — consistent idioms

Canonical idioms — always X, never Y

Pitfalls that produce silently wrong results

Version notes

Workflow

Similar Skills