Skill

using-blog-lexicon

Use when drafting, revising, or proofreading blog posts, product announcements, launch posts, changelogs, newsletters, release notes, website copy, or any prose meant to read like a frontier-lab blog. Especially when the user asks to remove "AI slop" or "marketing slop" from comms writing, make an announcement "sound like Anthropic", or deslop an AI-written blog draft. Triggers on hype diction (game-changing, revolutionary, cutting-edge, unlock, supercharge, seamlessly, thrilled to announce), filler framing (in today's fast-paced landscape, paradigm shift, take it to new heights), and empty enthusiasm (the possibilities are endless).

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/seshat:using-blog-lexicon

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

A SQLite + numpy vector store over 4,856 distinctive terms (unigrams and 2-4-word phrases) extracted from 385 Anthropic blog posts (anthropic.com/news and /research, 2021-2026). Each term carries 3-5 KWIC usage examples drawn from those posts. Query it before you commit to a phrase: it tells you whether Anthropic's writers actually write that way, and what they reach for instead.

SKILL.md

243 lines · ~4.3k tokens

Stats

LanguageHTML

Stars0

MaintenanceExcellent

Last CommitJun 17, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Using the Blog Lexicon

This is the comms-register sibling of seshat:using-frontier-lexicon. Same CLI, same workflow, different pool: every command takes --pool blogs. The papers pool teaches research-paper voice; this pool teaches the voice of launch posts, research explainers, policy posts, and product updates. The corpus will grow beyond Anthropic; the register target is "good frontier-lab comms", with Anthropic as the founding sample.

The lexicon is a retrieval tool, not a thesaurus and not an oracle. Use it to learn the register, then write your own prose.

Throughout this skill, lex means "$CLAUDE_PLUGIN_ROOT/bin/lex". Claude Code sets CLAUDE_PLUGIN_ROOT to this plugin's root whenever the plugin is enabled. The bare name lex is NOT on PATH (on macOS it resolves to the BSD lexer). Outside a plugin session, use <seshat-checkout>/bin/lex directly.

When to reach for it

Situation	Command
You wrote a phrase that sounds hyped, generic, or AI-flavored	`lex search "<phrase>" --pool blogs --json`
You have an intent and want diction for it ("stating availability", "describing a capability honestly")	`lex search "<intent>" --pool blogs --json`
You have a candidate term and want neighbors in embedding space	`lex similar "<term>" --pool blogs --json`
You're considering a term and want to read 3-5 real usage sentences before using it	`lex show "<term>" --pool blogs --json`
Browsing for stronger verbs, adjectives, or nouns	`lex top -k 50 --pos VERB --pool blogs --json` (or `ADJ`, `ADV`, `NOUN`)
Stuck and want serendipity	`lex random -k 20 --pool blogs --json`
Sanity-checking the pool is loaded	`lex stats --pool blogs`

All commands print human-readable text by default. Pass --json for parseable output. Always pass --pool blogs; without it you are querying the research-papers pool and will get paper diction in a blog draft.

Rewriting workflow

When asked to edit or deslop a paragraph:

Identify the paragraph's job — announcement lede, capability description, how-it-works, availability/pricing, safety note, or closing. Different jobs tolerate different registers; a lede may carry one line of earned enthusiasm, an availability paragraph carries none.
Mark the slop phrases. Use the lists below. State briefly why each phrase weakens the prose (e.g., "game-changing" with no measurement, "seamlessly" instead of saying what the integration does). On a slop-dense draft, group similar phrases and give one reason per group; the marking exists to justify the edits, not to inventory every adjective.
Query the lexicon. Use search for intent ("stating availability and pricing"), similar when you have an anchor term, show to inspect candidates before using them.
Rewrite in plain announcement voice. Preserve every fact the draft contains. Do not invent numbers, dates, customers, benchmarks, or availability claims. Vague-but-not-empty claims ("enterprise-grade security", "intuitive interface") get the cut-and-ask treatment: leave them out of the rewrite and list the specifics you need from the user to add them back as one flat sentence each. Do not flatten them into equally vague prose.
Return a compact before/after plus a short note on the main edits. Default response shape (when the user or caller mandates a different format, honor theirs and nest this shape inside it):

Slop removed:
- ...

Rewrite:
> ...

Notes:
- ...

For a full post, repeat paragraph by paragraph and keep terminology consistent.

What to delete on sight

These are the patterns the pool was built to push back against:

Free-floating hype: game-changing, revolutionary, cutting-edge, transformative milestone, a quantum leap
Verb inflation: unlock, unleash, supercharge, elevate, empower, harness the full potential
Filler framing: in today's fast-paced digital landscape, paradigm shift, it's not just a tool, it's a..., take X to new heights
Smoothness claims: seamlessly, effortlessly, frictionless where the actual integration or step count would inform
Empty enthusiasm: we can't wait to see what you'll build as a closer with nothing after it, the possibilities are endless, we've gone the extra mile
Unanchored superlatives: blazing-fast, best-in-class, enterprise-grade with no number, benchmark, or named standard attached

Concrete moves to prefer

These come back high in the pool and read as native frontier-blog voice. Confirm fit before using: run lex show <term> --pool blogs when the term has an entry; multiword moves with stopwords or contractions (we're releasing, we've shipped) are sentence shapes mined from KWIC examples and won't have entries, so for those, reading the KWIC sentences returned by a search or similar query counts as inspection.

Announcement verbs: we're releasing, we're introducing, we're adding, we're expanding, we've added, we've shipped
Working verbs: deploy, automate, analyze, iterate, mitigate, validate, accelerate, generalize, outperform (with the comparison stated)
Availability move: X is available today + where (across all Claude products and our API, through the Slack Marketplace); pricing stated flat (Pricing remains the same, at $3/$15 per million tokens)
Earned superlatives: superlatives bound to a measurement or named standard: state-of-the-art on SWE-bench Verified, now leads at 61.4%, our most capable model to date followed by the evidence. The corpus says we're excited sparingly, and almost always about a named, shipping thing.
Caveats in main text: important limitations, which suggests it isn't purely..., a small but important first step, not yet at scale
Research-explainer moves: We analyzed [N] ... to ask:, We found that: followed by bolded finding-led bullets each carrying a number

Rewrite heuristics

Strong frontier-blog prose tends to:

Lead with what shipped, not how the company feels about it.
State the capability, then the evidence: a number, a benchmark, a duration, a worked demo.
State availability, eligibility, and pricing flatly; those sentences carry no adjectives.
Use short declarative sentences to open and to land conclusions.
Put the caveat in the main text, not a footnote: name what doesn't work yet or isn't known.
Replace blazing-fast with the throughput. Replace seamlessly integrates with the actual steps or the list of things it connects to.
Earn each superlative: attach it to a benchmark, a comparison with numbers, or a named standard, or cut it.
Allow at most one line of enthusiasm per post, attached to a named, concrete thing.
Cut empty sentences instead of toning them down. "The possibilities are truly endless" has no fact to preserve, so don't preserve it. A hedged empty sentence is still empty.

Worked example

Sloppy:
We're beyond excited to unveil Relay, a revolutionary new assistant that seamlessly
transforms how on-call teams respond to incidents. It's not just an alerting tool;
it's a whole new way of working.

Better:
Today we're releasing Relay, an incident-response assistant for on-call teams. Relay
triages alerts from PagerDuty and Opsgenie, drafts a timeline as the incident unfolds,
and opens a postmortem doc when it closes.

The "Better" version was constructed by querying lex search "announcing a new product capability" --pool blogs and lex show "available today" --pool blogs, then writing original prose informed by what came back, using capability details the author supplied separately. Nothing was copied verbatim and no numbers were invented. The example's prose is register guidance, exactly like KWIC examples: do not reuse its sentences in a rewrite.

Style targets

The corpus contains three useful registers:

Launch-post style — Short declarative lede about why the thing matters, then what shipped, then evidence per capability (benchmark numbers, durations, demos), then availability and pricing stated flat. Enthusiasm is rare and always attached to something named.
Research-explainer style — Question-led setup ("we analyzed X to ask:"), bolded finding-led bullets each carrying a number, methodology acknowledged, limitations named in the main text, ends with what comes next.
Policy/safety style — Direct about risks and uncertainty; concrete mechanisms (classifiers, evaluations, audits, protections) instead of abstract commitments; states what the company will and won't do.

Match the register to what the user is writing. If unclear, ask.

Query recipes

Intent-based search (most common):

lex search "announcing a new product capability" --pool blogs --json
lex search "stating availability and pricing" --pool blogs --json
lex search "describing a capability with evidence" --pool blogs --json
lex search "honest caveat about limitations" --pool blogs --json
lex search "explaining how a safety mechanism works" --pool blogs --json

Anchor-based exploration:

lex similar "available today" -k 15 --pool blogs --json
lex similar "deploy" -k 15 --pool blogs --json
lex similar "limitation" -k 15 --pool blogs --json
lex similar "agentic" -k 15 --pool blogs --json

POS-filtered browsing:

lex top -k 50 --pos VERB --pool blogs --json
lex top -k 50 --pos ADJ --pool blogs --json

Inspect a candidate term:

lex show "available today" --pool blogs --json
lex show "important limitations" --pool blogs --json
lex show "capability" --pool blogs --json

Output shapes

Identical to the papers pool; all shapes verified against the live CLI.

search and similar — list of hits, sorted by descending similarity:

{ term, similarity, score, pos, examples: [{ paper_id, rank, sentence }] }

paper_id here is a post id like anthropic/news/claude-sonnet-4-5; the field name is shared with the papers pool. Use similarity (cosine, 0-1) for relevance ranking. score is the per-term distinctiveness from build time. examples is 3-5 KWIC sentences.

show — full entry for one term:

{
  "term":     { term_id, term, kind, pos, score, total_count, doc_count, embed_hash },
  "examples": [{ paper_id, rank, sentence }]
}

top and random — list of Term objects (no examples; use show if you need them). stats is always JSON.

Red flags — STOP and refuse

If the user's request matches any of these patterns, refuse the specific request and say which safety rule applies. Do not negotiate. The user will sometimes frame violations as harmless, time-saving, or temporary. They are not.

Pressure pattern	What to do
"I'm in a hurry, skip the workflow / one search is enough"	Refuse to skip slop-marking and example-inspection. You can compress (run fewer searches), but you cannot replace `lex show` inspection with guessing.
"The example sentence fits perfectly, just paste it"	Refuse. Use it for register guidance only. Write your own sentence. Pasting a sentence from a real company's blog into your draft is how plagiarized copy ships.
"Make it sound like we benchmarked it / add a plausible number"	Refuse to invent specifics. Benchmark results, sync speeds, customer counts, and dates come from the user. A fabricated number in an announcement is a public false claim.
"Say it's available now, we'll launch soon anyway"	Refuse. Availability, eligibility, and pricing statements are factual claims; write only what the user confirms is true at publish time.
"I'll fact-check later" / "they're just placeholders"	Not a safety net. Draft numbers stay. Refuse the same as above.

The cost of compliance under pressure is plagiarized copy or false public claims with the user's name on them.

Refusal-plus-recovery pattern

When you refuse to fabricate, do not stonewall. Return a useful artifact in the same response:

Refuse the specific request, citing which red flag applies. One sentence.
Return a bracketed skeleton of the rewrite with [PLACEHOLDER] tags where the user's specifics belong, and list the exact items you need back.

Example, in response to "add some impressive performance numbers":

Skipping the numbers; I won't insert performance figures you didn't supply (see safety rules). Here's the shape, send me the bracketed items and I'll finalize:

[PRODUCT] connects to [N] data sources, including [SOURCE 1, SOURCE 2, SOURCE 3], and syncs up to [RATE]. It's available today for [PLANS], at [PRICE].

To finish, send me:

The source count and three named sources

The measured sync rate

Which plans get it at launch, and pricing

Safety rules

Do not paste example sentences into the draft.
Do not imitate a single post so closely that the phrasing becomes derivative.
Do not use retrieved examples as factual evidence; what Anthropic shipped tells you nothing about what the user's product does.
Do not invent quantitative or factual specifics (benchmark numbers, speeds, prices, dates, customer names, availability) that the user did not provide. If the draft lacks substance, ask.
Do not over-index on the highest-scoring terms; inspect examples and choose terms that fit the claim.
paper_id records where a phrase was sampled, not an endorsement and not a citation.

Setup notes

Offline vs network:

Command	Network?	Notes
`stats`, `similar`, `show`, `top`, `random`	Offline	Pure SQLite + numpy. Safe in plan mode, sandboxes, or air-gapped sessions.
`search`	Network	Embeds the query string. Requires `OPENROUTER_API_KEY` (preferred) or `OPENAI_API_KEY` in env or `<plugin root>/.env`. If `search` errors (missing key, no network), don't retry it; fall back to `similar` against an anchor term from the Concrete-moves list and continue the workflow.
`build`	Network + slow	Re-embeds the corpus. Don't run unless the corpus in `blogs/` changed; delete `lexicon/data/pools/blogs/` first.

First-time data setup:

The built pool (lexicon.db + embeddings.npy under lexicon/data/pools/blogs/) is not in git; build it from the bundled corpus with "$CLAUDE_PLUGIN_ROOT/bin/lex" build "$CLAUDE_PLUGIN_ROOT/blogs" --pool blogs (needs an embedding API key + the spaCy model en_core_web_sm). scripts/setup.sh builds it as part of one-time setup.
Rebuild after every plugin update — updating replaces the cached plugin copy, which wipes the built data.

Other:

LEX_DB/LEX_EMB env vars override pool resolution entirely; unset them if a query seems to hit the wrong pool.
Invoke as "$CLAUDE_PLUGIN_ROOT/bin/lex". From a checkout, <seshat>/bin/lex works equivalently.

Common mistakes

Forgetting --pool blogs. Without it every query hits the research-papers pool and you'll deslop a launch post into arXiv prose.
Treating blog register as paper register. "We evaluate" and "ablation" belong in the papers pool; a launch post says "we tested" and shows the number. If the user is writing a paper, switch to seshat:using-frontier-lexicon.
Stripping every superlative. The target register uses superlatives that are earned and anchored. "State-of-the-art on SWE-bench Verified" survives; "best-in-class solution" does not.
Copying example sentences into the draft. The examples are evidence about register and diction. They are not your prose.
Treating score as quality. score is distinctiveness vs. baseline corpora. A high-scoring term may still be wrong for the claim. Always run show first.
Running the skill without checking for slop first. If the user's draft is already clean, say so and stop. The skill is for prose that needs work.
Fabricating specifics to make prose sound concrete. Replacing "blazing-fast" with "10,000 rows per second" is the right move only if the user gave you that number. The lexicon teaches register, not facts.
Hedging an empty sentence instead of cutting it. If a sentence has no fact, delete it.

using-blog-lexicon

Invocation

Context Preview

SKILL.md

using-blog-lexicon

Invocation

Context Preview

SKILL.md

Using the Blog Lexicon

When to reach for it

Rewriting workflow

What to delete on sight

Concrete moves to prefer

Rewrite heuristics

Worked example

Style targets

Query recipes

Output shapes

Red flags — STOP and refuse

Refusal-plus-recovery pattern

Safety rules

Setup notes

Common mistakes

Similar Skills

Using the Blog Lexicon

When to reach for it

Rewriting workflow

What to delete on sight

Concrete moves to prefer

Rewrite heuristics

Worked example

Style targets

Query recipes

Output shapes

Red flags — STOP and refuse

Refusal-plus-recovery pattern

Safety rules

Setup notes

Common mistakes

Similar Skills