From sparq
Use when turning a sparq contribution into an academic paper — identifying a genuinely-novel contribution (the re-runnable intake), classifying it and picking a venue (ISWC/ESWC Resource for the engine, PVLDB/SIGMOD/EDBT for DB-systems, arXiv/workshop for not-yet-sound ZK/MPC), drafting a single-source Typst (.typ) paper whose eval section binds to live benchmark data via --input/sys.inputs, building both a PDF and an in-site HTML page from that one source, and reviewing claims↔evidence under sparq's empirical-honesty mandate (canonical vs indicative numbers; the ZK/MPC not-yet-sound disclaimer). The paper-factory for epic sq-gum8.
How this skill is triggered — by the user, by Claude, or both
Slash command
/sparq:academic-paperThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
This is the repeatable PROCESS for producing one paper from a sparq contribution. It is
This is the repeatable PROCESS for producing one paper from a sparq contribution. It is
factory machinery, not prose advice: it enforces sparq's empirical-honesty mandate at
the data layer, binds every number to live benchmark data, and emits a PDF + an in-site
HTML page from one Typst source. Design record: research/paper-factory-design.md
(consolidates the contribution inventory + the auto-gen/PDF research). Epic sq-gum8.
Two non-negotiable constraints govern every step. They are the point of the factory; do not soften them.
-aws) wall-clock numbers are NON-CANONICAL. Only deterministic
integer metrics are canonical today (W3C/OGC conformance floors, byte-identity, recall
floors, gate/round/byte counts, differential-fuzz pass). A speed/memory claim needs the
canonical runner (research/ci-ec2-design.md). Indicative numbers are never
co-tabulated with canonical ones and never feed an aggregate figure-of-merit.The factory has six stages. Stages 0/1/3/5 are the methodology this skill encodes; 2/4/6 are
scripted (see research/paper-factory-design.md §1–§2 for the script/route layout).
Run before claiming anything is publishable. This is the re-runnable identification process;
its current ranked output is research/paper-contributions-inventory.md — read it first and
diff against it rather than starting from scratch.
Scan six sources: crates (crates/sparq-*/src, READMEs — what is IMPLEMENTED vs
designed-only; NotYetImplemented/#[ignore] markers); research/*.md (claimed technique +
prior-art + the author's own honesty hedges + negative results); beads
(.beads/issues.jsonl — open audit/gating beads, esp. sq-qhy4/sq-9hrn/sq-1gir);
benchmark deltas (research/BENCHMARKS.md, bench/ — which numbers exist, canonical vs
work-box); the conformance scoreboard (crates/sparq-conformance/src/scoreboard.rs — the
only fully-canonical evidence); skills (skills/*/SKILL.md — distilled durable findings).
Score four criteria — a candidate must pass all four:
Assign a readiness verdict: PUBLISHABLE-NOW (sound claim + canonical evidence today, or a design/methodology contribution needing no benchmark) | NEEDS-CANONICAL-BENCHMARKS (honest + real, but the headline rests on work-box numbers — publishable once re-gathered on the canonical runner) | NOT-YET-SOUND / NEEDS-EXTERNAL-AUDIT (ZK/MPC — honest design / negative result only; cite sq-qhy4). Rank by readiness × impact (a PUBLISHABLE-NOW candidate with a real audience outranks a high-impact unaudited crypto claim).
Re-run triggers: a new crate/feature lands; a benchmark improves or a canonical run
completes (re-evaluate every NEEDS-CANONICAL-BENCHMARKS candidate); a conformance floor rises;
an audit bead changes state (a sq-qhy4 pass moves the ZK candidate out of NOT-YET-SOUND —
until then it cannot move); a new *-measured.md/negative-result doc appears. The re-run is
cheap: re-scan, re-score, re-verdict, re-rank, diff the inventory, open/close paper beads.
Map the contribution to its family, then to a venue + track + Typst template. Verify every page limit / deadline / blinding policy against the current-year CFP before submitting — they drift yearly. Maintainer precedent (jeswr, Oxford / ODI / W3C) centres on ISWC/ESWC; the engine-as-artifact maps exactly to their Resource track.
| Family | Best venues | Track / notes |
|---|---|---|
| A — DB/engine performance (indexes, parallel/streaming/mmap exec, ingest, compression) | PVLDB (rolling monthly, single-blind, EA&B track for measurement-heavy papers); SIGMOD (fixed rounds, double-blind); ICDE; EDBT (most accessible first DB paper) | systems/empirical; reproducibility/artifact tracks near-universal |
| B — Semantic Web / RDF / SPARQL (federation, SHACL, GeoSPARQL, RSP-QL, inference, GenAI-over-RDF, RDFC-1.0) | ISWC (home venue; Resource track purpose-built for a reusable engine); ESWC (European sibling; note its strict no-preprint window conflicts with arXiv-first); TheWebConf; SEMANTiCS; JWS (journal) | Research / Resource / In-Use + posters/demos/workshops |
| C — Security/privacy (ZK/MPC), NOT-yet-sound | arXiv preprint + workshop NOW (HotPETs / ZKProof / WPES / ISWC-ESWC privacy workshops); PoPETs = best real target once sound (journal, 4 rolling deadlines/yr); USENIX/CCS/S&P aspirational | preprint/WIP today; graduate to a real venue only when sq-qhy4 (and sq-9hrn) close |
Template: charged-ieee (IEEE 2-col) | arkheion (arXiv-style) | para-lipics (LIPIcs;
accepted papers need a Typst→LaTeX conversion for the publisher) | an acmart-style template.
Double-blind venues (SIGMOD/VLDB/ICDE/WWW/CCS/S&P) require the anonymized build (Stage 4);
ISWC/ESWC blinding varies by track. Rolling cadence (PVLDB monthly, PoPETs quarterly,
SIGMOD multi-round) → target submit-when-ready, not one annual crunch.
Paper-bound numbers live in a dedicated evidence file, site/src/data/paper-evidence.json
— distinct from site/src/data/benchmarks.generated.json. The two MUST NOT be conflated:
paper-evidence.json — the canonical-only paper evidence. Every record is a
deterministic, machine-INDEPENDENT metric (a recall floor, an answer-safety invariant, a
cost-model crossover) lifted from a named test, so each is environment: "canonical" and
carries a source field tracing it to that test/dataset. [OPUS-4.8: as-built, PR #336.]benchmarks.generated.json — per-commit timing on the dev work-box, all
environment: "indicative" (folded in by site/scripts/sync-benchmarks.mjs from the
benchmark-data branch). It feeds the site's benchmark widgets, never a paper headline.A paper-evidence record:
{ "value": 0.90, "unit": "recall@10", "environment": "canonical",
"kind": "deterministic-floor",
"source": "crates/sparq-vectors/tests/filtered.rs::filtered_traversal_recall_vs_exact_on_broad_mask",
"note": "machine-independent: a deterministic seed + an assertion threshold, not a timing." }
The honesty gate (two layers, both as-built in PR #336):
site/scripts/build-papers.mjs::runHonestyGate() runs FIRST, before any
compile: it schema-checks paper-evidence.json (every record needs a valid environment
∈ {canonical, indicative}, a source, and a value) so a malformed/untraceable record is
a clear early build failure, not a Typst stack trace.headline(key) in site/papers/_lib/bench.typ panics the Typst
compile if a record's environment != "canonical". headline() is the ONLY accessor
allowed inside a headline result table/figure; the ungated ev(key) is for
clearly-labelled indicative callouts only.So any non-canonical (work-box / indicative) number can never appear as a paper's headline
evidence. Indicative and canonical numbers are never in the same table and never feed an
aggregate. Until the canonical latency runner exists (research/ci-ec2-design.md, blocked
on one IAM step), publish only the deterministic metrics (conformance counts, recall
floors, byte-identity / answer-safety invariants, gate/round/byte counts) — those are
canonical today and are exactly what paper-evidence.json holds.
Writing methodology (the order matters):
foo/bar; active voice; self-contained figure captions that tell
the reader what to notice (reviewers skim).The live-data recipe (the load-bearing factory mechanism). Author one .typ; the build
injects paper-evidence.json via --input data=.... Numbers are read through the shared
site/papers/_lib/bench.typ helpers, never hard-coded:
#import "_lib/bench.typ": headline, ev, provenance, authors, anon
bench.typ binds #let evidence = json(bytes(sys.inputs.data)) once and exposes the
accessors. The eval section uses #headline(key) for any headline table/figure (it panics
the compile on a non-canonical record — the honesty gate), and the ungated #ev(key) only
inside an explicitly-labelled indicative callout:
The filtered top-k matches the exact-filtered ground truth at recall@10
>= #headline("filtered_ann.recall_at_10_floor").
#provenance("filtered_ann.recall_at_10_floor")
Run the self-check before handing off to review:
rustc + Cargo.lock +
flags, baseline-system versions); dataset name+version+SHA-256+seed; report absolutes not
ratios-only; never summarize 4/6/7/49% as "up to 49%"; never use a competitor's sub-optimal
config (Heiser: that "probably constitutes scientific misconduct").site/scripts/build-papers.mjs (wired into prebuild + dev) drives both artifacts for
every paper in papers.ts, injecting the SAME paper-evidence.json so they cannot diverge.
The two compile invocations it runs (paths relative to site/, --root site):
# PDF (the download — static asset under public/papers/, served by the Next.js export):
typst compile papers/<source>.typ public/papers/<slug>.pdf \
--root . --input data="$(cat src/data/paper-evidence.json)"
# In-site HTML — Typst NATIVE HTML export (NOT typst.ts):
typst compile papers/<source>.typ src/generated/papers/<slug>.html \
--root . --format html --features html --input data="$(cat src/data/paper-evidence.json)"
# anonymized build for a double-blind venue (the .typ's authors()/anon honour this):
typst compile papers/<source>.typ /tmp/<slug>-anon.pdf \
--root . --input data="$(cat src/data/paper-evidence.json)" --input anon=true
[OPUS-4.8: as-built, PR #336.] The in-site HTML page at /papers/<slug> uses Typst's
native HTML export — not typst.ts/@myriaddreamin/typst.react. The build extracts the
<body> inner HTML and the React route (components/papers/paper-html.tsx) injects it as a
static fragment into a scoped .paper-prose block (no WASM compiler shipped to the browser).
The PDF and the HTML are built from the same .typ + the same injected evidence, so the
numbers can't diverge. Trade-off: native HTML export is experimental, so layout-only
constructs (centring, alignment, page rules / horizontal rules) drop in the HTML view — the
expected --features html "experimental feature" + page-rule warnings on stderr are harmless;
those constructs are preserved in the PDF. Author for both: rely on semantic structure
(headings, tables, paragraphs) for meaning, treat alignment as PDF-only polish. If Typst is
not on PATH, the script emits an honest placeholder fragment and warns (CI installs Typst so
real artifacts always build). A benchmark-data/evidence change re-triggers next build and
both artifacts regenerate. For a venue that rejects Typst→LaTeX, the camera-ready fallback is
Pandoc/LaTeX via Tectonic against acmart/IEEEtran/lipics. Do not author the PDF with
@react-pdf/renderer — it duplicates the numbers (breaks single-source).
For every claim in the intro/contributions list, identify its evidence (analysis / theorem / measurement / case study) and confirm the forward-reference resolves. The gate (run as subagent section-reviewers + one cross-cutting honesty/repro reviewer) blocks on:
build-papers.mjs schema-check + the headline() compile panic — enforces this
mechanically; the reviewer also checks no indicative/canonical co-tabulation).Resolve all findings before publish. Final check: the claims↔evidence loop is closed and every headline number is canonical.
Merge → next build serves /papers/<slug> + /papers/<slug>.pdf (static export, under the
/sparq basePath). A paper-evidence.json refresh re-runs the build → both artifacts
regenerate with current numbers. Each headline number carries its provenance inline (the
provenance(key) helper prints the record's source test + environment). A venue
camera-ready triggers the optional Typst→LaTeX export. To register a new paper, add an
entry to site/src/data/papers.ts (slug + source) and a site/papers/<source>.typ (e.g.
filtered-ann.typ) — the index, the per-paper route, the nav, and the PDF build
are all data-driven off papers.ts.
environment: canonical
records, reported with variability. Indicative (EC2 -aws) numbers live only in clearly
labelled "indicative development measurement" callouts that name the instance type + -aws
kernel, state they are not the basis of any claim, and are never co-tabulated with canonical
numbers. Never quote a speedup blending the two.The factory's first pilot is A1 + A2 together (both PUBLISHABLE-NOW, no canonical runner
or external audit needed): A1 = RDF-native filtered-ANN ("Filter-as-Query" — exact
transitive-BGP filter over the engine's own dict-ids + answer-safety; canonical recall
evidence today; frame as integration, not an ANN-algorithm novelty); A2 = the honest
same-box benchmark methodology (a methods/reproducibility contribution that strengthens A1's
eval). See research/paper-contributions-inventory.md (Part 2) and
research/paper-factory-design.md (§6).
skills/SKILL.md public-surface router (that router lists code-integration surfaces).
Keep this skill in sync with research/paper-factory-design.md if the pipeline changes.Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub jeswr/sparq --plugin sparq