From fgcz-infrastructure
Promote a one-off R Markdown analysis into a SUSHI-shaped folder on gstore so it can be chained as a parent dataset by downstream SUSHI apps. Use when delivering a custom analysis (a hand-written Rmd, not a SUSHI app) to a user, when an analysis output needs to appear in the SUSHI lineage tree, when input came from an upstream SUSHI dataset and the result should be linkable back. Triggers on "custom analysis", "register analysis in SUSHI", "Hubert blueprint", "promote Rmd to SUSHI", "dataset.tsv parameters.tsv input_dataset.tsv", "wrap manual analysis for B-Fabric", "make this Rmd look like a SUSHI app output".
How this skill is triggered — by the user, by Claude, or both
Slash command
/fgcz-infrastructure:deprecated-custom-analysis-registerThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Wrap a hand-written R Markdown analysis so it produces the same on-disk shape a SUSHI app would. The output drops cleanly into the SUSHI dataset graph: any downstream SUSHI app (ScSeurat, exploreSC, etc.) can chain off it, and the B-Fabric + SUSHI registrations (Steps 9–10) make it visible in the SUSHI UI and the B-Fabric audit trail.
Wrap a hand-written R Markdown analysis so it produces the same on-disk shape a SUSHI app would. The output drops cleanly into the SUSHI dataset graph: any downstream SUSHI app (ScSeurat, exploreSC, etc.) can chain off it, and the B-Fabric + SUSHI registrations (Steps 9–10) make it visible in the SUSHI UI and the B-Fabric audit trail.
Based on Hubert's blueprint at gitlab.bfabric.org/Genomics/hubert-scripts-2026/p40992-Alithea-FlashSeq/example-run.R. The B-Fabric side is live via Ronald's register_custom_analysis.py (pending merge into btools main). The production-SUSHI side is not wired into that script — Step 10 does it directly via a MySQL insert (the script's built-in --register-sushi only writes to dev SUSHI; see warning in Step 9).
.Rmd, rendered it locally, and the user now wants the output delivered to gstore.Do NOT use for:
fgcz-sushi-app-dev.autonomous-render.bfabric.Confirm aloud with the user — echo back project + order before writing anything (the existing "confirm order numbers" rule applies here):
| Field | Example | Notes |
|---|---|---|
project | p40992 | The pXXXXX directory under /srv/gstore/projects/. |
order_id | o41017 | The oXXXXX prefix matching the upstream order. |
analysis_name | SC-FlashSeq_QC_Evaluation | Short, no spaces. Becomes part of the folder name and filenames. |
timestamp | 2026-05-15--12-00-00 | Generate fresh with format(Sys.time(), "%Y-%m-%d--%H-%M-%S") — don't reuse a prior one. |
| Rmd path | ~/git/<your-scripts>/pXXXXX_*/QC.Rmd | The source .Rmd. |
| Upstream SUSHI dataset path | /srv/gstore/projects/pXXXXX/oYYYYY_FeatureCounts_YYYY-MM-DD--... | Used in Step 4 to copy input_dataset.tsv for provenance. Without it, the SUSHI lineage tree has nothing to link back to. |
| Upstream SUSHI dataset ID | e.g. 109531 | The numeric ID from production SUSHI's data_sets table (not the dev SUSHI ID, which is different — look it up via the URL fgcz-sushi.uzh.ch/data_set/pXXXXX/<id> or query the DB via the sushi-framework skill). Needed in Step 10 to set parent_id and build the lineage edge. |
If the upstream SUSHI dataset path is unknown, ask the user — don't guess. Provenance without a real parent is worse than no provenance.
/srv/gstore/projects/{project}/{order_id}_{analysis_name}_{timestamp}/
├── {analysis_name}.Rmd # source (copied from your repo)
├── {analysis_name}.html # rendered report
├── dataset.tsv # output dataset — SUSHI schema
├── parameters.tsv # analysis parameters
├── input_dataset.tsv # provenance — copied from upstream SUSHI dataset
├── *.qs2 # any cached objects the Rmd writes
└── scripts/
├── {order_id}_run-{analysis}.sh # vanilla bash launcher
├── {order_id}_run-{analysis}_o.log
└── {order_id}_run-{analysis}_e.log
The folder name and the three TSVs are the exact contract every SUSHI app honours — that's what makes the output chainable.
Work locally in /srv/GT/analysis/{project}/Analyses_Paul/ (or a /tmp/ scratch dir for dry runs), then g-req the finished folder to gstore at the end. Never write into /srv/gstore/ directly.
library(ezRun)
library(stringr)
project <- "p40992"
order_id <- "o41017"
analysis_name <- "SC-FlashSeq_QC_Evaluation"
analysis_version <- format(Sys.time(), "%Y-%m-%d--%H-%M-%S")
gstore_folder <- paste0(project, "/", order_id, "_", analysis_name, "_", analysis_version)
setwdNew(basename(gstore_folder)) # creates and chdir into the timestamped folder
dir.create("scripts")
setwdNew is from ezRun; it creates the directory if missing and setwd()s into it. The folder name must start with {order_id}_ so SUSHI's lineage parser picks it up.
dataset.tsv (output schema)rmd_file <- paste0(analysis_name, ".Rmd")
html_file <- str_replace(rmd_file, "\\.Rmd$", ".html")
output_dataset <- ezFrame(
"Name" = analysis_name,
"Html [File,Link]"= file.path(gstore_folder, html_file),
"Rmd [File]" = file.path(gstore_folder, rmd_file)
)
ezWrite.table(output_dataset, file = "dataset.tsv", row.names = FALSE)
The square-bracket suffixes ([File], [File,Link]) are the SUSHI column-type tags — they tell the SUSHI UI to render the path as a downloadable link. Don't drop them.
parameters.tsvparams <- list(
analysis_name = analysis_name,
analysis_version = analysis_version
)
paramFrame <- ezFrame(Value = sapply(params, as.character))
ezWrite.table(paramFrame, file = "parameters.tsv", col.names = FALSE)
Keep this honest — add every parameter that materially affected the result (gene annotation version, thresholds, reference paths). Future-you will thank present-you.
input_dataset.tsv (provenance)upstream <- "/srv/gstore/projects/p40992/o41017_FeatureCounts_2026-02-18--12-45-59"
file.copy(file.path(upstream, "dataset.tsv"), "input_dataset.tsv")
This is the file-level provenance link. The DB-level link is the parent_id column on the data_sets row that Step 10 inserts — both are belt-and-braces so the lineage is visible in the file tree on gstore and in the SUSHI UI's parent-child navigation.
file.copy(file.path("..", rmd_file), rmd_file) # or wherever the source lives
bash_commands <- sprintf('#!/bin/bash
set -eux
set -o pipefail
umask 0002
source /usr/local/ngseq/etc/lmod_profile
module add Dev/R/4.6.0
R --vanilla --slave <<EOT
rmarkdown::render(
input = "%s",
envir = new.env(),
output_dir = ".",
quiet = FALSE
)
EOT
', rmd_file)
launcher <- sprintf("scripts/%s_run-%s.sh", order_id, tolower(analysis_name))
writeLines(bash_commands, con = launcher)
Sys.chmod(launcher, mode = "0755")
Why R --vanilla --slave: no .Rprofile, no .Renviron, no .RData is read. The render runs with only what the heredoc loads. That means anyone landing on this folder in gstore six months from now can rerun the script and reproduce the result without guessing what env produced it.
o_log <- sub("\\.sh$", "_o.log", launcher)
e_log <- sub("\\.sh$", "_e.log", launcher)
system2("bash", args = launcher, stdout = o_log, stderr = e_log)
The two log files sit next to the launcher in scripts/ — same level as the script that produced them.
g-req to gstoresystem2("/usr/local/ngseq/bin/g-req",
args = c("copynow", ".", dirname(gstore_folder)))
The subtle bit: g-req copynow . X/ copies the current directory as a subdirectory of X/. Because gstore_folder already contains the basename (the timestamped folder name), dirname(gstore_folder) resolves to the project's gstore root, and the copy lands at exactly the right place. This is the only pattern where the "g-req creates a subdirectory" gotcha works in your favour — see CLAUDE.md "g-req Commands" for the general rule (always copy individual files, not directories).
After it succeeds, surface the web URL. The HTML on gstore is reachable via the SUSHI proxy (works in any FGCZ browser session without re-auth):
https://fgcz-sushi.uzh.ch/projects/{project}/{order_id}_{analysis_name}_{timestamp}/{analysis_name}.html
The same file is also served at https://fgcz-gstore.uzh.ch/projects/... — that URL works too but is Basic-auth-walled, so the fgcz-sushi.uzh.ch form is friendlier for sharing.
The SUSHI page (once Step 10 below has run) lives at:
https://fgcz-sushi.uzh.ch/data_set/p{project_number}/{sushi_dataset_id}
Note the URL pattern: /data_set/ (underscore, not /datasets/) and the p prefix on the project number.
register_custom_analysis.py (currently at /home/rdomi/btools/btools/, pending merge into btools main) creates a B-Fabric workunit and dataset and registers all the gstore files as resources. Do not pass --register-sushi if you want a production-visible SUSHI dataset — see the warning below.
If you don't already have a working btools env, the easiest setup is to copy the script + dependent src/ files into your own btools clone and invoke via uv run:
# One-time setup: bring Ronald's script + sync the src/ helpers it imports
cp /home/rdomi/btools/btools/register_custom_analysis.py ~/git/btools/btools/
cp /home/rdomi/btools/btools/src/{paths,bfabric_utils,tsv_utils,resource_utils}.py \
~/git/btools/btools/src/
# Per-run invocation: B-Fabric ONLY (no --register-sushi)
cd ~/git/btools && \
BFABRICPY_CONFIG_ENV=PRODUCTION uv run python btools/register_custom_analysis.py \
/srv/gstore/projects/{project}/{order_id}_{analysis_name}_{timestamp} \
--generated-using Claude_Agent \
--generated-for {user} \
--verbose
Capture the output. It prints two integers on the last line:
B-Fabric workunit_id: <WU>
B-Fabric dataset_id: <BFDS>
Both are needed for Step 10 (link them into the SUSHI row).
⚠️ Why NOT --register-sushi — and what to use instead:
The script's --register-sushi flag POSTs to http://fgcz-h-083:4071/projects/{project}/datasets/register, which is the DEV SUSHI Python API on fgcz-h-083. Production SUSHI (fgcz-sushi.uzh.ch / fgcz-h-082) has no equivalent Python API on any port — only Rails on :8880 behind auth. So --register-sushi writes to dev only; the dataset will not be visible at https://fgcz-sushi.uzh.ch/data_set/.... Use Step 10 below to do a direct production SUSHI MySQL insert and link it to the B-Fabric IDs from Step 9.
Other notes:
--generated-using Claude_Agent is the governance tag for LLM-origin runs.BFABRICPY_CONFIG_ENV=PRODUCTION without explicit user confirmation.This is the only way to make the analysis visible at https://fgcz-sushi.uzh.ch/data_set/p{project_number}/{id} today. The full recipe (column names, gotchas, Ruby-hash syntax for samples.key_value) is in the sushi-framework skill under "Registering Results in SUSHI Database". The short version, parameterised for this skill's outputs:
# Needs: $SUSHI_DB_PASSWORD env var (ask admin), ssh access to fgcz-h-082
ssh fgcz-h-082 'mysql -u sushilover -p"$SUSHI_DB_PASSWORD" sushi' <<SQL
-- Identifiers
SET @project_id = (SELECT id FROM projects WHERE number = {project_number});
SET @user_id = (SELECT id FROM users WHERE login = '{user}');
SET @parent_id = {upstream_sushi_dataset_id}; -- from the SUSHI URL of the upstream job
SET @bfabric_id = {BFDS}; -- from Step 9
SET @workunit_id = {WU}; -- from Step 9
INSERT INTO data_sets
(project_id, parent_id, name, created_at, updated_at,
num_samples, completed_samples, user_id, child,
sushi_app_name, order_id, bfabric_id, workunit_id)
VALUES
(@project_id, @parent_id,
'{order_id}_{analysis_name}_{timestamp}',
NOW(), NOW(), 1, 1, @user_id, 1,
'CustomAnalysis', {order_number},
@bfabric_id, @workunit_id);
SET @new_id = LAST_INSERT_ID();
-- One sample row matching dataset.tsv columns (Ruby hash-rocket syntax, NOT JSON colons)
INSERT INTO samples (key_value, data_set_id, created_at, updated_at) VALUES
('{"Name"=>"{analysis_name}", '
'"Html [File,Link]"=>"p{project_number}/{order_id}_{analysis_name}_{timestamp}/{analysis_name}.html", '
'"Rmd [File]"=>"p{project_number}/{order_id}_{analysis_name}_{timestamp}/{analysis_name}.Rmd"}',
@new_id, NOW(), NOW());
SELECT id, name, parent_id, bfabric_id, workunit_id FROM data_sets WHERE id = @new_id;
SQL
Result: the SUSHI page is now live at https://fgcz-sushi.uzh.ch/data_set/p{project_number}/<id_from_LAST_INSERT_ID>, the B-Fabric icon shows up in the SUSHI UI, and the lineage edge to the upstream parent is in place.
Gotchas inherited from the sushi-framework skill (worth re-reading there if anything looks off):
samples.key_value uses Ruby => hash rockets, not JSON colons. Pipe the SQL via stdin to avoid shell quoting hell.fgcz-h-082 and owned by trxcopy. You can run mysql read/write as your own user; only the Rails log requires trxcopy.key_value hash with Name first so the leftmost column matches the dataset.tsv schema.g-req succeeds. CLAUDE.md mandates this./srv/gstore/ directly. Work in /srv/GT/analysis/{project}/Analyses_Paul/<scratch>/, then g-req.--register-sushi to register_custom_analysis.py (Step 9 warning) — it only writes to dev. Production SUSHI registration goes through MySQL (Step 10).references/example-run-annotated.R — Hubert's original example-run.R with inline commentary on each block. Read it once before your first scaffolding to see the recipe end-to-end.fgcz-sushi-app-dev — for the real SUSHI app build path (when this analysis becomes routine and deserves to be a button in the SUSHI UI).autonomous-render — for self-correcting Rmd rendering with retry logic.bfabric — for direct B-Fabric reads/writes once Step 9 lands.register_custom_analysis.py + src/ updates into main btools so users don't have to copy from /home/rdomi/.fgcz-h-083:4071) so Step 10 can become an HTTP call instead of a raw MySQL insert. Until then, Step 10 is the canonical path.evals/evals.json with 2–3 test prompts once the workflow has stabilised.npx claudepluginhub cpanse/skills --plugin fgcz-infrastructureProvides behavioral guidelines to reduce common LLM coding mistakes, focusing on simplicity, surgical changes, assumption surfacing, and verifiable success criteria.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.