From torch-xpu-skills
Analyze CI nightly test failures and fix XPU test cases. Use when the user provides CI failure reports, nightly status emails, or lists of failing test cases. Covers triaging failures, reproducing locally, identifying root causes, and applying fixes.
How this skill is triggered — by the user, by Claude, or both
Slash command
/torch-xpu-skills:xpu-nightly-ci-fixThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Analyze CI nightly test failure reports and fix failing XPU test cases on PyTorch.
Analyze CI nightly test failure reports and fix failing XPU test cases on PyTorch.
Trigger: User provides a CI failure report (email content, test case list, or log snippets).
.env file configured for oneAPI environmentsource .env before any Python/torch commandcommit_id and report_date from the reportgit fetch origin main
git checkout <commit_id> # or origin/main if no commit_id
fix-<report_date>source .env
python setup.py clean
pip install -e . -v --no-build-isolation
source .env && python <test_file> -k <test_name> 2>&1 | tail -80
For each failure, determine the root cause:
First: Use git log to check when the test was added. If recently added, check the introducing commit/PR to see if XPU support is required — then skip to Step 4.
Otherwise, categorize:
| Category | Description | Typical Fix Location |
|---|---|---|
| XPU backend bug | Backend implementation issue | torch/_inductor/ or third_party/torch-xpu-ops/ |
| Tolerance too tight | Numeric precision mismatch | Increase atol/rtol to match CUDA |
| Skip decorator stale | Test now passes on XPU | Remove @skipIfXpu or @expectedFailure |
| Upstream regression | New upstream code broke XPU | Add XPU-specific workaround |
| Test infrastructure | Environment, import, or setup issue | Test setup/config files |
spin fixlint
Analyze output and fix any linting issues.[xpu][fix] <short description>
## Motivation
<why this fix is needed>
## Solution
<what was changed>
## Test plan
<how it was verified>
git rebase, git checkout, or any commit-base change, rebuild before running tests. Without rebuilding, C++ extensions and generated code are stale — test results will be completely unreliable (segfaults, wrong pass/fail, masked issues).git rebase origin/main) instead.torch/include/. After editing a C++ header, manually copy it to the installed include path./tmp/torchinductor_<user>/precompiled_headers/) after modifying headers under torch/csrc/inductor/cpp_wrapper/.CppCompileError in AOT Inductor generated code: read the .wrapper.cpp error — root cause is usually codegen ordering in cpp_wrapper_cpu.py (function used before definition emitted). Check write_wrapper_decl() and generate_input_output_runtime_checks() ordering.agent_space/ (git-ignored)Provides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Fetches up-to-date documentation from Context7 for libraries and frameworks like React, Next.js, Prisma. Use for setup questions, API references, and code examples.
npx claudepluginhub etaf/torch-xpu-skills --plugin torch-xpu-skills