From analysis-facilities
Use when working with the UChicago ATLAS Analysis Facility, submitting HTCondor batch jobs at UChicago, accessing JupyterLab at af.uchicago.edu, using XCache or Rucio for ATLAS data at UChicago, deploying ML models on Triton at UChicago AF, setting up ServiceX or Coffea Casa, or troubleshooting SSH access to login.af.uchicago.edu
How this skill is triggered — by the user, by Claude, or both
Slash command
/analysis-facilities:uchicago-afThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
The UChicago ATLAS Analysis Facility (UChicago AF) provides ATLAS physicists
The UChicago ATLAS Analysis Facility (UChicago AF) provides ATLAS physicists with interactive nodes (SSH + JupyterLab), HTCondor batch computing, GPU resources, and advanced data services (XCache, Rucio, ServiceX, Coffea Casa, Triton). It is part of the MWT2 Tier-2 center supporting US ATLAS computing.
| Resource | Value |
|---|---|
| SSH Login | login.af.uchicago.edu |
| JupyterLab | https://af.uchicago.edu |
| XCache | root://xcache.af.uchicago.edu:1094// |
| ServiceX (Uproot) | uproot-atlas.servicex.af.uchicago.edu |
| ServiceX (xAOD) | xaod.servicex.af.uchicago.edu |
| Coffea Casa | https://coffea.af.uchicago.edu |
| Triton gRPC | triton-traefik.triton.svc.cluster.local:8001 |
| Triton S3 | s3://triton-models/<username>/ at s3.af.uchicago.edu |
| Support Email | [email protected] |
| Discourse | atlas-talk.sdcc.bnl.gov |
| Home Directory | /home/<username> (19TB total, shared) |
| Data Directory | /data/<username> (4.5PB, shared) |
| Scratch Directory | /scratch/<username> (temporary, purged) |
| LOCALGROUPDISK | MWT2_UC_LOCALGROUPDISK (15TB via Rucio) |
[email protected] with your
CERN username and home institutessh-keygen -t ed25519 -C "[email protected]"
Upload the public key to your CERN account settingslogin.af.uchicago.edu:
ssh <username>@login.af.uchicago.edu
Initial login may take ~15 minutes for home directory sync/home/<username>, /data/<username>, and
/scratch/<username> existAll ATLAS software is available via CVMFS:
# Setup ATLAS environment
setupATLAS
# Setup specific tool (e.g., Rucio, ROOT, AnalysisBase)
lsetup rucio
lsetup root
lsetup "views LCG_105 x86_64-el9-gcc13-opt"
# Setup specific Athena/AnalysisBase release
asetup AnalysisBase,24.2.38
| Path | Quota | Backed Up | Purged | Best For |
|---|---|---|---|---|
$HOME (/home/<username>) | Shared 19TB pool | Yes | No | Code, configs, small files |
$DATA (/data/<username>) | 4.5PB shared pool | No | No | Analysis outputs, ntuples |
$SCRATCH (/scratch/<username>) | No quota | No | Yes (weekly) | Temporary job outputs |
LOCALGROUPDISK: UChicago provides MWT2_UC_LOCALGROUPDISK (15TB capacity)
accessible via Rucio. This is a local grid storage element for dataset
replication.
Best Practices:
$HOME$DATA$SCRATCH only for temporary files$HOME quota: exceeding it will block SSH loginUChicago AF uses HTCondor for batch job submission. Two queues are available:
+queue="short"): Max 4 hours runtime, faster schedulingBasic Submit File (job.sub):
universe = vanilla
executable = job.sh
output = logs/out.$(ClusterId).$(ProcId)
error = logs/err.$(ClusterId).$(ProcId)
log = logs/log.$(ClusterId)
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
transfer_input_files = input_data/
transfer_output_files = output/
+queue = "short"
request_cpus = 1
request_memory = 2GB
request_disk = 1GB
x509userproxy = /home/<username>/x509up_u$(UID)
queue 10
Docker/Singularity Support: Specify container image with
+SingularityImage = "/cvmfs/..." or similar directives.
Key Commands:
| Command | Purpose |
|---|---|
condor_submit job.sub | Submit job |
condor_q | View your jobs |
condor_q -analyze <ClusterId> | Diagnose why job is idle |
condor_rm <ClusterId> | Remove job |
condor_status | View available slots |
Access JupyterLab at https://af.uchicago.edu/jupyterlab (requires ATLAS
AuthZ).
Resource Limits:
GPU Specs:
Docker Images:
ml_platform: TensorFlow, PyTorch, scikit-learnAB-stable: AnalysisBase stable releaseAB-dev: AnalysisBase development releaseHigh-performance caching layer for ATLAS data. Prefix data paths with:
root://xcache.af.uchicago.edu:1094//
Set SITE_NAME=AF_200 environment variable for optimal XCache performance.
Example:
export SITE_NAME=AF_200
xrdcp root://xcache.af.uchicago.edu:1094//atlas/rucio/... output.root
ATLAS distributed data management system.
Setup:
lsetup rucio
voms-proxy-init -voms atlas
List Replicas:
rucio list-file-replicas <scope>:<dataset_name>
Download Dataset:
rucio download <scope>:<dataset_name>
CERN EOS is mounted on login nodes at /eos but NOT available on worker nodes.
Access on Login Nodes:
kinit <username>@CERN.CH
ls /eos/atlas/...
Access on Worker Nodes: Use xrdcp with full EOS URL:
xrdcp root://eosatlas.cern.ch//eos/atlas/... output.root
Required for grid operations (Rucio, XRootD, CERN services).
Generate Proxy:
voms-proxy-init -voms atlas -valid 96:00
Check Proxy:
voms-proxy-info
Store proxy in $HOME (default location).
Two ServiceX instances for ATLAS data delivery:
uproot-atlas.servicex.af.uchicago.eduxaod.servicex.af.uchicago.eduSetup:
servicex.yaml config:
api_endpoints:
- name: uchicago-uproot
endpoint: https://uproot-atlas.servicex.af.uchicago.edu
token: <your_token>
- name: uchicago-xaod
endpoint: https://xaod.servicex.af.uchicago.edu
token: <your_token>
Coffea Casa at https://coffea.af.uchicago.edu provides Dask + HTCondor
autoscaling for columnar analysis with Coffea.
Features:
Deploy ML models for inference at scale.
gRPC Endpoint: triton-traefik.triton.svc.cluster.local:8001
Model Repository: S3 bucket at s3.af.uchicago.edu
s3://triton-models/<username>/s3://triton-models/<username>/<model_name>/Supported Backends:
Model Polling: Triton polls the S3 repository every 60 seconds for new models.
Model Directory Structure:
<model_name>/
config.pbtxt
1/
model.onnx (or model.plan, model.pt, etc.)
Connect local VSCode to JupyterLab kernels:
https://af.uchicago.edu/jupyterlab| Resource | Quantity |
|---|---|
| Interactive SSH Nodes | 4 |
| Interactive Jupyter Nodes | 4 |
| HTCondor Long Queue Cores | 1520 |
| HTCondor Short Queue Cores | 1280 |
| GPUs (A100 40GB) | 2x4 = 8 |
| GPUs (V100 32GB) | 1x4 = 4 |
| GPUs (RTX 2080 Ti) | 3x8 = 24 |
| GPUs (GTX 1080 Ti) | 1x8 = 8 |
| Data Storage | 4.5PB |
| Cold Storage | 4.5PB |
| Home Storage | 19TB (shared) |
| Issue | Cause | Solution |
|---|---|---|
| SSH login rejected | RSA key type not allowed | Generate ed25519 or ecdsa key |
| Grid commands fail | Expired X509 proxy | Regenerate with voms-proxy-init -voms atlas |
| Cannot write files | $HOME quota exceeded | Clean up old files or use $DATA |
| Job outputs lost | Used $SCRATCH | Use $DATA for persistent storage |
| EOS mount not found on worker | EOS only on login nodes | Use xrdcp with full EOS URL on workers |
| JupyterLab session killed | Exceeded resource limits or 72h timeout | Reduce resource request or split work |
[email protected]atlas-talk.sdcc.bnl.gov (ATLAS community forum)https://usatlas.github.io/af-docs/Provides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Searches MemPalace before answering questions about past work, people, projects, or prior decisions. Returns verbatim stored content instead of guessing from model memory.
npx claudepluginhub usatlas/marketplace --plugin analysis-facilities