Skill

Qfine-tuning-expert

Guides fine-tuning LLMs using LoRA/QLoRA, PEFT, and dataset preparation. Useful for adapting foundation models, hyperparameter tuning, and deploying quantized models.

Hugging Face

OpenAI

Python

ai-ml

Popularity

Stars

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/qe-framework:Qfine-tuning-expert

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Senior ML engineer specializing in LLM fine-tuning, parameter-efficient methods, and production model optimization.

Supporting Files

references/dataset-preparation.mdreferences/deployment-optimization.mdreferences/evaluation-metrics.mdreferences/hyperparameter-tuning.mdreferences/lora-peft.md

SKILL.md

153 lines · ~1.7k tokens

Stats

LanguageJavaScript

Stars5

MaintenanceExcellent

Last CommitJun 15, 2026

Actions

View Source View Plugin View on GitHub View README

Fine-Tuning Expert

Senior ML engineer specializing in LLM fine-tuning, parameter-efficient methods, and production model optimization.

Core Workflow

Dataset prep — Validate & format data; run quality checks before training
- Checkpoint: validate_dataset.py --input data.jsonl — fix errors before proceeding
Method selection — LoRA for most tasks; QLoRA (4-bit) if GPU memory constrained; full tune only for small models
Training — Configure hyperparams, monitor loss curves, checkpoint regularly
- Checkpoint: validation loss must decrease; plateau signals overfitting
Evaluation — Benchmark vs base model; test on held-out set & edge cases
- Checkpoint: collect perplexity, task metrics (BLEU/ROUGE), latency
Deployment — Merge adapter weights, quantize, measure inference throughput

Code Patterns (3 Examples with Docstrings)

# Pattern 1: Dataset validation & deduplication
def validate_and_deduplicate_dataset(input_jsonl: str, output_jsonl: str):
    """Validate JSONL format and remove duplicates before fine-tuning."""
    import json
    seen, valid_count = set(), 0
    with open(input_jsonl) as f_in, open(output_jsonl, 'w') as f_out:
        for line in f_in:
            try:
                record = json.loads(line)
                assert 'instruction' in record and 'output' in record
                record_hash = hash((record['instruction'], record['output']))
                if record_hash not in seen:
                    seen.add(record_hash)
                    f_out.write(line)
                    valid_count += 1
            except (json.JSONDecodeError, AssertionError):
                continue
    assert valid_count >= 100, f"Too few training examples: {valid_count}"
    return valid_count

# Pattern 2: LoRA config selection
def select_lora_config(model_size_b: float, gpu_memory_gb: int):
    """Select LoRA rank & alpha based on model size and GPU capacity."""
    from peft import LoraConfig, TaskType
    if gpu_memory_gb < 16: r, alpha = 8, 16
    elif gpu_memory_gb < 32: r, alpha = 16, 32
    else: r, alpha = 32, 64
    return LoraConfig(task_type=TaskType.CAUSAL_LM, r=r, lora_alpha=alpha,
                      target_modules=["q_proj", "v_proj"], lora_dropout=0.05, bias="none")

# Pattern 3: Evaluation metrics
def compute_eval_metrics(model, eval_dataset, tokenizer):
    """Compute perplexity and task metrics on held-out set."""
    import torch
    total_loss, total_tokens = 0, 0
    with torch.no_grad():
        for batch in eval_dataset:
            outputs = model(**batch)
            total_loss += outputs.loss.item() * batch['input_ids'].shape[0]
            total_tokens += batch['input_ids'].shape[0]
    perplexity = torch.exp(torch.tensor(total_loss / total_tokens)).item()
    return {'perplexity': perplexity, 'eval_loss': total_loss / total_tokens}

Comment Template (Google-style)

def finetune_llm_for_task(base_model_id: str, train_path: str, task_type: str):
    """One-line task summary (e.g., 'Summarization fine-tuning').
    
    Longer: PEFT method rationale, expected improvements, evaluation approach.
    
    Args:
        base_model_id: HuggingFace model (e.g., 'meta-llama/Llama-3-8B')
        train_path: Path to JSONL training data
        task_type: Task identifier (e.g., 'summarization', 'classification')
    
    Returns:
        Path to saved LoRA adapter
    
    Raises:
        FileNotFoundError: If train_path not found
        ValueError: If dataset validation fails
    """

Lint Rules (ruff/mypy/black)

[tool.ruff]
line-length = 100
select = ["E", "F", "W", "UP"]

[tool.mypy]
python_version = "3.9"
disallow_untyped_defs = true
ignore_missing_imports = true

Critical: F841 (unused checkpoint), E501 (long args), missing loss assertions

Security Checklist (5+)

Training data contamination — No overlap in train/val/test; hash-based dedup; log data version
Model theft via inference — Rate limiting, API auth, per-user quotas, watermarking
Credential exposure — Use env vars, ~/.huggingface token; never hardcode keys in config
Poisoning via malicious examples — Filter for toxicity on ingestion; flag unusual patterns
Overfitting on small data — Use dropout, weight decay, eval_steps < 1000; monitor val loss plateau

Anti-patterns (5 Wrong/Correct)

Anti-pattern	Fix
No dataset validation; train on raw data	Always run validation script first; log valid record count
LoRA rank=4 for all tasks	Use rank ≥ 16; set alpha = 2×rank; tune on eval metrics
Train without warmup or LR schedule	Always use `warmup_ratio=0.03` + `lr_scheduler_type="cosine"`
Skip evaluation on held-out set	Hold out 10–20% test data; compute perplexity + task metrics
Merge adapter without quantization	Merge + quantize with bitsandbytes before serving

Quick LoRA Template

from peft import LoraConfig
from trl import SFTTrainer

lora_config = LoraConfig(r=16, lora_alpha=32, target_modules=["q_proj", "v_proj"],
                         lora_dropout=0.05, bias="none")
trainer = SFTTrainer(model=model, args=training_args, train_dataset=train_data,
                     eval_dataset=eval_data, peft_config=lora_config, max_seq_length=2048)
trainer.train()
model.save_pretrained("./lora-adapter")

MUST DO / MUST NOT DO

MUST: Validate datasets, use PEFT, monitor loss curves, evaluate on held-out set, version configs, include warmup
MUST NOT: Skip validation, train without tracking, overfit on small data, hardcode creds, deploy unquantized

Qfine-tuning-expert

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

Qfine-tuning-expert

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

Fine-Tuning Expert

Core Workflow

Code Patterns (3 Examples with Docstrings)

Comment Template (Google-style)

Lint Rules (ruff/mypy/black)

Security Checklist (5+)

Anti-patterns (5 Wrong/Correct)

Quick LoRA Template

MUST DO / MUST NOT DO

Similar Skills

Fine-Tuning Expert

Core Workflow

Code Patterns (3 Examples with Docstrings)

Comment Template (Google-style)

Lint Rules (ruff/mypy/black)

Security Checklist (5+)

Anti-patterns (5 Wrong/Correct)

Quick LoRA Template

MUST DO / MUST NOT DO

Similar Skills