Write, review, or integrate Apple's on-device FoundationModels framework for iOS 26.0+, macOS 26.0+. Builds generative AI, structured extraction, tool calling, streaming text on Apple Silicon.
How this skill is triggered — by the user, by Claude, or both
Slash command
/apple-foundation-models:apple-foundation-models-skillThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
- Consult `references/_index.md` at the start of every task to navigate the framework's capabilities.
references/_index.mdreferences/concurrency.mdreferences/error-handling.mdreferences/generation-options.mdreferences/glossary.mdreferences/guided-generation.mdreferences/performance.mdreferences/prompting-techniques.mdreferences/session-lifecycle.mdreferences/streaming.mdreferences/system-language-model.mdreferences/tool-calling.mdreferences/_index.md at the start of every task to navigate the framework's capabilities.SystemLanguageModel.default.availability) before attempting to initialize a session; gracefully handle unsupported hardware or missing downloads.LanguageModelSession must be explicitly isolated (@State, @MainActor, or actor), and Tool conformances must be Sendable.@Generable for structured output instead of asking the model to write raw JSON.Tool capabilities, and avoid using it for complex mathematical reasoning or authoritative world-knowledge.prewarm() intelligently during idle time to minimize first-token latency.LanguageModelSession instantiation..exceededContextWindowSize is explicitly caught and handled.LanguageModelSession is not declared locally inside a function or Task (which breaks statefulness).Tool implementations to ensure errors are propagated (throws) and not silenced with try?..default for conversational prose, .contentTagging for classification/extraction.respond(to:) or real-time streamResponse(to:).@Generable structs for structured data extraction, utilizing @Guide to strictly constrain output token by token.Tool protocol conformance, ensuring a clear, concise description for the model.Arguments using @Generable types.call(arguments:) method, ensuring network calls or database queries are properly awaited and errors are correctly thrown back to the model.Consult the reference file for each topic relevant to the current task:
| Topic | Reference |
|---|---|
| Core Models & Availability | references/system-language-model.md |
| Session & Transcript Lifecycle | references/session-lifecycle.md |
Structured Output & @Generable | references/guided-generation.md |
Expanding capabilities (Tool) | references/tool-calling.md |
| Temperature & Token Limits | references/generation-options.md |
| Real-time UI & Streams | references/streaming.md |
| Context Overflow & Fallbacks | references/error-handling.md |
| Actor Isolation & Sendable | references/concurrency.md |
| Memory, Prewarming & Optimization | references/performance.md |
| Prompt Design & Iteration | references/prompting-techniques.md |
| Framework Terminology | references/glossary.md |
| Prompting Techniques | references/prompting-techniques.md |
These are hard rules — violations will cause runtime crashes, deadlocks, or broken state:
SystemLanguageModel.default.availability is checked before creating any session.LanguageModelSession is explicitly owned by @State or an actor (never instantiated locally inside a function).LanguageModelSession.GenerationError.exceededContextWindowSize is explicitly caught in all do/catch blocks interacting with the session..exceededContextWindowSize error (a fresh instance must be created).prewarm() is called during idle time (e.g., view .task), never immediately preceding a respond(to:) call.Tool.call(arguments:) are explicitly thrown and never silenced with try?.@Generable properties are ordered logically top-to-bottom, with summary/dependent properties placed last.PartiallyGenerated types are never instantiated manually, only consumed from streamResponse.references/_index.md — Read first for quick navigation. Index of all documentation.references/system-language-model.md — Availability states, adapter types (.default, .contentTagging), and hardware requirements.references/session-lifecycle.md — Initialization, system instructions, transcript management, and statefulness.references/guided-generation.md — @Generable macros, @Guide token constraints, and dynamic schema building.references/tool-calling.md — Tool protocol design, Sendable conformance, and ToolExecutionDelegate.references/generation-options.md — Temperature tuning, .greedy vs .random sampling, and response token capping.references/streaming.md — streamResponse(to:) logic, PartiallyGenerated handling for SwiftUI incremental updates.references/error-handling.md — Mandatory recovery strategies for context overflow and unsupported locales.references/concurrency.md — Strict Swift 6 isolation invariants, @MainActor UI patterns, and cross-actor session usage.references/performance.md — KV-cache limits, 4096-token budgets, 1.2 GB RAM footprint, and latency reduction via prewarm().references/prompting-techniques.md — On-device prompt design: clarity, roles, few-shot examples, reasoning fields, and code-side branching.references/glossary.md — Canonical definitions for terms like "LoRA", "adapter", and "transcript".npx claudepluginhub alessiorubicini/apple-foundation-models-agent-skillIntegrates Apple's FoundationModels for on-device LLM in iOS 26+: text generation, @Generable structured output, tool calling, and snapshot streaming.
Integrates Apple's FoundationModels framework for on-device LLM features: text generation, guided output with @Generable, tool calling, and snapshot streaming in iOS 26+.
Guide for selecting and deploying on-device AI on Apple platforms: Foundation Models, Core ML, MLX Swift, and llama.cpp. Covers model conversion, quantization, structured output, and Neural Engine optimization.