apple-foundation-models-skill | apple-foundation-models

Stats

Actions

Tags

apple-foundation-models-skill | apple-foundation-models

Apple Foundation Models Skill

Operating Rules

Consult references/_index.md at the start of every task to navigate the framework's capabilities.
Always check model availability (SystemLanguageModel.default.availability) before attempting to initialize a session; gracefully handle unsupported hardware or missing downloads.
Strictly adhere to Swift 6 concurrency rules: LanguageModelSession must be explicitly isolated (@State, @MainActor, or actor), and Tool conformances must be Sendable.
Respect the strict 4 096-token context limit per session; design proactive recovery strategies for long conversations.
Prefer @Generable for structured output instead of asking the model to write raw JSON.
Never use the model for real-time data retrieval without injecting Tool capabilities, and avoid using it for complex mathematical reasoning or authoritative world-knowledge.
Focus on hardware performance: use prewarm() intelligently during idle time to minimize first-token latency.

Task Workflow

Review existing AI integration code

Read the code under review and identify which topics apply.
Flag any missing availability checks before LanguageModelSession instantiation.
Validate that .exceededContextWindowSize is explicitly caught and handled.
Ensure LanguageModelSession is not declared locally inside a function or Task (which breaks statefulness).
Check Tool implementations to ensure errors are propagated (throws) and not silenced with try?.

Implement new Foundation Models feature

Determine the correct model adapter: .default for conversational prose, .contentTagging for classification/extraction.
Design data flow: choose between monolithic respond(to:) or real-time streamResponse(to:).
Define @Generable structs for structured data extraction, utilizing @Guide to strictly constrain output token by token.
Build instructions safely, keeping them developer-controlled and strictly separated from user input.

Implement Tool Calling

Define the Tool protocol conformance, ensuring a clear, concise description for the model.
Define Arguments using @Generable types.
Implement the call(arguments:) method, ensuring network calls or database queries are properly awaited and errors are correctly thrown back to the model.

Topic Router

Consult the reference file for each topic relevant to the current task:

Topic	Reference
Core Models & Availability	`references/system-language-model.md`
Session & Transcript Lifecycle	`references/session-lifecycle.md`
Structured Output & `@Generable`	`references/guided-generation.md`
Expanding capabilities (`Tool`)	`references/tool-calling.md`
Temperature & Token Limits	`references/generation-options.md`
Real-time UI & Streams	`references/streaming.md`
Context Overflow & Fallbacks	`references/error-handling.md`
Actor Isolation & Sendable	`references/concurrency.md`
Memory, Prewarming & Optimization	`references/performance.md`
Prompt Design & Iteration	`references/prompting-techniques.md`
Framework Terminology	`references/glossary.md`
Prompting Techniques	`references/prompting-techniques.md`

Correctness Checklist

These are hard rules — violations will cause runtime crashes, deadlocks, or broken state:

SystemLanguageModel.default.availability is checked before creating any session.
LanguageModelSession is explicitly owned by @State or an actor (never instantiated locally inside a function).
LanguageModelSession.GenerationError.exceededContextWindowSize is explicitly caught in all do/catch blocks interacting with the session.
A session is never reused after throwing an .exceededContextWindowSize error (a fresh instance must be created).
prewarm() is called during idle time (e.g., view .task), never immediately preceding a respond(to:) call.
Errors inside Tool.call(arguments:) are explicitly thrown and never silenced with try?.
Instructions strings are strictly hardcoded or developer-controlled, never built directly from user input.
@Generable properties are ordered logically top-to-bottom, with summary/dependent properties placed last.
PartiallyGenerated types are never instantiated manually, only consumed from streamResponse.

References

references/_index.md — Read first for quick navigation. Index of all documentation.
references/system-language-model.md — Availability states, adapter types (.default, .contentTagging), and hardware requirements.
references/session-lifecycle.md — Initialization, system instructions, transcript management, and statefulness.
references/guided-generation.md — @Generable macros, @Guide token constraints, and dynamic schema building.
references/tool-calling.md — Tool protocol design, Sendable conformance, and ToolExecutionDelegate.
references/generation-options.md — Temperature tuning, .greedy vs .random sampling, and response token capping.
references/streaming.md — streamResponse(to:) logic, PartiallyGenerated handling for SwiftUI incremental updates.
references/error-handling.md — Mandatory recovery strategies for context overflow and unsupported locales.
references/concurrency.md — Strict Swift 6 isolation invariants, @MainActor UI patterns, and cross-actor session usage.
references/performance.md — KV-cache limits, 4096-token budgets, 1.2 GB RAM footprint, and latency reduction via prewarm().
references/prompting-techniques.md — On-device prompt design: clarity, roles, few-shot examples, reasoning fields, and code-side branching.
references/glossary.md — Canonical definitions for terms like "LoRA", "adapter", and "transcript".