From oracle-ai-data-platform-workbench-engineer-agent
Run LLM functions inside Spark SQL on AIDP via ai_generate(). Use when the user wants to summarize/classify/extract/enrich rows with an LLM directly in SQL, generate narratives over aggregated results, or do grounded RAG-style analysis in the lakehouse. Signature is model-first; available models must be confirmed live before relying on it.
How this skill is triggered — by the user, by Claude, or both
Slash command
/oracle-ai-data-platform-workbench-engineer-agent:aidp-ai-sqlThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Call an LLM directly inside Spark SQL on AIDP — summarize, classify, extract, or narrate over lakehouse
aidp-ai-sql — LLM-in-SQL with ai_generate()Call an LLM directly inside Spark SQL on AIDP — summarize, classify, extract, or narrate over lakehouse data without leaving SQL. A signature differentiator: most competitor agents can't do this inline.
This is a SQL-helper skill. Interactive Spark SQL runs through the bundled helper
scripts/aidp_sql.py (it mints a UPST from the api_key DEFAULT profile, auto-creates a scratch notebook,
and returns JSON). No aidp MCP and no AIDP_SESSION required.
ai_generate('<model>', '<prompt>')
e.g. ai_generate('openai.gpt-5.4', 'Summarize this supplier spend: ...').
LIVE-VERIFIED model-first (model, prompt) signature with openai.gpt-5.4, openai.gpt-4o, and
xai.grok-4.
Verify before relying on it (no-fabrication): confirm the exact signature and the available model names live on the target cluster before treating this as guaranteed — run a trivial
SELECT ai_generate('<model>', 'hello')cell first (see smoke test below). Model availability varies by environment. If a model name fails, list/ask for the correct one rather than guessing.Don't gate on the
/modelsREST catalog.ai_generateresolves the model at the Spark engine level, so it can work even whenaidp-models-catalog'sGET /models?modelType=GENERATIVE_AIreturns an empty list. The smoke test (not the catalog endpoint) is the source of truth for whetherai_generateworks.
python "$PLUGIN_DIR/scripts/aidp_sql.py" \
--region <region> --datalake <DATALAKE_OCID> --workspace <ws> --cluster <cluster-key> \
--code "<python/spark code>"
Returns JSON: {"status":"ok|error","outputs":[...],"spark_job_ids":[...]}. Exit 0 on success, 1 on
cell error. See scripts/aidp_sql.py for full flags (--profile,
--session-profile, --notebook, --timeout).
python "$PLUGIN_DIR/scripts/aidp_sql.py" --region <region> --datalake <ocid> --workspace <ws> --cluster <key> \
--code "spark.sql(\"SELECT ai_generate('openai.gpt-5.4', 'hello')\").show(truncate=False)"
oci raw-request; see references/no-mcp-rest-map.md).ai_generate('<model>', '<grounded prompt>').
For per-row enrichment, call it as a column expression over a bounded set.Pass the cell to --code:
ctx = spark.sql("SELECT ... FROM gold.supplier_spend ...").toPandas().to_string()
res = spark.sql(f"SELECT ai_generate('openai.gpt-5.4', 'As a finance analyst, summarize: {ctx}') AS summary")
res.show(truncate=False)
ai_generate (cost + latency).aidp-analyzing-datanpx claudepluginhub ahmedawan-oracle/claude-code-plugins --plugin oracle-ai-data-platform-workbench-engineer-agentGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.