From vanguard-frontier-agentic
Queries Salesforce STDM and Data Cloud for live Agentforce session traces, faithfulness scores, answer relevance, action telemetry, and quality metrics to answer production observability questions.
How this skill is triggered — by the user, by Claude, or both
Slash command
/vanguard-frontier-agentic:salesforce-agentforce-stdm-observer-skillThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Production observability for Agentforce agents via STDM and Data Cloud. This skill
Production observability for Agentforce agents via STDM and Data Cloud. This skill is a live evidence reader, not a configuration reviewer. It queries session telemetry, quality scores, and action traces to answer: "Is my agent working correctly right now?" It does not modify agents, configurations, or any org data.
Adaptation note: Query mechanics in this skill are adapted from the
observing-agentforce skill published by Salesforce in the
forcedotcom/sf-skills repository (Apache-2.0). Vanguard-specific additions
include the T1 least-privilege contract, structured audit envelope, explicit
aggregate-only output policy, and the handoff routing model.
Verify-before-merge notice: All Agentforce, STDM, Data Cloud, and Einstein AI feature names evolve rapidly. Validate all product references, DMO field names, and API structures against current official Salesforce documentation before use in production.
Use salesforce-agentforce-stdm-observer-skill when the goal is live
production observability for an Agentforce agent:
Delegate elsewhere when:
| Situation | Skill to use |
|---|---|
| Static review of agent configuration (topics, actions, instructions) | salesforce-agentforce-risk-review-skill |
| Agent is misconfigured and must be changed | T3 — requires human approval via salesforce-live-guard-agent |
| Compliance/privacy review of session data handling | salesforce-compliance-privacy-agent |
| General SOQL record queries unrelated to Agentforce | salesforce-soql-explorer-skill |
| Metadata export and schema inspection | salesforce-metadata-fetcher-skill |
Authoring or editing .agent files | developing-agentforce (forcedotcom/sf-skills) |
| Agent performance degrades and a fix must be deployed | Route through salesforce-live-guard-agent for human approval |
Before executing any STDM query, confirm all of the following. Ask if missing.
--target-org value recognized by sf org list.
Never accept a raw instance URL or session token.MasterLabel) or
API name. Resolve against the org before querying (see Step 2).getAggregatedMetrics as the
first call to bound scope.salesforce-compliance-privacy-agent before sharing externally.Confirm the org alias is reachable and Data Cloud is provisioned:
sf org display --target-org <alias>
Then probe the Data Cloud data spaces endpoint to confirm STDM DMOs are available:
sf api request rest "/services/data/v63.0/ssot/data-spaces" \
--target-org <alias>
Note: sf api request rest is a beta command — do not add --json (that
flag is unsupported and causes an error in this command).
Decision logic:
DATA_SPACE=default
and log it as an assumption.status: "Active" data spaces.name as DATA_SPACE for all subsequent steps.If Data Cloud is unavailable, stop and inform the user:
STDM requires Data Cloud with "Agentforce Activity" data stream active. Navigate to Setup → Data Cloud → Data Streams to verify. This skill cannot proceed without STDM. For local trace analysis without Data Cloud, see the
observing-agentforceskill from forcedotcom/sf-skills.
Resolve the user-provided agent name to the exact MasterLabel used by STDM.
Field names and exact object names are drift-prone — run this query and use
the returned values, not the user-provided string:
sf data query \
--query "SELECT Id, MasterLabel, DeveloperName FROM GenAiPlannerDefinition WHERE MasterLabel LIKE '%<user-provided-name>%' OR DeveloperName LIKE '%<user-provided-name>%'" \
--target-org <alias> \
--result-format json
Store:
AGENT_MASTER_LABEL — for STDM findSessions agent filterPLANNER_ID — the Salesforce record ID for this agent (redact in output)If the query returns no results: The agent does not exist in this org. Show the full list of agents and ask the user to identify the target.
Retrieve tagging metadata to understand what quality evaluation definitions are configured. This confirms the org has quality scoring enabled before querying scores:
sf data query \
--query "SELECT Id, AiAgentTagId, EntityId, EntityType FROM AiAgentTagAssociation LIMIT 10" \
--target-org <alias> \
--result-format json
If AiAgentTagAssociation returns no rows, quality scores may not be configured. Note this in the output and proceed with session-level metrics only.
Note: AiAgentTag, AiAgentTagDefinition, and AiAgentTagAssociation
are Tooling API objects. Use --use-tooling-api if the standard SOQL path
returns an "object not found" error:
sf data query \
--query "SELECT Id, AiAgentTagId, EntityId, EntityType FROM AiAgentTagAssociation LIMIT 10" \
--target-org <alias> \
--use-tooling-api \
--result-format json
Start with getAggregatedMetrics via the AgentforceOptimizeService Apex
helper class to get the health dashboard before drilling into individual
sessions. This is the most efficient first call and avoids fetching session
content.
Full Apex service deployment steps and invocation patterns are documented
in references/stdm-queries.md (adapted from forcedotcom/sf-skills
observing-agentforce).
String result = AgentforceOptimizeService.getAggregatedMetrics(
'<DATA_SPACE>',
'<START_ISO>',
'<END_ISO>',
50,
'<AGENT_MASTER_LABEL>'
);
System.debug('STDM_RESULT:' + result);
sf apex run --json --file /tmp/stdm_metrics.apex --target-org <alias>
Parse the result using the DEBUG|STDM_RESULT: pattern (see
references/stdm-queries.md). The aggregated metrics return:
total_sessions, total_turns, avg_quality_scoreavg_faithfulness, avg_answer_relevance, avg_context_precisionabandonment_rate, deflection_rate, escalation_rateend_type_counts, quality_distribution, top_intentsunavailable_dmos — list of DMOs that could not be queriedIf findSessions returns empty: No production sessions exist in this
date window. Check that the date range is correct and that the agent is
actively receiving traffic. Consider widening the window.
If avg_faithfulness or avg_answer_relevance falls below the thresholds
defined in references/observability-rubric.md, run targeted observability
queries:
AgentforceOptimizeService.ObservabilityInput inp = new AgentforceOptimizeService.ObservabilityInput;
inp.queryType = 'Hallucination';
inp.agentApiName = '<AGENT_MASTER_LABEL>';
inp.lookbackDays = 7;
List<AgentforceOptimizeService.ObservabilityOutput> results =
AgentforceOptimizeService.runObservabilityQuery(
new List<AgentforceOptimizeService.ObservabilityInput>{ inp }
);
System.debug('STDM_RESULT:' + results[0].resultJson);
Available query types: KnowledgeGap, Hallucination, RetrievalQuality,
AnswerRelevancy, Leaderboard — see references/stdm-queries.md for the
full table.
Do NOT use getMultipleConversationDetails or getLlmStepDetails
in this skill. Those methods return raw session content (user messages, agent
responses) which may contain PII. This skill operates on aggregate metrics
only. See Redaction Rules below and references/privacy-redaction.md.
Before emitting any result, apply all redaction rules from the Redaction
Rules section and references/privacy-redaction.md. Specifically:
<record_id_placeholder>.<user_id_placeholder>.salesforce-compliance-privacy-agent.Every execution must produce a complete audit envelope (see Audit Envelope Schema). Emit it unconditionally — even if the result set is empty or an error occurred.
Compare results against the rubric in references/observability-rubric.md:
salesforce-agentforce-risk-review-skillsalesforce-agentforce-risk-review-skillsalesforce-live-guard-agentsalesforce-compliance-privacy-agentSee Handoff Rules for the full escalation matrix.
Score the observability execution quality before emitting results. Threshold: 80+ acceptable, 60–79 emit with caveat, below 60 reject and request revision.
| Dimension | Points | What earns full marks |
|---|---|---|
| Query selectivity | 25 | Time-window applied; agent filter set; no full-DMO scans; aggregate-first approach used |
| Sanitization | 30 | No session content in output; all IDs redacted; regulated-vertical flag applied if applicable; audit envelope populated |
| Metric completeness | 20 | Sessions count, avg_faithfulness, avg_answer_relevance, action_invocation_count, error_rate all reported (or explicitly noted as unavailable) |
| Audit envelope | 15 | All required audit fields present; timestamp accurate; org_type_verified correct |
| Proper delegation | 10 | Anomalies routed to the correct downstream skill; no configuration changes attempted |
Scoring penalties:
This skill operates exclusively at T1 — read-only runtime. The contract is:
api, refresh_token, and cdp_query_api only.
The cdp_query_api scope is required for Data Cloud SQL queries via the
ConnectApi.CdpQuery namespace. No full, web, sfap_api, or any
other scope.sf org display that the target
alias is in the authorized set before any query.sf agent publish,
sf agent activate, sf project deploy start, or any command that
modifies agent configuration or org state.observing-agentforce sf-skills pattern for
use by human operators with appropriate data handling controls.Stop immediately and do not execute if any of the following apply:
salesforce-live-guard-agent.Manage Agentforce permission present — this skill
requires that permission to be explicitly denied.View Setup and Configuration — stop and
escalate to org administrator.sf project deploy start, sf agent publish, Apex DML, or equivalent.Every execution emits an audit envelope. The envelope travels with the sanitized output to any downstream skill.
audit_envelope:
matter_id: "<caller-provided-or-generated-uuid>"
skill_id: "salesforce-agentforce-stdm-observer-skill"
skill_version: "0.1.0"
target_org_alias: "<alias>" # never the raw org ID
run_as_user_id: "<user_id_placeholder>" # placeholder; never real ID in output
agent_master_label: "<label>" # display name used for STDM filter
data_space: "<data_space_name>" # resolved Data Cloud data space
query_types_executed: ["getAggregatedMetrics", "runObservabilityQuery"]
time_window_start: "<ISO-8601-UTC>"
time_window_end: "<ISO-8601-UTC>"
redactions_applied:
- type: "<session_content|user_id|record_id|pii>"
reason: "<aggregate-only-policy|pii-risk|encrypted>"
timestamp: "<ISO-8601-UTC>"
org_type_verified: "sandbox | production"
regulated_vertical_flag: true | false
downstream_skill_recommended: "<skill-id or null>"
All output is in YAML. Emit this structure for every execution.
verdict: "acceptable | caveat | reject"
quality_score: <0-100>
quality_notes: "<what drove the score>"
aggregate_metrics:
sessions_count: <integer>
total_turns: <integer>
avg_quality_score: <float> # 1.0-5.0 scale
avg_faithfulness: <float> # 0.0-1.0; null if unavailable
avg_answer_relevance: <float> # 0.0-1.0; null if unavailable
avg_context_precision: <float> # 0.0-1.0; null if unavailable
action_invocation_count: <integer> # total across all sessions
action_error_count: <integer>
error_rate: <float> # action_error_count / action_invocation_count
abandonment_rate: <float>
deflection_rate: <float>
escalation_rate: <float>
end_type_counts:
USER_ENDED: <integer>
AGENT_ENDED: <integer>
UNKNOWN: <integer>
quality_distribution:
"5": <integer>
"4": <integer>
"3": <integer>
"2": <integer>
"1": <integer>
top_intents:
"<intent summary>": <count>
unavailable_dmos: []
anomalies_detected:
- dimension: "<faithfulness|relevance|error_rate|abandonment>"
observed_value: <float>
threshold: <float>
severity: "low | medium | high | critical"
interpretation: "<human-readable explanation>"
sanitized_sample_sessions: null
# Always null in this skill. Session content is never emitted.
# If per-session debugging is genuinely required, route through
# salesforce-live-guard-agent for human-in-the-loop confirmation.
audit_envelope:
# See Audit Envelope Schema above
downstream_skill_recommendation: "<skill-id or null>"
downstream_routing_reason: "<why this skill was chosen>"
missing_evidence:
- "<what additional data would improve confidence>"
assumptions:
- "<explicit list of assumptions made>"
Apply in order. Do not bypass for any reason.
salesforce-live-guard-agent.00D): Replace with
<org_id_placeholder>.rec_a3f2) to detect duplicates without
echoing the raw ID. Never emit raw record IDs.<user_id_placeholder>.<pii_redacted>._v9): Omit from
output — it reveals internal versioning structure.When metrics cross the thresholds in references/observability-rubric.md,
hand off to the appropriate skill with the sanitized output and audit
envelope as the payload.
| Finding | Hand off to | Payload required |
|---|---|---|
| Faithfulness drops below threshold | salesforce-agentforce-risk-review-skill | audit_envelope, aggregate_metrics, anomalies_detected |
| Answer relevance below threshold | salesforce-agentforce-risk-review-skill | audit_envelope, aggregate_metrics, anomalies_detected |
| Action error rate > 5% | salesforce-agentforce-risk-review-skill | audit_envelope, aggregate_metrics, error breakdown |
| A configuration change is proposed | salesforce-live-guard-agent | audit_envelope, change_proposal, anomalies_detected |
| Regulated-vertical session anomalies | salesforce-compliance-privacy-agent | audit_envelope, anomalies_detected, vertical_flag |
| General SOQL follow-up needed | salesforce-soql-explorer-skill | audit_envelope, specific query request |
Required handoff fields: matter_id, audit_envelope, aggregate_metrics
(summary — not raw session data), anomalies_detected, assumptions.
Stop and do not continue if:
Manage Agentforce permission granted — this skill
requires it to be denied; stop and escalate to the org administrator.salesforce-live-guard-agent.salesforce-live-guard-agent for human
approval.cdp_query_api): Required for Data Cloud SQL
queries. This scope does not grant write access to Data Cloud; it permits
read-only queries against the cdp_query_api endpoint only.salesforce-compliance-privacy-agent before results are shared
externally.| File | When to read |
|---|---|
references/stdm-queries.md | STDM query patterns, SOQL/SQL examples, Apex service methods, Data Cloud cdp_query_api scope, anti-patterns |
references/observability-rubric.md | Thresholds for faithfulness, relevance, error rate, abandonment; escalation matrix |
references/privacy-redaction.md | Agentforce-specific redaction rules, session content policy, human-in-the-loop path |
npx claudepluginhub raishin/vanguard-frontier-agentic --plugin vanguard-frontier-agenticAnalyzes production Agentforce agent behavior using STDM session traces and Data Cloud. Triggers for querying sessions, investigating failures, regressions, performance issues, or reproducing in preview.
Analyzes production Agentforce agent behavior via STDM session traces and Data Cloud, then reproduces issues in preview and improves agent files. For Salesforce admins/developers debugging production agents.
Monitors AI agent health across quality, cost, performance, and errors using Amplitude Agent Analytics. Proactive health reports and drill-down into failing sessions.