Skill

assistant

Use when creating or editing Kodexa assistants — YAML configuration for event-driven processing pipelines including step definitions, connections to stores, event subscriptions, conditionals, and module orchestration

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/kodexa-metadata-skills:assistant

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Assistants are event-driven processing pipelines in Kodexa. They connect to stores and channels, subscribe to events (new documents, status changes), and execute modules in sequence to process documents. Assistants can be reactive (triggered by events), schedulable (run on cron), or both.

SKILL.md

265 lines · ~2.3k tokens

Stats

Stars0

MaintenanceExcellent

Last CommitMay 15, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Kodexa Assistant Authoring

Overview

Assistants vs Activity Plans (post-2026-05-02)

The activity refactor introduced activity-plan (a graph of orchestrated steps) and trigger (event-driven launches of an activity). The two abstractions overlap with assistants — pick the right one:

Use case	Pick
Reactive document-processing pipeline (parse → extract → label → write)	Assistant (this skill) — designed for streaming document events
Orchestrated workflow with human review, approvals, branches, multi-step gating	`activity-plan` + `trigger` (see `activity-plan` and `project-template` skills)
Cron-driven module run (no event chain)	Scheduled job in project-template, or a `schedulable: true` assistant
Wiring a service-bridge HTTP call into a pipeline	Either: an assistant module that does it, or an `activity-plan` step with `kind: BRIDGE_CALL`

In short: assistants are the right choice for document-event-driven pipelines (the platform's bread and butter); activity-plans are the right choice for human-in-the-loop workflows and explicit step graphs. They can coexist — an assistant can process a document, and the resulting task can fire a trigger that starts an activity-plan.

Terminology note. Where this skill (or older code) refers to "plans" or "planned items" produced by an assistant, the platform now persists those as Activity and Step rows. The runtime field formerly known as Activity.status is now Activity.lifecycleState. Subscription expressions that filter on activity-related events should use the new field name.

When to Use

Creating a new assistant for document processing
Configuring event subscriptions and store connections
Setting up multi-step processing pipelines
Adding conditional logic to processing flows
Debugging assistant event handling or execution issues

Interactive Wizard

Purpose — What does the assistant do? (extraction, classification, validation, transformation, routing)
Trigger — What triggers it? (new document uploaded, status change, schedule, manual)
Stores — Which stores does it read from / write to?
Steps — What processing steps? (which modules, in what order)
Conditions — Any conditional logic? (skip if already processed, route by type)
Error handling — What happens on failure? (retry, move to exceptions, notify)

Generate the assistant YAML configuration.

Extension Pack Assistant Definition

name: "My Assistant"
slug: my-assistant
description: "What this assistant does"
type: assistant
reactive: true                         # Triggered by events
schedulable: false                     # Can run on schedule
publicAccess: true                     # Available to all orgs
template: true                         # Is a template

A:
  package: my_package                  # Python package name
  class: MyAssistant                   # Python class name

options:                               # User-configurable options
  - name: confidence_threshold
    label: "Confidence Threshold"
    type: number
    default: 0.85
    required: false
    description: "Minimum confidence for auto-processing"

Project-Level Assistant Configuration

When used in a project or project template:

assistants:
  - name: "Document Processor"
    slug: doc-processor
    description: "Processes incoming documents"

    # Module reference
    assistantDefinitionRef: kodexa/pdf-extractor

    # Execution settings
    priorityHint: 10                   # Higher = more priority
    loggingEnabled: true               # Detailed execution logs
    chatEnabled: false                 # Enable chat interface
    assistantRole: "extractor"         # Role identifier

    # Store access
    stores:
      - "${orgSlug}/${project.id}-intake"
      - "${orgSlug}/${project.id}-output"

    # Event connections
    connections:
      - sourceType: STORE
        sourceRef: "${orgSlug}/${project.id}-intake"
        subscription: "!hasMixins('processed')"

    # Module options
    options:
      use_ocr: true
      confidence_threshold: 0.85

    # Optional scheduling
    schedules:
      - cronExpression: "0 0 8 * * *"  # 8 AM daily

Connection Types

connections:
  # Store connection - triggered when documents arrive or change
  - sourceType: STORE
    sourceRef: "${orgSlug}/${project.id}-documents"
    subscription: "!hasMixins('processed')"

  # Channel connection - triggered by upstream assistant output
  - sourceType: CHANNEL
    targetRef: "${orgSlug}/${project.id}/upstream-assistant"
    subscription: "status == 'completed'"

  # Document family connection
  - sourceType: DOCUMENT_FAMILY
    sourceRef: "${orgSlug}/${project.id}-store"

  # Workspace connection
  - sourceType: WORKSPACE
    sourceRef: "${orgSlug}/${project.id}/workspace-slug"

ConnectableType Values

Type	Trigger	Use Case
`STORE`	Document added/changed in store	Primary input processing
`CHANNEL`	Upstream assistant completes	Pipeline chaining
`DOCUMENT_FAMILY`	Document family events	Fine-grained document processing
`WORKSPACE`	Workspace events	User-initiated processing

Subscription Expressions

Filter which events trigger the assistant:

# Process only unprocessed documents
subscription: "!hasMixins('processed')"

# Process only PDFs
subscription: "contentType == 'application/pdf'"

# Process documents with specific status
subscription: "status == 'new'"

# Combine conditions
subscription: "!hasMixins('processed') AND contentType == 'application/pdf'"

# Process completed upstream results
subscription: "status == 'completed'"

Multi-Step Pipelines

Chain assistants together using channels:

assistants:
  # Step 1: OCR/Parsing
  - name: Document Parser
    slug: parser
    assistantDefinitionRef: kodexa/pdf-parser
    connections:
      - sourceType: STORE
        sourceRef: "${orgSlug}/${project.id}-intake"
        subscription: "!hasMixins('parsed')"

  # Step 2: Extraction (triggered by parser completion)
  - name: Data Extractor
    slug: extractor
    assistantDefinitionRef: kodexa/data-extractor
    connections:
      - sourceType: CHANNEL
        targetRef: "${orgSlug}/${project.id}/parser"
        subscription: "status == 'completed'"

  # Step 3: Validation (triggered by extractor completion)
  - name: Data Validator
    slug: validator
    assistantDefinitionRef: kodexa/validator
    connections:
      - sourceType: CHANNEL
        targetRef: "${orgSlug}/${project.id}/extractor"
        subscription: "status == 'completed'"

Extension Pack Structure

For creating assistant packages:

name: "My Extension Pack"
slug: my-extension-pack
type: extensionPack
description: "Collection of processing assistants"
source:
  type: docker
  location: kodexa://extension-core:{version}
services:
  - slug: my-assistant
    name: "My Assistant"
    type: assistant
    assistant:
      package: my_package
      class: MyAssistant
    schedulable: true
    reactive: true
    options:
      - name: option_name
        label: "Option Label"
        type: string
        required: true

Python Assistant Implementation

from kodexa import Assistant, PipelineContext

class MyAssistant(Assistant):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.confidence = kwargs.get('confidence_threshold', 0.85)

    def process(self, document, context: PipelineContext):
        """Main processing method."""
        # Process the document
        context.log.info(f"Processing {document.uuid}")

        # Access options
        use_ocr = context.get_option('use_ocr', True)

        # Update status
        context.status_reporter.update("Extracting data", status_type="analyzing")

        return document

    def handle_event(self, event, document, context):
        """Handle platform events (if reactive)."""
        event_type = event.get('type')
        return document

Common Mistakes

Mistake	Fix
No connections defined	At least one connection needed to trigger processing
Wrong `sourceRef` format	Must be `orgSlug/storeSlug` or use template vars `${orgSlug}/${project.id}-slug`
Channel ref without project context	Channel refs need project: `${orgSlug}/${project.id}/assistant-slug`
Missing stores list	Add all stores the assistant reads from or writes to
Subscription syntax errors	Use `hasMixins('label')` not `hasMixin('label')`
Scheduling without `schedulable: true`	Extension pack definition must set `schedulable: true`

assistant

Invocation

Context Preview

SKILL.md

assistant

Invocation

Context Preview

SKILL.md

Kodexa Assistant Authoring

Overview

Assistants vs Activity Plans (post-2026-05-02)

When to Use

Interactive Wizard

Extension Pack Assistant Definition

Project-Level Assistant Configuration

Connection Types

ConnectableType Values

Subscription Expressions

Multi-Step Pipelines

Extension Pack Structure

Python Assistant Implementation

Common Mistakes

Similar Skills

Kodexa Assistant Authoring

Overview

Assistants vs Activity Plans (post-2026-05-02)

When to Use

Interactive Wizard

Extension Pack Assistant Definition

Project-Level Assistant Configuration

Connection Types

ConnectableType Values

Subscription Expressions

Multi-Step Pipelines

Extension Pack Structure

Python Assistant Implementation

Common Mistakes

Similar Skills