From dora-skills
Guides building ML/vision pipelines with dora-rs: camera capture, YOLO detection, SAM segmentation, VLM inference, depth estimation, and visualization via Rerun.
How this skill is triggered — by the user, by Claude, or both
Slash command
/dora-skills:domain-visionThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
> Building ML/Vision applications with dora-rs
Building ML/Vision applications with dora-rs
Dora provides excellent support for vision and ML pipelines through:
- id: camera
build: pip install opencv-video-capture
path: opencv-video-capture
inputs:
tick: dora/timer/millis/33 # ~30 FPS
outputs:
- image
env:
CAPTURE_PATH: "0" # Webcam index or video path
IMAGE_WIDTH: "640"
IMAGE_HEIGHT: "480"
- id: yolo
build: pip install dora-yolo
path: dora-yolo
inputs:
image: camera/image
outputs:
- bbox
env:
MODEL: yolov8n.pt # or yolov8s, yolov8m, yolov8l
DEVICE: cuda # or cpu
- id: plot
build: pip install dora-rerun
path: dora-rerun
inputs:
image: camera/image
boxes2d: yolo/bbox # 2D bounding boxes
# Optional:
# points2d: detector/points
# points3d: depth/points
nodes:
# Camera input
- id: camera
build: pip install opencv-video-capture
path: opencv-video-capture
inputs:
tick: dora/timer/millis/33
outputs:
- image
env:
CAPTURE_PATH: "0"
IMAGE_WIDTH: "640"
IMAGE_HEIGHT: "480"
# Object detection
- id: detector
build: pip install dora-yolo
path: dora-yolo
inputs:
image: camera/image
outputs:
- bbox
env:
MODEL: yolov8n.pt
# Optional: Segmentation
- id: segmenter
build: pip install dora-sam2
path: dora-sam2
inputs:
image: camera/image
bbox: detector/bbox
outputs:
- mask
# Visualization
- id: plot
build: pip install dora-rerun
path: dora-rerun
inputs:
image: camera/image
boxes2d: detector/bbox
- id: depth
build: pip install dora-vggt
path: dora-vggt
inputs:
image: camera/image
outputs:
- depth
- points3d
- id: tracker
build: pip install dora-cotracker
path: dora-cotracker
inputs:
image: camera/image
points: source/points
outputs:
- tracked_points
- id: vlm
build: pip install dora-qwen2-5-vl
path: dora-qwen2-5-vl
inputs:
image: camera/image
prompt: user/question
outputs:
- response
env:
MODEL: Qwen/Qwen2.5-VL-7B
- id: pose
build: pip install dora-mediapipe
path: dora-mediapipe
inputs:
image: camera/image
outputs:
- landmarks
- pose
# custom_detector.py
import numpy as np
import pyarrow as pa
from dora import Node
from ultralytics import YOLO
node = Node()
model = YOLO("yolov8n.pt")
for event in node:
if event["type"] == "INPUT" and event["id"] == "image":
image = event["value"] # numpy array (H, W, C)
# Run detection
results = model(image, verbose=False)
# Convert to structured output
boxes = []
for r in results:
for box in r.boxes:
boxes.append({
"xyxy": box.xyxy[0].tolist(),
"confidence": float(box.conf[0]),
"class_id": int(box.cls[0]),
"class_name": model.names[int(box.cls[0])],
})
# Send as Arrow array
node.send_output("bbox", pa.array(boxes))
elif event["type"] == "STOP":
break
Dora uses Apache Arrow for efficient image transfer:
# Image as numpy array
image = np.zeros((480, 640, 3), dtype=np.uint8) # HWC format, RGB
# Receiving image
image = event["value"] # numpy array from dora
height, width, channels = image.shape
Standard bbox format used by dora vision nodes:
bbox = {
"xyxy": [x1, y1, x2, y2], # Top-left and bottom-right corners
"confidence": 0.95, # Detection confidence
"class_id": 0, # Class index
"class_name": "person", # Class label (optional)
}
Use appropriate timer frequency
dora/timer/millis/33dora/timer/millis/66Use queue_size: 1 for real-time
inputs:
image:
source: camera/image
queue_size: 1 # Drop old frames
Use CUDA when available
env:
DEVICE: cuda
Resize images for faster processing
env:
IMAGE_WIDTH: "640"
IMAGE_HEIGHT: "480"
| Node | Package | Purpose |
|---|---|---|
| opencv-video-capture | pip install opencv-video-capture | Camera/video input |
| dora-yolo | pip install dora-yolo | YOLO detection |
| dora-sam2 | pip install dora-sam2 | SAM2 segmentation |
| dora-rerun | pip install dora-rerun | Visualization |
| dora-vggt | pip install dora-vggt | Depth estimation |
| dora-cotracker | pip install dora-cotracker | Point tracking |
| dora-mediapipe | pip install dora-mediapipe | Pose estimation |
| dora-qwen2-5-vl | pip install dora-qwen2-5-vl | Vision language |
npx claudepluginhub zhanghandong/dora-skills --plugin dora-skillsProvides behavioral guidelines to reduce common LLM coding mistakes, focusing on simplicity, surgical changes, assumption surfacing, and verifiable success criteria.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.