From media-ingest
Ingest, catalog, and semantically tag media assets (photos and video) from camera imports. Use this skill whenever the user wants to process, tag, describe, catalog, archive, or organize media files — even if they just say things like "process my footage", "tag these photos", "catalog this SD card dump", "what's in these clips", or "help me organize my camera roll". Also trigger when the user mentions keywords like: ingest, media, footage, clips, photos, archive, keyframes, visual analysis, or media catalog. This skill handles the full pipeline from raw camera files to a searchable, keyword-tagged media library.
How this skill is triggered — by the user, by Claude, or both
Slash command
/media-ingest:media-ingestThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
This skill processes raw media files (photos and video) and produces a rich, searchable catalog
This skill processes raw media files (photos and video) and produces a rich, searchable catalog with semantic keyword tags, natural language descriptions, and technical metadata.
Given a folder of media files, this skill will:
The skill needs these tools available on the system:
Pillow (image handling), exiftool or pillow for EXIFIf ffmpeg is not installed, the skill will attempt to install it and inform the user.
"Ingest this clip and tag it" (with file uploaded or path provided)
"Process all the footage in /path/to/folder"
"Ingest everything on my SD card at /Volumes/EOS_DIGITAL/DCIM"
"Find all clips tagged with 'sunset'"
"Show me everything from the beach trip"
"What footage do I have of the city skyline?"
Scan the target folder (recursively) for supported file types. Build a manifest of files to process. Skip files that already have a sidecar JSON (unless the user asks to re-process).
python3 ${CLAUDE_PLUGIN_ROOT}/scripts/scan_media.py /path/to/media/folder
This produces a manifest.json listing all discovered files with basic filesystem metadata.
For each file, extract technical metadata:
ffprobe to get duration, resolution, codec, frame rate, bitrateFor video files, extract representative keyframes using ffmpeg:
Store keyframes in a temporary .keyframes/ directory next to the video file.
python3 ${CLAUDE_PLUGIN_ROOT}/scripts/extract_keyframes.py /path/to/video.mp4 --output-dir /tmp/keyframes/
This is the core intelligence step. For each media file:
{
"file": "DJI_0042.MP4",
"type": "video",
"description": "Aerial drone shot flying over a coastal town at golden hour. The camera slowly pans from the harbor with small fishing boats to the hillside covered in white and pastel-colored houses. Ocean waves break against a stone seawall. A church steeple is visible on the hilltop. Light cloud cover with warm sunset tones.",
"tags": [
"aerial", "drone", "coastal", "town", "golden-hour", "sunset",
"harbor", "boats", "fishing-boats", "hillside", "houses",
"ocean", "waves", "seawall", "church", "steeple", "clouds",
"mediterranean", "cinematic", "slow-pan", "establishing-shot"
],
"scene_type": "exterior/aerial",
"mood": ["warm", "serene", "cinematic"],
"time_of_day": "golden hour / sunset",
"weather": "partly cloudy",
"motion": "slow pan left-to-right",
"shot_type": "wide establishing shot",
"notable_elements": [
"fishing boats in harbor",
"pastel hillside houses",
"church steeple on hilltop",
"breaking waves on seawall"
]
}
Tags should be:
Aim for 10-30 tags per asset. More tags = better searchability. Don't be shy.
Descriptions should read like a script supervisor's notes or a stock footage description:
For each processed file, write a .meta.json sidecar file alongside the original:
DJI_0042.MP4
DJI_0042.MP4.meta.json
The sidecar contains the full analysis output plus technical metadata.
Append/update the central SQLite catalog at the root of the media folder:
/path/to/media/folder/media_catalog.db
Schema:
CREATE TABLE IF NOT EXISTS assets (
id INTEGER PRIMARY KEY AUTOINCREMENT,
filepath TEXT UNIQUE NOT NULL,
filename TEXT NOT NULL,
file_type TEXT NOT NULL, -- 'photo' or 'video'
file_size_bytes INTEGER,
description TEXT,
scene_type TEXT,
mood TEXT, -- JSON array
time_of_day TEXT,
weather TEXT,
motion TEXT,
shot_type TEXT,
notable_elements TEXT, -- JSON array
-- Technical metadata
duration_seconds REAL, -- video only
resolution_width INTEGER,
resolution_height INTEGER,
codec TEXT,
frame_rate REAL,
camera_model TEXT,
lens TEXT,
iso INTEGER,
aperture TEXT,
shutter_speed TEXT,
gps_lat REAL,
gps_lon REAL,
date_taken TEXT,
-- Processing metadata
processed_at TEXT,
keyframe_count INTEGER
);
CREATE TABLE IF NOT EXISTS tags (
id INTEGER PRIMARY KEY AUTOINCREMENT,
asset_id INTEGER NOT NULL,
tag TEXT NOT NULL,
FOREIGN KEY (asset_id) REFERENCES assets(id),
UNIQUE(asset_id, tag)
);
CREATE INDEX IF NOT EXISTS idx_tags_tag ON tags(tag);
CREATE INDEX IF NOT EXISTS idx_assets_filepath ON assets(filepath);
Use the catalog_db.py script to upsert assets:
python3 ${CLAUDE_PLUGIN_ROOT}/scripts/catalog_db.py upsert /path/to/media/folder/media_catalog.db /path/to/asset.meta.json
When the user wants to search their media, use the catalog_db.py search command:
python3 ${CLAUDE_PLUGIN_ROOT}/scripts/catalog_db.py search /path/to/media/folder/media_catalog.db "search terms"
Or query the SQLite catalog directly:
-- Find assets by tag
SELECT a.* FROM assets a
JOIN tags t ON a.id = t.asset_id
WHERE t.tag LIKE '%sunset%';
-- Find assets by description
SELECT * FROM assets WHERE description LIKE '%beach%';
-- Combine tag and description search
SELECT DISTINCT a.* FROM assets a
LEFT JOIN tags t ON a.id = t.asset_id
WHERE t.tag LIKE '%ocean%' OR a.description LIKE '%ocean%';
Present results in a readable format with the description, top tags, and file path.
After processing, present a summary to the user:
✅ Media Ingest Complete
━━━━━━━━━━━━━━━━━━━━━
📁 Folder: /Volumes/EOS_DIGITAL/DCIM
📊 Processed: 47 files (32 video, 15 photos)
⏭️ Skipped: 3 files (already cataloged)
🏷️ Total tags generated: 892
💾 Catalog: /Volumes/EOS_DIGITAL/DCIM/media_catalog.db
Top tags across this batch:
#outdoor (28) #daylight (24) #nature (19) #wide-shot (15) #handheld (12)
For detailed schema information and helper scripts, see:
${CLAUDE_PLUGIN_ROOT}/scripts/scan_media.py — Media file discovery and manifest building${CLAUDE_PLUGIN_ROOT}/scripts/extract_keyframes.py — FFmpeg-based keyframe extraction${CLAUDE_PLUGIN_ROOT}/scripts/catalog_db.py — SQLite catalog management${CLAUDE_PLUGIN_ROOT}/skills/media-ingest/references/tag_taxonomy.md — Detailed tagging taxonomy and guidelinesnpx claudepluginhub michaeldboyd/media-ingest --plugin media-ingestWrites structured, searchable descriptions for digital assets in DAM systems or media archives. Helps with batch ingest, re-tagging, and cataloging photos, video, audio, graphics, and documents.
Ingests video/audio from files, URLs, RTSP feeds, or desktop capture; indexes visual/spoken content for search; transcodes, edits timelines, generates assets, and creates real-time alerts.
Uses mm CLI to index multimodal directories, explore contents, find files by type/size, extract PDF text/images/video keyframes, search across files, count tokens, view trees.