From aria-ark
Use when working with projectaria_tools (PAT) — the official Python/C++ library for reading Aria VRS recordings, accessing sensor data, device calibration, loading MPS results (eye gaze, hand tracking, SLAM trajectory), time domain mapping for multi-device alignment, and data visualization. Use whenever the user imports projectaria_tools, works with Aria VRS files, or asks about Aria sensor data processing.
How this skill is triggered — by the user, by Claude, or both
Slash command
/aria-ark:projectaria-toolsThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
PAT is the foundational open-source library for accessing and processing data recorded by Project Aria glasses. It provides Python and C++ interfaces for loading VRS files, accessing calibration, processing sensor data, loading MPS results, and visualizing Aria data.
PAT is the foundational open-source library for accessing and processing data recorded by Project Aria glasses. It provides Python and C++ interfaces for loading VRS files, accessing calibration, processing sensor data, loading MPS results, and visualizing Aria data.
This skill is a navigation map — it tells you where the key APIs and source files are, what the gotchas are, and where to find the truth. Never guess API names from this skill alone. Read the source file it points to before writing any API call.
--help instead of guessing from this skill.GitHub: https://github.com/facebookresearch/projectaria_tools
Docs: https://facebookresearch.github.io/projectaria_tools/gen2/research-tools/projectariatools/overview
Install (pip in a virtual environment):
python3 -m venv $HOME/projectaria_gen2_python_env
source $HOME/projectaria_gen2_python_env/bin/activate
python3 -m pip install 'projectaria-tools[all]'
Supported: Linux x64 (Ubuntu/Fedora), macOS Apple Silicon (M1+). Python 3.9–3.12.
Build from source: See CMakeLists.txt at repo root and the Advanced Installation docs.
After cloning the repo, these are the source-of-truth files for API details:
| What | Path |
|---|---|
| C++ pybind: VrsDataProvider API | core/python/VrsDataProviderPyBind.h |
| C++ pybind: SensorData accessors | core/python/SensorDataPyBind.h |
| C++ pybind: Calibration | core/python/DeviceCalibrationPyBind.h |
| C++ pybind: MPS data | core/python/MpsPyBind.h |
| C++ pybind: SE3/Sophus | core/python/sophus/ |
| MPS types (eye gaze, hand, trajectory) | core/mps/ |
| TimeDomain enum | core/data_provider/TimeTypes.h |
| Python package (pip-installed API) | projectaria_tools/core/ |
| Visualization tools | projectaria_tools/tools/ |
| Gen2 tutorials | examples/Gen2/python_notebooks/ |
The Python package (projectaria_tools/core/) exposes the compiled C++ modules. The C++ pybind headers (core/python/) are the definitive API reference — read them when you need exact method signatures, return types, or parameter semantics.
from projectaria_tools.core import data_provider
from projectaria_tools.core.sensor_data import SensorDataType
provider = data_provider.create_vrs_data_provider("recording.vrs")
Key constraint: create_vrs_data_provider requires a local file path. Does NOT accept URLs or remote URIs.
VrsDataProvider offers methods for stream discovery, indexed and time-based data access, calibration, and image configuration. Read core/python/VrsDataProviderPyBind.h for the complete typed API.
Access sensor data by index or by timestamp. Time-based queries take a TimeDomain and TimeQueryOptions.
Each SensorData object has typed accessors for: image, IMU, magnetometer, barometer, GPS, audio, eye tracking, WiFi, Bluetooth. Read core/python/SensorDataPyBind.h for all accessors and return types.
Gotchas:
MotionData type with IMU — check validity flags to distinguish.TimeDomain argument.Each VRS stream is identified by a StreamId = RecordableTypeId-InstanceId (e.g. 214-1). PAT maps these to human-readable labels. Docs: https://facebookresearch.github.io/projectaria_tools/gen2/technical-specs/vrs/streamid-label-mapper
| StreamId | Label | Sensor |
|---|---|---|
1201-1 / 1201-2 | camera-slam-left / camera-slam-right | SLAM cameras (grayscale) |
1201-3 / 1201-4 | SLAM cameras 3/4 | Additional SLAM cameras |
214-1 | camera-rgb | RGB camera (12MP) |
211-1 / 211-2 | camera-et-left / camera-et-right | Eye tracking cameras |
1202-1 / 1202-2 | imu-left / imu-right | IMUs |
1203-1 | mag0 | Magnetometer |
247-1 | baro0 | Barometer |
281-2 | gps | GPS sensor |
281-1 | gps-app | GPS from companion app |
231-1 | mic | 7-channel microphone |
246-1 | temperature | Temperature sensor |
500-1 | als | Ambient light sensor |
248-1 | ppg | Photoplethysmography |
373-1 | eyegaze | On-device eye gaze |
282-1 | wps | Wi-Fi |
283-1 | bluetooth | Bluetooth |
Dynamic StreamIds: handtracking, vio, and vio_high_frequency streams have runtime-determined IDs that depend on whether hand tracking is present in the recording.
Each data record contains sensor readout values, timestamps, and acquisition parameters (e.g. exposure/gain for cameras). Image records store one frame + metadata per record; audio records group 4096 samples per chunk. VRS data format docs: https://facebookresearch.github.io/projectaria_tools/gen2/technical-specs/vrs/data-format
from projectaria_tools.core.sensor_data import TimeDomain
| Value | Description |
|---|---|
RECORD_TIME | VRS file order |
DEVICE_TIME | Hardware clock — use for single-device workflows |
HOST_TIME | Wall clock on companion host |
TIME_CODE | External sync signal (Gen1) |
SUBGHZ | Broadcaster's device time — use for multi-device alignment (Gen2) |
TimeQueryOptions: BEFORE (latest sample ≤ query time), AFTER (earliest sample ≥ query time), CLOSEST (nearest by absolute difference).
Device and per-sensor calibration — focal length, principal point, distortion models, image undistortion. Read core/python/DeviceCalibrationPyBind.h for the full API and core/calibration/ for camera model details.
6-DOF rigid body transforms (rotation + translation):
from projectaria_tools.core.sophus import SE3
Supports translation, rotation, matrix conversion, inversion, composition, point transformation. Read core/python/sophus/ for the full API.
Gen1 vs Gen2 can be distinguished from the device_type field in the image configuration. Read core/python/VrsDataProviderPyBind.h for how to access it.
Load cloud MPS results alongside VRS data:
from projectaria_tools.core import mps
Read core/python/MpsPyBind.h for all reader functions and return types.
Key concepts:
T_world_device)uid.from projectaria_tools.core.mps import hand_tracking.csv.gz files are auto-detected from file extension.Eye gaze gotcha: There is no .gaze_direction attribute. Always use the helper functions from MpsPyBind.h to convert yaw/pitch to unit vectors and 3D points.
PAT provides time domain mapping to temporally align data across multiple Aria Gen2 devices. The underlying mechanism uses sub-GHz radio hardware for sub-millisecond accuracy.
DEVICE_TIME IS the reference.(broadcaster_time, local_time) pairs into a SubGHz stream in its VRS. PAT uses these to map between device clocks.DEVICE_TIMETimeDomain.SUBGHZ — PAT handles the clock mapping automaticallyTutorial_6 demonstrates the complete multi-device time domain mapping workflow with Rerun visualization.
| Protocol | Generation | Accuracy | Notes |
|---|---|---|---|
| SubGHz time domain mapping | Gen2 | Sub-millisecond | Hardware radio, broadcaster/receiver setup |
| TICSync | Gen1 | Moderate | Software-based, client/server over network |
| TimeCode | Gen1 | Varies | External timecode signal (SMPTE) |
Path: examples/Gen2/python_notebooks/
| Tutorial | Content |
|---|---|
| Tutorial_1 | VrsDataProvider basics — read multimodal sensor data |
| Tutorial_2 | Device calibration |
| Tutorial_3 | Sequential multi-sensor access (queued data streaming) |
| Tutorial_4 | Eye tracking + hand tracking (on-device) |
| Tutorial_5 | On-device VIO |
| Tutorial_6 | Timestamp alignment (multi-device SubGHz) |
| Tutorial_7 | MPS DataProvider basics |
Tutorial_4 covers on-device eye gaze (lower accuracy, immediate). Cloud MPS provides higher accuracy but requires upload + processing.
Tools in projectaria_tools/tools/:
| Tool | Purpose |
|---|---|
aria_rerun_viewer | Rerun-based viewer for VRS + optional MPS overlay |
viewer_mps | MPS-specific viewer (trajectory, point cloud, eye gaze, hands) |
vrs_to_mp4 | Convert VRS camera streams to MP4 video |
gen2_mp_csv_exporter | Export sensor data to CSV |
See the Visualization Tools docs for usage.
| Acronym | Meaning |
|---|---|
| PAT | ProjectAriaTools — this library |
| VRS | Vision Record Stream — file format for timestamped multi-stream sensor data |
| MPS | Machine Perception Services — cloud processing for SLAM, eye gaze, hand tracking |
| VIO | Visual-Inertial Odometry — real-time 6DoF pose from cameras + IMU |
| SLAM | Simultaneous Localization and Mapping — trajectory in MPS |
| CPF | Central Pupil Frame — reference coordinate frame for eye gaze |
| SE3 | Special Euclidean group in 3D — 6DoF rigid body transforms |
| SubGHz | Sub-GHz radio hardware used for time domain mapping across Gen2 devices |
npx claudepluginhub facebookresearch/projectaria-plugins --plugin aria-arkProvides behavioral guidelines to reduce common LLM coding mistakes, focusing on simplicity, surgical changes, assumption surfacing, and verifiable success criteria.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.