The normalization layer for robotics data. Convert, inspect, visualize, score, and discover datasets across every major format.

RLDS  ===\          /===> LeRobot
Zarr  ===| Episode/Frame |===> RoboDM
HDF5  ===/          \===> RLDS
View on GitHub Browse Datasets
$ pip install -e ".[all]"

Everything you need for robotics data

One toolkit to convert, inspect, score, filter, segment, and browse robotics datasets.

Format Conversion
Convert between RLDS, LeRobot, Zarr, HDF5, MCAP, Rosbag, and RoboDM with a single command. Hub-and-spoke architecture means O(n) not O(n²).
$ forge convert hf://lerobot/pusht ./output --format rlds
Dataset Inspection
Auto-detect format, list episodes, cameras, action/state dimensions, FPS, and schema. Works with local paths and HuggingFace URIs.
$ forge inspect hf://lerobot/aloha_sim_cube
Quality Scoring
Score every episode 0-10 with 8 research-backed metrics. Detect jerky demos, dead actions, gripper chatter, and idle periods from proprioception alone.
$ forge quality ./my_dataset --export report.json
Episode Filtering
Filter datasets by quality score, flags, or episode IDs. Supports dry-run previews and pre-computed quality reports.
$ forge filter ./dataset ./filtered --min-quality 6.0
Episode Segmentation
PELT changepoint detection on proprioception signals. Automatically split episodes into sub-skills, regime changes, and idle periods.
$ forge segment ./dataset --label --plot timeline.png
Web Visualizer
Browser-based viewer with multi-camera support, action/state charts, timeline scrubber, and segment overlay. Zero extra dependencies.
$ forge visualize pusht --segment
Dataset Registry
Curated catalog of 23+ prominent robotics datasets. Search, filter, and download by name. Use dataset IDs directly in any command.
$ forge inspect droid # resolves via registry

Real output from real datasets

Every output below was generated by running Forge on the pusht dataset.

forge inspect
$ forge inspect hf://lerobot/pusht Dataset: lerobot/pusht Format: lerobot-v3 (v3.0) Episodes: 206 Total frames: 25,650 Observation Schema ┏━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━┓ Field Type Shape ┡━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━┩ observation.state float32 (2,) next.success bool (1,) next.reward float32 () next.done bool () └───────────────────┴─────────┴───────┘ Action: float32 (2,) Cameras: image: 96x96 (rgb) FPS: 10 Language: yes (100% coverage) Sample: "Push the T-shaped block onto the T-shaped target."
forge quality
$ forge quality hf://lerobot/pusht Analyzing episodes... ━━━━━━━━━━━━━━━━━━ 206/206 ╭────────── Quality Report: pusht (206 episodes) ──────────╮ Overall Quality Score: 8.5 / 10 Smoothness (LDLJ) ███████░░░ 0.75 OK Dead Actions █████████ 0.99 OK Gripper Health ██████████ 1.00 OK Static Detection ██████████ 1.00 OK Timestamp Regularity ██████████ 1.00 OK Action Saturation ████████░░ 0.87 OK Action Diversity ███░░░░░░░ 0.30 OK ╰───────────────────────────────────────────────────────╯
forge segment
$ forge segment pusht --sample 8 --label --penalty aic Resolved from registry: PushT (lerobot) Format: lerobot-v3 | Signal: observation.state | Penalty: aic Segmentation Results ┏━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┓ Episode Frames Segments Changepoints Labels ┡━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━┩ episode_000000 161 6 20, 33, 63... moving -> fine_m... episode_000001 118 6 13, 34, 49... fine_m -> fine_m... episode_000002 141 7 12, 27, 56... moving -> fine_m... episode_000003 159 7 28, 42, 60... fine_m -> fine_m... episode_000004 159 8 12, 22, 45... moving -> fine_m... episode_000005 157 6 30, 47, 83... moving -> moving... episode_000006 69 4 14, 46, 57 fine_m -> moving... episode_000007 169 7 12, 43, 59... moving -> fine_m... └────────────────┴────────┴──────────┴─────────────────┴───────────────────────┘ ╭────────────────── Summary ──────────────────╮ Episodes: 8 Mean segments/episode: 6.38 Range: 4 — 8 Total changepoints: 43 ╰───────────────────────────────────────────╯
forge filter
$ forge filter ./dataset ./filtered --min-quality 6.0 Filtering episodes... ━━━━━━━━━━━━━━━━━━ 206/206 ┏━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓ Episode Score Flags Status ┡━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩ episode_000000 8.7 KEEP episode_000001 9.1 KEEP episode_000002 8.4 KEEP episode_000003 5.2 jerky, hesitant EXCL episode_000004 7.8 KEEP episode_000005 3.1 mostly_static EXCL episode_000006 8.9 KEEP ... 199 more episodes ... └────────────────┴───────┴────────────────────┴────────┘ ╭────────────── Filter Results ──────────────╮ Episodes kept: 189 / 206 Episodes excluded: 17 Written to: ./filtered/ ╰─────────────────────────────────────────────╯
forge segment pusht --label --plot timeline.png — semantic phase labels via proprioception
Segmentation timeline visualization
forge visualize pusht --segment — browser-based viewer with segment overlay
Forge Viewer lerobot-v3
206 episodes · 25,650 frames · 10 fps
image (96×96)
Episode 0 Pause 1x
56 / 161 fine_manipulation
moving fine_manipulation idle
Actions
States
Space: Play/Pause   ←→: Frame   ↑↓: Episode   [/]: Speed
forge visualize droid_100 --backend rerun --samples 2 — Rerun viewer: cameras, time-series, and segment labels on one timeline
Rerun viewer showing camera stream, action and state time series
Install: pip install forge-robotics[rerun]

Hub-and-spoke, not N×M

Add a reader, get all writers for free. Add a writer, get all readers for free.

RLDS / Open-X
LeRobot v2/v3
GR00T
Zarr
HDF5
MCAP
Rosbag
RoboDM
Episode / Frame
Intermediate Representation
RLDS
LeRobot v2/v3
RoboDM

Format support matrix

Read, write, and visualize across every major robotics data format.

Format Read Write Visualize Notes
RLDS Open-X, TensorFlow Datasets
LeRobot v2/v3 HuggingFace, Parquet + MP4
GR00T NVIDIA Isaac, LeRobot v2 with embodiment metadata
RoboDM Berkeley's .vla format, up to 70x compression
Zarr Diffusion Policy, UMI
HDF5 robomimic, ACT/ALOHA
MCAP ROS2 CDR + Foxglove Protobuf, no ROS install required
Rosbag ROS1 .bag, ROS2 SQLite3

8 research-backed metrics

Score every episode 0-10 from proprioception data alone. No video processing needed.

Smoothness (LDLJ)
Jerk-based smoothness
Dead Actions
Zero/constant detection
Gripper Chatter
Rapid open/close cycles
Static Detection
Idle period detection
Timestamp Regularity
Dropped frames & jitter
Action Saturation
Time at hardware limits
Action Entropy
Diversity vs repetition
Path Length
Wandering & hesitation
$ forge quality hf://lerobot/aloha_sim_cube --export report.json
$ forge filter ./dataset ./filtered --min-quality 6.0 # remove bad demos

23+ curated robotics datasets

Browse, search, and download by name. Use dataset IDs directly in any command.

DROID
~76,000 episodes
rlds
Bridge V2
~60,096 episodes
rlds
Open-X Embodiment
~2,200,000 episodes
rlds
AgiBot World
~1,000,000 episodes
lerobot-v2
RoboSet
~100,000 episodes
hdf5
ALOHA
Bi-manual demos
lerobot
Browse All Datasets

Up and running in 60 seconds

1
Install
Clone the repo and install with pip.
$ git clone https://github.com/arpitg1304/forge.git
$ cd forge
$ pip install -e ".[all]"
2
Try a demo dataset
Download, inspect, and score a small dataset in one command.
$ forge demo
# Downloads pusht, runs inspect + quality
3
Convert and go
Convert to any format you need for training.
$ forge convert droid ./output --format lerobot-v3