Metadata-Version: 2.4
Name: robotdataset
Version: 0.1.5
Summary: A TorchRL wrapper for robot learning datasets.
Author-email: Vaishnavahari Seenivasan <vaishnavahari.seenivasan@rwth-aachen.de>
License-Expression: MIT
Project-URL: Homepage, https://github.com/robotics-action-group/robotdataset
Project-URL: Repository, https://github.com/robotics-action-group/robotdataset.git
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Operating System :: OS Independent
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: ipykernel>=6.16.2
Requires-Dist: numpy>=1.21.6
Requires-Dist: tensordict>=0.2.1
Requires-Dist: torch>=1.13.1
Requires-Dist: torchrl>=0.2.0
Requires-Dist: tqdm>=4.64.0
Requires-Dist: imageio[ffmpeg]
Provides-Extra: oxe
Requires-Dist: tensorflow>=2.11.1; extra == "oxe"
Requires-Dist: tensorflow-datasets>=4.8.2; extra == "oxe"
Provides-Extra: hf
Requires-Dist: datasets>=2.14.0; extra == "hf"
Requires-Dist: huggingface_hub>=0.19.0; extra == "hf"
Requires-Dist: Pillow>=9.0; extra == "hf"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Provides-Extra: all
Requires-Dist: tensorflow>=2.11.1; extra == "all"
Requires-Dist: tensorflow-datasets>=4.8.2; extra == "all"
Requires-Dist: datasets>=2.14.0; extra == "all"
Requires-Dist: huggingface_hub>=0.19.0; extra == "all"
Requires-Dist: Pillow>=9.0; extra == "all"
Dynamic: license-file

# robotdataset

[![Publish to PyPI](https://github.com/svaichu/robotdataset/actions/workflows/publish.yml/badge.svg)](https://github.com/svaichu/robotdataset/actions/workflows/publish.yml)

`robotdataset` is a Python library for loading robot learning datasets from multiple
sources (Google Cloud Storage, HuggingFace Hub) into the [TorchRL](https://github.com/pytorch/rl)
**TED format** (TorchRL Episode Data), backed by memory-mapped tensors for efficient
use in deep-learning training pipelines.

## Documentation

| Page | Covers |
|---|---|
| [Overview & TED format](doc/overview.md) | Library scope, architecture, the TED format, caching |
| [OXE datasets](doc/datasets/oxe.md) | `OXEDataset`, `list_datasets`, `validate_dataset_name`, `dataset2path` |
| [OXE JAX datasets](doc/datasets/oxe_jax.md) | `OXEJAXDataset` — NumPy/JAX batch path |
| [Table30v2](doc/datasets/table30v2.md) | `Table30v2Dataset` (RoboChallenge/Table30v2, HuggingFace) |
| [AgiBotWorld-Beta](doc/datasets/agibot.md) | `AgiBotWorldBetaDataset`, `list_agibot_tasks` |
| [Samplers](doc/samplers.md) | `TemporalSampler`, `EpisodeTubeletSampler`, `JAXTemporalSampler` |
| [Visualization](doc/visualization.md) | `batchViz`, `itemViz`, `episodeViz` |

## Quick start

```bash
# All dataset backends (recommended)
pip install "robotdataset[all]"

# Or install only what you need:
pip install "robotdataset[oxe]"   # OXE / GCS datasets (requires TensorFlow)
pip install "robotdataset[hf]"    # HuggingFace datasets (Table30v2, AgiBotWorld-Beta)
```

```python
from robotdataset import OXEDataset, TemporalSampler, batchViz

# Load two episodes of an OXE dataset (streams from GCS, caches locally)
dataset = OXEDataset(
    dataset_name="cmu_playing_with_food",
    episodes=[0, 2],
    batch_size=6,
)

# Attach a temporal sampler: each sample carries a 10-frame image history
sampler = TemporalSampler(
    delta_timestamps={"observation/image": [-0.9, -0.8, -0.7, -0.6, -0.5,
                                            -0.4, -0.3, -0.2, -0.1, 0.0]},
    control_frequency=10,
)
dataset.set_sampler(sampler)

batch = dataset.sample()
batch["observation/image"].shape   # (6, 10, 480, 640, 3) — (B, T, H, W, C)

# Visualize the batch as an animated mosaic
batchViz(batch, key="observation/image", fps=8)
```

![batchViz output — 6 items from cmu_playing_with_food, observation/image, 10-frame window at 8 fps](agent/batchviz_example.gif)

## Dataset status

All loaders are in **alpha** — APIs may change without notice between releases.

| Dataset | Class | Source | Status |
|---|---|---|---|
| Open X-Embodiment (OXE) | `OXEDataset` | `gs://gresearch/robotics` | **Alpha (usable, not stable)** |
| Open X-Embodiment (JAX path) | `OXEJAXDataset` | `gs://gresearch/robotics` | **Alpha (JAX path)** |
| Table30 v2 | `Table30v2Dataset` | `RoboChallenge/Table30v2` (HF) | **Alpha** |
| AgiBotWorld-Beta | `AgiBotWorldBetaDataset` | `agibot-world/AgiBotWorld-Beta` (HF) | **Alpha** |
| LIBERO | — | `openvla/modified_libero_rlds` (HF) | Planned |

## Public API

Everything below is importable directly from the top-level package:

```python
from robotdataset import (
    # Dataset classes
    OXEDataset, OXEJAXDataset, Table30v2Dataset, AgiBotWorldBetaDataset,
    # Samplers
    TemporalSampler, EpisodeTubeletSampler, JAXTemporalSampler,
    # Discovery helpers
    list_datasets, validate_dataset_name, dataset2path, list_agibot_tasks,
    # Visualization
    batchViz, itemViz, episodeViz,
)
```

`OXEJAXDataset` and `JAXTemporalSampler` are `None` when JAX is not installed —
they are optional-dependency imports.

## License

MIT — see [LICENSE](LICENSE).
