No Lock-In: A Round Trip Between LeRobotDataset v3 and RLDS with FoodforThought
A practical guide to moving robot-demonstration datasets between formats: importing Hugging Face LeRobotDataset v3 and Open X-Embodiment (RLDS) into FoodforThought, and exporting labeled recordings back out to LeRobotDataset v3 and RLDS — the open standards the robot-learning ecosystem actually trains on.
If you do imitation-learning data work, you have lived this problem: your demonstrations are scattered across formats. Some came from the Hugging Face Hub as a LeRobotDataset. Some came from Open X-Embodiment as RLDS / TFDS shards. Some you recorded yourself. The training loader you want to use speaks exactly one of these, and the labeling tooling you want to use speaks a third. Every project starts with a day of glue code, and every glue script is one more place a frame index can quietly shift by one.
We are building FoodforThought — Kindly's open platform for robotics data — on a simple bet: the loop that matters is design → data → deploy → operate, and the data leg of that loop only compounds into a flywheel if it reads and writes the formats the open ecosystem already trains on. No proprietary container, no "export to our format and pray." If you can't take your data back out the door in the same standard you brought it in, it was never really yours.
This post is the concrete version of that promise: a no-lock-in round trip. We'll import a Hugging Face LeRobotDataset v3 and an Open X-Embodiment RLDS dataset into FoodforThought, and we'll export labeled recordings back out to LeRobotDataset v3 and RLDS. Everything here is something the open-source FoodforThought CLI (ate, on PyPI) does today.
I'm writing this in the founder's voice, so a word on accuracy up front: I'd rather under-claim than oversell. Where the tooling has an edge or a dependency boundary, I'll say so plainly.
The two formats, and why we speak both
The open robot-learning world has effectively standardized on two on-disk shapes, and they make different trade-offs:
-
LeRobotDataset v3 is Hugging Face's current format. Its defining move versus v2 is that it is file-based and relational: frame rows and video from many episodes are concatenated into a few large, chunked shards (
data/chunk-*/file-*.parquet,videos/.../file-*.mp4), and per-episode views are reconstructed from offset metadata inmeta/. The canonical marker ismeta/info.jsoncarryingcodebase_version: "v3.0". The older v2 layout used one Parquet and one MP4 per episode. -
RLDS is the episodes-of-steps schema that Open X-Embodiment, RT-X, and Octo train on. A dataset is a sequence of episodes; each episode is a sequence of steps; each step bundles an
observationand anactiontogether with the boundary flagsis_first/is_last/is_terminal, areward/discount, and alanguage_instructioncarrying the natural-language task. In the wild it ships as TFDS builder directories (features.json+*.tfrecordshards).
These aren't competitors so much as two angles on the same thing — one optimized for columnar batch loading, one for trajectory iteration. FoodforThought treats both as first-class because the people who use our data want to take it to whichever trainer they already run.
The way we keep them honest is that both formats are produced from, and parsed into, a single internal representation. Import reads a foreign dataset into a canonical in-memory dataset object; export writes that same object out. So a lerobot-v3 export and an rlds export of the same labeled dataset agree by construction — the same episode means the same thing whether you go on to train with the LeRobot loader or a tensorflow_datasets pipeline.
Installing the CLI
pip install foodforthought-cli
A couple of dependency notes that matter for this guide, because the CLI is deliberate about not dragging heavy frameworks into your environment:
- Reading LeRobot Parquet needs pyarrow (
pip install pyarrow). Pulling a dataset from the Hugging Face Hub by ID needs huggingface_hub (pip install huggingface_hub). - RLDS in FoodforThought is, by default, TensorFlow-free. The RLDS reader and writer operate on a pure-Python structure tree (
dataset_info.json+episodes/episode_*.json) — no TF import anywhere on that path. TensorFlow is only needed if you want to read native TFDS builder directories (the*.tfrecordform Open X-Embodiment usually ships in), and it's gated behind an optional extra:pip install 'foodforthought-cli[rlds]'. If you point the importer at a TFDS dir without that extra installed, you get a clear, actionable error instead of a hard dependency.
Importing a Hugging Face LeRobotDataset (v2 and v3)
The LeRobot import adapter handles both the legacy per-episode v2 layout and the canonical v3 layout, and it auto-detects which one you gave it. A v3 tree (a meta/info.json with codebase_version: "v3.0" plus chunked data/ shards) is read through the shared v3 codec; a v2 tree (meta_data/info.json plus data/episode_*.parquet) falls back to the per-episode parser. You don't have to declare which you have.
You can point it at a Hugging Face Hub dataset ID (owner/repo) or at a local directory:
# From the Hugging Face Hub
ate data import lerobot/pusht --source hf-lerobot --format lerobot --output ./import
# From a local LeRobot dataset directory
ate data import /path/to/local/lerobot_dataset --source hf-lerobot --format lerobot --output ./import
--source hf-lerobot selects the LeRobot adapter family. --format here picks the LeRobot variant: lerobot (a dataset, the default) versus lerobot-policy (model weights, if you want to pull a trained policy alongside the data). --output is where the converted recordings land.
Under the hood, each LeRobot episode becomes one FoodforThought recording: observation.state maps to joint positions, action maps to control inputs, the episode's MP4s are tracked as video paths, and the dataset's fps and robot_type ride along in metadata. Each recording is written out as a JSON file in your output directory, ready to get lineage and labeling on top of it.
One honest detail about source detection: owner/repo strings are treated as Hub IDs only when they don't already exist on disk, and absolute or ./-prefixed paths are always treated as local. So a local folder that happens to contain a slash won't get mistaken for a Hub download.
Importing Open X-Embodiment / RLDS
Open X-Embodiment is the largest open cross-embodiment corpus — the RT-X training data — and like the rest of the open ecosystem it's distributed as RLDS. The OXE adapter ingests it into the same canonical representation as the LeRobot path, so foreign OXE data gets lineage and labeling exactly the way your own recordings do.
# RLDS structure tree (dataset_info.json + episodes/) — no TensorFlow needed
ate data import /path/to/rlds_dir --source open-x-embodiment --output ./import
# TFDS builder directory (features.json + *.tfrecord) — needs the [rlds] extra
ate data import /path/to/tfds_builder_dir --source open-x-embodiment --output ./import
--source open-x-embodiment selects the OXE adapter, which auto-detects the layout. If it sees a dataset_info.json and an episodes/ directory, it uses the TF-free reader. If it sees a features.json or *.tfrecord shards, it routes to the TensorFlow-backed TFDS reader — and that's the only branch that requires the [rlds] extra. The boundary flags, per-step language_instruction, and observation/action vectors all come across; imported episodes are tagged with their provenance so lineage can group them later.
This is the part of the round trip I care most about getting right, because it's where "open standards" stops being a slogan. RLDS-native data flows in without you having to stand up a TensorFlow environment just to look at it.
Labeling: the value FoodforThought adds in the middle
Importing is only half the point. Once your demonstrations are in FoodforThought as recordings, they're in the platform's lineage system (raw → processed → labeled → trained skill) and available for annotation — including frame-exact action-segmentation labeling, where you mark reach / grasp / transport / place spans on the demonstration timeline with multi-annotator consensus for quality control. (If you want the deep dive on that specific workflow, we wrote a practical guide to action-segmentation labeling.)
The reason the round trip exists is so this labeling effort isn't trapped. You label inside FoodforThought, then you take clean, labeled trajectories back out to whatever you train on.
Exporting labeled recordings back out
Export has two modes. The one most relevant to a portable workflow is local export: point it at a directory of labeled FoodforThought recordings on disk and convert them to a standard training format, no server contact required.
# Export labeled recordings to LeRobotDataset v3
ate data export --from ./labeled_recordings --format lerobot-v3 --output ./export
# Export the same recordings to RLDS
ate data export --from ./labeled_recordings --format rlds --output ./export
--from is the local mode flag — it takes a directory (or a single JSON file) of recordings and walks every *.json under it in sorted order, so episode ordering is deterministic. Local export accepts lerobot-v3 and rlds (the aliases lerobot_v3 and lerobot also resolve to v3). On success the CLI reports the episode and frame counts it wrote.
The other mode is remote export by dataset ID, which downloads an export the platform produced for a dataset already on FoodforThought:
ate data export <dataset_id> --format rlds --output ./export
Where does the natural-language task label come from on export? The pipeline is lenient about where the instruction lives on a recording — it looks at skillParams.task (and instruction / tasks), then at metadata tags, then falls back to the skill ID — and any recording with no discoverable task is exported as task-less rather than silently being aliased to task index 0. In RLDS that surfaces as an empty language_instruction, which is the Open X-Embodiment convention for "no language label." This is a small thing that matters a lot: it means an import → label → export → re-import loop doesn't invent tasks that were never there.
Because both exports are built from the one canonical dataset object, the v3 and RLDS outputs of the same labeled recordings are consistent with each other. The RLDS writer also validates its own output by default — it checks that dataset_info.json matches the episode count and that every episode's boundary flags are well-formed (exactly one is_first at the start, one is_last / is_terminal at the end) before it hands you the directory.
The honest limits
A guide that only lists the happy path isn't useful to engineers, so here are the edges:
- Video is not transcoded. Both codecs operate on frame-level tabular data (states, actions, timestamps, indices, tasks). Existing MP4 shards are tracked and referenced, but the export path does not decode or re-encode video streams. If your training pipeline needs re-chunked video shards, that's a step beyond what these commands do today.
- RLDS structure vs. TFDS. The default RLDS path is a structure tree, not a
.tfrecordbuilder. It's directly readable and trivially convertible into atf.data/tensorflow_datasetspipeline, but if you specifically need native TFDS shards on the write side, that lives behind the optional[rlds]extra and is a thinner path than the structure writer. - This is early and open-source. FoodforThought is researcher-first and the CLI is open. We'd genuinely rather hear that an
owner/repodataset didn't import cleanly than have you assume it's all polished. File it; that's the flywheel working.
I'm deliberately not quoting throughput numbers or dataset sizes here, because the honest answer is "it depends on your data," and a fabricated benchmark would undercut the whole point of a no-lock-in pitch.
Why this is the bet
Robot data and tooling are consolidating fast into a few foundation-lab and hyperscaler orbits. Kindly's deliberate position is the opposite: a neutral, interoperable data layer built on the standards the field already uses — LeRobotDataset, RLDS / Open X-Embodiment, and the broader open stack (ROS 2, URDF/Xacro, MCAP). The CLI is open-source so that adopting FoodforThought never traps a team, and the platform has a free tier so you can start without a commitment. The flywheel we want is for open robot learning, not a single vendor's silo.
A round trip is the cleanest test of whether a platform actually believes that. If your data can come in as LeRobot v3 or RLDS, get labeled, and leave as LeRobot v3 or RLDS, then the platform earned its place in your pipeline by being useful in the middle — not by holding your data hostage at the exits.
Try it
The fastest way to feel the labeling half of this loop is the guided try-flow — label a robot demonstration in under two minutes, no account needed:
→ Try labeling a robot demo now
When you're ready to run the round trip on your own data, pip install foodforthought-cli, point ate data import at a Hugging Face LeRobot dataset or an Open X-Embodiment RLDS directory, and explore the rest of the platform at kindly.fyi/foodforthought. Bring your data in. Take it back out. That's the deal.