Why kinodb?
Robot learning has a data infrastructure problem hiding in plain sight. The models are increasingly general, but the datasets they train on are still split across incompatible storage systems: HDF5 for robomimic and LIBERO, Parquet plus media files for LeRobot, TFRecord for RLDS and Open X-Embodiment, Zarr or raw folders inside individual labs, and custom Python loaders around all of it.
The result is a familiar loop: every lab writes another loader, every benchmark script encodes another schema assumption, and every mixed-dataset training run becomes a pile of brittle conversion code.
The Pain Is Real
Section titled “The Pain Is Real”The original kinodb blueprint started from concrete public complaints and published systems results, not an abstract dislike of file formats.
| Source | What it exposed | Why it matters |
|---|---|---|
| LeRobot issue #1623 | SmolVLA training spending more time in the dataloader than backprop | Data loading can dominate wall-clock training even when the model code is fine |
| LeRobot issue #1346 | Whole-dataset memory pressure; GR00T-style fine-tuning requiring hundreds of GB for images | ”Just load it into RAM” breaks down at robotics scale |
| LeRobot issue #2446 | Many LIBERO variants on the Hub across format versions | Format drift creates duplicated datasets and loader incompatibility |
| LeRobot issue #1434 | Recording-time video encoding bottlenecks | Data infrastructure affects collection, not only training |
| Robo-DM, ICRA 2025 | RLDS reported as dramatically larger per episode than necessary; LeRobot slower than optimized loading | Robotics data tooling has measurable systems overhead |
| RLDS / OXE practice | TensorFlow dependency and underdocumented conventions | PyTorch-heavy robotics stacks pay an integration tax |
These issues all point at the same missing layer: a database engine that understands trajectories as trajectories.
Why Existing Formats Fall Short
Section titled “Why Existing Formats Fall Short”| Format | Strength | Robotics weakness |
|---|---|---|
| HDF5 | Mature hierarchical binary format with efficient array access | No trajectory query language, no standard episode schema, awkward concurrent workflows |
| LeRobot Parquet | HuggingFace-native ecosystem and tabular metadata | Parquet is optimized for column analytics, not random episode reads |
| RLDS / TFRecord | Standardized reinforcement-learning episode representation | Sequential access, TensorFlow dependency, high storage overhead in common deployments |
| Zarr | Chunked N-dimensional arrays | Useful storage primitive, but no built-in robot episode abstraction |
| Custom folders | Easy to start | No portability, no schema validation, no shared query or mixing semantics |
None of these formats are “bad.” They were built for different jobs. kinodb’s bet is that robot learning deserves an episode-first database layer that can ingest from all of them.
The kinodb Idea
Section titled “The kinodb Idea”kinodb is an embedded trajectory database written in Rust. It stores a dataset as a single .kdb file with:
- a fixed-size file header,
- contiguous per-episode payloads,
- length-prefixed metadata,
- packed
f32state/action arrays, - optional image payloads with compressed-image pass-through,
- an end-of-file episode index for O(1) lookup.
The workflow is deliberately simple:
# HDF5: robomimic, LIBERO, DROID-style data.kino ingest data.hdf5 --output data.kdb --format hdf5 --embodiment franka
# LeRobot v2/v3 directory.kino ingest ./lerobot_pusht --output pusht.kdb --format lerobot
# RLDS / TFRecord directory.kino ingest ./bridge_rlds --output bridge.kdb --format rlds --embodiment widowxAfter ingest, every dataset has the same read path:
kino info data.kdbkino schema data.kdbkino query data.kdb "embodiment = 'franka' AND success = true"import kinodb
db = kinodb.open("data.kdb")hits = db.query("task CONTAINS 'pick' AND num_frames > 50")ep = db.read_episode(hits[0])What kinodb Changes
Section titled “What kinodb Changes”One API after ingest
Section titled “One API after ingest”HDF5 demos, LeRobot episodes, and RLDS TFRecord steps become the same logical object: metadata plus frames. Training code no longer has to branch on h5py, pyarrow, and TensorFlow parsing rules.
Episode-first access
Section titled “Episode-first access”Robot training typically samples episodes or windows inside episodes. A Parquet scan is excellent when the question is “read one column across millions of rows.” A trajectory database wants “give me episode 817 and its metadata now.” kinodb puts that access pattern in the file layout.
Query and mixing as first-class operations
Section titled “Query and mixing as first-class operations”KQL exists because every robotics project eventually wants filters like:
kino query bridge.kdb "success = true AND task CONTAINS 'drawer'"kino merge *.kdb --output successful.kdb --filter "success = true"kino mix --source bridge.kdb:0.4 --source aloha.kdb:0.6 --sample 1000Native HDF5, Parquet, and TFRecord do not provide the shared trajectory metadata semantics needed to make this uniform.
Rust engine, Python interface
Section titled “Rust engine, Python interface”The storage and parsing paths live in Rust for predictable performance, memory-mapped I/O, and a single CLI binary. Python remains the user-facing training interface through PyO3, NumPy arrays, and PyTorch datasets.
Current Results
Section titled “Current Results”The benchmark history behind the launch has three important claims:
| Claim | Evidence recorded in the benchmark history |
|---|---|
| Conversion preserves data | 15/15 datasets exact match; robomimic correctness issue was traced to benchmark-side lexicographic sorting and fixed |
| Metadata operations become cheap | Tabular median metadata scan speedup: 375x; image datasets: 605-2,648x |
| Episode access improves on the target workload | Tabular median random read speedup: 8.6x; image random reads: 3.3-20x once JPEG pass-through was working |
The honest framing is important: kinodb is not claiming to beat HDF5 at every raw array read. HDF5 is extremely good at direct array access. kinodb’s contribution is the unified trajectory layer: indexed episodes, metadata queries, cross-format mixing, validation, and a Python training bridge.
Current Scope vs Roadmap
Section titled “Current Scope vs Roadmap”Implemented now:
.kdbreader/writer with memory-mapped reads.- HDF5 ingest for robomimic/LIBERO-style
data/demo_*files. - LeRobot v2/v3 Parquet ingest, including action/state list columns and image struct payloads.
- RLDS TFRecord parser without a TensorFlow runtime dependency.
- KQL metadata filters.
info,schema,validate,query,mix,merge,export, andbenchCLI commands.- PyO3 Python bindings and PyTorch dataset helpers.
- gRPC server and Python client.
Roadmap:
- Window-level frame sampling instead of whole-episode reads.
- Raw compressed image return path and lazy image decode.
- More complete video segment indexing.
- Shared-memory serving for single-node high-throughput training.
- Published wheels and packaged CLI releases.