Skip to content

PyTorch Training

The Python bindings expose a Rust-backed Database plus PyTorch helpers in kinodb.torch.

Terminal window
cd crates/kinodb-py
maturin develop --release
cd ../..
python -m pip install torch numpy
import kinodb
db = kinodb.open("data.kdb")
print(db.num_episodes())
print(db.num_frames())
print(db.version())
print(db.summary())
meta = db.read_meta(0)
episode = db.read_episode(0)

read_episode() returns:

KeyType
metadict
actionsNumPy float32, shape (T, action_dim)
statesNumPy float32, shape (T, state_dim)
rewardsNumPy float32, shape (T,)
is_terminallist of bool
imagesdict of camera name to NumPy uint8, shape (T, H, W, C)

For low-dimensional training on image-heavy datasets:

episode = db.read_episode_actions_only(0)

This skips image decoding and transfer.

KinoDataset is an alias for KinoMapDataset.

from kinodb.torch import KinoDataset
from torch.utils.data import DataLoader
dataset = KinoDataset(
"data.kdb",
kql_filter="success = true",
image_key="front",
image_size=(224, 224),
to_tensor=True,
)
loader = DataLoader(dataset, batch_size=8, shuffle=True)

Sample keys:

KeyShape
action(T, action_dim)
state(T, state_dim)
reward(T,)
image(T, C, H, W) when images exist
taskstring
embodimentstring
successbool or None
episode_idint

Use KinoIterDataset for streaming or multi-source mixtures:

from kinodb.torch import KinoIterDataset
from torch.utils.data import DataLoader
dataset = KinoIterDataset(
"data.kdb",
kql_filter="num_frames > 50",
shuffle=True,
seed=42,
)
loader = DataLoader(dataset, batch_size=4)
from kinodb.torch import from_mixture
from torch.utils.data import DataLoader
dataset = from_mixture(
{
"robomimic_lift.kdb": 0.3,
"pusht.kdb": 0.4,
"aloha.kdb": 0.3,
},
seed=42,
image_size=(224, 224),
)
loader = DataLoader(dataset, batch_size=4)

Weights are relative. They do not need to sum to one.

The current helper picks one camera:

dataset = KinoDataset("data.kdb", image_key="agentview_image")

If image_key is omitted, the first camera in the episode image dictionary is used.

Resize is nearest-neighbor and dependency-free:

dataset = KinoDataset("data.kdb", image_size=(128, 128))

The returned image tensor is normalized to [0, 1] and transposed to (T, C, H, W).

The current KinoMapDataset opens the database in __init__. For multiprocessing-heavy workloads, prefer one of these patterns:

# Simple path: single-process or low worker count.
loader = DataLoader(KinoDataset("data.kdb"), batch_size=4, num_workers=0)

For robust multi-worker training, use a dataset wrapper that stores the path and opens kinodb.open() lazily inside each worker. This pattern came up in the experiment work because database handles are not always pickle-safe across worker processes.

class LazyKinoDataset:
def __init__(self, path):
self.path = path
self.db = None
def _reader(self):
if self.db is None:
import kinodb
self.db = kinodb.open(self.path)
return self.db
def __len__(self):
return self._reader().num_episodes()
def __getitem__(self, idx):
return self._reader().read_episode_actions_only(idx)

The training benchmark scripts originally measured the wrong thing by preloading both native and .kdb data into NumPy arrays before timing training. Once everything is in memory, the data source no longer matters. The corrected benchmark measures disk access inside the training/inference loop.

That distinction is the launch lesson: kinodb’s advantage shows up when the benchmark includes the data loading path that real large-scale training has to pay.