Skip to content

File Format

The current .kdb format is a single little-endian file with a 64-byte header, contiguous episode payloads, and a fixed-size episode index at the end.

byte 0
┌────────────────────────────┐
│ FileHeader │ 64 bytes
├────────────────────────────┤
│ Episode 0 metadata blob │
│ Episode 0 state/action data│
│ Episode 0 image data │
├────────────────────────────┤
│ Episode 1 metadata blob │
│ Episode 1 state/action data│
│ Episode 1 image data │
├────────────────────────────┤
│ ... │
├────────────────────────────┤
│ EpisodeIndex │ N x 64 bytes
└────────────────────────────┘

The writer reserves the header first, writes all episodes, writes the index, then seeks back to fill the final header with counts and offsets.

FileHeader is exactly 64 bytes.

OffsetSizeFieldDescription
04magicKINO
42version_majorcurrently 0
62version_minorcurrently 1
88num_episodesepisode count
168num_framestotal frames
248index_offsetbyte offset to index
328index_lengthbyte length of index
408created_timestampUnix seconds
4816reservedzeroed

Readers reject:

  • files shorter than 64 bytes,
  • bad magic bytes,
  • newer major versions.

Newer minor versions are accepted.

Each IndexEntry is exactly 64 bytes.

OffsetSizeFieldDescription
08episode_idassigned sequentially by writer
84num_framesframes in episode
122action_dimaction vector dimension
142state_dimstate vector dimension
168actions_offsetbyte offset to state/action section
248actions_lengthbyte length of state/action section
328images_offsetbyte offset to image section
408images_lengthbyte length of image section
488meta_offsetbyte offset to metadata blob
568meta_lengthbyte length of metadata blob

The index is stored at the end so appending episodes during write does not require knowing final offsets ahead of time.

The current metadata encoding is deliberately simple:

u16 embodiment_len
u8[embodiment_len] embodiment_utf8
u16 task_len
u8[task_len] task_utf8
f32 fps
u8 success # 0 unknown, 1 false, 2 true
u8 reward_present # 0 absent, 1 present
f32 total_reward # only if reward_present = 1

task is called language_instruction in the Rust type.

For each frame:

f32[state_dim] state
f32[action_dim] action
f32 reward
u8 is_terminal

The writer checks every frame’s action length against action_dim. Empty episodes are rejected.

For each frame:

u16 num_cameras
for each camera:
u16 camera_name_len
u8[camera_name_len] camera_name_utf8
u32 width
u32 height
u8 channels
u8 format # 0 raw, 1 compressed
u32 data_len
u8[data_len] data

When format >= 1, the reader attempts to decode JPEG/PNG bytes to raw RGB. If decode fails for a frame image, that image is skipped.

OperationReads
Open fileheader and index through mmap
read_meta(i)metadata blob only
read_episode(i)metadata, state/action section, image section
read_episode_actions_only(i)metadata and state/action section

The current format is versioned as 0.1. Compatibility rules are conservative:

  • future major versions should be rejected by old readers,
  • future minor versions may be accepted,
  • index entries keep offsets and lengths, so metadata/image encodings can evolve behind those boundaries.