KQL Queries
KQL is the small metadata query language built into kinodb-core. It is intentionally narrow: filter episodes by fields needed for robot dataset selection, curriculum construction, validation, and mixed-source training.
Syntax
Section titled “Syntax”<field> <operator> <value> [AND <field> <operator> <value> ...]Examples:
kino query data.kdb "success = true"kino query data.kdb "embodiment = 'franka' AND num_frames > 50"kino query data.kdb "task CONTAINS 'pick' AND fps >= 10.0"kino query data.kdb "total_reward != null" --limit 25Fields
Section titled “Fields”| Field | Type | Example |
|---|---|---|
embodiment | string | embodiment = 'franka' |
task | string | task CONTAINS 'drawer' |
success | bool or null | success = true |
num_frames | int | num_frames >= 100 |
action_dim | int | action_dim = 7 |
fps | float | fps >= 10.0 |
total_reward | float or null | total_reward > 0.5 |
task maps to EpisodeMeta.language_instruction in Rust and to meta["task"] in Python.
Operators
Section titled “Operators”| Operator | Meaning | Applies to |
|---|---|---|
= | equals | all fields |
!= | not equals | all fields |
> | greater than | numeric fields |
< | less than | numeric fields |
>= | greater than or equal | numeric fields |
<= | less than or equal | numeric fields |
CONTAINS | substring match | string fields |
KQL currently supports AND. OR, parentheses, projections, and joins are intentionally out of scope for the current implementation.
Values
Section titled “Values”'single quoted string'"double quoted string"bare_stringtruefalsenullnone12312.5Bare words are accepted as strings:
kino query data.kdb "embodiment = franka"For launch docs and scripts, quote strings anyway. It makes examples easier to read.
kino query data.kdb "success = true AND task CONTAINS 'pick'"kino query data.kdb "num_frames > 100" --limit 10The command prints matching episode positions and metadata. Positions are zero-based and can be passed to Python read_episode(position).
Python
Section titled “Python”import kinodb
db = kinodb.open("data.kdb")positions = db.query("success = true AND num_frames > 100")
for pos in positions[:5]: meta = db.read_meta(pos) print(pos, meta["task"], meta["num_frames"])Training Filters
Section titled “Training Filters”KQL filters can be used directly in KinoDataset:
from kinodb.torch import KinoDataset
dataset = KinoDataset( "data.kdb", kql_filter="success = true AND action_dim = 7",)Merge Filters
Section titled “Merge Filters”Create a smaller physical dataset:
kino merge raw.kdb --output successful.kdb --filter "success = true"This is useful when distributing a curated dataset split.
Performance Model
Section titled “Performance Model”KQL works by parsing the expression into a small AST and scanning episode metadata with read_meta. It avoids decoding frames and images. That is why benchmark metadata scans are the strongest win: native HDF5/Parquet/RLDS loaders usually have to walk their own source structures, while .kdb keeps episode metadata addressable through the index.
Recorded benchmark summary:
| Dataset class | Metadata scan result |
|---|---|
| 10 tabular datasets | Median 375x faster, range 48-612x |
| 5 image datasets | 605-2,648x faster |
Parser Errors
Section titled “Parser Errors”Common errors:
| Error | Cause |
|---|---|
empty query | The string is empty or whitespace |
unknown field | Field is not one of the supported KQL fields |
expected operator | Missing =, !=, comparison, or CONTAINS |
unterminated string | Missing closing quote |
expected AND | KQL currently only supports AND between conditions |