Skip to main content

Dataset

python scripts/setup/download_assets.py --only data

Then train directly with the shard directory:

python train_mimic/scripts/train.py --motion_file data/datasets/seed/train

For custom dataset construction, read on.


Custom Dataset Construction

Data pipeline: typed source YAML -> preprocess/filter -> shard-only training data

python train_mimic/scripts/data/build_dataset.py \
--spec train_mimic/configs/datasets/twist2_full.yaml

Output Structure

data/datasets/<dataset>/
├── clips/ # Optional; only for per-clip intermediates
│ └── <source>/...
├── train/
│ └── shard_*.npz
├── val/
│ └── shard_*.npz
├── manifest_resolved.csv
└── build_info.json
  • If the spec contains bvh or npz sources, the builder retains/generates clips/
  • If the spec is all pkl or seed_csv sources, the builder takes a batch path producing split-level shards directly

YAML Spec Format

Example (train_mimic/configs/datasets/twist2_full.yaml):

name: twist2_full
target_fps: 30
val_percent: 5
hash_salt: ""
preprocess:
normalize_root_xy: true
ground_align: clip_min_foot
sources:
- name: OMOMO_g1_GMR
type: pkl
input: data/twist2_retarget_pkl/OMOMO_g1_GMR
- name: lafan1_v1
type: bvh
input: data/lafan1_bvh
bvh_format: lafan1

Field Reference

FieldDescription
nameDataset name, maps to output directory
target_fpsTarget frame rate for resampling
val_percentValidation split percentage (hash-based on clip_id)
hash_saltOptional split salt
preprocess.normalize_root_xyNormalize root body first-frame xy to origin
preprocess.ground_alignnone / clip_min_foot
preprocess.min_framesMinimum clip length
preprocess.max_root_lin_velRoot linear velocity filter threshold
preprocess.min_peak_body_heightMinimum peak body height
preprocess.max_all_off_ground_sMax duration all feet off ground
sources[].nameSource name (used for clips subdirectory)
sources[].typebvh / pkl / npz / seed_csv
sources[].inputInput file or directory
sources[].weightOptional sampling weight (default 1.0)
sources[].bvh_formatRequired for BVH: lafan1 / hc_mocap / nokov
sources[].robot_nameBVH only, default unitree_g1
sources[].max_framesBVH only, 0 = full length

Conversion Rules

All sources are converted to standard training shards. Each clip goes through preprocess/filter before writing to shards:

  • bvh -> retarget pkl -> npz clip
  • pkl -> npz clip (or direct batch shard for pkl-only datasets)
  • npz -> validate + copy/reuse

Each shard contains: clip_starts, clip_lengths, clip_fps, clip_weights.

Common Commands

# Force rebuild
python train_mimic/scripts/data/build_dataset.py \
--spec train_mimic/configs/datasets/twist2_full.yaml --force

# Parallel processing
python train_mimic/scripts/data/build_dataset.py \
--spec train_mimic/configs/datasets/twist2_full.yaml --jobs 8

# Custom output root
python train_mimic/scripts/data/build_dataset.py \
--spec train_mimic/configs/datasets/twist2_full.yaml \
--output_root /tmp/my_datasets

# Print build report
python train_mimic/scripts/data/build_dataset.py \
--spec train_mimic/configs/datasets/twist2_full.yaml --json

Batch Ingest to NPZ Clips

Convert raw data to standard NPZ clips without merging:

python train_mimic/scripts/data/ingest_motion.py \
--type bvh --input data/lafan1_bvh \
--output data/datasets/lafan1_v1/clips/lafan1_v1 \
--source lafan1_v1 --bvh_format lafan1 --jobs 8

Check Clip FK Consistency

python train_mimic/scripts/data/check_motion_npz_fk.py \
--npz data/datasets/<dataset>/clips/<source>/<clip>.npz

Recommended thresholds: pos_max < 1e-3 m, quat_mean < 0.05 rad, quat_p95 < 0.10 rad.

Re-shard

Split large shards for distribution:

python train_mimic/scripts/data/split_shards.py \
--input data/datasets/seed/train \
--output data/datasets/seed/train_small_shards \
--max_size_gb 2

Each shard is self-contained with full clip metadata.