AI RESEARCH

Modeling Robotics Dataset Construction as an Artifact-Based Build Process

arXiv CS.LG

ArXi:2606.00162v1 Announce Type: cross Robotic systems generate large volumes of multimodal sensor data, but converting ROS bag recordings into machine learning datasets is often handled by ad hoc sequential scripts, creating engineering overhead and slow iteration cycles. We model dataset construction as an artifact-based build process over a dependency graph and implement this approach in Bagzel, an open-source Bazel extension for reproducible, incremental dataset generation (including nuScenes-format export.