Hey all,
We’re thrilled to introduce mcap-etl, an open-source package that allows you to transform MCAP files into a database.
Installation
Installation is straightforward with pip:
pip install mcap-etl
The Challenge
Working with MCAP (or ROS Bag) files presents two significant challenges:
-
Storage Consumption: These files can quickly balloon in size, often necessitating on-demand downloads from cloud storage to avoid using excessive local disk space.
-
Serialized Structure: The serialized format of these files, while useful for controlling file size, requires developers to create custom scripts for each use case or build complex cloud infrastructure with custom data parsing and extraction pipelines.
Our Solution
We designed mcap-etl to alleviate these problems, starting with a transformation pipeline from MCAP to TimescaleDB. This enables you to run any time-series based queries on your ROS data. For every topic, our package creates a table, and for every message, it writes a record.
Previously, if you wanted to plot battery voltage over time for your robot in Grafana, you would need to write a specialized ETL job to populate a database. Now, you can connect to Timescale in Grafana, and run a query like so:
SELECT time_bucket('30 seconds', ts) AS bucket_time, AVG(voltage)
FROM battery_state
WHERE $__timeFilter(ts)
GROUP BY bucket_time
ORDER BY bucket_time;
We’re welcome additional transformations as suggested by the ROS community.
Additionally, we’re developing a hosted solution offering managed services for data ingestion, database management, and infrastructure for integrations including S3 and Grafana. This service will also provide tools for converting data back from Timescale to .mcap and .bag formats, and a web interface to monitor and share data with your team.
For more detailed information on installation and usage, please visit our GitHub README.
We eagerly await your feedback, suggestions, and contributions!