Working with large ROS bag files on Hadoop and Spark

raysayandeep · April 10, 2019, 2:20pm

Hi Jan,

I am using Azure HDInsight Spark cluster to extract data from rosbag files using RosbagInputFormat. I have followed the readme file. While running the code in pyspark I am getting the following error,

It is not able to read the idx file from local system.

Using Python version 2.7.12 (default, Jul  2 2016 17:42:40)
SparkSession available as 'spark'.
>>> sc.newAPIHadoopFile(
...     path =             "/user/spark/HMB_4.bag",
...     inputFormatClass = "de.valtech.foss.RosbagMapInputFormat",
...     keyClass =         "org.apache.hadoop.io.LongWritable",
...     valueClass =       "org.apache.hadoop.io.MapWritable",
...     conf = {"RosbagInputFormat.chunkIdx":"/opt/ros_hadoop/master/dist/HMB_4.bag.idx.bin"})
[Stage 0:>                                                          (0 + 1) / 1]19/04/10 14:16:35 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, wn1-avdp-h.cfzrwlyaxyyuvies4sglc0tsud.cx.internal.cloudapp.net, executor 5): java.io.FileNotFoundException: /opt/ros_hadoop/master/dist/HMB_3.bag.idx.bin (No such file or directory)
        at java.io.FileInputStream.open0(Native Method)

Could you please help me with that?

Thanks,
Sayandeep

Topic		Replies	Views
Evaluation of robotics data recording file formats ROS General ros2 , dds , rosbag , rosbag2	27	5751	February 11, 2022
Announcing ROS binary_logger package ROS General	1	897	February 9, 2017
ROSBag Data Management for Robotics Projects Projects data	1	2226	July 10, 2024
Rosbag Backward Compatibility ROS General	8	4065	November 24, 2018
Time series database for ROS Projects	2	1435	January 23, 2023

Working with large ROS bag files on Hadoop and Spark

Related topics