We have been discussing the architecture of the perception pipeline in issue
#1501. However, the discussion stopped
before it converged.
The current perception architecture does not contain the information necessary for planning to
avoid obstacles or stop in front of an obstacle. These features were not considered when the architecture
was developed. Based on these needs and other requirements from the planning layer, we would like
to propose the following message types for use in the perception pipeline.
We tried to be as broad as we could with the Object and ObjectWithCovariance types in derived_object_msgs to cover as many possible sensor-specific inputs as we could with generic structures (like geometry_msgs/Pose and sensor_msgs/SolidPrimitive) but we are always open to feedback if these donât cover a specific need.
What would be our next step is what you did, which is the data modeling step.
From a quick glance there is nothing in your proposal that is not already included in http://wiki.ros.org/derived_object_msgs so I suggest to include messages from AS.
I third using the AutonomousStuff messages. They have what we need, and while I can see a couple of things Iâd consider making more exact I donât see any problems, not even small ones, in using them.
Current message type is made to be able to cover various information, and a lot of information can be defined. This includes not only the requirements for planning but also the data specific to algorithms and sensors.
The current definition has some issues.
Issue 1: A large number of algorithm-specific data makes it difficult to define interface and modularize perception. The interface Iâm talking about here is what information is filled in with detection and tracking and which information should be filled out
Issue 2: Unstructured and redundant message type
Issue 3: Missing information required for planning
Descriptions
This time, we defined the message type from the information required for dynamic object (Something that can move such as a pedestrian, car, truck, etc.).
The details are written here.
As for paths, it is still undefined and will be added as we discuss in the future.
About derived_object_msgs
derived_object_msgs is made with the same idea as the current message type as @JWhitleyWork said.
thank you @Kosuke_MURAKAMI.
As a side note, I think that the label specific shape of AS msg is No. It can not switch between polygon and SolidPrimitive. It can switch only in SolidPrimitive (Cone, sphere, bounding box, cylinder). This is also important from the point of view of multi object tracking and path planning.
@JWhitleyWork
Who uses detection label and classification age for what?
If we add items widely, when we modularize detection, tracking, and prediction, these modules will have to fill in these items. Some algorithms may not be filled. I think that it is better to define only what is really necessary in msg.
A Detected object is one which has been seen in at least one scan/frame of a sensor.
A Tracked object is one which has been correlated over multiple scans/frames of a sensor.
An object which is detected can only be assumed to have valid pose and shape properties.
An object which is tracked should also be assumed to have valid twist and accel properties.
The validity of the individual components of each object property are defined by the property's covariance matrix.
classification_age indicates the number of âscansâ or âdetectionsâ made of the object where the classification type is the same. When a sensor classifies an object, it usually tells you how many âscansâ of that object have been sent since the object was classified as that type. This helps determine the certainty of the classification.
According to your experience in robotics and autonomous vehicles. I would like to hear your opinions @JWhitleyWork@Dejan_Pangercic and @sgermanserrano about this messsage definition not including sensor data (i.e. ImageROI, PointCloud)?
Do you think it is necessary? Or do you think itâs better to keep it like this to add an abstraction layer?
Thanks
@amc-nu I think this is based on your intent for the message. If you intend to follow a âdomain-specific-controllerâ approach, then you are trusting that the individual sensor processing nodes know how to correctly filter the raw data and produce abstracted objects for the most part. The uncertainty is then encoded into the covariance matrix and the classification quality data.
However, if you intend to do either data fusion before object segmentation or fusion and segmentation in the same node, you would need the raw data in the message as well. My understanding is that Autoware is shooting for the first approach so I would say we donât need the raw data.
Agreed with @JWhitleyWork comment, I would expect the raw data to be processed on a separate step so that the filtered sensor data is in a usable stage. I think it also makes sense in a setup where you might have an edge device on the sensor that is pre-processing the raw data for consumption by higher level nodes.
I think this work is being blocked by a lack of a shared understanding of what it is we want to achieve. What objects do we want to recognise, where do we want to recognise them, what sorts of data do we want to use, what data rates, should data be synchronised or can information be added on to a detection after the fact, do we or do we not use consecutive detections to strengthen an objectâs presence, how interchangeable/optional do we want different algorithms and detection types to be, and so on. There are a huge number of unanswered questions that need to be defined and then answered before we can even begin to think about the messages used.
In other words, we need to define our requirements before we try to solve them. Otherwise we are solving an unknown or undefined problem.
We also need to keep in mind that we are designing Autoware for all Autoware users, not just for Tier IVâs favourite sensor set, or AutonomousStuffâs specific demonstration. Iâm not saying that that is what is happening, but it is easy to forget.
Additionally, I think it would be useful to draw up a list of:
The different types of sensors we expect to be used. Not just ones we use now, but also ones a potential Autoware user might use.
The different types of data we might process. Obviously this closely relates to the sensors used, but donât forget using post-processed data as an input to an algorithm, e.g. merged dense point clouds versus individual sparse point clouds, or point clouds with or without RGB data added from a camera.
The object locating, object identifying, object tracking, object predicting, etc. algorithm types that we might use.
@amc-nu in my experience you need to decide between performance and synchronization. That is if you do not have many nodes and you have fast middleware, then you can use message types that also include raw data. If you have the opposite case then you should go with small messages.
If you split your message types too much you will have to deal with the time synchronization once the messages received by the end node. That is both hard to do and computationally expensive.
In any case I believe that we should finish the computational graph architecture first (how many nodes and composition) and then define the messages and not the other way around. I assume that AS has a solid computational graph architecture that let them define such messages.
PoseWithCovariance[] past_paths is removed from DynamicObject since planning is not interested in past information. Though prediction will require past information, it can be resolved inside prediction.
I think the reasons of DynamicObject are reasonable since I created the table based on the hearing My opinion is already reflected into the table, e.g. object_classified and past_paths are not necessary.
@JWhitleyWork Sorry to bother you again. Would you comment about the different information? Iâm not familiar with the background of ObjectWithCovariance.
@kfunaoka Iâm sorry it has taken so long to get back to you. Here are responses addressing your issues:
The object does have a header field. However, it does not need to be populated and ObjectWithCovarianceArray also has a header.
The types listed for âclassificationâ are not exhaustive nor definitive. As far as I know, the message type has not been extensively used so it is open to modification. We have only done tests with it internally and have not released any packages that use it. I agree with your assessments for the âUNKNOWN_â types. They were types provided by a sensor vendor so we included them.
Not a problem to change the classification_certainty to a float (0-1).
Many algorithms use a convex hull bounding area to define an object. This is why âshapeâ was included. There is also geometry_msgs/Polygon polygon in the message for defining a non-normal polygons.
Regarding the rest of the comments: I think the overall concept is that our message structure for these is flexible. We can add or modify just about anything in the message, though I would prefer not to remove much (if any) of the fields.