What exactly do you mean by “motion vector” here? There are a lot of ways to describe motion but I think what’s going on here is that you are getting vectors of objects relative to screen pixel space (i.e. how much something moves on the screen). A lot of the mpeg encoders calculate some really basic heuristics that allow them to swap out compression techniques based on the underlying activity in a video and I think this is what they mean by “motion vectors.” From what I read into the docs this is probably a really really rudimentary calculation.

I don’t think this is going to particularly useful for odometry as you’re going to get a lot of false positives and a really noisy signal. I think you might be able to use this data for some really primitive signals (like is the robot moving or rotating) but I don’t think it would be suitable for visual SLAM even with an IMU. You might also be able to pull out some really basic brightness and color information from the mpeg encoder which you could use for something like line following or ball tracking, but that’s a bit of a conjecture.

Cool find though! That’s a really useful set of tools.