Announcing the Hardware Acceleration WG, meeting #1

Hi @vmayoral,

Great news! We are very interested in knowing more about the Hardware Acceleration WG initiatives.
We are hosting the Embedded WG focused on micro-ROS features and use cases, and for sure there will be simliar nature interests within the ecosystem.

Happy to read that the meeting will be recorded!

1 Like

Thanks @mamerlan, meeting is happening today. I know it’s a bit late in Spain but provided I don’t mess up out of being tired, recordings should be out soon after the meeting. We’ll keep minutes nevertheless :slight_smile: .

As for alignment with Embedded WG, I’m sure you know I’m quite tuned to micro-ROS. I look forward to cooperate. Here’re some early thoughts for alignment between the HAWG and the EWG:

  • it’d be interesting to run micro-ROS Client in either the R5s or into a soft-core (e.g. in a MicroBlaze)
  • it’d be interesting to offload the agent into Programmable Logic (the FPGA)
    • which obviously leads to consider pushing part of FastDDS also to the PL

We’ve got a few more ideas involving other architectures for mixed-criticality and link layers that allow for real-time distributed communications (e.g. TSN).

2 Likes

Thanks everyone for a fantastic first meeting!

We had a successful kick-off of the ROS 2 Hardware Acceleration Working Group where we discussed how we can make ROS 2 faster through FPGAs and GPUs with open technologies, initially focusing on C++ and OpenCL. We registered 27 participants initially and reached more than 30 (I’m being told) during the course of the meeting. This was a first for me in a WG kick-off session. We are very excited and thankful for the interest and the support messages received. Looking forward to continue contributing!

Here’s a summary of the resources discussed:

  • Minutes
  • ros-acceleration Github organization
  • KV260 kit as reference hardware platform
  • source code contributed:
    • ament_vitis, CMake macros and utilities to include Vitis platform into the ROS 2 build system (ament) and its development flows.
    • Xilinx Runtime (XRT), an open-source standardized software interface that facilitates communication between the application code and the accelerated-kernels
  • and finally, the video recording the session:

For those of you that don’t have the time to watch the whole piece, see below for a chapter-ed split:

0:00 Introduction
12:25
Objectives, rationale and hardware reference platforms
16:40
Initial hardware acceleration architecture
22:10
Short demonstration
26:44
Community hardware platforms
29:48
Q&A
43:00
Final remarks

We’ll be following up shortly on a number of the topics discussed over the call. Stay tuned for the next meeting and enjoy summer time :beach_umbrella: :sun_with_face: !

3 Likes

As a quick update, following the community ask, I ordered today an Ultra96-v2 board from Farnell only to find that the lead time is 16 weeks :slightly_frowning_face:!

@Pedro, @jopequ, @Joe_Dinius, do any of you have any advice for me on how to get my hands into one faster?

Looks like it is in stock at Avnet - Avnet: Quality Electronic Components & Services

1 Like

Thanks @Joe_Dinius, I was hoping to avoid that since I’m in Europe … but I found no better option so that’ll do I guess. Will cancell my order with Farnell and go with Avnet US.

In case someone bumps on it as well, there seem to be also some delays reported. Hopefully it’s just that:

1 Like

We got the last batch to Finland from Farnell in December last year so no experience with other sources, sorry :confused:

1 Like

Ultra96-v2 is a flagship 96Boards - let me know if you need help so I can forward you to Xilinx and Avnet folks directly.

For full disclosure, I am responsible for 96Boards specifications which Ultra96-v2 is built to be compliant.

Yang

@YangZ having someone from Avnet look at this would be awesome. Specially, I think they’ll benefit a lot if they could commit some engineering resources :wink: . We at Xilinx are already :slight_smile: doing so.

PM me if you need further details.

Didn’t realise Xilinx fellow here - I know you guys are in :slight_smile:

I have forwarded this thread to Avent folks just now.

1 Like

@Pedro, @Joe_Dinius and @jopequ, meeting your previous ask, have a look at acceleration_firmware_kv260. Creating a similar acceleration_firmware_ultra96v2 would be the way to go to support it.

This repo, together with the REP PR should give you the insights needed to start with it. I’ll make sure to go though the firmware quickly in the next meeting to address possible questions, but feel free to reach out as well.

@vmayoral, any safety ratings for that SOM? I realize this is probably the wrong place to ask.

hi @Shawn_Schaerer, I’m assuming we’re speaking about IEC 61508. If so, none that I know for K26 today and that come out of the box, sorry :frowning: .

If you’re pursuing this and would like to consider the K26 SOM, I’m interested and happy to help. K26 may or may not be the best way forward, depending on your criticality level (i.e. SIL-level).

If you’re looking for an immediate solution, you may want to check out the modular safety approach followed with Xilinx’s Zynq 7000 SoCs, which showed a high degree of flexibility across use cases.

Hi @vmayoral, yes I mean IEC61508. I know about the Znyq already. Is that processor being considered for the work you are doing here? We should continue this conversation on a separate thread.

1 Like

@vmayoral thanks for keeping up with this regarding the ultra96. I believe that because of the many supported mezzanine boards available for this Zynq-based SOM and its relatively low cost, it is a great starting point for many who would participate in this group. I do have a few questions that, hopefully, can be addressed in the next WG meeting in a few weeks’ time. Here’s my first question:

  • With regards to enabling devices connected via mezzanine or carrier board: How does the acceleration firmware concept apply to such cases?

What I am really asking is, based upon the whitepaper, I appreciate that the position you are taking for this working group is raising the level of abstraction for robotics developers to allow for greater flexibility and reducing the level of hardware expertise needed to make meaningful contributions, but it seems to me that some baseline proficiency with tools like Xilinx’s Vivado (or similar) is required to generate the hardware definition files for the target hardware. This leads to my second question:

  • Where is the line drawn with regards to where this WG’s domain begins and the hardware concept design/implementation ends?

From my first reading of the white paper (which I gained much insight from, so thank you), it seems like a lot of emphasis was placed on acceleration kernels and optimization of interprocess, intraprocess, and intra-network communications. What, if any, emphasis will be placed on communication/connection to sensor devices and other peripherals?

I think I will pause my stream of thoughts here and I will revisit them based upon your reply to this thread. Thanks much for continuing support of this initiative, and I look forward to participating in this WG!

Hello @Joe_Dinius,

Next WG session is already packaged :slight_smile: but I’ll definitely try and make room for Q&A (if there’s lots of interest we may stay for a bit longer, like last time). For the now, let me give you some thoughts, we can iterate here or discuss further during the meeting:

I’m sorry, but I’m not really familiar with the 96Boards mezzanine products. Could you clarify where do you see the problem? Are these boards enabled via PL connected pins? If so, do they require a specific Vitis platform and that’s where you see the conflict with the existing firmware repositories (e.g. given a acceleration_firmware_ultra96v2 with a default empty platform, you wonder how to support the Shiratech Bosch Sensor Mezzanine)?

I need a bit more of context, but if my assumptions above are right, there’re several ways to implement this capability (i.e. extensions with hardware/carriers that require specific configurations of the PL). The proposed architecture was designed with composability in mind, considering the future use of one or multiple carrier boards. In line with the efforts for simplifying hardware aspects to ROSsers, enabling additional hardware is possible by using its ROS 2 adapter (i.e. a ROS 2 package that enables the hardware).

With today’s architecture, there’re three avenues possible:

Option 1: add various platforms to a single (base board specific) firmware repository

The simplest one that comes to mind is to have in the ROS 2 acceleration_firmware_ultra96v2 package more than one platform, and switch between each other at desire. This could be implemented easily with additional verbs/subverbs in colcon-acceleration extensions leading to a flow like:

colcon acceleration select ultra96v2  # pick ultra96v2 firmware
colcon acceleration platform list     # show existing platforms
colcon acceleration platform shiratech-bosch  # pick one example

These tooling do not exist today, but shouldn’t take long to implement if needed, and add to the existing colcon-acceleration logic.

Option 2: fork the firmware repository and customize the platform for your hardware

Another way is to simply fork the acceleration_firmware_* package and customize it yourself. This can be done today.

Option 3 (preferred): create a ROS 2 package that enables that hardware in an existing firmware (already deployed)

This is probably the best option since it requires no modifications to existing packages and/or forks, however it comes at the cost of a bit of additional CMake complexity. In a nutshell, instead of creating (or enhancing) a complete (base board specific) firmware ROS 2 package, a new (carrier-board specific) specialized package would deploy any new required firmware (e.g. .dtbo files) and Vitis platform files (besides the ROS logic applicable to it) for enabling the mezzanine/carrier board hardware configuration you’re looking for.

One great argument to go this path is that it’s the manufacturer’s responsibility to create/maintain this package. In other words, if they want to sell boards to the ROS community, they should contribute and maintain this. Note this same concept can be used also without the need of additional hardware. Just new capabilities that require new firmware (e.g. a new FPGA IP addition to the Vitis platform).

Let me illustrate it with an example:
Imagine that you have a CAN carrier board which you’d like to plug into the KV260 in one of its expansion ports (e.g. the PMOD ones). Let’s call this carrier board, canboard. By the default, KV260 firmware ROS 2 package will probably not configure those pins appropriately for CAN use with canboard. A (and I’m just coming up with a name) ros_canopen_canboard could extend ros_canopen with additional CMake logic and the KV260 custom platform files. Accordingly, when building the workspace, the CMake logic will ship these new platforms files and any future raw disk image (SD card image) created out of this ROS 2 workspace would use them, empowering the use of canboard. This feature can also be combined with colcon-acceleration extensions to manually select between Vitis platforms and/or hardware configurations.

The idea of the ros_canopen_canboard could be extended further supporting a) multiple boards (not just KV260, but also Utra96v2, etc → more sales for manufacturer) and/or additional use of the FPGA for improved determinism. E.g. a ros_canopen_canboard_offloaded package could leverage additional resources in the FPGA and offload CAN logic to a soft core running there. This could help guarantee determinism at the data link layer (OSI L2) interactions, while interfacing with the usual ros_canopen. There’re lots of possible specializations, but for further variations (in this example, at least), I don’t see them being very useful for the general ROS user.

Yes. To enable new hardware for the first time, you need to have hardware knowhow. I don’t see a way around it for now. Most people nevertheless should be just fine using existing default platforms for reference boards, and these should be already ROS-enabled.

The creation of acceleration_firmware_* does indeed require some hardware and embedded experience, but for the most part I hope this will be a vendor-specific effort. Either the silicon OEM, or the board manufacturer/vendor, should produce this repository and provide support for it (if he/she wants to facilitate a path for the ROS community).

Of course, skilled users are welcome to maintain their own firmware packages and forks. We’ve provided a reference implementation for that purpose.

For the time being, the focus of the WG is on the layers depicted in the architecture diagram. This is what I’m maintaining for now, but it can easily change if people commits to other packages and efforts. It’s a working group after all!

The initial goal was indeed set to start with simple kernels and demonstrate how ROS 2 interactions (inter-, intra-process and intra-network) can be optimized. This should help people jumpstart with hardware acceleration, FPGAs and adaptive computing. That said, by no means we’re leaving communications aside. Besides the flexibility for I/O that an FPGA provides, the only way to achieve real hard real-time in ROS 2 is to have every single (OSI) layer and software abstraction time-bounded, and FPGAs excel also at that.

1 Like

I’m sorry, but I’m not really familiar with the 96Boards mezzanine products. Could you clarify where do you see the problem? Are these boards enabled via PL connected pins? If so, do they require a specific Vitis platform and that’s where you see the conflict with the existing firmware repositories (e.g. given a acceleration_firmware_ultra96v2 with a default empty platform, you wonder how to support the Shiratech Bosch Sensor Mezzanine)?

Exactly right: I’m wondering how I would start from an empty acceleration_firmware_ultra96v2 and a board specification to achieve an implementation that uses the PL in my ultra96v2 for sensor communication.

The proposed architecture was designed with composability in mind, considering the future use of one or multiple carrier boards.

This is great! The hardware configuration I am targeting for my use-case uses two mezzanine boards connected to the ultra96v2: the Shiratech sensor mezzanine and the On-Semi dual AR0144 mezzanine.

I really like the following idea you express regarding Option 3 above:

One great argument to go this path is that it’s the manufacturer’s responsibility to create/maintain this package. In other words, if they want to sell boards to the ROS community, they should contribute and maintain this . Note this same concept can be used also without the need of additional hardware. Just new capabilities that require new firmware (e.g. a new FPGA IP addition to the Vitis platform ).

However, I think it might be a tough sell in the short-term for hardware manufacturers. Long-term this makes total sense. For now, though, I am willing to do the work to integrate these specific boards with my ROS2 application as a proof-of-concept of the viability of this model. However I am in need of guidance as to how to do it. Based on your description above, I believe I would need to setup new ROS2 packages with device tree (e.g. *.dtbo) files as well as Vitis/Vivado hardware definition files (e.g. *.tcl, *.bsp, etc…?). I’m not an embedded systems nor Xilinx tooling expert, so I am not entirely sure which files are required here.

I’ll end my comment here, because it is starting to become a monologue. Thanks much for your continued support of this WG and associated proposals. I hope that I have provided adequate context to support some engaging Q+A at the next meeting.

I like the idea of adding the Ultra96-V2 but there is limited support for the On-Semi dual AR0144 mezzanine. From my experience, there is only Petalinux support for this with Vivado 2020.1 or Vitas AI v1.3. Mario Bergeron on Hackster.io has some nice examples of how to use the Ultra96-V2 and the On-Semi dual AR0144 mezzanine to create a Stereo Face Detection config.
https://www.hackster.io/AlbertaBeef/stereo-face-detection-with-the-dual-camera-mezzanine-8c7baf
I am not sure if ROS2 is supported with the version of PetaLinux used in the example though.

In either case, both the Ultra96-V2 and the Kira K26 or Kria K260 Vision AI Starter Kit have long lead times if purchased in the US.

Though I’ve only built the (On-Semi Dual AR0144 mezzanine + u96v2) bsp and boot image using 2020.2, it seems that there is support for 2020.1, 2020.2, and 2021.1 on Avnet’s Github (check the scripts directory in the tags for make_u96v2_sbc_dualcam.sh)

@JonM, you bring up a very good point regarding compatibility between PetaLinux and ROS2. Is anyone aware of a Bitbake/Yocto recipe for ROS2 and what, if any, versioning restrictions (e.g. Zeus or Thud etc…) there are? Has any consideration been made to using Docker containers with privileged access instead of native installations of ROS2?

Sorry about that @JonM.

meta-ros provides exactly this @Joe_Dinius. Check out the branches to meet whatever Yocto version you’re using.

I had to answer this recently and made a gist for it, here it is: Is it possible to install ROS2 in petalinux? - Zynq UltraScale+ MPSoC · GitHub

For those that would rather avoid building their own rootfs, acceleration_firmware_kv260 addresses exactly ths problem and provides a ready to use rootfs with various kernels (including a fully preemptible one) and ROS 2 Foxy. Check out KRS alpha release for more details (note this only supports Kria SOMs, for other SOMs, one would need to build the corresponding acceleration_firmware_* package).

What do you mean? A container having what exactly? And for what purpose? If it’s just the rootfs, that’s pretty easy to do from acceleration_firmware_kv260.