2020 User Survey
Back in January Marya and I put together a survey of the ROS community. Our goal was to gather some data to help us model what the ROS community looks like, and what it needs in terms in documentation. More specifically we wanted some data to guide our documentation development. We put together a brief survey and left it up for about six weeks. All in all it was reasonably successful. We got 116 responses in total. Last week I pulled down the data from the survey (it is still up) and dropped it into an ipython notebook to play with results. Below you’ll find the raw data as well as a cleaned up dataset. As stated in the original post I have removed the free form text sections and the e-mail addresses for privacy reasons.
If you haven’t already, please take the survey before you read the results. It takes three minutes. It is also better that you take it before you read the results. If we get a few hundred more responses I’ll do an update.
Open Data and Jupyter Notebook
We have released the data (sans personally identifiable information) in this repository. There are two versions, the raw data and what I call the scrubbed results. The scrubbed results are just what was left over after I processed all the data and cleaned it up so it will be easier for someone else to do more in depth analysis. If you want to take a crack at analyzing the data or see how I put this together I put my Python Jupyter Notebook up on Google Collaboratory. Be aware that I removed the word cloud code because of the redacted data. Feel free to fork it, tweak it, and post your results. The rest of this post basically follows the notebook so feel free to follow along in a second tab.
The first step with any data analysis is to clean up the data. Google sheets returns both a written value and a numerical value for categorical questions. To make things easier to understand and work with I removed the numerical values and then shorted the strings. For example the “3 (advanced)” became just “advanced.” Similarly multiple choice questions come as strings and I wrote a little function to move that data to a python list of strings.
Skill Level Data
The first thing we asked survey participants to do was to self report their skill levels on a variety of ROS related topics such as C++, Python, Shell Scripting, Robotics, etc. You can see this plot below:
There are three things in this plot that I think are interesting:
- The plot either has some sample bias towards advanced users or the ROS community on Discourse are all brilliant and experts on everything except ROS 2.
- Most of the community is still uncomfortable with ROS 2.
- Asking the community about a specific topic, like C++, returns more normally distributed results than asking about a more general topic like “robotics.”
I reached out to Steve Macenski to review this work before I released it. He said one burning question he and others have had is how to appropriately split their time and resources between C++ and Python. He was curious if there was a bias in the community towards one or the other. Since we had all the data ready to go it was trivial to plot the results. I created a self-reported skill matrix for two skills and then normalized it to the total number of respondents. I didn’t look at all the permutations as I don’t think that is valuable, but I did take a look at a few of the most relevant ones.
The way you read these plots is lower left is less skilled at both aspects, while the top right is high skill at both aspects. The color indicates the number of respondents. The first thing that jumps out to me for most of these plots is the sample bias, that is to say the top right corner is almost always the brightest square. As to the C++ versus Python question the answer seems to be that most individuals who responded are roughly equally skilled at both. This pattern holds for most of the other skill comparisons I looked at. Another way to interperet this data is to say that perhaps being well skilled with ROS 1 requires mastery, or at least proficiency with C++, Python, shell scripting, and robotics and software engineering fundementals.
The lone exception in all of the plots below is the ROS 1 versus ROS 2 skill level plot. The plot indicates that even the most skilled ROS 1 users are still having trouble mastering ROS 2.
Utility of Different Types of Documentation
The next thing I wanted to look at was the utility of various documentation approaches for the community. My hypothesis was that more experienced developers would prefer straight documentation or perhaps concept descriptions, while less experienced developers would prefer more videos.
What this raw data indicates on the balance is that the community prefers tutorials and guides the most, cheat sheets and videos the least, with concept articles and quick start guides somewhere in between.
Robot Platforms and User Roles
Next we wanted to understand what the community looks like in terms of their role in the world. How many people are using ROS professionally versus in research or academia. I often hear people claim that ROS and Gazebo are academic or hobbyist tools, but the reality is not supported by the data. Nearly half of the ROS community surveyed are professionals.
Another question we wanted to answer was, “what kind of robots are used by the community?” What was unclear to us is if we should continue to devote the majority of our tutorials to mobile robotics, or if things like manipulation and UAVs needed more representation in the documentation.
The data we collected indicates that mobile robots and autonomous vehicles make up nearly 50% of the community interest, with industrial applications in the mix too. While UAVs and aquatic vehicles are important, they make up only a small percentage of the community. Similarly, application specific domains like medical robots, and agriculture robots are still small niche domains.
ROS User Educational Background
The next thing we wanted to understand was the educational background of the individuals using ROS. Specifically, we wanted to understand the balance between professional software engineers / computer science graduates, and other disciplines like mechanical engineering and electrical engineering.
The data indicates that about half the community consists of software engineers with the balance coming from other engineering disciplines. This data is interesting with respect to the self reported skill levels being so high.
How Do Users Perceive ROS?
The next thing we wanted to examine is how participants wanted to use ROS 2. One theme we’ve heard over and over again is that professionals would like to move in the direction of robots consisting of a variety of turn-key packages.
This seems to be supported by the data with most participants wanting to do some degree of customization. Notably in this plot only 22% of respondents want to contribute code back into ROS 2.
Simulation, Useful or Nah?
Finally, we wanted to understand how the community uses ROS and Gazebo, whether they use them separately or in conjunction. I’ve built robots without simulation components other than SDF and URDF, and we have a number of going events like ARIAC and DARPA SubT that are completely simulation based.
On the balance the ROS community is building simulations for their physical robots, with those working with just pure ROS or pure Gazebo simulations making up the minority or respondents.
The survey also contained two questions where we asked the community what they’re doing with ROS and what they plan to do with ROS. Unfortunately, we decided to hold this data back so users would feel comfortable talking about their top-secret plans. To look at this data I first concatenated the two datasets and stripped out the 100 most common English words, and few other terms that came up a lot (like ROS, ROS 2, Gazebo, etc). From this data I went and generated a word cloud.
Deeper Dive on Skills
One of the topics that comes up a lot when we write documentation is who is the audience and how best can we address their needs. I was curious if different documentation approaches work better or worse for different types of users. To try and figure this out I split the data based on the reported skill level by calculating the average of all the self reported skills excluding the ROS 2 question. I did a quick plot and chose an arbitrary cutoff to separate the raw data into three cohorts: low skill, mid skill and high skill. This data is in a column is called “skill_score” in the scrubbed dataset. Skill scores varied between 0 to 3, and our arbitrary cutoffs were 1.75 and 2.75, giving us 23 respondents in the low skill set, 72 in the mid skill set, 21 in the high skill set. I then replotted the data for these three cohorts to see how different skill levels might require different documentation types, and how the cohort’s skill level might change the platform they prefer working on. I would take all this data with a grain of salt as the number of respondents is still on the low side.
Skill Level and Documentation Utility
The first question we wanted to answer is if certain kinds of documentation work better for different kinds of users. I have often heard it said that students prefer videos. I could belive this as my best educational moments were standing behind a senior engineer while they worked. In my mind videos of this nature are the next best thing. I was curious if less skilled users felt the same way.
From these three plots a few things jump out:
- Tutorials are consistently and overwhelmingly preferred.
- Guides, quick start documents, concept articles are helpful but less desirable.
- No one likes cheat sheets.
- Less skilled users like videos more than middle and high skill users but they aren’t nearly as useful as tutorials.
Skill Level and Platform
One thing we wanted to understand is what robotics platforms people were interested in when using ROS. We know that by and large the bulk of community interest is in autonomous mobile robots, but we were unsure if that was uniform among users. We were curious if different skill levels showed any preference for one platform or another. Perhaps less skilled users were more interested in drones because they are generally easier to come by.
Surprisingly, there is a slight preference for less skilled ROS developers for industrial arms. More experienced developers seem to be working on autonomous vehicles or mobile robots. This is potentially useful information as it means integrating topics like TF2 and MoveIt might need to start earlier .
Skill Level and Education
Just as a sanity check we wanted to plot general skill level versus education type. We wanted to understand if we have individuals reporting that they are less skilled because they have less experience in general or because they have less software engineering skill.
It appears that our thesis that less skilled users come from non-software backgrounds is generally supported by the data. This is helpful to know, and would make a great start for a user persona.
Just because it was trivial to do we generated word clouds for our three skill groups. This uses the same data as before, namely what users use ROS for or what they intend to use ROS for with the top 1000 English words removed. The thing that jumps out at me from these clouds is that the less skilled cloud is fairly emphatic, “begginer robot content!”, while the more skilled individuals discuss more nuanced needs like, “security”, “manipulation”, “DDS”, and “Quality.”
Educators/Students/Hobbyists vs. Professional Users
The last thing we wanted to examine is the difference in utility of different documentation types between “professional users” and everyone else (educators, researchers, hobbyists, etc). Not to imply that researchers and hobbyists aren’t professional, but we wanted to see if there was a difference between the two. We split the data set in two according to self reporting. There were 75 individuals who reported being professionals versus 41 non-professionals. Looking at the utility graphs they are strikingly similar. We repeated a number of other experiments, but they are all reasonably similar between the two groups so we’re going to omit them here. This leads us to believe that “skill level” and educational background have more impact on the documentation needs of individuals than their role in the world.
This data is interesting and useful but I would take it with a grain of salt. The overall survey is biased towards the individuals on ROS discourse. Thankfully the analysis here is fairly automated so we can continue to collect data and learn from it. We would love to have more students and educators take the survey so we can broaden the sample size.
As I said earlier we’re not releasing the written part of the survey but I went through and re-read it after doing all of this. Some of the topics that came up multiple times or caught my eye were as follows:
- Beginner tutorials, beginner tutorials, beginner tutorials.
- Navigation. Everyone wants to know how to do navigation.
- “Best Practices.”
- ROS 1 to ROS 2 conversion
- Beginner “mechanical engineering graduate student”-level introductions to the topics required to learn for ROS2
- Tutorials related to DDS/QoS configurations, with particular focus on high bandwidth sensors.
- Build systems and build tool descriptions and tutorials. Along with this there were a number of requests for IDE based tutorials.
I would love to hear what the community thinks. Does this data jive with your personal experiences? What would you like to see in terms of ROS 2 documentation.