Voice recognition module for ROS2 foxy

Hi,
Are there any packages for voice recognition in ROS2 Foxy?
Preferably to work offline like the PocketSphinx.

Thanks,
Ajay

I’m not sure about something ROS2 specific, but there’s Mozilla Deep Speech, which should run offline:

3 Likes

Audio support in ROS is pretty thin on the ground, but there are a few packages that will bind gstreamer pipelines

This ROS1 package looks like it would be pretty easy to port to ROS2:
http://wiki.ros.org/pocketsphinx
Otherwise, a riff on gscam might be doable, the pocketsphinx pipeline is pretty simple.
gst-launch-1.0 autoaudiosrc ! audioconvert ! audioresample ! pocketsphinx ! fdsink fd=1

I’d love to plug my own ros-gstreamer package, but it doesn’t support string payloads yet.

1 Like

Hi @Ajaykumaar_S,

Not offline. However, at AWS we have build a number of ROS 2 packages to help integrate audio recognition and speech generation with AWS cloud services like Amazon Lex and Polly. You can find them here:

Regards,
Cam

4 Likes

I just got pocketsphinx running with ROS2 and discovered it doesn’t like my accent.
Fortunately pocketsphinx is not the only speech-to-text package that has gstreamer bindings

It’s easiest to launch from bash, but that repo also has tools to launch it from roslaunch.

gst-launch-1.0 --gst-plugin-path=install/gst_bridge/lib/gst_bridge/ autoaudiosrc ! audioconvert ! audioresample ! pocketsphinx ! queue ! rostextsink

Another option would be the voice assistant Jaco:

It can run completely offline on most Linux computers and even on a RaspberryPi and supports multiple languages. For a robot project of mine I created an interface skill that allows to control the assistant over the ROS2 topics. You can find it as demo skill here.

If you’re only interested in Speech-To-Text without NLU or TTS, the STT module Scribosermo can also be used as standalone model.

1 Like