Improving Search for ROS Docs - Technical

Topic

This is in continuation with: this ticket

My finds so far

Pagefind:

Pagefind is a fully static search library that aims to perform well on large sites, while using as little of your users’ bandwidth as possible, and without hosting any infrastructure.

Pagefind runs after Hugo, Eleventy, Jekyll, Next, Astro, SvelteKit, or any other website framework. The installation process is always the same: Pagefind only requires a folder containing the built static files of your website, so in most cases no configuration is needed to get started.

Nextra

I don’t know how they implement search but this is what I found:

They support mermaind and latex

1 Like

A project which faced the same problem (unsatisfactory search experience in sphinx) not long ago is the Zephyr RTOS, they recently settled on using the “Programmable Search Engine” offering from Google with a fallback/toggle to the builtin search.
They discarded Algolia in the past, although as far as i can tell only for the reason of not wanting a 3rd party tool, which seems to have changed as no satisfactory solution without relying on a 3rd party had been found.

The PR implementing this with a bunch of discussion attached can be found here: doc: Add Google Programmable Search Engine by kartben · Pull Request #60018 · zephyrproject-rtos/zephyr · GitHub

Key differences seem to be that

  • Zephyr has a bunch of autogenerated docs (config options, API, etc) which litter the search results a lot and are usually not the best result, so the need for improved search was probably greater there than for ROS
  • Zephyr seems to be well funded enough to pay for the google service, IDK how ROS infrastructure costs are handled right now
2 Likes

This still costs them something to run right ! I think nextra, Starlight, PageFind are the options we have. @kscottz is pretty clear why paid solutions won’t work

1 Like

I use Meilisearch on my site.
It’s opensource and free to host the Dockerfile and a shared vps costs around $4 on hetzner. I push the documents to Meilisearch and it’s as fast as algolia to make a search on the website content. It’s not THE best, as algolia costs quite a lot, but it’s in the top 3 for me.

I did some research of my own:

My topic picks: (Open Source Search engines)

  • Solr (Apache)
  • Lucene (Apache)
  • MeilliSearch
  • TypeSense
  • ElasticSearch

Catch is , this will become a full fledged search engine and not a “search integration”

Should we go the search engine way ?

Are people using the search in the official docs?
I normally first try to find the section in the docs mostly likely to contain the answer I’m looking for. Next I try a Google search that usually leads to answers on ROS Discourse, ROS Answers/Exchange, ROS Index or some external blog post.
It would be nice to have good native search in the official docs, but is this higher priority than say better tutorials?

4 Likes

search engine way looks like yahoo from 2000s imho.
Meili is faster than elastic, solr, and lucene. it just works and it’s typo tolerant.
Just throw a command palette like https://cmdk.paco.me/ with input debounce and let it query meili .

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.