Blog

Solr vs Elasticsearch vs Vespa – what did we learn at The Great Search Engine Debate?

One of the most common questions we’re asked at OSC is ‘which search engine should I choose, Elasticsearch or Solr?’. Our web logs show consistently high traffic from Google queries like ‘solr vs elasticsearch’ – although as we wrote some years ago the answer doesn’t matter as much as people think it might. With recent changes to Elasticsearch’s license this question has become even more popular.

To help answer it, last week we hosted what turned out to be our most popular Haystack LIVE! Meetup, with over 240 people signing up and over 150 attending on the day: we brought together three experts from the community to pitch their favourite search engine and answer questions:

  • Anshum Gupta, Software Engineer – Search @ Apple, Apache Lucene/Solr committer and VP of Apache Lucene, speaking for Solr
  • Josh Devins, Senior Principal Engineer, Machine Learning at Elastic, talking about Elasticsearch
  • Jo Kristian Bergum, Senior Principal Software Engineer at Verizon Media working on Vespa

You may not have heard of Vespa, but it’s a rising star and has been around for longer than you might think: the team working on it have long experience of search technology, going back to the days of FAST Search and Transfer which has an interesting history (some of their technology eventually made it into Sharepoint when Microsoft bought the company). The Vespa team work for Verizon Media and their search engine powers a lot of Verizon properties such as Yahoo and also dating service OKCupid. As Jo Kristian explained, Vespa has the advantage over the Lucene-based Elasticsearch of native support for ‘modern’ search features like vectors – however the Vespa community is still small and adoption outside Verizon is still in the early stages.

Anshum talked about how Solr‘s Apache Project status means that no one company has control of the project, meaning that its feature roadmap and future development is entirely guided by its users. Solr is the oldest search engine project in our list and in the recent past has had to play catch-up to some degree in the area of scalability – but nowadays it is used for many massive scale applications and its pluggable, extendable architecture means that there is a vast range of integrations available with other systems. Its community is also extensive and in general it is a well understood and reliable option.

Josh talked about how Elasticsearch development focused initially on ease of use, particularly at scale and on the cloud, and how although it is often focused on analytics applications it can also be a powerful text search engine. He talked about how Elasticsearch is leveraging new developments in the Lucene core (which also powers Solr) to add new features such as vector search and how Elastic have focused on building a strong community.

All our speakers were keen to stress that they appreciated the efforts of the other teams and that they were often inspired by the innovations others had made. Following a brief pitch, they were kind enough to answer a wide range of questions from our audience and you can view the whole debate below. Thanks again to Jo Kristian, Anshum and Josh and to all who came – I don’t think we ended up with a definitive answer to the eternal question of ‘which search engine should I use?’ – but we did find out a lot about the differences between these three choices. We’ll be back with an other Haystack LIVE! Meetup soon!

At OSC we specialise in Solr and Elasticsearch – but we’re also looking hard at Vespa and thinking about how we might support those considering this technology, perhaps starting with training. If you need help and advice around your search engine choice, or migrating to/from one of the three options above, get in touch!

Picture from Keynote Vectors by Vecteezy (https://www.vecteezy.com/free-vector/keynote)