Vespa vs Lucene: First Impressions

As we learn more about Vespa, we wanted to give our initial impressions when comparing to Lucene-based search (Solr/Elasticsearch). This is based on initial passes with Vespa and our long history with Lucene-based search. Please get in touch to let us know what we’re getting wrong. In fact we’re pretty much writing this so that we can be corrected – Vespa is a beast and will take time to learn 🙂

1. Tensors Are Awesome! But they aren’t magic…

Vespa gives you a framework for storing and referencing tensors at query time. What threw me with experience with Solr and Elasticsearch learning to rank (ml for relevance ranking) was that I assumed tensors were a model representation from some standard ranking library. For example, in Elasticsearch learning to rank accepts a number of model representations from xgboost or ranklib. Tensors however are just multidimensional data that mean whatever you want them to mean. You can use them for anything. They can represent a ranking model, perhaps. You can then use Vespa’s built in tensor/math operators on to interpret it as a model (Vespa’s docs have a neural network ranking model example).

But Tensors can mean many things, and to me this is Vespa’s power as a ranking engine. They can be a user-item collaborative filtering matrix. They can be some kind of topic model. Tensors seem to be a really powerful feature that the Lucene ecosystem lacks: a resource for storing and using arbitrary multidimensional data resources in ranking. I think this is the main selling point for me of Vespa when compared to the Lucene search engines.

2. Nice concise ranking syntax

I appreciate the syntax behind ranking. Solr and Elasticsearch focus on expressing text-or-math queries than combining them additively with boolean logic or multiplicative with function queries. The “text” queries give a lot of fine grain control for what constitutes a “match” before the score gets injected. That being said, you have to understand and take apart how queries are combined with boolean/other operators to piece together the underlying ranking function. For example, in Solr what’s the math behind

q={!edismax mm=2 qf=text title}foo bat

Or giant verbose, awkward Elasticsearch queries

{    "query": {        "multi_match": {          "type": "cross_field",          "query": "foo bat"          "fields": ["text^20", "title^2"]        }    }}

The ranking syntax in Vespa is just math:

 first-phase {            expression: nativeRank + query(deservesFreshness) * freshness(timestamp) }

I can see a lot of appeal to this syntax compared to more Perl-like Solr and overly-verbose Elasticsearch syntax. You can also define your own macros and Java searchers to expand this syntax further.

To me this isn’t quite an issue of power, but of usability. I already know that Lucene-based queries come with an immense level of control and power. Solr and Elasticsearch let you inject scripts or arbitrary math into your ranking. But Vespa “out of the box” tells you ‘this is just math’ and you don’t think about boolean logic or anything.

The big unknown will be how “down and dirty” you can get with Vespa. Solr/Elasticsearch let you express a great deal of control, including writing Lucene query primitives in a plugin and exposing as a custom query component.

3. Lucene has more customizable text analytics

One big question I had about Vespa was the degree you could configure text tokenization and analysis. So much of the relevance work we do is feature modeling by manipulating terms, expanding terms, etc. Lucene comes with a bevy of libraries for combining tokenizers and token filters to do things like entity extraction or turn tokens into a minhash. You can use this framework to take apart either query strings or text. You can choose to treat name matching different than text for example. Or as my colleague discusses in his Lucene Revolution talk perform query time semantic expansion.

Primarilly what Vespa seems to give you is the ability to add customizable linguistics. You can turn on/off features like stemming. Synonyms seem to be in an altogether different functionality in YQL known as EQUIV, separate from “linguistics.” But I don’t see yet the same depth of configurability of analysis chains you find in Lucene-based search.

Streaming search is a handy functionality that comes in Vespa. Instead of building an inverted index, Streaming search lets you identify a small subset of data to allow “grep-like” search functionality. A common limitation of inverted indexes is in-word searches can be more complex/slower when compared to full-term searches. Because inverted indexes are compact, and the full text data is not, streaming search can only work on a small subset of the whole corpus. So it’s appropriate for a small side-corpus (perhaps a set of phrases for autocomplete?)

5. Community

It goes without saying that it will take time for Vespa to grow the same level of community support and infrastructure as Solr and Elasticsearch. You can get hosted Solr/Elasticsearch from half a dozen cloud providers. There’s client libraries, plugins, developers that you can hire, meetups, and mindshare around Lucene that’s developed over two decades.

Vespa isn’t there yet, but that won’t mean it can’t get there! In some ways now is the time to get in on the ground floor. I’m sure community support will build, and that’s probabily a big part of the reason Vespa was open sourced!

Tentative Initial Conclusions

I’m not sure it’s fair to say that Vespa is like Elasticsearch but 100 times better. They are comparable in power: Vespa is a tremendous piece of technology, but so is Lucene. It’s fantastic to see a major contendor to Lucene come out in the search space. I can see the limitations to Lucene that having a piece of functionality like Tensors exposes.

My initial thoughts are that Vespa could be great for very large-scale search where you can’t do much index-time configuration of text analysis. Where there’s enough training data that technologies like Learning to Rank or recommenders make sense, and most of your ranking work is in machine learning around user behavior and you do little/no baseline relevance tuning or text analysis. You’re going to customize around tensors and their interpretation, not around text analysis or boosting. I’m not convinced Vespa is the best solution for the “broad middle class” of search problems that benefit from Solr/Elasticsearch’s customizability around text matching. And I’m not convinced Elasticsearch/Solr are nescsarilly bad for large scale search use cases.

Of course Solr and Elasticsearch are catching up! Solr and Elasticsearch have a learning to rank story by directly accepting models generated by common learning to rank libraries. They also come with infrastructure for feature logging and management needed for learning to rank. But I would love to see something as flexible as tensor manipulation in Lucene search!

Again, this is a working document based on our first impressions. We’d love to hear from you if you have a different perspective!