What's up with multi-term synonyms in Solr?

Joseph LawsonJune 23, 2016

There were some questions floating around the Solr mailing lists about multi-term synonyms and a few notable answers are as follows. The short version is, it’s complicated and every use case has different considerations. Doh!

An aside, I’ve been giving hon-lucene-synonyms some love since December. I got it working on Solr 5.3.1 and Solr 6.0.0 but neglected the documentation. The latest release of hon-lucene-synonyms included a number of namespace changes which weren’t completely reflected in the README.md so there has been some confusion as to how to get the plugin running. With that, the hon-lucene-synonyms README.md is now update to date explaining how to get the plugin working in Solr 6.0.0.

Doug Turnbull said Re: Solutions for Multi-word Synonyms,

Honestly half the time I run into this problem, I end up creating a
QParserPlugin because I need to do something specific. With a QParserPlugin
I can run whatever analysis, slicing and dicing of the query string to
manually construct whatever I need to


One thing I often do is repeat the functionality of Elasticsearch's match
query. Elasticsearch's match query does the following:

- Analyze the query string using the field's query-time analyzer
- Create an OR query with the tokens that come out of the analysis

You can look at the field query parser as something of a starting point for

I usually do this in the context of a boost query, not as the main edismax

Bernd Fehling added Re: Solutions for Multi-word Synonyms,

you should really try to build your own solution for Multi-term Synonyms
because every need is different and you can customize it for your special
use case, like adding a Thesaurus.


From myself Re: Solutions for Multi-word Synonyms (where APT refers to Lucidwork’s auto-phrasing tokenfilter),

The auth-phrasing-token (APT ) filter is a two pronged solution that
requires index and query time processes versus hon-lucene-synonyms (HLS)
which is strictly a query time implementation. The primary take away from
that is, APT requires reindexing your data when you update the autophrases
and synonyms while HLS does not.

APT is more precise while HLS is more flexible.

Note that hon-lucene-synonyms is also very useful for when you have a single term in documents but want multiple multi-term synonyms to find it. For example you could have FDA in your documents but can make matches like Food and Drug Administration,Food Drug Administration=>FDA which allows multi-term synonyms to be search for and inserted without reindexing the entire system.

Update 2016-06-24: Scott Stults pointed out that Querqy, maintained by René Kriegler, is another alternative. Querqy describes itself well in its README.md,

Querqy is a framework for query preprocessing in Java-based search engines. 
It comes with a powerful, rule-based preprocessor named 'Common Rules 
Preprocessor', which provides query-time synonyms, query-dependent boosting 
and down-ranking, and query-dependent filters. While the Common Rules 
Preprocessor is not specific to any search engine, Querqy provides a plugin 
to run it within the Solr search engine.

Because Querqy is a general toolset to manipulate queries it runs on top of Solr via a query handler. Most everything is implemented through a rules.txt file which is fed through rewrite chains.

personal computer =>
    SYNONYM: pc

personal computers =>
    SYNONYM: pc

Great stuff! The world of search is ever expanding. Whether you are using an existing plugin or trying to write a new one please reach out and contact us!

More blog articles:

We've been Solr-istas since day one!

Our founder wrote the first book on Solr, now in 3rd edition. We've helped organizations from the US Patent and Trademark Office to Cisco build smarter search solutions with Solr.

Learn More about our Solr services.