Defining relevance engineering, part 1: the background

December 8, 2020 Charlie Hull
Category: Relevancy

Relevance Engineering is a relatively new concept but OSC and others have been carrying out the software engineering practice of tuning search engines for many years. So what is a ‘relevance engineer’ and what do they do? In this series of blog posts I’ll try to explain what I see as a new, emerging and important profession.

The commoditization of search

Let’s start by turning the clock back a few years. Ten or fifteen years ago search engines were usually closed source, mysterious black boxes, costing five or six-figure sums for even relatively modest installations (let’s say a couple of million documents – small by today’s standards). Huge amounts of custom code were necessary to integrate them with other systems and projects would take many months to demonstrate even basic search functionality. The trick was to get search working at all, even if the eventual results weren’t very relevant. Sadly even this was sometimes difficult to achieve.

Nowadays, search technology has become highly commoditized and many developers can build a functioning index of several milion documents in a couple of days with off-the-shelf, open source, freely available software. Even the commercial search firms are using open source cores – after all, what’s the point of developing them from scratch? Relevance is often ‘good enough’ out of the box for non business-critical applications.

When you need a relevance engineer

A relevance engineer is required when things get a little more complicated and/or when good search is absolutely critical to your business. If you’re trading online, search can be a major driver of revenue and getting it wrong could cost you millions. If you’re worried about complying with GDPR, MiFID or other regulations then ‘good enough’ simply isn’t if you want to prevent legal issues. If you’re serious about saving the time and money your employees waste looking for information or improving your business’ ability to thrive in a changing world then you need to do search right.

Choose your tools

So what search engine should you choose before you find a relevance engineer to help with it? I’m going to go out on a limb here and say it doesn’t actually matter that much. At OSC we’re proponents of open source engines such as Apache Lucene/Solr and Elasticsearch (which have much to recommend them) but the plain fact is that most search engines are the same under the hood. They all use the same basic principles of information retrieval; they all build indexes of some kind; they all have to analyze the source data and user queries in much the same way (ignore ‘cognitive search’ and other ‘AI’ buzzwords for now, most of this is marketing rather than actual substance). So, should you move to an open source engine right away? The trouble is, search engine migrations can be risky as Max Irwin explains in this talk, and migrations don’t necessarily improve relevance on their own – the new, un-tuned engine may initially be worse than the old one.

Any modern search engine should allow you the flexibility to adjust how data is ingested, how it is indexed, how queries are processed and how ranking is done. These are the technical tools that the relevance engineer can use to improve search quality. However, relevance engineering is never simply a technical task – in fact, without a business justification, adjusting these levers may make things worse rather than better.

In the next blog I’ll cover how a relevance engineer can engage with a business to discover the why of relevance tuning. In the meantime you can read Doug Turnbull’s chapter in the free Search Insights 2018 report by the Search Network (the rest of the report is also very useful) and you might also be interested in our ‘Think like a relevance engineer’ training . Of course, feel free to contact us if you need help with relevance engineering.

Written originally on the Flax blog, I’ve updated this blog for today’s relevance engineers!