At OpenSource Connections, we discuss the role ‘relevance engineer’ quite extensively. In this post, I want to define this term. We feel it’s a key, often missing component of search teams.
A relevance engineer implements information retrieval algorithms that solve user information needs in real time, at scale. A relevance engineer owns components of the discovery process (search, autocomplete, recommendations, etc). They are primarily engineers, though have awareness of algorithms and machine learning techniques as part of their toolbelt.
A relevance engineer straddles the sweet spot between system accuracy/relevance and performance/stability
When solving an Information Retrieval problem, relevance engineers don’t chase the state of the art unnecessarily, rather they prefer proven techniques for 80% of the problem. They know that well-established solutions can be easily maintained at scale. This includes keyword search, BM25, classic Learning to Rank models and semantic search techniques. They do this because classic techniques have a long track record of scaling, solving users’ problems, and allowing stakeholder management.
They know every information retrieval problem has that hard to reach 20% that requires innovation. They track developing industry trends, and look at places where their problem requires unique innovation. To help innovate, they have awareness of Natural Language Processing (NLP) and Information Retrieval innovations. They are well versed in search system internals and to get needed performance, they don’t keep the search engine at arm’s length, but rather they wrangle those solutions to achieve the needed functionality.
Accountability and Data are at the center of the Relevance Engineer’s life
Relevance engineers are fundamentally data-driven. They make decisions based on performance and scaling metrics and relevance metrics.
With relevance metrics, they likely aren’t the ones defining the ‘ground truth’. They need to work with data scientists and domain experts to understand how and why users prefer one set of results over the other. Yet once they have this information, they can use tools for Test Driven Relevancy to optimize the relevance of their algorithms. They may work hand-in-hand with data scientists to develop and implement algorithms on a real-life information retrieval system.
Relevance engineers don’t solve search for Kaggle points or academia, but for real companies and users. This means that they’re always measuring the speed of their systems. They work closely with backend developers and operations to appropriately scale the information retrieval solution: they care as much about the uptime and performance of their solution as they do the accuracy.
“Machine Learning Engineer” vs “Relevance Engineer”
I think the recent advent of the term “Machine Learning Engineer” also has a lot in common with what a relevance engineer is. Both need to implement algorithms, at scale, reliably, and in consultation with data scientists. Both focus on the engineering and practical side of algorithms, not the academic side. Both must balance performance with model accuracy.
Yet there are a few distinctions:
- Relevance engineers are user-centered: they care about the information retrieval experience. Machine Learning Engineers focus mostly on ‘how can I implement this algorithm fast and at scale?’
- Being user centered, relevance engineers focus on being ‘inline’ with the user’s request, so their algorithms must be fast and reliable so as not to harm the user experience
- Relevance engineers focus on Information Retrieval, not machine learning more broadly. Information retrieval solutions may or may not include some aspect of machine learning
Learn to “Think Like a Relevance Engineer”
Image by Background Vectors by Vecteezy