Migrating to a new search engine is an activity fraught with risk. Whatever the reason for a migration, it is important to consider each risk in return and to identify possible mitigations. At OSC we have worked on many search migrations, often from very old technologies long considered obsolete by the search community; some of these projects have taken several years as they affected multiple systems at global companies.
Why migrate at all?
There are several reasons a migration to a new search engine technology may be considered:
Lack of support
The old search platform may be out of support, or the original vendor may have ceased trading. This is a risky place to be, even if the search team understand the current technology well – they won’t be getting regular security updates, improvements to performance or extra features. Some teams will try to take on this maintenance burden themselves, with in-house patches or enhancements – but these will also need to be maintained as custom code. The necessary in-house knowledge is also hard to maintain – often this depends on one key person who could leave or retire – and recruiting someone to work on outdated technology, rather than the cutting edge, is difficult.
Even before we consider relevance, the speed of a search engine is a key indicator to users who generally expect sub-second performance even over many millions of documents. Older search engines may only run on certain hardware and can fail to take advantage of all the available resources. Some may not be cloud compatible, or have other dependencies on even older and more limited components.
A newer engine will always be better?
A common assumption is that a different search engine will automatically produce higher quality results than an old one – it’s newer, right? It can’t be as bad as our users tell us the current search engine is! Sadly, this is not always the case. Remember in most cases you will be moving from an old system that has at least had some tuning, to a new, un-tuned system. However by moving to a new system you will remove many limitations and hopefully set yourself on a path to a more modern and relevant search experience.
Some search migration reality checks
It will take longer than you think
Search migrations, in our experience, always take longer than expected. There will be a host of new things to learn and many things about your current search platform that will be unknown until you lift the covers. Perhaps there are a lot of hard-coded, undocumented boost values, or a complex and misunderstood content processing pipeline. There is no simple ‘lift and shift’ when it comes to a complex system like this. Plan pessimistically and expect to be surprised (and sometimes horrified) by what you find.
Search results will be different
A new search engine will give you different results. This may seem obvious, but many users (and perhaps non-technical colleagues) will assume that a mark of a successful migration is that exactly the same results are produced, in the same order. This is hugely difficult to achieve in practise – there may be different index structures, different algorithms, different query processors all in play – even if you are migrating to a newer version of the same underlying engine.
We got used to bad results
One common problem is that users have built-in workarounds or coping strategies for your old “bad” engine. They may expect certain behaviour that is actually wrong, and can be surprised when a new engine removes this. We’ve even seen cases where some of the old “bad” behaviour has to be replicated in the new system!
Risk mitigations for a search migration
Measure before and after
If you aren’t already measuring search quality in some way, now is the time to start. You should be collecting some kind of judgements of relevance for search results (using explicit human judgments perhaps with a tool like Quepid, and/or implicit click tracking) and deriving a measure of relevance for a set of queries that represent a range of user information needs. This will give you a baseline to aim for with the new system – but remember, your old engine is tuned at leas to some degree whereas your new engine will be fresh out of the box.
You should also measure other aspects of search quality such as responsiveness and speed. The Jaccard index is a great way to measure how results differ between two search platforms, so you can prepare users for the change and address any special cases.
Don’t leap too far ahead
Think about getting the basics right first – machine learning, Learning to Rank, AI and vector search are all very exciting, but leaping straight into these advanced techniques before you’ve achieved parity with your old solution introduces considerable risk. Search is a journey, start at the beginning!
The results from the new search engine may not immediately appear subjectively better than the old one, at least not at first. Your first challenge is to make it work at all – given all the challenges above – to build a Minimum Viable Product on the new search platform. Getting the results to be better than the old one is the next challenge! During this process you need to work closely with stakeholders so they can understand what will be the same and what will be different on the new search platform.
Assemble a great team
You’ll need a skilled and effective team to make a migration work. Think about the roles you will need and how to train the team for the new challenges they will face. Think about how the metrics you develop can be used to communicate the progress your team is making to management and the wider business, while managing expectations and signalling what resources you’ll need.
Get some help
Experienced search consultants will have lived through the evolution of search technology and will have inside knowledge of when particular changes happened – for example when Lucene switched to BM25 ranking from plain old TF/IDF, introduced in version 6 of Apache Solr, or when Elasticsearch opened the code of their X-Pack extensions around 2018. They might even remember old options like the Google Search Appliance. They can thus help you work out the functionality gaps between old and new, the current technology options and which is best for you and what new features you can take advantage of.
At OSC we even have a Search Migration Playbook for you to download, to guide the process of migration. We won’t be shocked by what you tell us – there are still systems out there running on Solr 4 (released in 2012, the first ‘SolrCloud’ version) – and this won’t be the oldest out there!
With the right advice & help, realistic expectations and a solid data-driven strategy, you can mitigate some of the risks of a search migration. Like all significant technology upgrades this will have unexpected side effects, take longer than you think and reveal many other issues – but a successful migration can open the door to new techniques and features that can significantly improve your user’s search experience.
If you need help migrating to a new search engine, talk to us.
Image from Migration Vectors by Vecteezy