Welcome, dear reader, to my first OSC blog post. Let’s dive in!
While search relevance is often equated with ensuring customers find what they need, that is only part of the picture. Even if relevance for search results is well tuned and considered great, it may not matter if the experience is poor or search is running slowly. When considering improvements to search in a product or application it is necessary to have a vision of overall quality, which is a combination of three key areas: relevance, experience, and performance. This article explains all three at a high level and why they matter as a whole, and offers some tips to diagnose and attack a spectrum of common search quality issues.
Search Relevance is typically called out as an issue when customers or users of a product or application complain that they can’t find the information they need. Relevance is subjective and therefore its definition is key to understanding what it means to have relevant search. Formally, we define relevance as a series of metrics that measure whether appropriate documents are appearing in search results for a given query, typically at the top or above the fold of the website or app presenting the results. These metrics usually include at least precision, recall, and various forms and blend of the two. There are many other metrics, but these are the easiest to understand and use when approaching measurement of search for the first time. Only by measuring metrics can we improve on them, and improving them is correlated with more relevant search results being returned for the queries being measured. If enough queries are measured then we increase coverage of the potential interactions that can happen with search. It is useful to draw an analogy between this type of coverage and coverage of testing software: the more tests you have, the less likely your customers will experience a bug. Also, measurement of how customers are using the live search is critically important to know if relevance is performing as expected from testing.
Search Experience is comprised of the functional aspects of how customers engage with a product in order to find content through search. Common examples include autocompleting queries, layout of search results and the result data shown, suggesting alternate terms and spelling corrections, offering filters and facets, and highlighting results and documents with the query terms. Search Experience also includes the format of content - if text in search results has poor grammar or spelling, or documents are displayed incorrectly or inconsistently, it will impact the impression your product has with customers.
Search Experience varies significantly between markets and domains. These include eCommerce, Job search, Research, Enterprise Search, Application Search, and others. It is important to understanding how customers expect your search to behave within the context of the domain. If this is difficult to do because your product is niche, then using tools such as Contextual Inquiry will help to find the ideal workflow and interface that will best help customers when using your product or app. Just like relevance, one cannot improve an experience unless the improvement is measurable with data. Capturing metrics for user experience is a well established field and includes click tracking of workflows, heat maps of UI interaction, A/B testing, session length, and many others. Analytics for search measurement very often overlap with measurement of live search relevance. Capturing both is key to understanding how customers are experiencing a product and whether that experience is positive. The more analytics captured and the better the strategy, it is easier to measure which areas reflect good experiences for customers and which areas require improvement.
Good performance of your product is critical to ensuring customers do not become frustrated with the experience. It is no mystery that people are short on patience when waiting for an app screen to load or for results to be returned. Even if relevance is high and functional user experience workflows are excellent, customers will be unhappy if they need to wait several seconds or more every time they search. Setting KPIs for page load and search response times, striving to meet them, and keeping the search response snappy will in many cases make customers pleased with your product. Also, being able to instantly return search results can sometimes make customers happy even if relevance is not ideal.
Handy Tips for Common Problems
This cheat sheet can help with a high level diagnosis of problems and where you should focus attention on a product that isn’t meeting customer expectations. The list isn’t exhaustive but offers insights to the more common search quality issues you might run into.
Customer use of search seems sporadic, and it is difficult to tell if they are finding what they need
Your analytics strategy might be flawed if you are unable to pin down what customers are doing. Make sure you are tracking ‘search conversations’ that log from start to finish the entire path customers are taking in your product - all the way from clicking on the search bar, running a search, refining the search, clicking or scrolling results, and interacting with individual results. Implement this strategy in your product and create reports and dashboards for the data that is most important to your product, and add to this over time.
Customers are searching but are not clicking any documents
Make sure you have a relevance testing plan in place. Gather a broad range of queries, and a set of judgments of what documents should be returned for each query. Use that to measure precision, recall, and other metrics with a relevance testing tool such as Quepid.
Relevance tests well, but customers aren’t clicking on documents in live search results
The areas to look at here are whether your relevance testing coverage is high enough, and if you are testing with the right queries. Start by looking through the queries your customers are searching for that don’t result in clicks, categorize them, and look to improve test coverage and improvement on the categories that are performing the worst and work your way up. Also, investigate whether your test data aligns with customer expectations. Remember that search is subjective, so if you have the opinion that a result is relevant but your customers do not, find out why and update your judgments accordingly.
Some longer queries are performing well, but short queries don’t seem to be working for customers
You may have what is known as a customer intent problem, which is a problem most search engines have when queries are only one or two terms. For shorter queries offer various types of results that cover a wide range of content in which the customer might be interested. Also, give options in the search experience for faceting, filtering, and autocompletion to more specific queries that match common and overly broad terms.
Many queries are returning too few results, and in some cases no results at all, even for short queries
This is specifically a recall problem, and usually happens because the search is not tuned correctly with synonyms or stemming/lemmatization. Turn on basic language stemming in your search engine, and start curating a dictionary of common search terms and their synonyms, then add it to your search engine query analysis. This synonym dictionary should be kept current by investigating customer searches that return less than a full page of search results (and especially when no results are returned), and amending where needed.
Customers are clicking on the first or second product or document in the results, but they don’t convert or interact beyond that click
The information you are showing in the results might be misleading or uncharacteristic of the full document. Investigate search results display and highlighting at first. Then scrutinize the document and make sure it looks nice and conveys the necessary information clearly. Get help and feedback from your product team and trusted customers for the problem might be with specific documents and the queries that found them and find for a pattern. Also, if your product is enterprise search or research, customers may actually be finding what they are looking for when they click on the document. In that case, start measuring document dwell time and selection and copying of text in your analytics, and see if the data starts telling a better story.
Customers are avoiding search, and are instead locating products or documents through browsing
Investigate performance of search and whether it is returning results quickly. Do this with a broad range of actual customer queries and gather as much data as you can to investigate overall performance. Also, take a look at your UX metrics and potentially A/B test with a more prominent search bar.
Customers start searching, but they seem to abandon the search or product before any more measurements are captured
This is almost certainly a performance problem and should be investigated and improved accordingly.
Striking a Balance
Getting all three areas of search quality correct is not easy, and there are often trade-offs. These trade-offs are almost always different between products and can only be found with investigation, experimentation, measurement, and refinement over time. All the while it is important to monitor search quality as a whole and jump on issues as they pop up to prevent them from getting worse. For example, using a new plugin to analyze queries to improve relevance might impact performance. Also, spending all your effort on improving experience can result in relevance languishing as content is updated over time and query trends change.
There are many more problems you can face when offering search to customers, and it is important to look at the whole picture. By widening your investigations you may be surprised with where they take you. And as always, try to enjoy the journey, and take the time to celebrate when things improve. Happy Searching! Max