What is Test Driven Search Relevancy? - OpenSource Connections

October 14, 2013 Doug Turnbull
Category: Solr

During a challenging search project, we realized that we were simply trying to accomplish too much. The demands to create relevant search results were intense. The deadline looming. Presented with dozens of unique search bugs, we provided one quick solution after another.

What we found was that we quickly forgot why or what bug one piece of the relevancy formula had fixed. Why did we add this to the stopwords list? What made us think this boost was appropriate? Why did we add this field to the query fields?

As we bandaided one solution on top of another, we realized we were creating a disaster. The search equivalent of a ball of mud. With the deadline looming, tension between the search developers and content experts was growing. Communication was breaking down. Even with a deadline looming, we knew we had to try another path.

We needed to solve two problems. First we needed a way to remember and enforce all the ways the search needed to behave. Second, we needed to fix the collaboration problem. It was becoming too hard to communicate productively around broken search.

As an engineer, the solution to the first problem was clear: automated testing. As a human being, the solution to the second problem was also clear: better collaboration processes and tools.

I knew from extensive experience, that combining these two solutions– collaborating closely with others on creating automated tests– turned a potentially confrontational situation to a collaborative one. Instead of two people asserting that this way or that way was the One True Way(™), having them sit together and write tests allowed them to have a conversation on how a system should behave. Writing tests for the various corner cases allows the developers to turn to a tester or product owner and ask “Well what do you think it should do!?”. Similarly, for marketing and content experts, helping the developer write automated tests for the nitty-gritty details suddenly creates a sense of being able to have a very in-depth say over what the system should do.

So I applied those lessons to search and we built our testing & collaboration product, Quepid. Initially, Quepid had a few simple but effective features:

Execute multiple search queries simultaneously, present all results in a single page
Rate the quality of results for those search queries
Provide an edit box to edit the relevancy params passed to Solr

Marketing expert rates the quality of this juice for the “apple juice” query, giving developers instant feedback on search quality.

The result was a simple tool that completely altered how we dealt with search relevancy problems at OpenSource Connections. With the ability to see the relative quality of several queries simultaneously allowed the whole team to work more efficiently:

Developers had a workbench to experiment with relevancy parameters
Sales, testers, and content experts could add broken queries, and rate the results as poor, and provide meaningful feedback on their search

With these ingredients, we had the ability to achieve a level of automated test coverage over dozens of representative queries. Testers and content experts could know before pushing changes to production whether or not the team was making measurably improvements. Developers could see right away the impact fixing new problems has on existing representative queries.

More importantly, all search stakeholders could sit together and troubleshoot together rather than shoot emails at each other and argue. Developers could point out rather directly the challenges related to search. Marketing, content experts and testers could more readily communicate why the search ranking was broken.

It was a pretty powerful experience that reinforced to me that automated testing is more than test coverage. Its an organizational dynamic that uses a meaningful artifact (tests) for communication. In other words, talk is cheap, tests matter.

Can our products or services help your organization work more efficiently to improve relevancy? Do you struggle organizationally with how to communicate and fix broken searches? Would you be interested in the product that made this all possible. Contact us and let us know!