Why I think search engines are the future of recommendation systems

September 13, 2016 Doug Turnbull
Category: Elasticsearch

Recommendation systems have often felt like costly endeavours that require machine learning expertise and large amounts of infrastructure. Search, on the other hand, increasingly feels like a commodity with the ubiquity of Solr and Elasticscearch. In reality, as we argue in Relevant Search, search and recommendations are two sides of the same coin.

As search engines improve their ability to implement collaborative filtering, I see a future where recommendation systems aren’t just the domain of sophisticated organizations. Increasingly, I see a shift towards search engines like Solr or Elasticsearch as the only tech needed to implement a recommendation system.

Here’s my opinions on why the future of recsys will be rooted in open source search engine tech:

Devs grok search engine tech – Search engine tech, ie Elasticsearch and Solr, have become extremely familiar to developers. As recommendation features make their way into search engine tech, organizations will get 90% of the way there on collab filtering/personalized search without having to invest in additional infrastructure and skills.
Search engines let you mix and match multiple relevance models. Manually tuned relevance works well early on when there’s not statistically significant user behavior to drive results. For small organizations, this can help with the “cold start” problem and allow recommendation systems without much user behavior. For larger organizations that can gather more user behavior, search engine technologies allow practitioners to layer in collaborative filtering, personalization, and learning to rank alongside manually tuned relevance.
Simpler, more straight-forward recommendation relevance tuning. Solr and Elasticsearch have sophisticated query DSLs that let you directly layer in many ranking factors, including location, content recency, and many other content features alongside collaborative filtering. Tools like Quepid and the practice of Test Driven Relevance let users more simply test relevance solutions.
Breadth of applicability of Solr and Elasticsearch: Open source search engines are becoming less about being the best implementation of a search bar, and more focussed on general information retrieval framework. The next generation of user interfaces: voice search, conversational UX, chat bots, etc often have search at the core. Image search, and other forms of “similarity” often turn to Lucene-based search. Recommendation systems will be absorbed into the fold, so they can be mixed and matched with many of these other cutting-edge applications.
Real-time recommendations: Instead of running offline jobs to find user-item affinities, this can be done in real-time while simultaneously mixing in content properties and other assets mentioned above. The inverted index data structure is really stinking fast and always getting faster. Lucene has been tuned extensively over the years for looking up things given a feature that describes that thing. Or as my coauthor John Berryman puts it, a “sophisticated token matching system.” And as I talk about all the time “tokens” can be any descriptive feature of your content, not just a word.
The next gen of recsys won’t look like the current gen: Recsys today is where search was in the early 2000s. In the early Web, search engines were the domain of extremely sophisticated firms with deep pockets. Solr and Elasticsearch laid waste to those days, letting anyone deliver a basic level of search very quickly. Solr and Elasticsearch will do the same to the next generation of recsys: allowing less sophisticated organizations with much shallower pockets implement 90% of what’s needed relatively quickly without extensive amounts of additional infrastructure or retraining staff.
Search is still important: Some say search engines will in fact be replaced by recommendation engines. I disagree. Because search is still an extremely important interaction form. With search, users tell you what you want and you need to respond with relevant matches in real-time. The core component of that is still what Solr and Elasticsearch are good at: matching and ranking content and showing them to the user. Also, most of the supporting functionality of search: autocomplete, faceting, highlighting, etc is still extremely crucial to this interaction and isn’t going away anytime soon. And the personalization that collaborative filtering offers only makes search better.

That’s my listicle (opinioncle?). What do you think? Let’s discuss over at hacker news and you can tell me where I’m wrong. If you’re curious about this topic, check out my post on implementing really good recsys with Elasticsearch. Please invite me to your company’s lunch and learns and I’ll be happy to talk about implementing recommendations with open source search engine technology with your team!