When to choose open source search (and when you shouldn’t)

In my role as search consultant, I spend a lot of time helping organizations pick the right search technology. Will it be Solr or Elasticsearch? Algolia or Swifttype? Endeca or Marklogic? The choices can seem endless.

In this article, I want to single out Solr and Elasticsearch specifically for consideration. Why? As open source solutions, they share most of the pros/cons compared to proprietary solutions when making a business decision. Your developers may be pushing eagerly to build something on top of them. You may want to stop paying so much money to a search vendor. With this article, I want to prepare you to make an educated purchasing decision when it comes to open source.

Need Out of the Box Simplicity? Choose Proprietary

Proprietary solutions tend to target domains or use cases. For common search problems, they’ll have 80% of what you need. Big E-commerce firms love Endeca, for example. Endeca comes with powerful merchandising tools that allow merchandisers and marketers to fine-tune how products are showcased. Algolia, has found it’s niche in autosuggest, mobile, and typo tolerance. Marklogic works superbly with documents where structure is very amorphous. Google custom search has worked well for web-like document structures.

If you’re coming from Algolia, Endeca, or Marklogic, the switch to Solr or Elasticsearch can be jarring. You’re suddenly down to a bare-bones framework that lacks the domain-specific bells-and-whistles that might have been serving you well. You can shop for supporting tools. You can buy FindTuner as a merchandising platform. Or buy Quepid for relevance tuning. These can be powerful, but don’t always integrate well with how your team has deployed Solr or Elasticsearch.

Specific User Experience? Choose Solr or Elasticsearch

Yes, Solr and Elasticsearch are bare bones, but that’s a conscious choice. You should think of them as programming frameworks for discovery, not solutions. They require you to make lots of your own choices, because, well, your user experience is unique! With great responsibility comes great power. You can do a lot with these tools. You can build recommendation systems or personalized search or integrate machine learning and semantic search to boost relevance. You can mix and remix all these to deliver precisely what your user experience requires.

It’s often the case that your search user experience differs more than you realize. There’s just one way to implement book search, right? Not really, it’s easy to imagine several obvious variants: Library search differs from Amazon’s e-commerce book search which differs from a searching historic books which differs from the in-store kiosk at Barnes and Noble, which differs from a book search for small children (and on and on…). The discovery process differs tremendously in each of these cases, with radically different user goals, relevance expectations, and business incentives. In my experience, most firms don’t appreciate how unique their use case truly is. Solr and Elasticsearch being frameworks giving you the control means you get to make all the careful decisions to build the right experience.

When the use case is understood, and you simply want a capability such as “e-commerce search”, then you’re probably better off with a proprietary solution that targets your use case. Some companies are a slam-dunk for Endeca. So why increase the cost by building out a development team to reinvent the Endeca wheel? Remember open source is only as free as your developer’s time. Have you seen the average Solr/Elasticsearch developers salary these days? The value in open source is not reduced cost, it’s flexibility and innovation.

Reduce Total Cost of Ownership with Hosted Solr/Elasticsearch and Consulting

As an aside, there’s two ways to reduce the cost of ownership with Solr and Elasticsearch that get it closer to proprietary solutions. This gives you the advantages of open source search (innovation) while managing one of the big downsides (development cost).

  1. Hosted Solr and Elasticsearch: The majority of firms shouldn’t be spending their developers time building out search infrastructure. Hosted Solr/Elasticsearch platforms like MeasuredSearch, Elastic Cloud, Bonsai, Websolr, OpenSolr, and the like take a great deal of the infrastructure development away. They also offer operational support, freeing your team from 3AM phone calls.
  2. Consulting: Ok yes I’m a consultant, so errm maybe I’m biased. But I really do think you can avoid costly mistakes by leveraging some consulting time during planning and implementation. Get a firm with a long history of focusing primarily on Solr and Elasticsearch (like us, Flax, Sematext). They can cross-train your team, augment your staff, help develop an architecture, guide product decisions, and provide expertise in a pinch when something goes wrong.

Search is your unique value proposition? Choose Solr or Elasticsearch

If small increases in search relevance have big impacts on your business, then go where you have the most control: Solr and Elasticsearch. For example, our case study with Careerbuilder we were able to increase job application rate 3% by careful relevance tweaks. This is a big deal to their business! But it required careful, low-level work to get there. The sort of control almost no proprietary solution will give you.

If something that small isn’t a big deal, and a proprietary solution covers your use case, why burden yourself with Solr or Elasticsearch? Why pull your hair out and build a big development team? You shouldn’t do it to reduce costs. If there’s no upside, don’t invest the incredible amount of effort to bootstrap your own search team and technology.

Search needs to happen at big scale? Choose Solr or Elasticsearch

A close cousin to the last point. Perhaps it’s not so much that search is the core/unique value proposition, or that the user experience is unique, but it happens at large scale. Is Apple’s Spotlight search a core value proposition to the company? Well, maybe no. But how many iPhones exist? Supposedly on the order of 1 billion. Servicing that much search requires incredible investments in custom infrastructure. You need open source to intimately understand the fine-grained code-level details to carefully scale and test search. You may be working on your own fork of the project for scalability reasons. Or you may need project committers that can make contributions to the project to improve your scale use case.

In these cases, you can’t easily rely on a proprietary vendor to help you. You really need as much control and insight as possible, which means open source.

Bottom line

So given all these criteria, where do we stand?

I think to sum up, choose a proprietary solution when it is doing a good job of scratching a particular itch and you’re not innovating. Choose open source when (1) you’re doing something unique (2) search is close to your core value or (3) you’re operating at high scale. If you’re not in these situations, don’t give into developer pressure! Don’t choose open source because you want to reduce total cost of ownership, do it because you want to innovate. Do it because the investment can pay off tremendous, outsized divideds!

Let us help you choose!

Get in touch if we can help you a product assessment or trusted advisor consulting. With hundreds of Solr and Elasticsearch projects under our belt, we can find ways to add business value and avoid costly mistakes. And don’t hesitate to ping me directly if I’m missing anything in this article.