BM25F in Lucene with BlendedTermQuery

Doug TurnbullOctober 19, 2016

As part of the London hack days Diego Ceccarelli started a BM25F implementation. I began to continue it at Lucene Revolution's Lucene hackathon. I realized though that when you break down the problem, BM25F can be implemented using existing Lucene bits, including the existing BM25Similarity and the BlendedTermQuery.

event

LuceneRevolution

October 11, 2016

*The* conference focused on Solr, we've made the pilgrimage since 2010

event

beCamp!

Eric PughSeptember 16, 2016

If you're a geek in or around the Charlottesville metroplex or even if you're merely tech-curious, this is the event you don't want to miss. beCamp is Charlottesville's version of the BarCamp unconference phenomenon: organized on the fly by attendees, for attendees.

High-Quality Recommendation Systems with Elasticsearch

Doug TurnbullSeptember 9, 2016

Let's explore how to deliver great recommendations with Elasticsearch. In this article, we dive into an aggregrations based method for Elasticsearch recommendations. We attempt to understand the mechanics and assumptions of the underlying JLH scoring method.

event

Cassandra Summit

Eric PughSeptember 7, 2016

Eric will be giving a short talk on how to break the Cassandra data modeling strait jacket with DSE Search

podcast

How to Practice Search Relevancy

Matt OverstreetAugust 14, 2016

How to build a search practice with Doug Turnbull, Matt Overstreet and Scott Stultz. Doug released a companion to relevant search about how to actually practice good search. We discuss it.

Content expert's guide to diagnosing site search relevance problems

Doug TurnbullAugust 8, 2016

In this series of articles, I want to give you, the content person, a very practical and straight-forward guide to managing site search. We'll start by discussing diagnosis -- how to find problems. We'll use a simple, free analytics tool (google analytics). We'll make a few naive assumptions about these analytics that act as a good starting point.

Deploying AACT Oracle Dump File into the Cloud with Docker

Eric PughJune 30, 2016

ClinicalTrials.gov is a wealth of information. But the only database format they support is an Oracle dmp file. Follow along as I help our data science intern answer hard questions about ethnic diversity in clinical trials by deploying Oracle using Docker Cloud.

Top 7 Mistakes Organizations Make With Search

Doug TurnbullJune 29, 2016

After much sweat and tears our book Relevant Search is out! Relevant Search reflects the wisdom we've acquired over the years helping many clients improve search. I thought it would be an appropriate time to recap where many organizations get stuck with search

podcast

Algolia Search

Matt OverstreetMay 31, 2016

Julien Lemoine, CTO of Alogia, joins us to talk about how they have implemented text search from scratch.

event

DevIgnition

Doug TurnbullApril 29, 2016

Doug will speak at the the annual Washington, DC area developers conference about search relevance!