Haystack LIVE! Understanding Scoring Through Examples

Our speaker this week is Rudi Seitz, from KMW Technology.

The talk aims to teach you, in a short time and without any math, everything you’ll ever need to know about scoring in search. Having a solid understanding of scoring will prepare you to better diagnose relevance problems and improve relevance in real-world applications.

The built-in scoring mechanism in Elasticsearch and Solr can seem mysterious to beginners and experienced practitioners alike. Instead of delving into the mathematical definitions of TFxIDF and BM25, this talk will help you develop an intuitive understanding of these metrics by walking you through a series of simple examples.

Each example consists of a query and list of several indexed documents. You will be invited to guess which document comes up on top for each query. In each case, we will examine why that particular document gets the highest score and we’ll extract the general principle behind this behavior.

A set of six examples will be followed by an “extra credit” section focusing on more advanced topics. Along with illustrating all of the key behaviors of BM25, our examples will touch on some of the “gotchas” around scoring in cluster scenario, where shards and replicas come into play.

See previous talks from the Haystack LIVE! series on our Youtube channel

Get Notified about Upcoming Events

  • This field is for validation purposes and should be left unchanged.