Blog

Stepwise Date Boosting in Solr

When you want to boost on recency of content (ie more recently published documents before older ones), the Solr function query documentation gives you a basic date boost:

boost=recip(ms(NOW,mydatefield),3.16e-11,1,1)

This will give you a nicely curving multiplicative downboost where the latest documents come first (multiplying the relevancy score by X for stuff “NOW”), slowly sloping off into the past (multiplying the relevancy score by some decreasing value

Let’s say instead of a nice curving boost, you’d just like to downboost anything that happens before some date in the past. Say you set a date, 10 years ago, anything older than 10 years ago should get significant downboost. Anything newer should not be impacted.

Well you can do this with Solr function queries!

Let’s start by thinking about how you might naturally think about this problem in code

if mydatefield > dateInPast    return 1.0else    return 0.8 // some downboost multiplier

Unfortunately, we’re hamstrung by the fact that Solr function queries don’t have a greater than or less than operator. Function queries do have an if function that tests if the value is non-zero. Both negative and positive values are considered true. Can we use some of the other things available in function queries to get the effect we desire?

First, let’s calculate a date in the past to compare against. Well call this “marker” as well it marks something. Here we’ll use 10 years ago (315569259747 is 10 years in milliseconds).

// 10 years agomarker = sub(ms(NOW),315569259747)

This translates to English as NOW (as milliseconds since epoch) minus 10 years in milliseconds.

Now we just need to figure out a way to do a comparison. First let’s observe that if we take our date field and subtract it from our marker we’ll either get a positive value (happens after our marker) or a negative value (happens before our marker). In other words

2010/01/01  -  2004/0101    == (some postive number)2001/01/01  -  2004/0101    == (some negative number)

If we take the min of this subtraction and 0, we’re either left with 0, which indicates that the subtraction was positive (and thus our mydatefield was to the future of the marker). Or we’re left with a negative number (our mydatefield was to the past of the marker).

This dovetails nicely into Solr’s if. Once we do this subtraction, we can use the 0 (the falsy value) to know that mydatefield is to the future of marker. So we can put a multiplicative boost that shows no impact (1) in the false parameter, and down boost when there’s a true parameter (negative values are still truthy).

Putting that together, you have something like:

boost=if(min(0, mydatefield-marker),0.8,1.0)

Expressing (mydatefield – marker) as a function query:

sub(ms(mydatefield),sub(ms(NOW),315569259747))

Putting it all together:

if(min(0,sub(ms(mydatefield),sub(ms(NOW),315569259747))),0.8,1)

And viola, you have a multiplicative down boost of 0.8 for anything older than 10 years!

Oddly enough when I was writing this post I realized that 10 years ago from NOW (ie sub(ms(mydatefield),sub(ms(NOW),315569259747))) was when I met and started dating my wife. Nowadays, she frequently edits my blog posts against my will. So Ill dedicate this post to her and all the great stuff she does for our family!

And as always, please to contact us to discuss how you can utilize our expertise in improving the relevancy of your Solr search results!