Vespa vs Lucene: First Impressions
As we learn more about Vespa, we wanted to give our initial impressions when comparing to Lucene-based search (Solr/Elasticsearch). This is based on initial passes with Vespa and our…
As we learn more about Vespa, we wanted to give our initial impressions when comparing to Lucene-based search (Solr/Elasticsearch). This is based on initial passes with Vespa and our…
Responding to queries takes CPU time, memory, and in unfortunate cases, wall time as well. Increasing the power of a cluster helps, over-provisioning can be very expensive. Caching is…
Searching for non-ASCII characters can be a challenge. There are a number of reasons for doing so, even in a primarily English corpus: Accented characters in names and words…
As part of the London hack days Diego Ceccarelli started a BM25F implementation. I began to continue it at Lucene Revolution’s Lucene hackathon. I realized though that when you…
I want to share what I know at your company’s lunch and learn! For free! The problem with lunch and learn’s is that everyone’s busy. It’s challenging to find…
After getting cranky on one Algolia blog post, and having a Search Disco episode with Julien Lemoine CTO of Algolia, I’m left fascinated by the solution. Algolia, so…
Overview Quite a while ago Flax released Luwak as a document monitoring and alerting library. It was designed to solve the problem of running a lot of predetermined queries…
Every one of us at OSC looks forward to Lucene Revolution. It’s one of the few conferences we attend where everyone understands search at a deep level. That means…
There’s something new cooking in how Lucene scores text. Instead of the traditional “TF*IDF,” Lucene just switched to something called BM25 in trunk. That means a new scoring formula…
VLDS Insights Conference Recap Yesterday I had the privilege of attending the VLDS Insights conference. VLDS is the Virginia Longitudinal Data System, which provides researchers and policy makers anonymized…
We’re pleased to announce that Chapters 4 and 5 are available for early access for Relevant Search! Please read and give us feedback. This is early access for a…
I often want to intercept the complete Solr updates sent to Solr in a format I can use offline. Clients have complex ingestion systems. I shouldn’t need to have…