You may remember I was ever so slightly enthusiastic about last year’s inaugural Haystack conference in Charlottesville, Virginia – it would perhaps be unseemly to be so effusive again, since I was part of the team running this year’s event after joining OSC earlier this year. However, from what I’ve heard we managed to avoid ‘difficult second album syndrome’ and run another great conference.
This year the venue was a cinema in downtown Charlottesville, which gave us much needed extra space and easier access to the Downtown Mall and its array of restaurants, snack shops and bars. Plus points included reclining seats, an onsite cafe and some very big screens, although we did discover some issues with WiFi coverage (perhaps an aid to concentration however?) and the movie projector didn’t always play nice with presenters’ laptops. We’ll sort this out for next time I’m sure – of course the affected presenters were professionals and coped admirably with the glitches. Also, I’m hoping none of the conference attendees felt they missed out on seeing Avengers Endgame, on show in one of the other theatres…but just in case they did I’ll introduce some of the marvel-lous characters we saw onstage at Haystack.
The first day was introduced by Max “Ironman” Irwin of OSC who gave us a keynote on What is Search Relevance?. Max showed us the three aspects of search quality: performance, experience and of course relevance, and went on to discuss how we can score judgements, cope with disagreements between human raters and fold in user engagement data. He also showed us a list of the speakers to come and welcomed over 140 attendees from the USA and Europe to Haystack.
The next talk I saw was by Alessandro “Dr Strange” Benedetti of Sease Ltd. (OK, I’ll stop the Avengers references now before I infer one of our speakers was green and angry) on the Rated Ranking Evaluator relevance testing tool. He showed us the heirarchical model for test queries they have developed and how the open source RRE can be used to run a huge amount of tests on a Solr or Elasticsearch instance as part of the Maven build process, producing a set of relevance metrics. These metrics in turn can be emitted to a spreadsheet, RRE’s own server dashboard or as JSON (RRE also uses JSON for the relevance judgements that must be provided to it).
Tara Diedrichsen & Tito Sierra of LexisNexis followed with a fascinating talk on best practices for gathering human judgements for relevance testing. It’s clear that LexisNexis have put huge amounts of work into this area to help them identify problem areas to focus on and to evaluate new algorithms. I’m pleased they stressed that it’s important to record why a search result is good or bad – this is essential information for relevance engineers who may be unfamiliar with the subject area.
Lunch followed, and conference attendees scattered to the various restaurants on the Downtown Mall – luckily as far as I can tell they all came back afterwards. The next talk I saw came from René Kriegler on Query Relaxation which was fascinating – René showed us various ways to remove terms from a query to increase the number of results and eventually suggested using a neural network to work out the best term to lose.
Unfortunately I missed the next session as I was preparing to run the Lightning Talks, our last session of the day. The Lightning Talks started with a moving tribute to Ted Sullivan by his friend and colleague Eric Hatcher – sadly we lost Ted this year, I was very privileged to be able to meet him at last year’s Haystack.
The talks featured speakers on subjects including Zookeeper on AWS, the new Quaerite relevance test tool, Solr on Kubernetes and the challenges of full text search at the Hathi Trust over 17 million documents. Thanks to everyone who volunteered to speak at such short notice!
You’ll be glad to know we will be releasing the slides for all the main talks and the Lightning Talks very soon, and unlike last year we managed to video all the sessions – so anything you (or I) missed (or simply didn’t understand well enough at the time) will be available to peruse at your leisure. UPDATE – The slides & video are now available here – click the ‘More’ link on each talk to see them.
The first day of Haystack finished with drinks and dinner at Kardinal Hall nearby, during which a few attendees played Bocce (although stupidly I thought it was boules). I’ll write about the second day very soon! If you’d like a richer description of the first day including some of the talks I missed please do read Jettro Coenradie’s blog.
If you missed out on Haystack there will be a European event on October 28th in Berlin (details to be confirmed) – and if you have questions about this or indeed anything search or relevance related, please do contact us.