I somehow managed to line up a speaking gig for every week in September! I hope you’ll join me on this insane marathon. I’ll be talking about topics key to what we care about at OSC: search as a datastructure, search relevancy, and search/big data at performance and scale. Don’t hesitate to email me if you’d like to chat about any of these topics before or at the conference!
Here’s what I’ve got lined up:
Modern Python Concurrency
PyCHO (Python Charlottesville Users Group) September 2nd, 6:30 PM, Center for Open Science, Charlottesville VA
Are you choosing the right model for your analytic and data-processing workload? Is it going to scale? Do you have the right concurrency model? Andrew Montalenti and I will be talking about Python and Concurrency! I’m going to setting the stage talking about the primitives the OS exposes, as well as the current state of Python concurrency — including libraries such as gevent and twisted. Andrew will introduce everyone to the new asyncio library that he’s used to great effect on his Apache Storm library, streamparse.
Test Driven Relevancy
NYC Solr/Lucene Meetup September 10th, 6:30 PM, XO Group Inc, New York, New York
Getting good search results is hard; maintaining good relevancy is even harder. Fixing one problem can easily create many others. Without good tools to measure the impact of relevancy changes, there’s no way to know if the “fix” that you’ve developed will cause relevancy problems with other queries. Ideally, much like we have unit tests for code to detect when bugs are introduced, we would like to create ways to measure changes in relevancy. This is exactly what we’ve done at OpenSource Connections. We’ve developed a series of tools and practices that allow us to work with content experts to define metrics for search quality. Once defined, we can instantly measure the impact of modifying our relevancy strategy, allowing us to iterate quickly on very difficult relevancy problems. Get an in depth look at the tools we utilize when we not only need to solve a relevancy problem, we need to make sure it stays solved over the product’s life.
Hacking Lucene for Custom Search Results
DC Solr/Lucene Meetup September 18th, 6:00 PM, Comcast Labs
Stuck on a problem that might need very specific search ranking? This is my talk on working with custom search scoring and ranking. If you’ve ever been stuck on a tough search relevancy problem and think you need a custom solution, this talks for you!
Here’s the blurb:
Search is everywhere, and therefore so is Apache Lucene. While providing amazing out-of-the-box defaults, there’s enough projects weird enough to require custom search scoring and ranking. In this talk, I’ll walk through how to use Lucene to implement your custom scoring and search ranking. We’ll see how you can achieve both amazing power (and responsibility) over your search results. We’ll see the flexibility of Lucene’s data structures and explore the pros/cons of custom Lucene scoring vs other methods of improving search relevancy.
Open Source Staunton, September 25th, 6:00 PM LightCastle Technology Complex, downtown Staunton VA
New to search? Want to understand why everyone’s so excited about Elasticsearch? My colleague, Daniel Beach, and I will be giving two short presentations introducing everyone to Elasticsearch! This is a chance to come out and learn about both the fundamentals of search engines as a technology while getting started with Elasticsearch. If you want to begin using Elasticsearch for a new project, or have any questions about why search engines are so great, come out to this talk!
I hope you’ll come out to find me for these talks. Please feel free to email me if there’s any search problems you’d like to chat about!