Blog

September Chock Full of Talks (Dougtember?)

I somehow managed to line up a speaking gig for every week in September! I hope youll join me on this insane marathon. Ill be talking about topics key to what we care about at OSC: search as a datastructure, search relevancy, and search/big data at performance and scale. Dont hesitate to email me if youd like to chat about any of these topics before or at the conference!

Heres what Ive got lined up:

Modern Python Concurrency

PyCHO (Python Charlottesville Users Group) September 2nd, 6:30 PM, Center for Open Science, Charlottesville VA

Are you choosing the right model for your analytic and data-processing workload? Is it going to scale? Do you have the right concurrency model? Andrew Montalenti and I will be talking about Python and Concurrency! Im going to setting the stage talking about the primitives the OS exposes, as well as the current state of Python concurrency – including libraries such as gevent and twisted. Andrew will introduce everyone to the new asyncio library that hes used to great effect on his Apache Storm library, streamparse.

Test Driven Relevancy

NYC Solr/Lucene Meetup September 10th, 6:30 PM, XO Group Inc, New York, New York

Do you struggle to maintain search relevancy over time? This is the talk Ive given before on the ideas behind our product Quepid. This time Ill have Splainer to demo too! Heres the blurb:

Getting good search results is hard; maintaining good relevancy is even harder. Fixing one problem can easily create many others. Without good tools to measure the impact of relevancy changes, theres no way to know if the “fix” that youve developed will cause relevancy problems with other queries. Ideally, much like we have unit tests for code to detect when bugs are introduced, we would like to create ways to measure changes in relevancy. This is exactly what weve done at OpenSource Connections. Weve developed a series of tools and practices that allow us to work with content experts to define metrics for search quality. Once defined, we can instantly measure the impact of modifying our relevancy strategy, allowing us to iterate quickly on very difficult relevancy problems. Get an in depth look at the tools we utilize when we not only need to solve a relevancy problem, we need to make sure it stays solved over the products life.

Hacking Lucene for Custom Search Results

DC Solr/Lucene Meetup September 18th, 6:00 PM, Comcast Labs

Stuck on a problem that might need very specific search ranking? This is my talk on working with custom search scoring and ranking. If youve ever been stuck on a tough search relevancy problem and think you need a custom solution, this talks for you!

Heres the blurb:

Search is everywhere, and therefore so is Apache Lucene. While providing amazing out-of-the-box defaults, theres enough projects weird enough to require custom search scoring and ranking. In this talk, Ill walk through how to use Lucene to implement your custom scoring and search ranking. Well see how you can achieve both amazing power (and responsibility) over your search results. Well see the flexibility of Lucenes data structures and explore the pros/cons of custom Lucene scoring vs other methods of improving search relevancy.

Elasticsearch Night!

Open Source Staunton, September 25th, 6:00 PM LightCastle Technology Complex, downtown Staunton VA

New to search? Want to understand why everyones so excited about Elasticsearch? My colleague, Daniel Beach, and I will be giving two short presentations introducing everyone to Elasticsearch! This is a chance to come out and learn about both the fundamentals of search engines as a technology while getting started with Elasticsearch. If you want to begin using Elasticsearch for a new project, or have any questions about why search engines are so great, come out to this talk!

I hope youll come out to find me for these talks. Please feel free to email me if theres any search problems youd like to chat about!

Cheers!