Blog

CodeFloat 2012 “by the numbers”

For the past year Ive been talking with Christopher Ball and Erik Hatcher about how frustrating it is to have to carve time out of our regular day jobs to work on Solr and Lucene. Thanks to their enthusiasm for the idea of spending a couple of days hacking on Solr and Lucene, last weekend OSC hosted folks who spent two days of writing code and learning from each other. And since hacking all day Friday and Saturday burns you out, some of us went tubing on Sunday!

“by the numbers” what happened:

Attendees: 12, Eric, Scott, Matt, John, Kasey, David, Joesph, Erik, Anthony, Jessica, Christopher, Jake

Visitors who stopped in for some search chat: 2

JIRAs Closed: 4 – SOLR-358, SOLR-3051, SOLR-1486, SOLR-3292

Commits Made: 3 – SOLR-1280, LUCENE-4265, LUCENE-4266, SOLR-3648

Wiki Pages Updated: 2

Motto for the weekend: “Its literally going to take at most an hour. Make that 18 minutes.” – David Dodge

We had Show and Tell each day, and here is what we saw:

  1. A “single page” search app using EmberJS from Matt Overstreet that demonstrated the awesome data binding aspects of EmberJS. Instant Search results were as simple as binding the results pane to the query box and issuing queries to Solr.
  2. Erik Hatcher put together a demo of the ScriptUpdateProcessor that lets you use a scripting language (JavaScript, Ruby, Python) to manipulate incoming documents. The ability to rapidly prototype ideas through this tool is exciting. More information at http://wiki.apache.org/solr/ScriptUpdateProcessor.
  3. Anthony Burton walked us through some challenges he was having in using XPath with DataImportHandler to index xml documents. We all debugged the issues as a group, with David Dodge having the key insight into the pattern of XPath that was required to make it all work. Lots of discussion on the pros and cons of DIH!
  4. I attempted to demo using Apache UIMA with Solr. Jessica Bonnie and I hacked on the example app documented in the wiki at http://wiki.apache.org/solr/SolrUIMA, and at the end of it I had the WhitespaceTokenizer working through the UIMA framework. However, no luck in getting the more interesting AlchemyAPI and OpenCalais integrations to work.
  5. John used node.js to index the public data from StackOverflow directly into Solr via the JSON updater in Solr.

And then we floated!