Last week was the crucial week on my current Lucene -> Solr conversion project for making our goals. A lot of work the previous couple of weeks came together. I wanted to take a couple of minutes and just record some of the little things that Ive been learning about:
Sunspot is the up and coming solution for integrating Solr into Ruby on Rails, and fortunately enough, the 1.0 release (followed quickly by 1.0.1!) has just come out last week. Between acts_as_solr and Sunspot, Sunspot wins hands down for its support of a master/slave Solr configurations, embedded Solr for testing, richer indexing semantics, and not being tied to ActiveRecord. The companion sunspot_rails gem does give wonderful ActiveRecord integration however.
Solr cores are the bees knees! Weve built a simple RoR webapp using HTTParty and the Solr API that allows you to perform all the admin functions for cores, and allows you to quickly clone a core for your own nefarious purposes! Simplifies hacking around with a new schema or configuration without having a local copy of Solr running. Allows multiple QA environments to potentially share a single Solr infrastructure.
Solr master and slave setup in a single VM. While pointless from a scaling perspective, its a really great way to work out the kinks! Its funny to see a slave core polling the same Solr VM its in for updated segments!
Doesnt suck after all. Actually, maybe I should say that JBoss, when combined with JRuby, means that JBoss doesnt suck so much. I had the aforementioned Solr core admin tool bundled up as a WAR file with JRuby, and was able to deploy it to an existing environment that had JBoss installed! I didnt have to install ruby on the box, (or JRuby for that matter!) I just deployed the WAR file and bamn, off to the races. Ops folks get the JBoss they love, I get the Ruby on Rails that I love.
And on a related note, Warbler was the key to thinking JRuby is cool. Id never actually had to package up a RoR app, so Warbler came to the rescue. And you know what? It was nice to build a single file that I knew had everything that I needed in it that could be scped around! And thanks to some cool code in the environment.rb, my app was able to load up the right configuration file for the environment based on an environmental variable set in JBoss.
I recently migrated a Linux VPS based RoR + Solr app (see a trend in tech choices ;) ) to a Windows environment. And to deliever the new Windows environment, I used VirtualBox to host the Windows Vista environment on my Mac laptop.
A couple of notes:
- VirtualBox may not have all the snazzy integration points of Parallels with the host computer like seamless application sharing, but it seems to be much lighter weight. Starts up quicker, and I dont get the spinning beach ball of death as much.
- If you are shipping a 11 GB file, you cant use a 16 GB USB Memory Stick… Turns out the biggest file is 4 GB. Â (Although I never tried formatting the stick as NTFS, maybe that would have allowed a single 11 GB file???)
- Uploading 11 GB to a remote out on the internet server will take a long long long time. Even on a really fast network. connection.
- If you need to format an external USB hard drive as NTFS on a Mac, it is possible! Just fire up your trusty Windows Vista image in Parallels, plug the USB drive in, download and install the correct USB drivers so the drive doesnt show up as a network share mapped to the Mac, and then use the built in reformatting tools! Warning: This will take a loooong time!
- Lastly, if you are using VirtualBox, and you attempt to create a Windows XP machine, and attach a Windows Vista hard disk image to it, VirtualBox will let you! And then Windows wont start. sigh.