Mar 09-11

Daniel will be at Elastic{ON}15. He builds client-side search applications for Elasticsearch

Strata Conf 2015

Feb 17-20

Doug will be talking about "Database History from Codd to Brewer and Beyond"

A First Look at VisualOps

Scott StultsJanuary 22, 2015

VisualOps looks to be a great time-saver for managing AWS architecture, and it scratches an itch I've been having for quite a while.

Ad-hoc Solr Monitoring

Jody White — January 16, 2015

Hacking together Solr monitoring using Easy Auto Refresh (Chrome Plugin) and the command line

Using SolrJ CloudSolrServer and retrieving JSON

Eric PughJanuary 8, 2015

SolrCloud gives you HA capabilities for your Solr setup, but currently only the SolrJ client supports SolrCloud natively, and it returns Java objects. Here is how to return JSON formatted results instead.

Quepid: Write Tests Against Your Search Results

Doug TurnbullDecember 9, 2014

Quepid is our “Test Driven Search Relevancy” workbench product actively used by several clients. What do we mean by test-driven relevancy? We want to give you the ability to iterate quickly when creating a search solution. Sometimes the correctness of search results is fuzzy — based on how users or domain experts grade search results. Quepid has supported this since day one.

Apache Sentry. So close, and yet nothing.

Eric PughDecember 2, 2014

Security, it’s always been the bug a boo of Solr. There is a wide sense that security isn’t a concern of the Solr community, and that isn’t quite accurate. How to secure Solr is pretty simple. It’s just that there isn’t any one “blessed” approach that is wrapped into the codebase as each organizations needs are different.

Stepwise Date Boosting in Solr

Doug TurnbullNovember 26, 2014

When you want to boost on recency of content (ie more recently published documents before older ones), the Solr function query documentation gives you a basic date boost:

Two Search Conferences in Two Weeks Was Too Informative

Eric PughNovember 25, 2014

This year I experienced the conference equivalent of a lunar eclipse: two search conferences in two weeks located two hours away from my home town of Charlottesville, Virginia! Enterprise Search Summit (ESS) and LuceneRevolution (LR) share many similarities. Both have changed their names in the last year, Enterprise Search Summit expanding it’s focus to be *Enterprise Search & Discovery Summit* , and LuceneRevolution billing itself as the *Solr/Lucene Revolution*! Ironically, both still use their original domain names. Both are overlapping more in the focus on open source search, with Solr and ElasticSearch being frequent topics of conversation at ESS.

Playing with Thoth

Eric PughNovember 25, 2014

At LuceneRevolution last week, one of the sessions that got me really excited was about Thoth, presented by Damiano Braga and Praneet Mhatre. It was very nicely done, especially considering a 30 minute timeslot! Thoth is a new Solr monitoring solution open sourced by Trulia.

All Things Open

October 22-23

Doug will be talking about How I learned to stop worrying and love the SQL -- converting Quepid from Redis to MySQL

Let’s Stop Saying “NoSQL”

Doug TurnbullSeptember 27, 2014

I say the word NoSQL a lot. When I say NoSQL, I tend to talk about denormalized and hierarchical document/row-based data stores like Cassandra, Mongo, Couch, or HBase. But its a terrible way to use that term. Because there are also graph databases that feel even more normalized than traditional relational databases.

Solving data “variety” with Postgres’s NoSQL Extensions

Doug TurnbullSeptember 26, 2014

Raise your hand if you’ve heard the three "Vs" of Big Data? Velocity — your query/updates are exceptionally fast or large. Your processing the entire twitter feed. Volume — you store a massive amount of data at rest. You’ve crawled the web and are storing the entire web in a database. Variety — The structure of records varies dramatically.

The Semantic Web up and coming – impressions of SEMANTiCS 2014

René Kriegler — September 19, 2014

When you hear someone say about a technology that ‘it only works in theory’, ‘it is too labour-intensive’ and ‘it is not industry-ready’, chances are that they are talking about semantic web technologies.

DC Solr/Lucene Meetup

September 18

Doug will be talking about 'Hacking Lucene for Custom Search Results'.

Recap of Cassandra Summit 2014

Christopher BradfordSeptember 17, 2014

OpenSource Connections was well represented in San Francisco at this years Cassandra Summit 2014. We had Chris Bradford, Eric Pugh, and Matt Overstreet in attendance for the training, sessions, and networking events.

New York Solr/Lucene Meetup

September 10

Doug will be talking about 'Test Driven Relevancy-How to Work w/Content Experts to Optimize Search Relevancy'.

Cassandra Summit

September 9-12

Matt, Eric, and Chris will all be at Cassandra Summit, sharing war stories from our C* projects for the Federal Government and Commercial clients

September Chock Full of Talks (Dougtember?)

Doug TurnbullAugust 28, 2014

I somehow managed to line up a speaking gig for every week in September! I hope you’ll join me on this insane marathon. I’ll be talking about topics key to what we care about at OSC: search as a datastructure, search relevancy, and search/big data at performance and scale.

Introducing Splainer — The Open Source Search Sandbox That Tells You Why

Doug TurnbullAugust 18, 2014

One piece of feedback that has consistently come with our Quepid search testing tool is the need to understand “why” search results come back the order they do. In plain English, what factors influence search the most? Why does my search engine think a document about “water bottles” is more relevant than “baby bottles” for a search about “milk bottles”?

Improving The Camel Solr Component

Doug TurnbullJuly 15, 2014

We’ve decided to make dramatic improvements to the Apache Camel Solr component! You can find our improvements here ready for production use (specifically this pull request)! What’s been done out of our wish list above?

Reindexing Collections with Solr’s Cursor Support

Doug TurnbullJuly 13, 2014

When a Solr schema changes, us Solr devs know what’s next — a large reindex of all of our data to capture any changes to index-time analysis. When we deliver solutions to our customers, we frequently need to build this in as a feature. Many cases, we can’t easily access the source system to reindex. Perhaps the original data is not easily available, having taken a circuitous route through the Sahara to get to Solr. Perhaps the sys admins don’t want us to run a nasty SQL query with 15 joins to pull in all the data.

Quepid : Athena Release

Jonathan ThompsonJuly 11, 2014

As the newest full time developer working on Opensource Connection's search relevancy tool, Quepid, I'm happy to announce that our newest release, codenamed 'Athena', is now live. This release is the first in a series named after Greek figures in mythology that aims to add powerful new features for our tool.

RDS is expensive — a cautionary AWS tale

Eric PughJune 30, 2014

I wanted to share with the world a cautionary story related to my by @softwaredoug that reminded me that while Amazon AWS is amazing, it's also best used in situations where your needs are extremely variable. It's the natural gas powerplant versus coal powerplant of hosting providers.

What is Search Relevance?

Doug TurnbullJune 10, 2014

Have you ever tried a site's search and been underwhelmed with the accuracy of the results? Do you find yourself feeling frustrated and leaving when the search doesn't return what you’re looking for? Even worse – do you find yourself just assuming what you’re looking for must not exist on that site – only to find the item on that exact same site through other channels?


June 2-6

Bringing search relevancy to Drupalcon. More information at https://austin2014.drupal.org/, we'll see you there!

Sponsoring DrupalCon NA 2014!

Doug TurnbullMay 29, 2014

This year we’ll be discussing one of the most undervalued elements of any Drupal installation: site search. Its easy to get going, but can be challenging to master. Undervalue site search, and users will silently leave in droves. Hone and tune it, and you'll delight users with relevant results. (This of course where we come in!).

New York Solr/Lucene Meetup

May 28

Eric will be talking about Building a Lightweight Discovery Interface for Chinese Patents. Including why you need to embrace rich JavaScript interfaces, and our approach to scaling search interfaces: Cloud meets Ocean. More information at http://www.meetup.com/NYC-Apache-Lucene-Solr-Meetup/, we'll see you there!

Drupal Devs — Don’t Undervalue Relevant Site Search

Doug TurnbullMay 27, 2014

Drupal developers: raise your hand if you’ve ever been in this situation. You’re ready to deploy your app. You’ve developed a beautiful site, leveraging Drupal to its max. You’ve plugged in site search through the Drupal Apache Solr Search plugin. But there’s trouble ahead. Just before you deploy you suddenly realize your search isn’t all it could be.

Crawling with Nutch

Elizabeth Haubert — May 24, 2014

Recently, I had a client using LucidWorks search engine who needed to integrate with the Nutch crawler. This sounds simple as both products have been around for a while and are officially integrated. Even better, there are some great 'getting started in x minutes' tutorials already out there for both Nutch, Solr and LucidWorks. But there were a few gotchas that kept those tutorials from working for me out of the box. This blog post documents my process of getting Nutch up and running on a Ubuntu server.


May 17

Database History from Codd to Brewer and Beyond There are innumerable technical lessons to learn from database history. Its easy to go with what’s new and trendy. Its harder to appreciate technical reasons why one approach suddenly became more favored than another. History highlights the limitations and power behind database solutions. If we don’t learn from history we are doomed to repeat it: – What were the first databases like (Codasyl, etc)? Why did they start out this way? – Why was RDMS the right technical response to the non RDMS databases back in the day? – Why was the move away from RDMS to NoSQL the right technical solution for many problems today? A great introductory to the basic technical scaffolding and historic context for NoSQL, from this talk, you’ll have a deeper appreciation of the transition from vertically scaling Big Metal to horizontally scaling Big Data.