Announcing Quepid 6.1.0
It’s been six months since we open sourced Quepid, and it looks like momentum is growing. Since we flipped the switch, we’ve had two minor point releases. We’ve also…
It’s been six months since we open sourced Quepid, and it looks like momentum is growing. Since we flipped the switch, we’ve had two minor point releases. We’ve also…
In our first couple of posts, we ended up doing a lot of the processing work outside of Solr. It gave me a chance to polish my PowerShell skills,…
Tesseract 4 is a major upgrade to this venerable OCR library, incorporating neural networks and lots of other great improvements, but not everyone has upgraded to it (including one…
Don’t want to deploy a separate Tika server? But need Tika server-like capabilities and you already have Solr? Then this is the solution for you! What I am going…
Extracting content from file formats using Tika as a standalone service is the traditional architectural approach, and what my most recent project is built around. You can try out…
What is Tika Tuesdays? Over the past few months I’ve finally accomplished the long time personal goal of being able to easily search PDF documents with in context hit…
Recently I saw this post on solr-user mailing list asking about running Tika for text extraction in Solr, which if you follow the thread led to chorus of people…
Today we have flipped the switch to release Quepid as an open source project, licensed under the Apache License v2.0. Come check out the source at http://github.com/o19s/quepid. What is…
I wrote this back in 2012 for version 1.5.2-incubating, and never published it. So I’m updating it for the October 2018 version of OpenNLP, 1.9.0. Visit http://opennlp.apache.org/ and you…
In April I went on a pilgrimage to Enterprise Data World to encourage my colleagues in the Data world who typically are focused on issues of Data Governance, Data…
For years I’ve been interested in describing what makes someone special by looking at the ambient data that surrounds them. My first big effort was way back in 2008…
My colleague Scott had been bugging me about NiFi for almost a year, and last week I had the privilege of attending an all day training session on Apache…