When you hear someone say about a technology that ‘it only works in theory’, ‘it is too labour-intensive’ and ‘it is not industry-ready’, chances are that they are talking about semantic web technologies. As my experience has been different in a semantic search project for OpenSource Connections, I went to the SEMANTiCS 2014 conference in Leipzig, Germany, earlier this month, to learn more about the state of the art of semantic technologies and Linked Data. SEMANTiCS was preceded by three days of workshops and community events, including the 2nd DBpedia community meeting.
Right from the start, the conference left no doubt that semantic web technologies are now mature enough to be deployed to real-world applications in many industries. There was a general impression that we are now at a point where semantic solutions are really taking off.
In the opening keynote talk, Phil Archer, W3C, gave many examples to demonstrate how widespread the application of semantic web technologies and Linked Data already is today. Among others, they are used by governmental organisations, including the E.U., libraries, publishers and the media industries in general, health care, the finance industry, e-commerce and the automotive industry. This success has been made possible by the development of solid and widely used standards (RDF, SPARQL, SKOS, …) Furthermore, we now have a mature technology stack that allows us to build semantic applications at enterprise level. Finally, Phil noted a growth in the number of system integrator companies that train their consultants in semantic web technologies. And indeed, there were a lot of consultants and developers new to semantic web technologies and who had come to the conference eager to learn.
The SEMANTiCS conference featured case studies of applied semantic web and Linked Data technologies from a broad variety of industries. According to the organisers, more than half of the 45 talks were from industry, which is another indication that the Semantic Web has spread beyond academia.
Sofia Angeltou described the use of Linked Data at the BBC. She explained that annotating and linking data has been a very efficient way to share content across editorial departments and a lightweight approach to data integration. The use of Linked Data as a successful integration strategy within and beyond an organisation, while leaving existing systems mostly unchanged, was a recurring theme at the conference. This alone could be an argument for the use of semantic web and Linked Data technologies. Another interesting aspect was that the BBC takes an agile approach to expanding their ontologies. They have an established process to grow the ontologies on demand and they focus on specific editorial topics, such as the 2012 Olympics and the Scottish independence referendum, to introduce the use of ontologies. This process seems to facilitate the introduction of semantic web technologies, especially in large organisations.
Other talks included case studies of Healthdirect Australia, an impressive platform to provide health-related content and (self-) services; “ZEIT ONLINE”, semantic support for content management of the online edition of one of Germany’s most important weekly newspapers; and JURION, a semantically supported legal knowledge management and search solution by Wolters Kluwer, Germany.
I particularly liked the case study of Oerlikon Metco’s knowledge portal because it demonstrated that semantic technologies are relevant not only for companies that primarily create, publish or manage content or knowledge. Oerlikon Metco is a leading manufacturer of surface solutions (coatings, sprays, etc.) who have built a platform that provides a centralised portal to all internal technical information using MS SharePoint and semantic technologies. The portal has a great positive impact on the accessibility and diffusion of information across sub-divisions of the company and on response times to customer requests.
Semantic web technology and beyond
A number of conference talks gave insight into the available semantic web technology stack. Semantic technology has reached maturity and is used in many applications today. There also were a lot of poster sessions and research papers which showed that we can expect many more exciting semantic solutions in the future.
One of the most popular databases for Semantic Web applications is the Virtuoso server by OpenLink Software, which also has an Open Source edition. Orri Erling gave an impressive keynote talk, in which he explained the challenges to implement a very fast SPARQL-enabled database. According to his experience, the main source of slow queries are not-optimal query plans. It is hard to optimise query plans, given the schemaless nature of triple stores. On the other hand, Virtuoso, when used as an SQL database, is magnitudes faster than a MySQL database (Star Schema Benchmark). In order to make Virtuoso even faster when used as a SPARQL database, the developers are trying to discover and explore structures in triple store data and then apply optimisations that are known to work for SQL tables.
Virtuoso is also part of the Linked Data Stack, which was presented at the conference. This data stack has been compiled by the E.U. funded LOD2 project. The components of this stack cover probably all aspects of managing Linked Data, including authoring, storage, linking, search and quality assurance. They are downloadable as Debian packages and Virtual Machines. The project is currently approaching its final release. After completion, the components will continue to be available under the name Linked Data Stack at this location.
I especially liked Andreas Blumauer’s talk. Although he had a great tool to show – Semantic WebCompany’s PoolParty Semantic Suite – he reminded us that technology is not everything and he shared his insights in the planning of semantic web projects and semantic data management. He suggested to start with a simple SKOS-based taxonomy and only later turn this into an elaborated ontology and publish linkable data. This provides a better acceptance of semantic web technologies within an organisation and allows them to better recognise their requirements and build up required skills along the way.
My impression at the conference was that it seems obvious to almost everyone working with semantic web technologies that data quality and long-term data management by domain experts are crucial to building successful applications. As a search technology consultant, this particularly pleased me and is very different from usual experience as search tends to be seen merely as a technology – an approach which can make search applications fail.
Next year’s SEMANTiCS conference will take place in Vienna from 15th to 17th September, but semantic web technologies and Linked Data will probably come to you before!