I had the pleasure of attending the Semantic Web Technologies RDF and OWL workshop with Bob DuCharme at the UVA New Horizons Conference. Bob is a well respected contributor in the semantic web community and has written several books related to the topic. After attending this workshop I feel like I finally grasped the concepts of the semantic web technologies. Bob did a great job of explaining the different technologies in a way that a technical person could easily understand.
The workshop began with a discussion of Resource Description Framework (RDF), which is a means to store metadata about resources. A resource can be anything from an audio (mp3, wma, wav), video(wmv, mov, mpg), e-book, etc. The metadata can be stored within the actual file or in a separate linked location. The metadata is made up of a simple data structure containing a subject, predicate and object. Personally I am not fond of these terms as they are a bit confusing and I find the best way to describe them is with an example from Wikipedia.
The RDF code above can be parsed into the following triplet:
Object: “Tony Benn”
Which in plain English translates into, “The title of this resource, which is published by Wikipedia, is `Tony Benn’”. Now we could add several more tags to this example similar to the “dc:title” tag to describe many aspects of the object such as publisher, contributor, etc.
RDF can be assigned to a resource in several different ways and using different syntax. For example, Notation 3 is a syntax for defining RDF in a readable format without the use of XML. RDF can be embedded in HTML as well to define resources within a web page. Several popular web sites currently embed RDF in their web pages including www.wikipedia.org and www.digg.com.
Bob also discussed another web semantics technology referred to as Web Ontology Language (OWL). OWL is an extension of RDF Schema which is a common set of terms defined to describe a domain (A domain being something such as music, psychology or biology). Which basically means an agreed upon list of terms to be used to describe something. The Dublin Core is commonly used to describe video, sound, image and text with several metadata elements or “terms”.
Another technology discussed during the workshop was SPARQL Protocol and RDF Query Language (SPARQL) which is the query language used with RDF. If you think of RDF as a huge database available on web, SPARQL would be the SQL language used to query the database. It was with this thought during the workshop that I realized the remarkable potential these semantic web technologies possessed. Now imagine all these RDF “databases” are linked together and you can query all of them at once. DBpedia is a project designed to do just that by extracting information from Wikipedia, making the information available on the Web and linking to other data sets such as MusicBrainz.