Universities in RDF
What if you could find all of the academics working in your field across all universities in the world with a simple, SQL-like query? What if you could filter your search based on the region the academics are working in, the projects they have worked on, the languages they speak and then get a list of the courses that they offer? All with one query.
I don't know about you, but I totally geek out over that prospect.
It's something that is now becoming a very real prospect, too, as Semantic Web technologies such as RDF are developing. I totally fail at explaining RDF, so I'll let Manu Sporny tell you about RDF instead.
My very first RDF hack
As I've been wrapping my head around RDF and how vocabularies interrelate, I have roughed out a preliminary map of how the different vocabularies could fit together to model faculty, their courses, and their publications.
I used a couple of vocabularies that are under active development, such as Talis's Academic Institution Internal Structure Ontology and Patrick Murray-John's University Onotology. I have a much longer (and significantly more boring) discussion of my process in hacking this together, including a discussion of the vocabularies I chose.
Making RDF easy
So this map begins to show how we could describe faculty conceptually in a way that would make all faculty Web sites into one giant, queriable database.
But even if we could work this hack into something elegant and settled on a common vocabulary for all different faculty to use, this is still pretty improbable, right? I mean, you'd have to get all the faculty to go in and tag up their pages the same way -- when's that ever going to happen?
Here's where it gets awesome.
A number of folks in the Drupal community have been working to make RDF in Drupal really usable and simple for content managers to implement on sites. And you wouldn't believe me if I tried to tell you, so you'll have to try it for yourself:
http://www.lin-clark.com/rdf-faculty-db/sparql
(note: this isn't really a sparql endpoint... you'll hear more about sparql down below)
Click on a name and inspect it with Firebug... see the thing in the tag that says property:"foaf:firstName"? That's the RDF (this is RDFa because it's in the XHTML rather than a separate .rdf file. I assume the 'a' stands for attribute, but others contend it stands for awesome).
Now log in with the un/pw 'demo'. Go to the drop down menu at the top of the page and selecting Content Management -> Create Content -> Faculty Member. After filling out the fields, be sure to hit Save at the bottom of the page... and verything else is done for you, from URI to RDFa.
Check out your new entry... it has all of the RDF tagging that the other ones have. You just made a contribution to the web of data! Sir Berners-Lee would be so proud.
And now, for the really mind blowing part...
Blowing your mind with SPARQL, the RDF query language
So the RDF is in the page, but what do you do with it now?
You access it from anywhere in any way you want to.
For instance, go to: http://dbpedia.org/sparql
For the graph URI, use http://www.lin-clark.com/rdf-faculty-db/sparql and select the second option in the dropdown, get remote RDF data.
Enter your query. The one below will find all faculty listed who have an interest in XML.
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?uri ?interest ?name WHERE {
?uri foaf:surname ?name . ?uri
foaf:topic_interest ?interest
FILTER regex(?interest, "^XML")
}
Ok, so maybe querying 4 records on a toy site isn't mind blowing, but hopefully you can see where I'm going with this.
Let's pretend you have 100 top universities with all of their faculty and courses structured this way. And let's say you want to know where you should go to learn about the Semantic Web... what do you do? Make a list of the sites, find the faculty who are pursuing Semantic Web research, retrieve projects and publications for each researcher, and retrieve the courses they teach... and you just made your own custom tailored research program comparison sheet and course catalog, in 10 minutes all with ONE query.
Pretty cool, isn't it? At least I think so.