Tuesday, May 1, 2007

Data interchange

Since time immemorial, computer representation of data for information interchange has gone through millions of version of formats and standards. Almost every company, group, and consortium came up with its own format of data interchange, touting it obviously as the best of the lot. Now that semantic web is talk of the town (at least in the web 2.0 town), thankfully we are well positioned now with ubiquitous XML, resurrected RDF, and esoteric OWL.

A gentle primer on semantic web

Much of web today evolved over Mosaic and Netscape browser. It was an obvious choice for web based information providers to build everything that catered to these browsers. The result was astounding. The explosive growth is all in front of us.

Yet, there was one problem. Computers in this ocean screamed "Water water everywhere, not a drop to drink". Web got littered with information everywhere. Computers became faster, bandwidth availability grew exponentially, storage became cheaper. Yet, all the tall promises of early web such as the refrigerator that will order groceries when the bread was done, remained promises.

Sooner or later, it was imperative that web got structured in a way that one computer could discover, explore capabilities and create relationship between all the possible information on the web. Semantic web is the direction, web has to evolve into.

How does this direction affect applications of tomorrow? Let's talk about an example. Suppose I wanted to create this web application that will allow me to schedule tee time with my friends, at a top rated golf course within drivable distance. This will require the application to check calendar free/busy data from all different calendaring applications my friends use, query a sports directory for a list of golf courses within 50 miles from my home, consult multiple sites to filter this list to have overall user rating of 4 or more. Finally, it has to check each of the golf course site to check availability information for next 7 days.

It is not only a nightmare to build an application like this today, it is practically impossible. Why? because all the assumption about availability of information such as calendar and user ratings etc. are not available in a way my application could interrogate. To accomplish such a task today, one of the friends have to spend hours (if not days) trying to arrange that using the information available on the web or start a flurry of emails trying to gather it.

What if you wanted to buy a used camera today. Even today, you will spend countless hours surfing trying to find a deal. What if you had a software agent, you could just instruct the model and make of the camera you were interested in and it could find couple on eBay, one on craigslist and 3 different ecommerce dealers who were ready to ship it to you at the lowest price available anywhere. The agent would know your zipcode, estimate shipping and tax and perform the comparison all by itself. Is there an incentive for eBay to publish its information in structured fashion? Absolutely. Everybody wants a bigger audience.

Semantic web promises to allow creation of such application much easier, perhaps designed by user themselves.

No comments: