Archive for the ‘Technology’ Category

RSS and Normalization theory

Sunday, September 19th, 2004

The recent spat of criticism against the blogs for being bandwidth hog brings new realism as to where we are headed in the blogsphere. Microsoft stopped delivering the full text of postings on the MSDN blogs citing a bandwidth crunch (MSDN has 964 blogs as of today). Bloggers were already critical about Microsoft trimming the posts in the feed with couple of hundred characters. Why click on a link in your RSS aggregators to read the full post?
There are two ways to solve this problem:
Firstly, the MSDN RSS Feed is one gigantic feed covering all the recently updated posts. Dave Winer suggested that Microsoft cancel the aggregated feed — simply offer a feed for every blog. (Blogs.msdn.com is already offering individual feeds)
Secondly, extend the RSS specification and propose a normalization scheme for the data carried in RSS. The concept is very similar to the normalization done in databases.
Here’s an hypothetical case of normalization theory applied to RSS:

  1. The index.rdf contains all the elements except <description>
  2. A separate resource exists for the text content of <description>
  3. A new sub-element is introduced in <item> which is a reference for the separate resource for the contents of <description>

How this would work for a feed aggregator (on steroids)? The feed aggregator downloads the index.rdf as usual. The aggregator renders the content of the index.rdf by breaking down each of the elements. Since, the actual content (the entries in case of blogs) exist in a separate resource, the aggregator downloads the resource as required. In the next refresh, a local cache of the “seen” content does not require the resource to be downloaded, unless detected as modified in the index.rdf.

WebServices in the roadmap for On-Demand business

Sunday, August 15th, 2004

On-demand computing can be described as a computing infrastructure having a collection of systems, processes, and software, which are flexible, open and integrated. It is an infrastructure, which enables rapid deployment and integration of business applications and processes through virtualization and automation.
Integration of heterogeneous platforms requires applications to communicate using a common lingua franca such as WebServices.
The key characteristic of WebServices is application-to-application exchange of data independent of a platform. This data could be exceptions from a

Collaborative ranking? Orkut + Google = Orkut TrustRank

Friday, February 6th, 2004

Google acquired Outride in the summer of 2001. An article published in March 6, 2001, issue of Red Herring magazine[Google cache] reads,”…it has built a revolutionary, individualized search technology, unlike competitors that personalize searches based on groups of users or on user-specified preferences…”.
Now, Orkut is community. People related to each other. How would PageRank algorithm be re-written using Orkut? Alice is part of Orkut. Bob is few hops to Alice. Bob makes a search. OK. The search results are page ranked. The ranks then get a multiplicative factor(say a Trust Factor). The final output has Alice’s link on top.
Makes sense? Maybe it doesn’t. Google only has 3.3 billion pages indexed today. A search on George Bush returns 6,470,000 results. But then,who thought about e-mail spam when RFC 821 was drafted in 1982.