RSS and Normalization theory

September 19th, 2004

The recent spat of criticism against the blogs for being bandwidth hog brings new realism as to where we are headed in the blogsphere. Microsoft stopped delivering the full text of postings on the MSDN blogs citing a bandwidth crunch (MSDN has 964 blogs as of today). Bloggers were already critical about Microsoft trimming the posts in the feed with couple of hundred characters. Why click on a link in your RSS aggregators to read the full post?
There are two ways to solve this problem:
Firstly, the MSDN RSS Feed is one gigantic feed covering all the recently updated posts. Dave Winer suggested that Microsoft cancel the aggregated feed — simply offer a feed for every blog. (Blogs.msdn.com is already offering individual feeds)
Secondly, extend the RSS specification and propose a normalization scheme for the data carried in RSS. The concept is very similar to the normalization done in databases.
Here’s an hypothetical case of normalization theory applied to RSS:

  1. The index.rdf contains all the elements except <description>
  2. A separate resource exists for the text content of <description>
  3. A new sub-element is introduced in <item> which is a reference for the separate resource for the contents of <description>

How this would work for a feed aggregator (on steroids)? The feed aggregator downloads the index.rdf as usual. The aggregator renders the content of the index.rdf by breaking down each of the elements. Since, the actual content (the entries in case of blogs) exist in a separate resource, the aggregator downloads the resource as required. In the next refresh, a local cache of the “seen” content does not require the resource to be downloaded, unless detected as modified in the index.rdf.

Publicizing your Blog

September 13th, 2004

Ganesh has some interesting ideas about promoting your blog.
But, the killer idea came from Peeyush. He changed the “My Display Name” property in his MSN Messenger to his blog URL viz. www.ranjan.us. The property is available under Tools > Options > Personal of your local MSN Messenger.


25.jpg

I was quick to monkey it. Hey, I like the idea.

25_2.jpg

…and here’s the ROI within 12 hours!

25_3.jpg

Need suggestion for Yahoo Messenger? Easy. Create a “New Status Message” pointing to the blog URL!

Business Plan Competition in India on live TV!

September 9th, 2004

Arun Natarajan reports about an article appearing in Business Standard that beginning January 2005, Zee Telefilms (a premiere Network channel in India, also a chief competitor of Sony Asia) will air an 36-episode show in which entrepreneurs will get an opportunity to pitch their B-plans on TV.
While the US has moved to the likes of Survivors, Fear Factors and Apprentice, it is very apparent that grass-root entrepreneurial spirit has caught up in India. This is a fundamental and a very positive change. The number of companies getting funded in India has increased and every month one or the other groups of VCs are scouting in India.
I remember that while trying to raise money in early 1999, I was bluntly asked what is my collateral for raising 500,000 Rupees ($11,000 approx.). The majority of VC firms (most of which were run by banks and other government institutions) were only funding core infrastructure and not software startups. I did get couple of appointments from two VCs, the parents of which were actually funding software companies here in the Valley. Both the sessions were actually spent in orienting them of the developments on the WWW. No wonder my ‘crappy’ software idea never got any attention.
Guess, which side the wind is blowing for people like me? But then, there is Murphy’s Law–Ha, I’ll always miss the train!

Google Messenger Take Two: Is Mumbai-based company prototyping it?

September 7th, 2004

Previous entry on the Google Messenger was speculative. I have found several links from other b l o g s that are talking about Google Messenger. Also, the domain ‘gMessenger’ has been registered by an annymous person.
Noteworthy is the report that Google Messenger is being prototyped by Geodesic Systems in India. Read the first comment here. Not sure, if Geodesic is being referred here.
Google and Geodesic may have forged a partnership but there are no reports confirming this in the media. Consider this–Ram Shriram is on the board of Google and was an early investor. Geodesic has Rakesh Mathur on their board. Ram Shriram was the President of Junglee (acquired by Amazon in 1998) which had Rakesh Mathur as the CEO. More truth?

G!Messenger or gMessenger: Is Google working on its version of IM?

September 2nd, 2004

I overheard this conversation during a recent visit to a local Java Users Group.

Person A: (muffled)
Person B: Do you think ‘G Messenger’ would have all the capabilities?
Person A: (muffled)
Person B: Do you know anything about the launch?
Person A: (muffled)

The above two sentences hit my head. Does G stands for Google? Is Google planning to launch its own version of Instant Messenger? It makes sense. Google is already rumoured to enter the desktop search market. It already has couple of tools viz. Deskbar, Toolbar and GMail Notifier.
Natural progression a la Yahoo. Yahoo entered the IM market and now has almost every offering within IM. Chat is just one function. The IM products from almost all the major players AOL, MSN, Yahoo have search, news, stocks, weather, etc. What these products lacked is the desktop search and Google is taking this head-on with Microsoft.
I am not sure whether my judgement its true. I cannot guarantee the merits of the conversation. I just overheard something which might be something else and took it for Google since it is always on my mind.
OK. The above conversation was sheer dramatization but, I did attend a local JUG and a boring topic got my thoughts going. But, if I were Google, that would be the next move anyway. Google Messenger with Search, News, Froogle, RSS Aggregator, E-mail alerts etc. Anybody listening at Google?

Is HTML a Legacy? The rise of Rich Internet Applications

August 22nd, 2004

c. 1995. During the SunWorld conference, there was a lot of activity related to Java. Microsoft and notably Netscape announced their intention to license Java. It was the magic of executing applets in the browser, which made the browser makers circle like bees around the Java platform. With applets the seeds of Rich Internet Applications were thus sown. Macromedia was around too. But, applets provided not only animation but a complete ability of building rich GUI applications using Object Oriented Programming.
During the dot-com boom, predominantly Internet applications were of “Browser<-->Application Server<-->Database” type. The application server was the place where all the logic, interaction, caching was being done. The browser was just the rendering engine for the HTML output from the application server. Almost every click on the HTML page resulted into server roundtrips. Model-View-Controller (MVC) framework was the chief design pattern governing complex websites.
Around 2002-2003, the logic started moving to the client. An initial download of screens, rendering logic and subsequent server trips to fetch the data. One case in example is GMail–a big download of Javascript, followed by a DHTML driven browser UI. If you use GMail, you might have noticed the speed with which you can move between messages that have already been viewed. Oddpost is another RIA example, total DHTML magic. It was the RIA-ness of e-mail which made Oddpost a good proposition for Yahoo.
A growing RIA framework is Macromedia’s Flex. Quoting from ColdFusion Developer’s Journal, “Flex offers a standards based, declarative programming methodology and server runtime services for delivering rich, intelligent user interfaces with the ubiquitous cross platform, cross-device Macromedia Flash client.”
Other top contenders for RIA framework:

Check out a real world RIA example here. (No, I didn’t make a $259/day reservation. I booked mine at a different hotel, offering $54.95/day using the plain old HTML!)

WebServices in the roadmap for On-Demand business

August 15th, 2004

On-demand computing can be described as a computing infrastructure having a collection of systems, processes, and software, which are flexible, open and integrated. It is an infrastructure, which enables rapid deployment and integration of business applications and processes through virtualization and automation.
Integration of heterogeneous platforms requires applications to communicate using a common lingua franca such as WebServices.
The key characteristic of WebServices is application-to-application exchange of data independent of a platform. This data could be exceptions from a

WS-SPAGHETTI: Uncontrolled proliferation of WebServices Specifications

August 8th, 2004

WebServices were supposed to be simple:

  1. Do an HTTP GET request, pass some query parameters,
  2. Retrieve XML instead of the regular HTML.
  3. Process the XML and extract data.

One of the very early implementations of WebService I did, was just that. Send a GET request for news stories for a ticker symbol. Retrieve the RDF document with stories and other data. Massage retrieved data the way you want–cache it, format it, archive it. Straight-forward.
In real world, If the request happens to be more than just query parameters, do an HTTP POST of an XML document. Then SOAP comes into picture–a way to formalize the passing of parameters, generalization of target functions, abstraction of network end-points, transport and marshalling of values. SOAP was good, provided us a means of exposing the remote method calls. Then we had WSDL, a language for describing WebServices.
Then came the boom, much fuelled by rivalries between companies, and the consortium jump started by these companies. SOAP, itself has its own competitors in terms of the XML-RPC protocol.
Here is the reality–For every WS-XXX specification, claiming to enhance the security, reliability of WebServices, there is an equivalent XXX4WS already being proposed by a competing consortia formed by the rivals of the former.

  1. Messaging and Transaction Coordination BPEL4WS, BTP, WSCI, WS-CAF, WS-CDL, WSCL, WS-AtomicTransaction, WS-Coordination, WS-BusinessActivity, BPML, WSFL, XLANG, ebBPSS.
  2. Reliable Messaging ebMS, WS-ReliableMessaging, HTTPR, WS-Reliability
  3. WebServices Addressing WS-Eventing, WS-Addressing, WS-Routing, WS-Discovery

That’s not the definitive list, there are many other specifications, which I don’t know where to bucket, unless I pore through the specifications, use cases and relevance related to the base SOAP & WSDL specifications.
What we want is simplicity–WSDL to expose the service interface and their end-points, SOAP to exchange the payload and a protocol for doing 2-phase commit operations on the services. Bingo–95% of the applications using WebServices for SOA implementations would be covered. Rest of the 5% will customize anyway.
On the similar lines–Adam Bosworth’s latest entry.

The Vision of Semantic Web: Part I (Search Engines and Web content)

August 1st, 2004

Semantic 1 : of or relating to meaning in language
That’s the dictionary definition of Semantic. When applied to the Web–it means content which is semantically related to the content. Let us take the example of a keyword search on Google. I type in Blog, take a snapshot of the results and then key in Weblog. There is only one result in the top 10 which is found in these two samples.
Blog and Weblog; don’t we use these interchangeably? Don’t they mean the same? Semantically, to a human–YES; to the search engine indexing the web content–NO. That’s exactly the vision of Semantic Web, when search engines and information retrieval in general extracts data like humans.
Well, in the above example of “Blog” vs. “Weblog”, its not the search engine’s fault for failing to index the content in a desirable manner. To some extent the problem also lies in the HTML page, which expresses the term “Blog” and “Weblog”. What if the HTML page header says that all the terms in the page conform to certain taxonomy. This is not uncommon, exactly what we do in a DTD or an XML Schema document. Take for example, the <P> tag. The tag is defines in the HTML DTD, and well understood by the browser’s parsing and rendering engine. A browser semantically understands this tag as–“the text which comes after this tag is a paragraph and should be rendered as such”. In case of HTML the vocabulary is limited, a P tag is always a P tag. However, in case of English language a “Blog” is a “Weblog” which is an “Online Journal” which is… the list continues.
Establishing relationship is not trivial. A well-defined set of terms related with peers, parent-child nodes, and attributes–essentially this is Ontology, a way of representing and conceptualizing knowledge.
One very good example, where this association works–A robot programmed to identify/recognize fruits. Robot’s master writes the word “Mango” on the whiteboard. The robot quickly scans his ontology(assuming that the robot in our example uses Ontology for Knowledge Representation) for a match. He finds an exact match for the word M-A-N-G-O. Then he traverses; Mango –> Mangifera Indica (attribute type Scientific Name) –> Fruit (Parent node). The robot then thinks–“Mango is a Fruit”. But, how does he find whether the fruit is sweet/sour, grown in tropical climate, has a large seed, grows on trees, is rich in Vitamin C, Folate, Selenium and Pantothenic Acid ? The answer lies within the Ontology, which could represent the extended knowledge as well.
Going back to the search example, there are couple of ways to solve this problem:

  1. While indexing the page, instead of indexing the terms, index the generic id as retrieved from a “super” ontology. The hard part is locating the Ontology
  2. Let the web page authors expose the terms with some metadata around it. For (a hypothetical) example:
    <p>This is my <so:onto id=”757893″ contextid=”222″>Weblog</so:onto>

  3. Convert the search term itself. For example, if I search for Weblog, two queries are made–for “Blog” and “Weblog” and the search results de-duped and presented.

Some work is already being done in the TAP Project. TAP is a succession of Alpiri, founded by RV Guha and Rob McCool, the same people behind TAP.

BPEL: Composing WebServices

July 23rd, 2004

Business Process Execution Language (BPEL) is an XML-based standard for composing WebServices to create processes. In the stack of WebServices standards it sits on top of WSDL, SOAP and XML Schema. WSDL documents of a WebService defines the execution and behavior–parameters, types, returns, error conditions, invocation etc. BPEL interconnects two or more WSDLs. BPEL as a standard defines the notation and semantics of composing two or more individual services in order to create a process.
Why a standard for composing WebService? Why not write a program in Java, C# to integrate two WebServices ?
The answer is loose-coupling–the same reason why we have WSDL for fine-grained services. Its abstraction. The loose-coupling allows for run time typing and invocation from the WSDL to the fine-grained service. Same holds true for modelling the integration of services using BPEL. BPEL proposes the notations as to how the individual services could be executed.
On a different note–While BPEL is being developed under OASIS, parallel efforts are underway:

  • WSCI. Developed at W3C (Authored by HP, SUN, BEA, SAP, etc.)
  • WSCL. Submitted by HP to W3C
  • WSFL. Proposed by IBM (pdf)
  • BPML
  • .Hosted at BPMI.org (Members–BEA, IBM, Fujistu, SAP, etc.)

  • BPSS. Hosted at ebXML.org