Reverse mashup

Web September 3rd, 2008

I barely remember when was the last time I updated Gelman. This project starts as an exercise to practice my Django + dojo skills, and also to manage my eBook collection, I were stuck in UI design which I suck and the authentication/registration which I takes little interest in.

What if we take a different approach to avoid the reinvention of the wheel? Let’s take an online eBook management Web2.0 application, for example LibraryThing, to manage the Read, Reading, Wish to Read list, if an electronic copy is available, it will be inserted into the LibraryThing search/detail page, and we can add it to Reading list for quick access later. We just mashup our web service into LibraryThing!

This application can be decomposed to two parts:

  • A Web service to CRUD eBook collection hosted in home-brew server
  • Firefox Add-on or GreaseMonkey user script for mashup
  • A web service client to automate importing books
  • Find your favorite app, then mash it up!

A more portable solution to glue the pieces is to leave the mashup in the server side, using a proxy server to insert all the mashup data.

As all comments, tags, reading history has been hosted in the cloud, it is essential to do something against the raining day:

  • Big name may endorse the availability. I would seriously consider Shelfari only after it is acquired by Amazon
  • Open API is the key. This is the trend, the web site could not lock on the user by closing itself, the users may cold feet in the first place. Douban, the leader in the Chinese market, did an excellent job to open its platform.

I am working on an prototype of the RESTful web service using Django, stay in tune.

AideRSS relieves the pain, fails to cure

Web April 29th, 2008

AideRSS logoAideRSS aims to resolve a problem, so called “information overload”, one of the typical symptoms in 2.0 era is we are overwhelmed by the piled thousands of unread posts in Google Reader and it so hard to catch up the pace of the rest of the world. AideRSS ranks the posts to Good, Great and Best categories based on its PostRank. This approach moves one step forward to relieve the pain, but still does not to resolve the problem.

PostRank measures the attention, not the value
Though the algorithm of PostRank is not disclosed as the mysterious PageRank, from the promotion voucher and personal observation, the PostRank is determined by the comments, reference and social bookmarking. In another word, a provocative flaming post may invite more attention, and achieves a higher PostRank then a plain HOWTO, though the latter is more valuable imho.

PostRank reflects the group wisdom, not the personal choice
I once read digg’s program channel, then moved on to programming@reddit because the latter is more programmer-oriented and just meets my flavor. Once the community grows big, the voice from the majority dwarf the “long tail” which the minority audience care most. The same dilemma also applies to the PostRank.

PostRank is too humble
There is no evidence, still my wild guess; the PostRank is feed-based, not internet-wise as PageRank. This assumption is quite reasonable: the global PostRank is too expensive for a startup company; the global PostRank is too provocative to the bloggers, how come the post in Gizmodo ends up lower than the alternative Engadget? The humbleness renders the sorting across feeds less useful.

The next step towards perfection
To address the above issues, the PostRank needs to be personalized. Let the user to define what is Good, Great, Best based on his/her historic behavior; check the influence from the public using the popularity contest score; discover the similar minority for sharing and referring.

A hybrid Bayesian classifier case + Web 2.0 community.

Who would be the next water vendor?

Web April 6th, 2008

As thousands start-ups running for the Web 2.0 gold rush, some vendors have setup their vending machine to sell the water. Amazon is the first and has the most comprehensive inventory, S3, SimpleDB and EC2; then followed by Microsoft’s SSDS. Now Google may sell their BigTable according the source from TechCrunch. Now the game becomes interesting. Who would be the next water vendor?

My guess is Sun. Sun ride the trend of dot com up and down. As a experience player in the grid computing, Sun has built the infrastructure for distributed computing; plus they recently acquired MySQL with $1B. It won’t be a surprise in the next morning Sun announces a new data service based upon MySQL clustering, or even the fusion of grid computing and database service.