Reverse mashup

Web September 3rd, 2008

I barely remember when was the last time I updated Gelman. This project starts as an exercise to practice my Django + dojo skills, and also to manage my eBook collection, I were stuck in UI design which I suck and the authentication/registration which I takes little interest in.

What if we take a different approach to avoid the reinvention of the wheel? Let’s take an online eBook management Web2.0 application, for example LibraryThing, to manage the Read, Reading, Wish to Read list, if an electronic copy is available, it will be inserted into the LibraryThing search/detail page, and we can add it to Reading list for quick access later. We just mashup our web service into LibraryThing!

This application can be decomposed to two parts:

  • A Web service to CRUD eBook collection hosted in home-brew server
  • Firefox Add-on or GreaseMonkey user script for mashup
  • A web service client to automate importing books
  • Find your favorite app, then mash it up!

A more portable solution to glue the pieces is to leave the mashup in the server side, using a proxy server to insert all the mashup data.

As all comments, tags, reading history has been hosted in the cloud, it is essential to do something against the raining day:

  • Big name may endorse the availability. I would seriously consider Shelfari only after it is acquired by Amazon
  • Open API is the key. This is the trend, the web site could not lock on the user by closing itself, the users may cold feet in the first place. Douban, the leader in the Chinese market, did an excellent job to open its platform.

I am working on an prototype of the RESTful web service using Django, stay in tune.

Ubiquity – Lowering the threshold for Web mashup

Web August 28th, 2008

Mozilla Lab recently released Ubiquity, it aims to lower the threshold for Web mashup. The following clips demonstrates how Ubiquity works in action:


Ubiquity for Firefox from Aza Raskin on Vimeo.

The 0.1 version is pretty much the prototype, but still quite inspiring. As more and more application have migrated from the desktop to the cloud, some may consider to replicate the OS to WebOS, others try to glue different pieces together to stop reinventing the wheel. Ubiquity takes the second approach, and seems quite promising.

As a die-hard command line user, I really love the simplicity of Ubiquity. And hopefully the following features may emerge later:

  • Wrap another command add prologue and epilogue, glue several commands together for a new functionality.
  • Pipe or we could do this in-place by using Pipe.
  • Drag-n-drop from the Ubiquity to the web page, and vice versa.
  • Nohup, fg, bg and screen full job controls for a quick response for BIG jobs, goes too far?

Any idea of new commands that make your life easier?

Search Music by humming, not perfect, but feasible

Web August 13th, 2008

I once asked my colleague the name of a song, I barely remembered its lyric, though managed to hum the melody, and I am pretty sure it come from his favorite singer, Fish Leong. But neither of us could recall the name of the song. I searched the web, and found Medomi:

MidomiMidomi stands out from other music search engines by supporting to search music by humming. The user may use its microphone to upload her humming, then the search engine would return the matched section. A full version purchase link(via iTune) sits besides it for the user’s convenience.

I hummed the song, and the very first return result is exactly what I am looking for. No a big surprise as the song is just released in Leong’s last album, and the artist is quite popular. Then I tried some old songs, more precisely, 1995 by Bob Chen. Nothing relevant returned, it seems the song not even in their database. So I registered, logged in and recorded my own rendition in Midomi studio, the next time I searched it by humming, the search engine returns the expected result.

As we all know, search, index are formalized, the hardest part is how to extract the pattern precisely and concisely. Midomi addresses this problem with an extremely brilliant idea, let the users do it, is there any delicate artificial intelligence algorithm smarter than human being? And the user activities are easily synergized for other web 2.0 ingredients, like friends, groups etc, that is a good news for the venture capitals.

The only missing piece is there is no API for the developers, that make it hard to integrate this web service to your personal music management software, like iTune or Amarok. As the pattern extraction is powered by human, if you happen to be in the tip of the long tail, just pray the singer get talented.

PyAWS 0.3.0 released

Development, Python, Web May 6th, 2008

After 6 months, PyAWS 0.3.0 is eventually released. You can check out the tar ball here.

I almost abandoned this project as I found the XSLT approach is more appealing: ideal for AJAX application and easy to integrate via simplejson in the server side. Furthermore, I joined Microsoft, moved to Canada, and had less spare time to work on less interested hobby work. The last straw is the unexpected complicity of the the BIG FAT refactory.

Until recently, I got the email from one PyAWS user, he reported a bug on unexpected result of ListLookup operation. It is so good to hear from some users that this library still benefits somebody in the world. So I picked it up, completed the refactory and released it today. The library still in active development, the code style stinks, the document sucks and most of all, testing is lacking — I would explain it for a little bit here.

I am a big fan of TDD personally, and we have respected testing troops to help building our products in MSFT as well. However, the complexity of PyAWS is far beyond my capacity: there are tens of operations and twenties of response groups, and response groups may combine, that make it extremely difficult to cover all the paths. To make it worse, the AWS is dynamic, there is no guarantee that the consecutive queries would return the same result. I may consider automation to facilitate the unit tests. If you have better ideas, please leave a comment here.

AideRSS relieves the pain, fails to cure

Web April 29th, 2008

AideRSS logoAideRSS aims to resolve a problem, so called “information overload”, one of the typical symptoms in 2.0 era is we are overwhelmed by the piled thousands of unread posts in Google Reader and it so hard to catch up the pace of the rest of the world. AideRSS ranks the posts to Good, Great and Best categories based on its PostRank. This approach moves one step forward to relieve the pain, but still does not to resolve the problem.

PostRank measures the attention, not the value
Though the algorithm of PostRank is not disclosed as the mysterious PageRank, from the promotion voucher and personal observation, the PostRank is determined by the comments, reference and social bookmarking. In another word, a provocative flaming post may invite more attention, and achieves a higher PostRank then a plain HOWTO, though the latter is more valuable imho.

PostRank reflects the group wisdom, not the personal choice
I once read digg’s program channel, then moved on to programming@reddit because the latter is more programmer-oriented and just meets my flavor. Once the community grows big, the voice from the majority dwarf the “long tail” which the minority audience care most. The same dilemma also applies to the PostRank.

PostRank is too humble
There is no evidence, still my wild guess; the PostRank is feed-based, not internet-wise as PageRank. This assumption is quite reasonable: the global PostRank is too expensive for a startup company; the global PostRank is too provocative to the bloggers, how come the post in Gizmodo ends up lower than the alternative Engadget? The humbleness renders the sorting across feeds less useful.

The next step towards perfection
To address the above issues, the PostRank needs to be personalized. Let the user to define what is Good, Great, Best based on his/her historic behavior; check the influence from the public using the popularity contest score; discover the similar minority for sharing and referring.

A hybrid Bayesian classifier case + Web 2.0 community.