Friday, April 25, 2008

You really don't know about China.

Update: This post deserves a fuller explanation.

Reading a list of Chinese regions, I realized why so many Chinese might be pissed off at the west for focusing on Tibet.

So I made a graph. Look for Tibet.

That's right: Tibet has only 3 million people. (The U.S. is on the right, for comparison.) There are probably hundreds of problems in China that affect as many people as what's happened in Tibet, but get next to no attention in the West.

There are two other things that strike you:

1. Social dynamics within China might be unique -- there's nowhere else with as many people who share a language and culture (more or less). Ideas could propagate differently. We'll have to readjust our thinking to get a grasp on what's happening.

2. There's a whole huge-ass part of the world -- comparable to Europe and the U.S. put together (that's without counting the people in dire poverty) churning out culture and architecture and whatever else, which I know next to nothing about, relatively speaking.

I like that; it's exciting.

I've been following the current wave of chinese nationalism through wikipedian Andrew Lih, and I hesitated posting this because I thought it might be seen as promoting that nationalism (which is just about as stupid as the Freedom Fries stuff preceding the american invasion of iraq).

But then I remembered that this blog is inaccessible in the chinese mainland (last I heard, a couple years ago). :P

Thursday, April 24, 2008

Wikipedia takes manhattan: a photographic scavenger hunt for places around our city needing photos for their Wikipedia articles. Looks like it went pretty well.


Is anyone else seeing predictive autocompletion in the Wikipedia search field?

Update: head techie Brion Vibber has the scoop. This is pretty big news -- most people have no idea how vast wikipedia is, and now the dropdown is a little window into that.

Tuesday, April 22, 2008

Warning: not Wikipedia-related.

My band, the Afternoon Round, at one of our first shows.

The recording levels are way off, but I think it rocks pretty hard. We're playing at the Heidelburg in Ann Arbor on thursday. There's some more melodic stuff on the myspace.

Sunday, April 20, 2008

WikiXMLDB provides a way of querying Wikipedia with XQuery.

With all the benefits that Wikipedia promises, it is not easy to use it off-the-shelf in applications. While Wikipedia is available for download in an XML format, individual articles are formatted in a proprietary wiki format. So the most interesting uses of Wikipedia in applications are still locked behind the access troubles.

Here is where WikiXMLDB comes to the rescue. We have parsed the entire English Wikipedia content into XML representation (its total size is about 21GB), loaded it into Sedna and provided a query interface to it. Now you can dissect individual articles, rip out abstracts, sections, links, infoboxes and other components. Or you can combine pieces of existing documents into new XML documents and convert them to web pages with XSLT for example. And you can do it all using the standard W3C XQuery Language. So finally you can start enriching your content with data from Wikipedia and unlock its power for your applications.