Photo: Flickr user lifeontheedge

Saturday, August 04, 2007

Dunbar's number, which is 150, represents a theoretical maximum number of individuals with whom a set of people can maintain a social relationship.

The Great Stink was a time in the summer of 1858 during which the smell of untreated sewage almost overwhelmed people in central London.

Wikipedia needs a better API. If there's one reason you should give the foundation your money, this is it.

Friday, August 03, 2007

Who adds real content to Wikipedia, not just correcting typos and wikification?

Answer:

Only 12% of edits create fresh content. Of these 12%...
  • 0% were made by admins

  • 69% were registered users.

  • 31% were created by anon users, or non-logged in users.

...and only 52% were by people who had a user page.

For great wikimania coverage, get thee to Wikipedia Weekly.

NYTimes dispatch from Wikimania

Thursday, August 02, 2007

The Taipei Times is pumping out wiki coverage. Here's something interesting:

The sharp divide between producers and consumers of knowledge began only about 300 years ago, when book printers secured royal protection for their trade in the face of piracy in a rapidly expanding literary market. The legacy of their success, copyright law, continues to impede attempts to render cyberspace a free marketplace of ideas. Before, there were fewer readers and writers, but they were the same people, and had relatively direct access to each other's work.

Indeed, a much smaller, slower and more fragmented version of the Wikipedia community came into existence with the rise of universities in 12th and 13th century Europe.

The large ornamental codices of the early Middle Ages gave way to portable "handbooks" designed for the lighter touch of a quill pen. However, the pages of these books continued to be made of animal hide, which could easily be written over. This often made it difficult to attribute authorship, because a text might consist of a copied lecture in which the copyist's comments were inserted and then perhaps altered as the book passed to other hands.

Wikipedia has remedied many of those technical problems. Any change to an entry automatically generates a historical trace, so entries can be read as what medieval scholars call a "palimpsest," a text that has been successively overwritten. Moreover, "talk pages" provide ample opportunity to discuss actual and possible changes. While Wikipedians do not need to pass around copies of their text -- everyone owns a virtual copy -- Wikipedia 's content policy remains deeply medieval in spirit.

Wikimania (the yearly wiki conference) is starting today in Taiwan. Google it for the main page -- but for the flavor of what it's really like, try the photos. (Wikipedia weekly has good audio coverage, too.)

So there's England. And Britain, and Great Britain, and the United Kingdom. And the British Isles. And what's the difference between all these, again?

Wednesday, August 01, 2007

When the Going turns Surreal, only Criminals will own Librarians

When I had encountered this story long ago, the allegation was far less exotic: that "Slim Virgin" was the screen name of one Linda Mack, an eccentric college student who lost someone close to her (either a family member or a friend) on the Lockaby air plane crash, and volunteered a lot of her time and energy into finding the people responsible.

If that is in fact true, then it explains a lot about why she wants to remain anonymous, and why other people are willing to stifle discussion on the topic: she's been put through more than enough shit already.
Besides, so what if she is a female version of James Bond? As long as she doesn't resort to some black ops tactics to resolve disputes (even if that is the only way to settle them), is it a problem? Maybe she can draw on that experience to improve some articles.

However, as Kelly Martin and others have pointed out, the way this has been handled has only made things worse: removing material from article histories only creates more controversy, not less. A simple denial is all that is needed to handle this surreal rumor. ... I don't agree with Tlogmer that requiring Administrators to furnish (or use) their real names would solve problems like this. For example, I happen to share the same name as a car dealer in Australia.

Bonus: When the going gets weird, the weird turn pro.

Abir is an ancient Israelite martial art. Scroll down for pictures of bearded guys with swords.

Being jewish (but, I hasten to add, not in favor of Israel's policies or any sort of secret agent), I think this is pretty cool. There's really nothing like finding out that your ethnicity has its own kung-fu.

Admins should not be anonymous



SlimVirgin update. I feel like I should be typing this kind of post in a darkened new york apartment with furious clicks and dings, chainsmoking, and sending out a telegram -- yes ! a telegram ! -- to my editor before the morning presses start.

So, Dear Reader, on the Q.T....

Kelly Martin smells a rat.
The exact details of the rumors that are spreading like wildfire now may not be accurate, but I'd be totally unsurprised to find out that SlimVirgin is somehow connected to the the Flight 103 bombings. Why else would she have all of her edits to such topics disappeared?...SlimVirgin, a little advice: the only way out of this situation is to abandon your account. This drama will surround you indefinitely; the only way for it to end is for you to start completely fresh with a new one. You won't be able to suppress all discussion of this indefinitely, and the more you try the more people will be convinced of the truth of the allegations. You've made this bed; now you must sleep in it.


And J.W. (Wales, that is) has commented on the lists:
In this particular case, due to some really spectacular nonsense, this is being treated as evidence that a private person who has been badly harassed by stalkers and lunatics is... a former spy? Please.

Many editors at Wikipedia have been involved in dealing with extraordinarily crazy people. Some of these people are dangerous in real life. Some of them have made direct physical threats. Others have made phone calls to people's employers. Others have done some homemade self-styled "investigative journalism" that any rational and kind person would see as being what it really is: abusive stalking.

I fully support the right of the Wikipedia community to protect itself from those kinds of lunatics by giving support to those who need to maintain their privacy.


I'm going to come out and say that Wikipedia administrators should not be anonymous. Editors, sure. Admins? Absolutely not. Their real names should be listed. Not admins on the Chinese language Wikipedia, of course, or anywhere there's politcal repression. But elsewhere, the police can protect you from crazy people. My name, phone number and address have been online for 5 years, and (to my disappointment) I've yet to attract a stalker.

Admins have tremendous influence within Wikipedia. They were originally intended to be enlightened, ideologically neutral "janitors" whose powers were used only to conduct tasks too tedious for ordinary editors. But where software endows power, no man can take it away. Or, rather more specifically, when the deletion process is a "consensus", not a vote, and admins are the only people who get to decide when a consensus has been reached and push the big red button, guess who has disproportionate sway?

Tuesday, July 31, 2007

There's been a lot of fuss about Slimvirgin's identity. (Israeli spy! Secret covert operative!) Milos Rancic injects a much-needed dose of reality.

Sunday, July 29, 2007

But how do you know if it's accurate?



Alright, this is the most awesome thing I have seen in a very long time.

People have been talking for years (at least, I have) about color-coding words in Wikipedia articles based on how long they've survived unchanged. You could see at a glance what had lasted hundreds of revisions, and what had just been added.

Other people have been talking for years about some sort of reputation system -- you could vote on whether an article was reliable, or whether a user was. There are all sorts of problems with this kind of thinking, but I won't get into them because they've just been made moot: *

We present a content-driven reputation system for Wikipedia authors. In our system, authors gain reputation when the edits they perform to Wikipedia articles are preserved by subsequent authors, and they lose reputation when their edits are rolled back or undone in short order. Thus, author reputation is computed solely on the basis of content evolution; user-to-user comments or ratings are not used. The author reputation we compute could be used to flag new contributions from low-reputation authors, or it could be used to allow only authors with high reputation to contribute to controversial or critical pages. A reputation system for the Wikipedia could also provide an incentive for high-quality contributions.

We have implemented the proposed system, and we have used it to analyze the entire Italian and French Wikipedias, consisting of a total of 691,551 pages and 5,587,523 revisions. Our results show that our notion of reputation has good predictive value: changes performed by low-reputation authors have a significantly larger than average probability of having poor quality, as judged by human observers, and of being later undone, as measured by our algorithms.


And I haven't even gotten to the good part.

The same people developed a color-coding system based on their new trust metric. Text contributed by authors with a high content-driven reputation looks normal (black on white); text contributed by authors with a low reputation has an orange background (of varying shades).

Here's the demo.



Be sure to click "random page" a few times and page through the article histories.

(Damn. Now I want to go to Wikimania. Anyone want to buy me a ticket to Taipei?)

A note to the UCSC Wiki Team, who created this: even if your proposal doesn't get implemented on the Wikipedia servers (because of community opposition, or lack of resources), you can still implement it yourself (on wikipedia content, yes) via greasemonkey, or a firefox extention, or a web-based mashup, or whatever.




* Update: Well, not completely moot. Any automated reputation system can be gamed, and here are a few cantankerous side effects this one might have. That's one reason it may be a better idea to roll it out as an add-on than on the Wikipedia servers themselves.



This is a fan-made video, of course, not a real advertisement, as evidenced by the fact that it got the URL wrong. And by pretty much everything else about it.