Aaron Swartz has done some yeoman's work on how Wikipedia has gotten written. First, he presents Jimbo Wales' view - which is that 20% of the people on the site have done 80% of the work:
So did the Gang of 500 actually write Wikipedia? Wales decided to run a simple study to find out: he counted who made the most edits to the site. "I expected to find something like an 80-20 rule: 80% of the work being done by 20% of the users, just because that seems to come up a lot. But it's actually much, much tighter than that: it turns out over 50% of all the edits are done by just .7% of the users ... 524 people. ... And in fact the most active 2%, which is 1400 people, have done 73.4% of all the edits." The remaining 25% of edits, he said, were from "people who [are] contributing ... a minor change of a fact or a minor spelling fix ... or something like that."
Aaron was skeptical, so he decided to take a few samples and investigate. He chose the Alan Alda page first:
To investigate more formally, I purchased some time on a computer cluster and downloaded a copy of the Wikipedia archives. I wrote a little program to go through each edit and count how much of it remained in the latest version. † Instead of counting edits, as Wales did, I counted the number of letters a user actually contributed to the present article. If you just count edits, it appears the biggest contributors to the Alan Alda article (7 of the top 10) are registered users who (all but 2) have made thousands of edits to the site. Indeed, #4 has made over 7,000 edits while #7 has over 25,000. In other words, if you use Wales's methods, you get Wales's results: most of the content seems to be written by heavy editors.
But when you count letters, the picture dramatically changes: few of the contributors (2 out of the top 10) are even registered and most (6 out of the top 10) have made less than 25 edits to the entire site. In fact, #9 has made exactly one edit -- this one! With the more reasonable metric -- indeed, the one Wales himself said he planned to use in the next revision of his study -- the result completely reverses.
I don't have the resources to run this calculation across all of Wikipedia (there are over 60 billion edits!), but I ran it on several more randomly-selected articles and the results were much the same. For example, the largest portion of the Anaconda article was written by a user who only made 2 edits to it (and only 100 on the entire site). By contrast, the largest number of edits were made by a user who appears to have contributed no text to the final article (the edits were all deleting things and moving things around).
The upshot: An "Army of Davids" (to steal a metaphor :) ) creates the content, while a small cadre of editors cleans it up and does follow on work. That's pretty much how I always thought of it, and it's cool to see that someone has gone out and investigated. I hope Wales is listening, because the current ideas about locking the site down more run counter to this reality.
Technorati Tags:
media, wiki, Wikipedia