So
Tim Bray asks about the usefulness of tagging.
Steve Green tries to answer.
What amazes me about the whole thing is that these two super-smart fellas largely missed the point. They are comparing tagging to keyword-based document retrieval :
Steve Green:
"Here's an interesting fact upon which I'll base the rest of my argument: people are horribly inconsistent when assigning keywords to documents. If you give two people the same document and ask them to assign a set of keywords to describe it, then the sets of keywords that they assign will agree only about 20% of the time. This was one of the problems that lead to the development of full text indexing systems. If we couldn't choose a few keywords from a document, we would use every word in the document as a keyword! "
This evaluation is wrong in a major way and in a minor way.
First, the recent tagging explosion is not keyword assignment for document retrieval, but a social phenomena. Clay Shirky explains a lot of this here. Link popularity, people in your tag hood, interesting people's link streams are a very large reason del.icio.us is so cool. This is the major point.
The minor point is that even as a retrieval technology tagging has some important differences with keyword and hyperlink assignment.
First - the power of numbers. Yeah, when two people assign keywords to a documen there is only a 20% overlap. What about when
100 or 1000 people do it? It makes it pretty likely that your search term or any other Joe Shmoe's term is a tag for this url, if it's relevenat.
Second, as tagging in del.icio.us is used "mainly" for personal bookmark storage, the retrieval performance question changes scope. True, you have 20% chance of overlapping with another person, but overlapping with yourself (at some later date) is much more likely. After all, the tags came from your brain in the first place!
Which makes me wonder if these guys have ever used del.icio.us?