The Artima Developer Community
Sponsored Link

Python Buzz Forum
right question, wrong context - tim bray on

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
maxim khesin

Posts: 251
Nickname: xamdam
Registered: Mar, 2005

Maxim Khesin is developer for Liquidnet. I like C++, python, attend design patterns study group/NYC.
right question, wrong context - tim bray on Posted: May 19, 2005 10:17 PM
Reply to this message Reply

This post originated from an RSS feed registered with Python Buzz by maxim khesin.
Original Post: right question, wrong context - tim bray on
Feed Title: python and the web
Feed URL:
Feed Description: blog dedicated to python and the networks we live in
Latest Python Buzz Posts
Latest Python Buzz Posts by maxim khesin
Latest Posts From python and the web

So Tim Bray asks about the usefulness of tagging. Steve Green tries to answer.

What amazes me about the whole thing is that these two super-smart fellas largely missed the point. They are comparing tagging to keyword-based document retrieval :

Steve Green:
"Here's an interesting fact upon which I'll base the rest of my argument: people are horribly inconsistent when assigning keywords to documents. If you give two people the same document and ask them to assign a set of keywords to describe it, then the sets of keywords that they assign will agree only about 20% of the time. This was one of the problems that lead to the development of full text indexing systems. If we couldn't choose a few keywords from a document, we would use every word in the document as a keyword! "

This evaluation is wrong in a major way and in a minor way.

First, the recent tagging explosion is not keyword assignment for document retrieval, but a social phenomena. Clay Shirky explains a lot of this here. Link popularity, people in your tag hood, interesting people's link streams are a very large reason is so cool. This is the major point.

The minor point is that even as a retrieval technology tagging has some important differences with keyword and hyperlink assignment.

First - the power of numbers. Yeah, when two people assign keywords to a documen there is only a 20% overlap. What about when 100 or 1000 people do it? It makes it pretty likely that your search term or any other Joe Shmoe's term is a tag for this url, if it's relevenat.

Second, as tagging in is used "mainly" for personal bookmark storage, the retrieval performance question changes scope. True, you have 20% chance of overlapping with another person, but overlapping with yourself (at some later date) is much more likely. After all, the tags came from your brain in the first place!

Which makes me wonder if these guys have ever used

Read: right question, wrong context - tim bray on

Topic: Slices of SPAM Previous Topic   Next Topic Topic: Updated ReseekFile

Sponsored Links


Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use