The Artima Developer Community
Sponsored Link

Agile Buzz Forum
Tagging has simpler problems

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
James Robertson

Posts: 29924
Nickname: jarober61
Registered: Jun, 2003

David Buck, Smalltalker at large
Tagging has simpler problems Posted: Oct 14, 2005 9:51 AM
Reply to this message Reply

This post originated from an RSS feed registered with Agile Buzz by James Robertson.
Original Post: Tagging has simpler problems
Feed Title: Cincom Smalltalk Blog - Smalltalk with Rants
Feed URL: http://www.cincomsmalltalk.com/rssBlog/rssBlogView.xml
Feed Description: James Robertson comments on Cincom Smalltalk, the Smalltalk development community, and IT trends and issues in general.
Latest Agile Buzz Posts
Latest Agile Buzz Posts by James Robertson
Latest Posts From Cincom Smalltalk Blog - Smalltalk with Rants

Advertisement

Here's an article on tagging problems from September - I meant to comment on it then, but I find myself looking at my flagged posts now that I'm on a train. Oddly enough, that relates to the problem at hand. Here's the scenario that the post lays out as problematic:

Let's say Joe reads a new article about a battery technology breakthrough in the Scientific American. Joe has been thinking about buying a fuel-efficient car lately.  When Joe goes to tag the article's web page, he uses the following tags: "battery," "fuel-savings," "car," "future-vehicle."  Let's say the article comes with a .gif of a high-level schematic for how the battery works.  Joe saves the .gif in his Flikkr account, tagging it with "battery," "schematic," and "fuel-savings." 
Eighteen months and many tags later, due to Joe's profession as an engineer at Intel, he has an electric moment and realizes the battery tech breakthrough has more relevance to something he's directly working on, in nano-tech.  Given the keywords he chose, will he be able to 1) recall how he tagged the original article, to find it later on or, 2) if he can find it at all, will he be able to easily re-tag the article and the schematic .gif to match the new context in which Joe finds these ideas relevant?  I wouldn't bet on either outcome.

That is a problem, and it's one most of us run into a lot. I use del.icio.us to tag posts that I want to be able to find later - I use the tag "cst" for posts that I want to share with people about Cincom Smalltalk. Now, the problem I'm going to run into here isn't the same as the one above - I'm not going to forget the tag. However, over time, I'll tag a whole ton of things that way. Once I have tens (never mind hundreds or thousands) of posts tagged that way, how do I find the needle I actually want in that haystack?

The article suggests that refactoring tools (like Smalltalk's refactoring browser) for tag libraries are the answer. I don't think so. There's a wall of inertia that's going to prevent most people from doing that. Heck, the simpler problem that my title references is that most people won't tag their posts at all. Of the ones that do, a smaller subset will be motivated to refactor.

Don't believe me? Well, let's look at two A-Listers as an example - Scoble and Winer. The former never categorizes a post, and the latter rarely bothers with a title or a category. These two are widely read, and deeply involved in "web 2.0" discussions - and even they can't be bothered to take the minimal amount of action necessary to enable it. How likely do you think it is that the average web user will bother? For your answer, walk into anyone's old video cassette library and see how many of the ones recorded at home actually have a label. The answer will be enlightening.

Here's more evidence: I subscribe to 315 feeds as I write this, and I keep a fairly large cache of old items for each one. Let's trawl through those and see which ones have a category set:

RSSFeedManager default getAllItems size.

That tells me how many items I have sitting in memory. The response? 16,466. Now, let's see how many have no category set:

(RSSFeedManager default getAllItems select: [:each | each category isNil or: [each category = 'None']]) size.

The result there? 10,810. Nearly two thirds of the items I'm tracking have no category associated with them. Now, let's walk back to the web 2.0 discussions where the semantic web heads are trying to decide whether RDF, or OPML, or something else is the best way to make sense of all this. I'll make it simple for them - it just doesn't matter. The problem isn't the one posited in the article - i.e., "how did I categorize that item"? It's "holy smokes, I'm awash in a sea of completely uncategorized plain text!". Before someone chimes in that text search will auto-categorize, I'll point out that engines like Google already do a lot of that - and, as Scoble has been noticing, there are limits to that.

Read: Tagging has simpler problems

Topic: Get ready for ApacheCon US 2005! Previous Topic   Next Topic Topic: Comparative Performance of XSLT Processors in Cocoon

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use