The Artima Developer Community
Sponsored Link

Agile Buzz Forum
Finding related content

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
James Robertson

Posts: 29924
Nickname: jarober61
Registered: Jun, 2003

David Buck, Smalltalker at large
Finding related content Posted: May 10, 2005 11:53 AM
Reply to this message Reply

This post originated from an RSS feed registered with Agile Buzz by James Robertson.
Original Post: Finding related content
Feed Title: Cincom Smalltalk Blog - Smalltalk with Rants
Feed URL: http://www.cincomsmalltalk.com/rssBlog/rssBlogView.xml
Feed Description: James Robertson comments on Cincom Smalltalk, the Smalltalk development community, and IT trends and issues in general.
Latest Agile Buzz Posts
Latest Agile Buzz Posts by James Robertson
Latest Posts From Cincom Smalltalk Blog - Smalltalk with Rants

Advertisement

One of the things that's buzzing around the blogosphere is the whole related content thing. There's attention.xml, an effort from Technorati to give you feedback on how often stuff you care about is updated; there's the venerable trackback, which attempts (when not being spammed out of existence) to cross link posts. And of course there's comments.

All this got me thinking - I already have filtering in BottomFeeder - you can create keywords which, if they show up in a post, will suppress that item. What about the other end of that though - content which, if it shows up, should be elevated? And keyword based isn't enough - you want to be able to flag content that, according to you, is related. So, I opened up a workspace in BottomFeeder (one of the cooler things about having a full environment available in the app) and wrote this:


| relations items results |
relations := Dictionary new.
relations at: 'aggregator' put: #('rssbandit' 'rss bandit' 'bottomfeeder' 'newsgator' 'feeddemon' 'feed demon').


items := RSSFeedManager default getAllItems.
results := Dictionary new.
results at: 'aggregator' put: OrderedCollection new.
items do: [:each |
	relations keysAndValuesDo: [:key :values | | matchOrNil toMatch |
		toMatch := each description.
		toMatch notNil
			ifTrue: [matchOrNil := values detect: [:eachValue | ('*', eachValue, '*') match: toMatch] ifNone: [nil].
			matchOrNil notNil ifTrue: [(results at: key) add: each]]]].
^results

 

So what does that do? First, I defined a dictionary that pointed to related terms (the names of a few aggregators). Then I ran through all the items, looking for matches, and slapping them into a results dictionary. This isn't that fast; doing it incrementally as items arrived would be a lot more optimal. Still - the nice thing is that I could experiment on live data in a running application. Here's a screenshot of the resulting inspector:

Inspecting the Results

That's an interesting "first cut" at thinking about this problem. Clearly, there are things to consider here - for instance, say I wanted to find things related to RSS, Atom, (etc). I want items that mention those, but I don't want to include items that just include links to feeds that end in .rss (etc). So - it's a simple sounding problem with a lot of complexity behind it.

I know, I know - any second now, the RDF crew is going to explain how RDF triples solve this entire problem. Which might be true, IF all feeds out there were in RDF. Since they aren't, we have to look elsewhere for answers...

Read: Finding related content

Topic: The bloggers speak panel Previous Topic   Next Topic Topic: Panel discussion on PR and blogs

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use