The Artima Developer Community
Sponsored Link

Python Buzz Forum
Weblog URL stemming

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
Phillip Pearson

Posts: 1083
Nickname: myelin
Registered: Aug, 2003

Phillip Pearson is a Python hacker from New Zealand
Weblog URL stemming Posted: Feb 24, 2005 5:29 PM
Reply to this message Reply

This post originated from an RSS feed registered with Python Buzz by Phillip Pearson.
Original Post: Weblog URL stemming
Feed Title: Second p0st
Feed URL: http://www.myelin.co.nz/post/rss.xml
Feed Description: Tech notes and web hackery from the guy that brought you bzero, Python Community Server, the Blogging Ecosystem and the Internet Topic Exchange
Latest Python Buzz Posts
Latest Python Buzz Posts by Phillip Pearson
Latest Posts From Second p0st

Advertisement

Reading Leonard Richardson's paper about his Ultra Gleeper recommendation engine, I notice that he's run into the problem of stemming weblog URLs.

I managed to write a reasonable stemmer back when I was running the [[blogging ecosystem]]; if I remember, when I've got some free time I'll dig this out and improve it to do a better job matching more modern[1] URLs. It's a function that would be handy to have in an open source library.

----

1. Back in 2002 and 2003, when the ecosystem was operating, people tended to use either simple MT-style archive links ("/archives/12345.html") or dated ones ("/2003/2/2.html"), whereas now it's quite popular to put your post title in the URL.

Comment

Read: Weblog URL stemming

Topic: Market Forces Previous Topic   Next Topic Topic: C macro gotcha

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use