This post originated from an RSS feed registered with Python Buzz
by Phillip Pearson.
Original Post: Weblog URL stemming
Feed Title: Second p0st
Feed URL: http://www.myelin.co.nz/post/rss.xml
Feed Description: Tech notes and web hackery from the guy that brought you bzero, Python Community Server, the Blogging Ecosystem and the Internet Topic Exchange
I managed to write a reasonable stemmer back when I was running the [[blogging ecosystem]]; if I remember, when I've got some free time I'll dig this out and improve it to do a better job matching more modern[1] URLs. It's a function that would be handy to have in an open source library.
----
1. Back in 2002 and 2003, when the ecosystem was operating, people tended to use either simple MT-style archive links ("/archives/12345.html") or dated ones ("/2003/2/2.html"), whereas now it's quite popular to put your post title in the URL.