This post originated from an RSS feed registered with Ruby Buzz
by Adam Green.
Original Post: Excerpting HTML
Feed Title: ruby.darwinianweb.com
Feed URL: http://www.nemesis-one.com/rss.xml
Feed Description: Adam Green's Ruby development site
Another interesting decision is whether to display all of a post's text or just an excerpt. I orginally said I was going to use just an excerpt, and I still want to do that, but there are real problems with excerpting HTML. I can't just crop the text after a certain number of characters or words, because that will leave HTML tags unclosed. This can result in the rest of the page looking like garbage. Parsing out all the HTML tags is easy, but that can result in some hard to read posts. It also means that all the hyperlinks will be removed. It looks like online aggregators that publish excerpts follow this idea of stripping tags, so that is what I will do as well. There will be a link to the original post, of course, so the reader can follow that link to get any necessary hyperlinks in the body of the post.