The Artima Developer Community
Sponsored Link

Java Buzz Forum
TagSoup

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
Elliotte Rusty Harold

Posts: 1573
Nickname: elharo
Registered: Apr, 2003

Elliotte Rusty Harold is an author, developer, and general kibitzer.
TagSoup Posted: Jun 27, 2008 7:13 AM
Reply to this message Reply

This post originated from an RSS feed registered with Java Buzz by Elliotte Rusty Harold.
Original Post: TagSoup
Feed Title: The Cafes
Feed URL: http://cafe.elharo.com/feed/atom/?
Feed Description: Longer than a blog; shorter than a book
Latest Java Buzz Posts
Latest Java Buzz Posts by Elliotte Rusty Harold
Latest Posts From The Cafes

Advertisement

Here’s part 13 of the ongoing serialization of Refactoring HTML, also available from Amazon and Safari.

John Cowan’s TagSoup (http://home.ccil.org/~cowan/XML/tagsoup/) is an open source HTML parser written in Java that implements the Simple API for XML, or SAX. Cowan describes TagSoup as “a SAX-compliant parser written in Java that, instead of parsing well-formed or valid XML, parses HTML as it is found in the wild: poor, nasty and brutish, though quite often far from short. TagSoup is designed for people who have to process this stuff using some semblance of a rational application design. By providing a SAX interface, it allows standard XML tools to be applied to even the worst HTML.”

Read: TagSoup

Topic: Flock2 Ease Previous Topic   Next Topic Topic: JBoss in The Cloud, EC 2 - The Video! from PeopleOverProcess.com

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use