The Artima Developer Community
Sponsored Link

Ruby Buzz Forum
Using Mechanize for HTML Scraping

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
James Britt

Posts: 1319
Nickname: jamesbritt
Registered: Apr, 2003

James Britt is a principal in 30 Second Rule, and runs ruby-doc.org and rubyxml.com
Using Mechanize for HTML Scraping Posted: Dec 4, 2005 1:41 PM
Reply to this message Reply

This post originated from an RSS feed registered with Ruby Buzz by James Britt.
Original Post: Using Mechanize for HTML Scraping
Feed Title: James Britt: Ruby Development
Feed URL: http://feeds.feedburner.com/JamesBritt-Home
Feed Description: James Britt: Playing with better toys
Latest Ruby Buzz Posts
Latest Ruby Buzz Posts by James Britt
Latest Posts From James Britt: Ruby Development

Advertisement

There was some discussion on ruby-talk recently about HTML screen scraping, and some questions about using Michael Neumann's WWW::Mechanize library.

I mentioned that I was using that library to grab multiple CafePress pages and extract product data to assemble the rubystuff.com Web site. Someone asked if I could post my code as an example, which seemed a reasonable idea.

The code was never meant to be more than a way to save me from doing more work than absolutely necessary, but it turned out to be a pretty good, small-but-instructive example of what one can do with Mechanize. I cleaned up/refactored a few things, added in a narrative, and have put the results up here.

I'm not so sure that this will remain the final home for the article/example, though I'm thinking of using the Neurogami site as a repository for all my writing and code libraries. I have things on jamesbritt.com, rubyxml.com, the Linux Journal site, and maybe elsewhere, plus Ruby code hosted in almost as many different places. So some one-stop shopping might make it easier to keep track of things.

 

 

Read: Using Mechanize for HTML Scraping

Topic: Ruby tutorials? Previous Topic   Next Topic Topic: Localization with Rails and PostgreSQL, part 1

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use