The Artima Developer Community
Sponsored Link

Ruby Buzz Forum
Purging referrer URLs concurrently

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
Eigen Class

Posts: 358
Nickname: eigenclass
Registered: Oct, 2005

Eigenclass is a hardcore Ruby blog.
Purging referrer URLs concurrently Posted: Apr 17, 2006 4:58 AM
Reply to this message Reply

This post originated from an RSS feed registered with Ruby Buzz by Eigen Class.
Original Post: Purging referrer URLs concurrently
Feed Title: Eigenclass
Feed URL: http://feeds.feedburner.com/eigenclass
Feed Description: Ruby stuff --- trying to stay away from triviality.
Latest Ruby Buzz Posts
Latest Ruby Buzz Posts by Eigen Class
Latest Posts From Eigenclass

Advertisement

After writing a "pooling executor" that assigns tasks to a number of handlers in parallel, filtering referrer URLs somewhat efficiently becomes easier. Checking multiple referrers concurrently helps maximize bandwidth usage, which is quite important when you have to fetch ~8000 pages. Were this done serially, the process could easily take a couple hours or more.

The script described below helped me remove nearly 95% of the referrer URLs, going from 13595 (7909 unique) to 933 (653).

Task description

My HTTP referrers (and the corresponding hits) are stored as serialized hashes in a number of files, marshalled with TMarshal, AMarshal's elder (yet simpler) sibling:


Read more...

Read: Purging referrer URLs concurrently

Topic: RDT and JRuby news Previous Topic   Next Topic Topic: Keeping Busy at the Border Crossing

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use