Ruby Buzz Forum - Purging referrer URLs concurrently

Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

Sponsored Link •

Ruby Buzz Forum
Purging referrer URLs concurrently

0 replies on 1 page.

Welcome Guest
Sign In

Back to Topic List

Reply to this Topic

Search Forum

Threaded View


Previous Topic		Next Topic

Flat View: This topic has 0 replies on 1 page

Eigen Class

Posts: 358
Nickname: eigenclass
Registered: Oct, 2005

Eigenclass is a hardcore Ruby blog.

Purging referrer URLs concurrently

Posted: Apr 17, 2006 4:58 AM

This post originated from an RSS feed registered with Ruby Buzz by Eigen Class.
Original Post: Purging referrer URLs concurrently Feed Title: Eigenclass Feed URL: http://feeds.feedburner.com/eigenclass Feed Description: Ruby stuff --- trying to stay away from triviality.	Latest Ruby Buzz Posts Latest Ruby Buzz Posts by Eigen Class Latest Posts From Eigenclass

After writing a "pooling executor" that assigns tasks to a number of handlers in parallel, filtering referrer URLs somewhat efficiently becomes easier. Checking multiple referrers concurrently helps maximize bandwidth usage, which is quite important when you have to fetch ~8000 pages. Were this done serially, the process could easily take a couple hours or more.

The script described below helped me remove nearly 95% of the referrer URLs, going from 13595 (7909 unique) to 933 (653).

Task description

My HTTP referrers (and the corresponding hits) are stored as serialized hashes in a number of files, marshalled with TMarshal, AMarshal's elder (yet simpler) sibling:

Previous Topic

Next Topic


	Web Artima.com