This post originated from an RSS feed registered with Python Buzz
by Aaron Brady.
Original Post: 10m Hack: Replay an Apache log file and graph the response times
Feed Title: insom.me.uk
Feed URL: http://feeds2.feedburner.com/insommeuk
Feed Description: Posts related to using Python. Some tricks and tips, observations, hacks, and the Brand New Things.
Right, not the script I'd intended posting, but useful nonetheless. This script will take an Apache access_log file (in the standard 'combined' format) and retrace its steps. You'd probably want to take an example 100-1000 rows, not a whole log.
It can do this in multiple threads, and can have these threads give a staggered start, so you can see the number of concurrents impact the RTT on the graph.
The script will output a tab-delimited file ready to import into OpenOffice.org and create an X/Y scatter graph out of.
The major cheat here is that itertools.groupby is used to group each seconds request together, and then we sleep to space them apart again. This means you get web-browser like activity, of someone hitting a page and the images in one go, then a pause before the next page. However, it's a very impatient visitor that it emulates.