This post originated from an RSS feed registered with .NET Buzz
by Tim Sneath.
Original Post: Blog Log Analysis
Feed Title: Tim Sneath's Blog
Feed URL: /msdnerror.htm?aspxerrorpath=/tims/Rss.aspx
Feed Description: Random mumblings on Microsoft, .NET, and other topics.
I've been keeping a blog for about two months now, and I thought it would be an interesting
exercise to do some analysis of the logs. The blogging application that this site
uses (BlogX)
records the daily hits each blog gets into a tab-delimited file, so I used Data Transformation
Services to clean the data up a bit and import it into SQL Server, and then finally
used Analysis Services to create a multidimensional cube that I could manipulate with
Excel. This process worked very smoothly, and saved the need to purchase a specialised
web reporting tool. I'll document this process more fully at a later stage, but the
information gleaned from the analysis was quite revealing about the current status
of the blogging world:
At the moment my blog averages around 40,000 hits per month. I've no idea how
that compares to other blogs out there, but knowing that your blog is read is definitely
a motivating factor when writing new entries! I suspect that most people stumble across
this blog because it's posted on the main GotDotNet
blogs page; I'm certainly under no illusions that it's to do with any personal
fame. Like any other website, one of the biggest challenges of a blog is capturing
and maintaining traffic to the site. For bloggers without the inherent advantage of
working for Microsoft, aggregation sites such as PDC
Bloggers are probably one of the best ways to spread the word.
I'm amused and amazed at how many people have wound up at the blog by means of a Google
search. Unsurprisingly, searching for "Tim Sneath" brings the blog more or less to
the top of the results, but I've had hits that have come from such bizarre search
terms as "lossless wma", "Sitar music that you can listen to on the net", and "Frank
Zappa AND Albanian Music"! Approximately 5% of browser hits to the site come via Google;
other search engines might as well not exist for the traffic they bring.
There's an astonishing variety of blog aggregators and browsing tools in use: I counted
over 500 distinct user agent strings. Of the aggregators, various variants of SharpReader are
the most popular, with a 46% share; Newsgator comes
next with 23%; NewzCrawler has a 5% share,
and many others have a smaller share. (Incidentally, 8% of visitors have an empty
useragent string, a surprisingly high number.) I'm a SharpReader user myself; although
I've never done an exhaustive survey of aggregation tools, I've certainly heard good
things about Newsgator. What's NewzCrawler like (I've not come across it before)?
Traffic drops by about 20% at the weekend. I was expecting that to be higher, but
I guess many people leave their computers on permanently, so the aggregators continue
to poll for new content.
Overall it's been an intriguing experiment. I look forward to repeating it in a couple
of months to see whether there have been any noticeable changes of trend as weblogging
continues to mature.