The Artima Developer Community
Sponsored Link

Agile Buzz Forum
Looking at the search engines

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
James Robertson

Posts: 29924
Nickname: jarober61
Registered: Jun, 2003

David Buck, Smalltalker at large
Looking at the search engines Posted: May 30, 2005 12:28 PM
Reply to this message Reply

This post originated from an RSS feed registered with Agile Buzz by James Robertson.
Original Post: Looking at the search engines
Feed Title: Cincom Smalltalk Blog - Smalltalk with Rants
Feed URL: http://www.cincomsmalltalk.com/rssBlog/rssBlogView.xml
Feed Description: James Robertson comments on Cincom Smalltalk, the Smalltalk development community, and IT trends and issues in general.
Latest Agile Buzz Posts
Latest Agile Buzz Posts by James Robertson
Latest Posts From Cincom Smalltalk Blog - Smalltalk with Rants

Advertisement

Inspired by Tim Bray's post on search engine blog accesses, I decided to take a look at my own logs for search engine referrals. The results aren't a terribly big surprise: here's a small image of the chart I came up with going back through April:

spring 2005 search engine rankings at smalltalk rants

You should be able to click that image for a larger version of it. The yellow line is Google - you'll note that it outpaces all the other search engines by quite a bit - but it's been dipping of late, with a slight rise from Yahoo. Ask Jeeves and MSN search are nearly invisible. Getting that chart was an interesting process all by itself - I used a simple Smalltalk script over the logs I downloaded, building up a dictionary of weekly accesses:

logDict := Dictionary new.
refs := ApacheLogScanner scan: 'f:\logs\blog_log.0' recordSeparator: Character lf.
begin := Date readFrom: '5/8/05' readStream.
end := Date readFrom: '5/14/05' readStream.
scanned := refs entries select: [:each | each timestamp asDate >= begin and: [each timestamp asDate <= end]].
dict := Dictionary new.
dict at: 'google' put: 0.
dict at: 'yahoo' put: 0.
dict at: 'msn' put: 0.
dict at: 'askjeeves' put: 0.
dict at: 'googleImage' put: 0.
last := nil.
lastWasImage := false.
scanned do: [:each |
	| referer |
	referer := each referer.
	('*www.google.*search*' match: referer)
	       ifTrue: [dict at: 'google' put: ((dict at: 'google') + 1)].
	('*search.msn.com*' match: referer)
	       ifTrue: [dict at: 'msn' put: ((dict at: 'msn') + 1)].
	('*web.ask.com*' match: referer)
	       ifTrue: [dict at: 'askjeeves' put: ((dict at: 'askjeeves') + 1)].
	('*search.yahoo.com*' match: referer)
	       ifTrue: [dict at: 'yahoo' put: ((dict at: 'yahoo') + 1)].
	('*http://images.google*' match: referer)
	       ifTrue: [(each origin ~= last and: [lastWasImage not])
	                       ifTrue: [dict at: 'googleImage' put: ((dict at: 'googleImage') + 1).
	                                       lastWasImage := true]]
	       ifFalse: [lastWasImage := false].
	last := each origin].
logDict at: begin put: dict.

It's pretty simple minded code, and it chews a lot of memory on large log files - but it works, and it's pretty quick. I dumped the results to a file and imported them to excel, where I made the mistake of using Pivot charts. Ack! That wasn't what I wanted. I then found the chart wizard on the toolbar, and created the simple chart above.

Then, having no idea how to export a graphic from Excel (if you can, it wasn't obvious to me), I took a screen shot, trimmed the result in Paint, and then shrunk it down with irfanview for this post. Whew. All that just to post a simple chart :)

Read: Looking at the search engines

Topic: Blog spamvertising Previous Topic   Next Topic Topic: That was weird

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use