When I started the blog server, I had a pretty simple directory structure for posts:
- One file per day, with the posts a collection of of post objects stored in that file
- All files stored in a per-blog directory
Well, I've been blogging since 2002, so I had managed to get up to well over 1000 files in that directory. I decided that it was kind of ugly that way, so I made two small code changes and changed the storage scheme - subdirectories named by year, with files stored in the appropriate subdirectory. That was simple - the storage code was nicely factored. The harder part was this:
- I had to get all the blog posts moved to the proper subdirectories
- I had to fix up the keyword cache
- I had to fix up the category cache
The latter two are an important optimization - I cache the file name for those searches, so they had to be updated in place. Most people would have shut down the server, scripted the changes, and then restarted. But hey - this is a dynamic system, not some static one that requires such things - so I just loaded the code change, then ran a script. The first one was a really stupid brute force script to migrate files to the right directories:
"Conversion to new dir structure"
| blogs |
Transcript cr.
Transcript show: 'Moving Files...'; cr.
blogs := (Blog.BlogSaver default keys reject: [:each | each = 'master']) asOrderedCollection.
blogs do: [:eachBlogName | | blog dir files |
Transcript show: 'Moving: {', eachBlogName, '}'; cr.
blog := Blog.BlogSaver named: eachBlogName.
blog getDefaultSettings.
dir := blog settings blogFilesDir asFilename.
files := dir filesMatching: 'vw*.blg'.
files do: [:each |
('*-2.blg' match: each)
ifTrue: [| to end |
to := dir construct: '2002'.
to exists ifFalse: [to makeDirectory].
end := each asFilename tail.
to := to construct: end.
each asFilename moveTo: to].
('*-3.blg' match: each)
ifTrue: [| to end |
to := dir construct: '2003'.
to exists ifFalse: [to makeDirectory].
end := each asFilename tail.
to := to construct: end.
each asFilename moveTo: to].
('*-4.blg' match: each)
ifTrue: [| to end |
to := dir construct: '2004'.
to exists ifFalse: [to makeDirectory].
end := each asFilename tail.
to := to construct: end.
each asFilename moveTo: to].
('*-5.blg' match: each)
ifTrue: [| to end |
to := dir construct: '2005'.
to exists ifFalse: [to makeDirectory].
end := each asFilename tail.
to := to construct: end.
each asFilename moveTo: to]]].
Yes, that's really simplistic code. But it's a one-off piece of work, so I was ok with it. Then I had to migrate the category caches:
"caches all point to old directory structure, so we must reset"
Transcript show: 'Resetting category caches...'; cr.
blogs := (Blog.BlogSaver default keys reject: [:each | each = 'master']) asOrderedCollection.
blogs do: [:each |
| blog |
Transcript show: 'Resetting: {', each, '}'; cr.
blog := Blog.BlogSaver named: each.
blog cache setupSearchCategoryCache].
That little snippet runs through the blogs and resets all the category caches - which fixed back up all the category search links. Finally, I had to do a similar thing for the keyword searches. Those are expensive to fix though - I build new cache objects only on demand, and didn't want to rebuild it all from scratch. So, I just ran through and fixed up the file paths:
"get the keyword caches reset"
Transcript show: 'Resetting keyword caches...'; cr.
blogs := (Blog.BlogSaver default keys reject: [:each | each = 'master']) asOrderedCollection.
blogs do: [:each | | saver all |
Transcript show: 'Resetting: {', each, '}'; cr.
saver := Blog.BlogSaver named: each.
all := saver cache keywordFileCache.
all keysAndValuesDo: [:key :values |
values do: [:eachEntry | | newName fname tail date |
fname := eachEntry matchFilename.
tail := fname asFilename tail.
date := saver rawDateFromFileName: tail.
newName := saver fileNameFor: date.
eachEntry matchFilename: newName]]].
That just runs through and patches the file paths in the existing cache objects. And that's it - the entire script took very little time to run, and there was only a smidgen of downtime - the interval when the script was running. I didn't need to go through the bother of taking the server down and restarting it - I just updated it on the fly. Pretty cool :)