This post originated from an RSS feed registered with Agile Buzz
by James Robertson.
Original Post: ClassyLocales
Feed Title: Travis Griggs - Blog
Feed URL: http://www.cincomsmalltalk.com/rssBlog/travis-rss.xml
Feed Description: This TAG Line is Extra
In the past, I've followed a practice of going off and making a tool, getting it close to ready, and then announcing it and talking about it. I'm going to try it the other way this time. ClassyLocales is at this point, very much a work in progress.
I don't like the current VisualWorks Locale mechanism. I've seen too many errors made with them. And it's not easy to figure out what you really have to do and not. And I get so tired of reinventing the data in Smalltalk. There's a project hosted by the Unicode consortium called Common Language Data Repository (CLDR). It aims to be a cross platform XML definition of Locale specific data and formats.
I've also been impressed over the years with how much I liked the switch to class based exceptions. And the same has been true of Announcements. Especially with the presence of namespaces to guard against too many class names in one big bucket.
So, I've got this crazy idea to express each Locale defined by the CLDR as a unique class with appropriate class side behavior. Indeed, the CLDR has an inheritance mechanism of its own which is easily mapped into the Smalltalk class model.
At this point, I've created a CLDRImport package which takes input from the CLDR xml files and creates classes and methods for the Locales. All of this computed behavior goes in ClassyLocales. At this point, they can return paperSize and all kinds of number formats and symbols. Plus each can answer its own name in its own language.
I want to turn the number patterns into actual methods which describe the pattern. This should remove a level of indirection and make number formatting go faster. The next task is to get collation working. Anyway, it'll be a fun experiment and we'll see how it goes as it goes.
My biggest hope will be to somehow discover that charset data is embedded inferred from these. This would make reworking font lookup very handy indeed.