Had the ChiPy meeting on Thursday. Aaron Lav
started out with a talk on Unicode and Chinese. One thing
that I hadn't realized about Unicode is that it needs to be
normalized. You can represent characters either as composed or
decomposed. E.g., e' could be a single character or two characters
(the e and the '). Of course this has a dramatic effect on
searching, string length, etc. From what Aaron said the display
support for the decomposed form isn't very good, but he made use of it
for constructing pronunciation guides and then turned it into the
composed form. The unicodedata module
has a normalize function to handle this.
I'd also had the vague impression that Chinese was one character per
word. But of course there's too many words for that. But there are
no spaces, so it's not immediately clear where the word boundaries are
(at least to a computer, and certain it is unclear to me eye). So the
middleproxy
application actually scans every three-character combination for
possible "words". That reminded me a great deal of this presentation.
I gave a presentation on setuptools, mostly hoping
to introduce all the things you can do, and how to distribute
packages.
At the end Chris McAvoy
brought up the issue of the standard library. I had
kind of forgotten about it, but that's what got me to thinking about
versioned imports some time ago,
and I think Setuptools has an important place there.
So, the story goes like this: the standard library isn't advancing
very fast. Little of the neat new stuff in Python is in the standard
library, with a few small exceptions, and when neat new stuff is in
the standard library it's often not really helpful for a few years
anyway, since it's not in the standard library in old versions of
Python. And the standard library is stuck in a release cycle that is
really slow -- slow releases are okay for a core language (good even!)
but not for the software built on that language.
Setuptools doesn't improve the standard library. The standard
library has some advantages
over other libraries, but I think we need to figure out how to develop
outside of the standard library with those same advantages, and that's
what Setuptools (really the whole family of setuptools,
easy_install.py, Python Eggs, and pkg_resources) give us. Or at least
move us in the right direction. The Cheese Shop is also important, and the PEP
process can still be
applicable to libraries not in the standard library. For instance, if
Web-SIG creates libraries on
top of WSGI I think some
PEP-ish process is appropriate (giving some consensus and authority to
the library), even though the library can't reasonably be distributed
with the Python standard library.