Python Buzz Forum - August ChiPy (and the stdlib)

Had the ChiPy meeting on Thursday. Aaron Lav started out with a talk on Unicode and Chinese. One thing that I hadn't realized about Unicode is that it needs to be normalized. You can represent characters either as composed or decomposed. E.g., e' could be a single character or two characters (the e and the '). Of course this has a dramatic effect on searching, string length, etc. From what Aaron said the display support for the decomposed form isn't very good, but he made use of it for constructing pronunciation guides and then turned it into the composed form. The unicodedata module has a normalize function to handle this.

I'd also had the vague impression that Chinese was one character per word. But of course there's too many words for that. But there are no spaces, so it's not immediately clear where the word boundaries are (at least to a computer, and certain it is unclear to me eye). So the middleproxy application actually scans every three-character combination for possible "words". That reminded me a great deal of this presentation.

I gave a presentation on setuptools, mostly hoping to introduce all the things you can do, and how to distribute packages.

At the end Chris McAvoy brought up the issue of the standard library. I had kind of forgotten about it, but that's what got me to thinking about versioned imports some time ago, and I think Setuptools has an important place there.

So, the story goes like this: the standard library isn't advancing very fast. Little of the neat new stuff in Python is in the standard library, with a few small exceptions, and when neat new stuff is in the standard library it's often not really helpful for a few years anyway, since it's not in the standard library in old versions of Python. And the standard library is stuck in a release cycle that is really slow -- slow releases are okay for a core language (good even!) but not for the software built on that language.

Setuptools doesn't improve the standard library. The standard library has some advantages over other libraries, but I think we need to figure out how to develop outside of the standard library with those same advantages, and that's what Setuptools (really the whole family of setuptools, easy_install.py, Python Eggs, and pkg_resources) give us. Or at least move us in the right direction. The Cheese Shop is also important, and the PEP process can still be applicable to libraries not in the standard library. For instance, if Web-SIG creates libraries on top of WSGI I think some PEP-ish process is appropriate (giving some consensus and authority to the library), even though the library can't reasonably be distributed with the Python standard library.


	Web Artima.com