Weblogs Forum - Notes from PyCon DC

Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

Pre-conference Sprints

Before the conference proper, we had a two-day coding sprint. What's a coding sprint? It's an event loosely inspired by extreme programming, where a number of developers get together for a few days of intense pair programming on a common project. The first time I heard of sprints was from Zope Corporation's Jim Fulton, who's been using sprints successfully for the Zope 3 project; maybe he invented the word. Coding sprint certainly sounds better than coding marathon!

On this occasion, there were at least three separate groups sprinting in the same space (a classroom at George Washington University which we were renting for the conference): Jim Fulton was leading a Zope 3 sprint, there were a number of Twisted programmers sprinting on Twisted projects, and in the back of the room we were having a core Python sprint.

The Python sprint quickly separated out in three groups of two to three coders each: one group, lead by Jeremy Hylton, worked on the new bytecode compiler which is being developed in a CVS branch; I was leading the two other groups, which focused on Python speedups. (There were many other ideas for tasks to sprint on, but not enough time.)

The first speedup plan, proposed by Ka-Ping Yee and implemented by him and Aahz (that's really his whole name!), was a scheme to cache the lookup of object attributes. Python has extremely dynamic rules for looking attributes: an instance attribute is first searched in the instance dictionary, for instance variables, then in the class, for methods and class variables, and finally in the successive base classes, for inherited methods and class variables. In other languages, this lookup is usually done at compile time, but Python does it at run-time, using a very efficient dictionary (hash table) implementation. Nevertheless, finding a method defined in the third base class costs three failing lookups and one successful one. We've got to be able to to better, and this is a very common operation in Python, so a speedup here might cause measurable speedup for all Python programs.

Ping's plan was to cache the dictionary where the lookup was successful, thereby reducing the number of lookups to exactly two: one in the cache, and another one in the directory indicated by the cache. This seems an obvious optimization, but wasn't done earlier because there are situations where the cache must be invalidated because one of the base classes is modified. Part of the project was to do the invalidation right, and this could only be done with new infrastructure added in Python 2.2.

We implemented the whole scheme successfully, but in the end ran into a snag: there were some common cases where the old scheme did only one lookup, and there the new scheme was slower than the old scheme! We tried various refinements, but in the end we didn't shave off enough to call it an overall win. The code is checked in on a CVS branch, though, and I'm sure we'll be getting back to it later.

The other speedup team, consisting of Thomas Wouters and Brett Cannon, was tackling the issue of speeding up method calls. When Python encounters an expression of the form x.meth(args), the bytecode compiler first spits out code to construct temporary object representing the "bound method" x.meth, after which it produces code to load the argument list and call the bound method. These are very powerful semantics: a bound method can also be stored in a variable or data structure, and can be used as a callback. Other languages call this "closures". Python unifies closures, plain functions, and a few other things, including class constructors, as "callables". But method calls are very common, and the overhead of creating the bound method object which is thrown away immediately after the call is quite measurable.

So Thomas and Brett set out to introduce a new opcode which implements the method call operation without creating the intermediate bound method object. There were numerous challenges on the way to success, such as how to recognize this exact situation in the parser, and how to implement an opcode taking three arguments when the bytecode interpreter only supports opcodes with zero or one argument.

But the real challenge was how to quickly decide at run-time whether this was in fact a method call or not: syntactically, instance.method(args) looks the same as module.function(args), and the bytecode compiler doesn't know the type of x in x.attr(args), so it will generate the new opcode for all expressions of this form, regardless of whether x is a class instance. Therefore, the opcode has to deal correctly with method calls as well as with all other kinds of calls. Fortunately, the slight overhead of the required generality is offset by the need to decode only one opcode instead of two, and in the end we measured a decent speedup (in the order of 5% for a certain benchmark, if I recall correctly).

Despite this clear success, we didn't check the code in yet. There are really two cases that need to be sped up: classic classes and new-style classes (the new class implementation introduced in Python 2.2, which will coexist with the original class implementation until Python 3.0 is released). Thomas and Breatt only had time to implement their code for classic classes. The code was parked on the SourceForge patch manager until someone has time to complete it.

The Conference Proper

Since my sprint diary ended up much longer than planned, I'll have to write down more extensive conference notes later. For now, some highlights:

Paul Graham's keynote on the "100-year language" was entertaining, although I wished he'd taken the time to say a bit more about Python, like last year's keynote speaker (Andrew Koenig of C++ fame, who's become quite the Python evangelist). It's been reviewed already in Ziggy's blog. Ziggy is a Perl developer, but (a) he's got an open mind, and (b) he's an experienced conference organizer who helped us find this excellent venue.

An excellent idea was "open space", suggested and organized by Bob Payne. This is not quite the same as BOFs, although it is somewhat similar. The venue really helped, by accidentally setting up the grand ballroom with circular dinner tables for the lunch arrangement. Many smaller groups could have discussions or small presentations in parallel that way.

The Python Software Foundation (PSF, also financially responsible for the conference) had its annual member meeting. This was a great success; more than half of all members were present in person, and half of the others had sent in their proxy form for the various votes. There was lively discussion, and after the meeting we all went out for dinner.

Looking at the schedule, I realize that I hardly went to any of the scheduled presentations! I spent almost all my time talking to various people about their Python issues, having my picture taken with attendees, and in an audio interview with Bruce Eckel. Well, so it goes.

Gregg Wonderly

Posts: 317
Nickname: greggwon
Registered: Apr, 2003

Re: Notes from PyCon DC

Posted: Apr 8, 2003 9:46 PM

I have kept my distance from python for some time. The reason is basically that I dislike its use of spacing for indentation, instead of the more conventional bracketing with keywords or tokens.

My reasoning is pretty petty. My opinion though is that I'd have a hard time switching back and forth and that I'd always insert bracketing stuff when I didn't need it in python. But also, I already have several interpretive languages and JIT'd languages at my disposal.

Two questions:

1. What makes you feel bracketing is unnecesary (or why
does python just use indentation.
2. What would be your idea of the driving reason why
Python would provide a better solution to small
problems then some of the conventional UNIX tools such
as awk, sed and cut in a shell script. And, what about
large applications. How does python scale to really
huge applications that might need to run in highly
available environments?

Okay and perhaps a 3rd...

3. What's going to make python keep going at its amazing
pace of adoption?

Bill Venners

Posts: 2284
Nickname: bv
Registered: Jan, 2002

Re: Notes from PyCon DC

Posted: Apr 8, 2003 11:49 PM

> 2. What would be your idea of the driving reason why
> Python would provide a better solution to small
> problems then some of the conventional UNIX tools
> ols such
> as awk, sed and cut in a shell script. And, what
> hat about
> large applications. How does python scale to really
> huge applications that might need to run in highly
> available environments?
>
Typically, I use Python for small tasks, Java for large ones. I haven't tried using Python for a large task, but I did ask Guido a question similar to yours in this interview:

http://www.artima.com/intv/strongweak.html

Guido van van Rossum

Posts: 359
Nickname: guido
Registered: Apr, 2003

Re: Notes from PyCon DC

Posted: Apr 9, 2003 6:42 AM

> 1. What makes you feel bracketing is unnecesary (or why
> does python just use indentation.

Python is about readability for humans. When skimming code I usually rely on the indentation, not on the braces; making the indentation define the grouping prevents overlooking grouping bugs like


if (condition)
    a = 12;
    b = 42;

It also prevents holy wars on the one right bracing style. Finally, there's this quote from Don Knuth:
[blockquote]
We will perhaps eventually be writing only small modules which are identified by name as they are used to build larger ones, so that devices like indentation, rather than delimiters, might become feasible for expressing local structure in the source language.

--Donald E. Knuth, "Structured Programming with goto Statements", Computing Surveys, Vol 6 No 4, Dec. 1974
[/blockquote]

> 2. What would be your idea of the driving reason why
> Python would provide a better solution to small
> problems then some of the conventional UNIX tools
> such as awk, sed and cut in a shell script.

Uniformity rather than a mishmash of tools each with their different syntax, limitations, and conventions. A Python is more maintainable than a solution built out of many little pieces.

> And, what about
> large applications. How does python scale to really
> huge applications that might need to run in highly
> available environments?

Have you heard of Zope? It's a content management solution for large websites, and all written in Python. Works very well!

> Okay and perhaps a 3rd...
>
> 3. What's going to make python keep going at its amazing
> pace of adoption?

It gets better all the time.

Bill Venners

Posts: 2284
Nickname: bv
Registered: Jan, 2002

Re: Notes from PyCon DC

Posted: Apr 9, 2003 11:05 AM

> It also prevents holy wars on the one right bracing style.
>
Two years ago Matt Gerrans and I were sitting at our computers starting a Python project we were going to work on together. Matt and I had previously argued about where to put the open curly brace in Java code. We both agreed that it was nice to not have to argue about where to put that open curly brace in Python. We then spent the next half hour arguing about whether to indent 3 or 4 spaces in shared Python code.

Guido van van Rossum

Posts: 359
Nickname: guido
Registered: Apr, 2003

Re: Notes from PyCon DC

Posted: Apr 9, 2003 11:17 AM

If you find yourself arguing about indenting with 3 or 4 spaces, have a look here: http://compsoc.dur.ac.uk/whitespace/

Gregg Wonderly

Posts: 317
Nickname: greggwon
Registered: Apr, 2003

Re: Notes from PyCon DC

Posted: Apr 9, 2003 2:03 PM

From day one of writing C code that someone else had to edit, I have just used tabs. I used the vi(1) editor then, and could just use ":set ts=4 sw=4" and not have to worry about tabs/spacing. I just used tabs. Now, that is the driving factor over whether an editor is acceptable to me. If I can control tab expansion, and if it doesn't have a line shifting function, it's just not usable to me.

I'd highly recommend pushing the tab key once instead of the space bar 4 times to anyone who wants to work faster :-)

Gregg Wonderly

Posts: 317
Nickname: greggwon
Registered: Apr, 2003

Re: Notes from PyCon DC

Posted: Apr 9, 2003 2:04 PM

Oops, I should have said, If I can't control tab expansion, or I can't shift lines by the indentation level indicated by tab expansion, the editor is not acceptable to me.

Gregg Wonderly

Posts: 317
Nickname: greggwon
Registered: Apr, 2003

Re: Notes from PyCon DC

Posted: Apr 9, 2003 2:30 PM

> > 2. What would be your idea of the driving reason why
> > Python would provide a better solution to small
> > problems then some of the conventional UNIX tools
> > such as awk, sed and cut in a shell script.
>
> Uniformity rather than a mishmash of tools each with their
> different syntax, limitations, and conventions. A Python
> is more maintainable than a solution built out of many
> little pieces.

I guess I consider the shell pipeline syntax to be similar in nature to expressions in any language. The arguments to the commands are pretty much similar to the signatures on functions/methods. Once you learn them, it is easy to regurgitate them at will :-)

More than one person, including myself, has been guilty of doing something different, just because the method presented seemed unfamiliar or ineffective.

The Shell has been around for going on 30 years now, and nothing has really changed about the basics. KSH, CSH, TCSH, BASH etc have all tried to do it different, or better. KSH added functions and $(( )) and $() and other notations to let `expr args` not have to be used so often to fork a process to evaluate expression. CSH's initial difference was history references. There are lots of other little added things for interactive use.

But, the power of the shell is really amazing when everything is a string. Since the shell was initially designed to be used in the text formatting environment that the initial PDP UNIX was structured around, that made a lot of sense. As times change, and a wide range of different problems are being solved by scripting, new, less fragile tools are necessary (see my blog about my language I did http://www.artima.com/weblogs/viewpost.jsp?thread=4350).

I think that it is interesting that so many languages that do many similar things keep falling out of the woodwork, so to speak. Objects have become very popular, and dynamic attributes (late binding) of languages provide some very powerful tools.

Matt Gerrans

Posts: 1153
Nickname: matt
Registered: Feb, 2002

Re: Notes from PyCon DC

Posted: Apr 11, 2003 2:40 PM

One big problem with "the shell" is that it is not "the" shell. There are many shells. Python works as well on Windows as it does on Unix or Linux. The Windows shell, however, is much inferior (or at least significantly different) is this respect and using something like cygwin is impractical and clunky.

Matt Gerrans

Posts: 1153
Nickname: matt
Registered: Feb, 2002

Re: Notes from PyCon DC

Posted: Apr 11, 2003 3:27 PM

It's funny. I just can't understand why some people are aghast that Python doesn't use opening and closing braces for scopes. When I first saw this I thought it was brilliant.

I have always thought that the computer should do as much of the busywork as possible. I have always hated it when the compiler says something stupid like "missing ; on line 44" or some cryptic message on line 300 that is a result of a missing } on line 66. Usually when I get the first one, I mumble to myself, "well then put one there, you idiotic compiler!" Beginners are constantly befuddled about where to put semicolons and braces and where not to. This was particularly bad with Pascal and C only improved it a little.

I have seen enough horribly formatted C and C++ code to know that if it is meaningless to the compiler than it will be meaningless to some people. If indentation should be used to show intent, why not have the compiler use it for its intended purpose? And why force the humans to type in superfluous symbols, when the compiler can be smarter? The only arguments I can see for curlies instead of indentation are a) It is easier for the compiler writer to parse, and b) to provide a way for people to write poorly indented code.

By the way, I despise tab characters for a number of reasons, but modern editors let you hit the tab key to insert a tab or a (configurable) number of spaces, so that is kind of a moot point, on the entry side at least.

Kirby Urner

Posts: 2
Nickname: kirby
Registered: Apr, 2003

Re: Notes from PyCon DC

Posted: Apr 16, 2003 6:01 PM

I think the argument for using a unified "Swiss Army Chainsaw" [TM -- Perl] over a combination of tiny utilities and pipe joiners, is well advanced by the Perl community. Then the logical next question is why might one prefer Python over Perl for similar tasks.

In many ways its a generational thing: if you grew up on the UNIX command line and became productive with sed and awk etc. in the context of shell scripts, or if you learned Perl really well, then you're set. You know how to get the job done and you feel productive.

But if you're new to the game, as many are (thanks to their being born in the 1980s, for example), then it's more a question of finding a curriculum that'll get you where you want to be with a minimum of fuss. Sitting down with a sed and awk book just may not be where you want to start.

And yes, the *NIX platform isn't the only one worth targeting. Python neatly accommodates differing file path delimiters in the os module, for example. os.sep returns '/' if you're on a posix system but '\' if you're on Win32. So if you're writing some simple configurator that you'd like to use across platforms, you have this layer to build on -- makes code more readable, easier to maintain. os.name tells you which platform you're running on.

As for using indentation for code blocks, that forces a readable and consistent style (think of it as compile-time style checking) and reduces the number of tokens in the syntax i.e. it's clean. An alternative to braces used in many languages is a lot more keywords, like if/endif, do/enddo -- but that just clutters up the vocabulary (likewise the shell practice of case/esac if/fi and so on).

Vincent O'Sullivan

Posts: 724
Nickname: vincent
Registered: Nov, 2002

Re: Notes from PyCon DC

Posted: Apr 23, 2003 12:14 AM

> If you find yourself arguing about indenting with 3 or 4
> spaces, have a look here:
> http://compsoc.dur.ac.uk/whitespace/

It looks very interesting. I looked for a whitespace interpretor in Python but - unfortunately - I drew a complete blank.

Vince.

Flat View: This topic has 12 replies on 1 page

Previous Topic

Next Topic