In a conversation with Luciano Ramalho (president of the Brazilian Python group) while I was in Brazil, he made me realize that it wasn't self everywhere that has been bugging me, it's self in the argument list which I think could actually be called un-pythonic.
Here's some simple Python code showing the use of classes:
def f(): pass
a = 1
class C1(object):
a = 2
def m1(self):
print a # Prints '1'
print self.a # Prints '2'
f() # The global version
self.m2() # Must scope other members
def m2(self): pass
obj = C1()
obj.m1()
First, you see f() and the global a, so we have something to call at the global scope. The class C1 is defined by inheriting from object, which is the standard procedure for defining a new class (I think this might become implicit in Python 3).
Note that both m1() and m2() have a first argument of self. In Python, self is not a keyword, but the name "self" is conventionally used to represent the address of the current object. The address of the object is always the first argument.
The a that is defined at class scope represents one way to create object fields, but you can also just assign to self.a within a method, and the first time this happens the storage will be created for that field. However, the two versions of a must now be differentiated. If you just say a within a method, you'll get the global version, but self.a produces the object field (you can also assign to global variables from within classes, but I'll skip that for the current discussion).
Similarly, an unqualified call to f() produces the global function, and self.m2() calls the member function by qualifying it (and simultaneously passing the address of the current object to be used as the self argument for m2()).
Now let's look at a class with a method that has arguments:
class C2(object):
def m2(self, a, b): pass
To call the method, we create an instance of the object and use the dot notation to call m2() on the object obj:
obj = C2()
obj.m2(1,2)
In the call, the address of obj is implicitly passed as self for the call to m2(), and here we see a big inconsistency: why is implicit better than explicit when you define the method, but it's OK to be implicit when you call the method?
I certainly think that the method call syntax is desireable, but it means that you define a method differently than you call it, which I don't see as either "explicit" or pythonic. This is seen when you call the method with the wrong number of arguments:
obj.m2(1)
Here's the resulting error:
Traceback (most recent call last):
File "classes.py", line 9, in <module>
obj.m2(1)
TypeError: m2() takes exactly 3 arguments (2 given)
Because of the implicit argument pass of self during a method call, the above error message is actually saying that it wants you to call the method this way:
C2.m2(obj,1,2)
Even though the above line does run, this of course isn't the way you actually do it; you use the normal method calling syntax and give it two arguments:
obj.m2(1,2)
The message m2() takes exactly 3 arguments (2 given) is not only confusing for beginners, but it confuses me every time I see it, which I think suggests non-Pythonicness and points out the inconsistency between method definition and method invocation.
So what am I suggesting, despite the long history of hopelessness for this idea?
Make self a keyword in Python 3.1 (what's a bit more backwards incompatibility, as long as we're at it?) (Or even use this to make it easier for C++ and Java programmers to transition). All the existing rules for self remain the same.
The only difference: you don't have to put self in a method argument list. That's the only place it becomes implicit; everywhere else it's explicit -- except, just as it is now, the method call.
This produces consistency between the method definition and the method call, so you define a method with the same number of arguments that you call it with. When you call a method with the wrong number of arguments, the error message tells you the actual number of arguments the method is expecting, instead of one more.
Before I hear "explicit is better than implicit" one more time, there's a difference between making something clear and making it redundant. We already have a language that forces you to jump through lots of hoops for reasons that must have seemed good at the time but have since worn thin: it's called Java.
If we just want to be explicit about absolutely everything, we can use C or assembler or some language that spells out exactly what's happening inside the machine all the time and doesn't abstract away from those details.
Forcing programmers to put self in the method argument list doesn't honor explicitness; it's just redundant forced behavior. It doesn't add to the expression of programming (we already know it's a method; we don't need self in the argument list to remind us), it's just mechanical, and thus, I argue, non-pythonic.