Summary:
Python creator Guido van Rossum talks with Bill Venners about the robustness of systems built with strongly and weakly typed languages, the value of testing, and whether he'd fly on an all-Python plane.
The ability to add new comments in this discussion is temporarily disabled.
Most recent reply: July 7, 2007 8:09 PM by
|
Artima.com has published Part V of an interview with Python creator Guido van Rossum, in which he talks about the robustness of systems built with strongly and weakly typed languages, the value of testing, and whether he'd fly on an all-Python plane.---xj40dkcfea73---Artima.com has published Part V of an interview with Python creator Guido van Rossum, in which he talks about the robustness of systems built with strongly and weakly typed languages, the value of testing, and whether he'd fly on an all-Python plane. http://www.artima.com/intv/strongweak.htmlHere's an excerpt: That attitude sounds like the classic thing I've always heard from strong-typing proponents. The one thing that troubles me is that all the focus is on the strong typing, as if once your program is type correct, it has no bugs left. Strong typing catches many bugs, but it also makes you focus too much on getting the types right and not enough on getting the rest of the program correct.
Strong typing is one reason that languages like C++ and Java require more finger typing. You have to declare all your variables and you have to do a lot of work just to make the compiler happy. An old saying from Unix developers goes something like, "If only your programs would be correct if you simply typed them three times." You'd gladly do that if typing your programs three times was enough to make them work correctly, but unfortunately it doesn't work that way.
All that attention to getting the types right doesn't necessarily mean you don't have other bugs in your program. A type is a narrow piece of information about your data. When you look at large programs that deal with a lot of strong typing, you see that many words are spent working around strong typing.What do you think of Guido's comments?
|
|
|
I think both James and Guido have good points. For example, I really like interfaces in Java and I really like the flexible and incredibly easy-to-use collections in Python.
Maybe what we need is a language that has both static and dynamic typing.
Why is it that downcasting in Java requires an explicit cast, by the way? Wouldn't it be simpler to allow that, then to implement generics? This would solve the problem with collections, wouldn't it?
|
|
|
Before I give my exp let me list my language skilss:
Weak Type:
Perl PHP Userland UserTalk Javascript Python(light exp)
Strong Type:
Java C
While weak typing does let you fast prototype strong type seems to allow you to do more robust..
Plus you are dealing with totally very different debugging types or workflows in eak and strong type languages..
|
|
|
Actually, this "strong vs. weak" typing discussion has been confusing several issues. Guido's comment that Python isn't weakly typed, but "runtime-typed", brought a bit of clarity.
Someone -- maybe Robert Martin or "Pragmatic" Dave Thomas -- came up with a taxonomy of type systems that's even clearer:
Strong/Weak: how easy it is for a programmer to circumvent the type system because he knows better. C is "weak" in this sense, while Java, Eiffel, etc. are "strong". However, so are Python, Smalltalk, Ruby, etc. C++ is weak mainly because it inherits C-style casts and void * . Note that Java's type casting doesn't weaken the type system, in that a cast fails immediately with a ClassCastException. In C, casting an int to a void* always succeeds, whether it really makes sense or not ...
Static/Dynamic: whether variables and formal parameters have types fixed at compile-time, or whether the type of a variable depends on the run-time type it currently contains. This is what most people mean by "strong" or "weak" typing. C, C++, Java, Eiffel, etc. are static, Python, Smalltalk, Ruby, etc. are dynamic. Note that polymorphic method dispatch implies some dynamism, so many of the "static" languages aren't completely static.
Manifest/Implicit: whether each variable must have a type declaration or not. Most static languages have manifest types, and most dynamic languages have implicit types. However, there are exceptions: Haskell and the dialects of ML can infer the type of any variable based on the operations performed on it, with only occasional help from an explicit type. Also, the GNU/NeXT/Apple extensions to Objective-C permit optional manifest typing.
|
|
|
> Actually, this "strong vs. weak" typing discussion has > been confusing several issues. Guido's comment that > Python isn't weakly typed, but "runtime-typed", brought a > bit of clarity. > Thanks for clarifying the terminology. Prior to talking to Guido, whenever I had this discussion it was always in "strong" and "weak" terms. Everybody knew that what we were discussing was the two approaches you refer to as static and dynamic. I noticed much of the discussion of this article on slashdot used the static and dynamic terms. Guido prefered to use "strong" and "runtime".
I left in the terminology that Guido and I used in the actual conversation we had, because that's what we said. I put strong and weak in the title, because primarily those were the terms Guido and I used when talking. I think Guido's mentioning of "runtime typing" was an effort to clarify terminology a bit, and especially to point out that Python won't let you use an object in any way incompatible with its type, unlike "weak typing" languages such as C.
|
|
|
Here's my take on the strong vs weak/static vs dynamic typing issue. Python and Java are the two main languages I use. Both are used behind the scenes at Artima.com, for example. I tend use Python for scripts, single file programs that do one thing. I tend to use Java for apps or systems. I have a stong-typer personality, so I feel more comfortable building systems in a strongly typed environment. But I do really appreciate Python's economy of finger typing. Although some of the extra finger typing you have to do in Java comes from the need to declare variable types, I think Java requires a lot of gratuitous finger typing. Common things I do in Java should be easy, and require a minimum of finger typing. Things like iteration and opening a file seem very verbose in Java, and not because I have to declare types. I had the gratuitous finger typing in mind when I asked Gosling to what extent he had programmer productivity in mind when he designed Java. Gosling replied that the main way he was trying to improve productivity was to make memory corruption bugs impossible. He didn't seem to be concerned with finger typing: http://www.artima.com/intv/gosling319.htmlI think one of the reasons people like IntelliJ IDEA is because it gives you finger typing shortcuts for verbose Java language structures. That helps reduce finger typing when programming in Java, but you still have to look at all that verbosity when reading code. As Guido points out, the advantages of conciseness is not just reducing finger typing but reducing the amount of code you read. I suspect I would enjoy using a language whose compiler (and/or interpreter) inferred types where possible, and required the programmer to explicitly provide types where they can't be inferred.
|
|
|
One post in Slashdot that made me stop and think pointed out that languages affect the way we think, and that perhaps the most important robustness benefit of a compiler that enforces strong typing is not catching typos (sorry) but forcing the programmer to think more about types in the first place.
My feeling when programming in Python and Java is that Python tries hard to get out of your way and let you get things done. Java actually tries to get in your way and slow you down when you are doing something the designers think needs to be done carefully to be robust. So for example, checked exceptions force me to do more typing, and encourages but doesn't force me to do more thinking. I can't just open a file for reading without dealing with the possibility that it will throw IOException .
I think some of the out-of-your-way/in-your-way difference between Python and Java comes from the different goals the designers had for these languages. One was intended primarily for scripts, the other for systems. But as I mentioned in a different post, I think in many places Java is more verbose than it needs to be even given that from time to time Java is trying to get in my way for my own good.
|
|
|
> I think some of the out-of-your-way/in-your-way > difference between Python and Java comes from the > different goals the designers had for these languages. > One was intended primarily for scripts, the other for > systems. But as I mentioned in a different post, I > think in many places Java is more verbose than it > needs to be even given that from time to time Java is > trying to get in my way for my own good.
Hopefully the new syntax for genericity and a recently announced iterator shorthand will take care of that. I'm sick to death of casting Object to some more specific class, or typing out for loops and Iterator boilerplate.
However I'm not sure that there's a solid distinction between "systems" and "scripts". Zope is a complex system in a scripting language; where I work we also have a complex benchmarking tool written in nearly pure Python. "Scripting" languages are more a matter of intent than specific features; Lisp and Smalltalk have dynamic, implicit typing, but generally lack another critical feature of "scripting" languages, an easy interface to other languages. (And Eiffel has an almost trivial interface to C and C++ -- or Java in SmartEiffel -- but it isn't a scripting language by any means.)
Still, I think there's some mileage in Osterhout's "system language"/"scripting language" distinction. Objects are easier to use than design, so an ideal system would have experts building reliable, efficient, and robust components which generalists wire together in application-specific configurations. Static, compiled languages make components safer to write, and dynamic "scripting" languages make composing existing components easier.
|
|
|
I agree with Gosling when he said that to make a Java programmer much more productive, he made memory corruption impossible. Typing speed has never been considered as a criteria of a productive programmer... all experienced programmer having a decent typing speed.
So i think arguing that week-typed languages are more "productive" languages that strong-typed ones just because they require less typing is a word of non-sense, IMO.
<GUIDO> Guido van Rossum: Those variables don't have types. Runtime typing works differently, because you can easily make a mistake where you pass the wrong argument to a method, and you will only find out when that method is actually called. On the other hand, when you find out, you find out in a very good way. The interpreted language tells you exactly this is the type here, that's the type there, and this is where it happened. If you make a mistake against the type system in C or C++, and it only happens at runtime, you're in much worse shape. So you can't simply say strongly typed languages are better than runtime typed languages or vice versa, because you have a whole tradeoff of different parts. </GUIDO>
You **cannot** pass a wrong argument to a function in a stong-type language (i speak in term of java). The compiler prevent you for doing so during compile time. And here i second all those strong-typed evangelist when they say : the more errors you find earlier, that is during compile time, the better it is! It may require the compiler to do much extra work of type checkin ... but i simply don't care, if it produces fast-running byte-code. What is important is run-time speed, not compile-time speed (although a slow compiler is rather frustrating ..., i admit)
I think the whole point is : it's programmer's role to define type of object he or she is manipulating. It's more straightforward and less bug-prone - IMHO - for the programmer to know - and to have it written somewhere in the code - the exact type of data he or she's dealing with.
Just my 2 cents.
/hermann
|
|
|
I don?t see how (finger)typing can be of such importance in a discussion on strong/weak typing in computer languages. Lines of code has nothing to do with productivity. Compactness of code does not necessarily make it easier to read. Haskell is very compact, partly because of implicit types, but also very hard to read.
I also have trouble with understanding the practice of using a compiler as a static type-checking tool. For me, a compiler is a tool which produce binaries given source code. For static source code analysis I use dedicated tools such as lint.
|
|
|
> So i think arguing that week-typed languages are more > "productive" languages that strong-typed ones just > because they require less typing is a word of > non-sense, IMO. Python, Perl, Ruby, and similar languages save on finger typing in other ways, by having built-in collections, regular expressions, function objects, etc. Dynamic typing is a cornerstone of the approach, though. > [...] here i second all those strong-typed evangelist > when they say : the more errors you find earlier, that > is during compile time, the better it is! It may > require the compiler to do much extra work of type > checkin ... but i simply don't care, if it produces > fast-running byte-code. Here's how dynamic-typing evangelists handle those concerns, and offer advantages besides: - Unit Testing: Aggressive testing, especially unit testing, to verify inputs and outputs, catches as many or more bugs as static types. - Higher-level types: often you don't *need* to write a new class, because existing classes and/or language mechanisms do the job simply and concisely. Perhaps it's a matter of the type of problems "system" and "scripting" languages address. - Efficient Enough: it isn't necessary in many cases for a program to be lightning fast, as long as it's efficient enough for its purpose. GUI rendering has to be fast, but code that performs actions based on GUI events can take a second or so without a user noticing. - "Duck Typing": See < http://c2.com/cgi/wiki?DuckTyping>. One of the frustrations of static typing is to have two classes from two different sources that have similar interfaces but inherit from no common types. For example, Java 1.4 added a CharSequence interface; before that, the only way to write a library that operated on Strings, StringBuffers, and char arrays was either to pass them as Object (as the GNU regex library does), overload and reimplement methods for all known types (as I think ORO does), or to create an adaptor (as Apache Regex does). On the other hand, a Python or Ruby object doesn't have to inherit from X; it merely has to implement enough of X's methods (ideally all) in order to be X-like enough to be sustitutable, in the Liskov sense. - Ease of Refactoring: Because of Duck Typing, changing the type of an argument or a return value from a simple type to a more complex one is a lot easier in a dynamically-typed language. In a statically-typed language, you have to change all code that uses that type, if it and the old type don't share an interface, or if the usage declared the concrete type. You'd also have to change the type of any variable or collection that held the argument or return value without using it directly. Dynamic typing isn't right for every task, by any means. However, it can build complex systems in some domains, as Zope proves. I think they're especially useful in web development because the Web is all about loose coupling, self-describing data, and converting structured text into internal data structures and back.
|
|
|
One interesting area where strong and weak typing has interesting effects is Genetic Programming (GP). In GP programs are randomly generated and then tested. The programs that solves the problem best are allowed to reproduce, and the process is iterated until a satisfactory solution has been found.
The first GP engines used Lisp programs which allowed random creation of robust programs (meaning that it could execute at all, compared to a random C program). However the resulting solutions where often impossible to understand. Later, GP has explored strongly typed languages. The benefits are that the enormous search space is reduced, and that it becomes possible to understand what the generated solutions actually do.
I guess GP generation of programs reflect the productivity discussion: Creating untyped programs is very fast, but expanding upon the solution is impossible since they are hard to understand. They are also slow. Creating typed programs is slower, but they execute faster and are easier to understand.
I am not sure if I agree with this myself though ...
|
|
|
I think the real issue of how to get a program correct is much bigger than getting the data types correct, but for compilers with no runtime checking like C, getting the types to match is very important for the program to function. Designing a more strongly type-checked language like C++ is good because it reduces the chances of a type mismatch at runtime. This was an important design consideration as C++ evolved from C, but it follows from the decision to have "no runtime" intelligence. If a program accesses a long int as if it were a char, even though both are numbers, the value returned will be way off the mark. This kind of checking is mainly a mechanical process. Using the wrong type of number is an easy mistake to make, but there are lots of ways the variable types can match while the meanings are scrambled, as when passing parameters to a function in the wrong order. Type checking (strong or weak) does nothing to detect this kind of problem. The full process of getting the program correct is a far greater endeavor than checking types "strongly" or "weakly."
Compare this "mechanical" type-checking of C/C++ with the runtime typing of Python. The Python runtime environment knows the type of each object and can do a good job of interworking between the related types. If a "wrong" type of a number is passed to a function, the Python runtime will do it's best to make it work - to make the "cast" automatically. To guard against the problem of misordered function arguments, Python offers the powerful syntax of specifying the arguments by name rather than by position. This cannot be done in C/C++ but it is a great way to reduce errors. Verilog is the only other language I know of that has something like this (named signal ports to modules) and the ASIC design groups I have been part of have all mandated this syntax over positional ports for large modules.
Yes, you might get lucky and have a C++ compiler-detected type mismatch in a function argument misordering, but you might just as well be unlucky and have the misordering not cause a type mismatch (e.g., reversed ints). Python will certainly detect a type mismatch it can't handle, (it will happen at runtime instead of compile-time), but neither environment will detect all mistakes. There is no substitute for a proper verification process! Because Python is easy to run, I typically run the developing program often, as I add functionality. Then, the distinction between compile-time and run-time checking essentially goes away.
BTW, for all of g++'s strong type checking, I have managed to confuse it (by accident) with legitimate (I believe) default values for function arguments to the point it delivered a very screwy value when reading one numeric arguement type as another. That took some serious debugging!
Another aspect of Python that I find greatly reduces my mistakes is the use of pointers/references to access all objects. When I'm doing complex things with classes, references are what I want anyway. A C/C++ variable NAME with a VALUE and a maybe a POINTER and an ADDRESS makes keeping track of when to use a "*" and a "&" tricky. Thankfully there is strong type checking in C++! I really need it, ...but with Python I don't... With Python, I'm using a single object referencing scheme that I can just get right the first time, every time. Again, strong (compile-time) type checking in C++ is very important for C++, but "compile-time" checking in Python would be much less valuable.
In summary, I think the discussion of strong vs. weak type checking is dealing with just the tip of the iceberg. The majority of the discussion of what makes a language good for a particular application is much vaster than this, but it requires looking under the surface, ...and sometimes that is hard. I find it all but impossible to discuss the actually important language issues with my C programming friends - they have no experience with a good, object-oriented VHLL (Very High Level Language) like Python, so they have no mental framework within which to evaluate the points I make. (To be fair, the most knowledgeable C++ programmer I know is a Python enthusiast.) The essence of it for me is that I can write reliable programs with many fewer iterations (whether compile-time or run-time triggered) with Python than I can with C++, doing more functionality in much less time. I use C++ where speed of execution is critical (SystemC), but with a 2 GHz processor, Python is quick enough for a surprising number of jobs!
|
|
|
"That all sounds fine to me in theory. It makes sense that the sooner I find programming errors the better. Strong typing helps me find errors at compile time. Weak typing makes me wait until a runtime exception is thrown. The trouble is that I use Mailman and qmail. Both are completely written in Python and both work fine. They aren't prototypes, they're applications, and they both seem quite robust to me. So where does theory miss practice?"
I know that Mailman is written in Python, but I always thought that qmail (from DJB) was written entirely in C.
|
|
|
> "That all sounds fine to me in theory. It makes sense that > the sooner I find programming errors the better. Strong > typing helps me find errors at compile time. Weak typing > makes me wait until a runtime exception is thrown. The > trouble is that I use Mailman and qmail. Both are > completely written in Python and both work fine. They > aren't prototypes, they're applications, and they both > seem quite robust to me. So where does theory miss > practice?" > > I know that Mailman is written in Python, but I always > thought that qmail (from DJB) was written entirely in C.
I suspect you are right. Someone pointed this out in Slashdot as well. I'm not sure where I got the idea that qmail was written in Python. I tried to find a statement on the qmail website about what language it was written in, but I couldn't find anything. I'm planning to remove the mention of qmail and just leave the mention of mailman.
I guess I only have experience with one full-fledged application written in Python, and that's mailman.
|
|