I feel like I've been neglecting Python while working on the fourth edition of the Java book (and didn't make it to Pycon this year). But under the covers, I write all my tools in Python.
After a lot of initial resistance, I finally decided that Ruby was interesting, and I will continue to learn things about it from time to time, but at this point it's just because I find that learning new languages help me with the ones I use on a day-to-day basis. But the hurdle to actually writing a tool in Ruby is still too big; I just don't have the incentive yet comparied to my positively-productive experiences in Python.
Someone in the discussion said that it should be about frameworks, and I would argue that it should be more generally about libraries. I've never forgotten one of the comments that Bjarne Stroustrup made in a very early C++ Standards Committee meeting (I think), to the effect that the goal of OO programming is to make library use easier. I think C++ did that, in comparison to C, and yet the creation of libraries was still hard, which meant that the number of C++ libraries grew relatively slowly. I think the further development of OO languages has been in the direction of making both the use and the creation of libraries easier (which is why Java generics is not such a big step forward). In general, Python does this (although not completely trivially; I've never had the patience to figure out the whole __init__.py thingy), and as a result we've seen lots of libraries -- and to my mind most importantly, lots of standard libraries in the distribution (perhaps CheeseShop will change this distinction).
As far as the RoR/Django/TurboGears issue, I think that it would be even more interesting to move up a level of abstraction and try to figure out what problem we're really trying to solve here. In broad terms, I think that EJBs, and to some degree Zope, are examples of big systems that do everything, and in the case of EJB1 & 2, were designed for the wrong consumers of the technology (which in the case of EJBs, didn't exist). So RoR comes along at the other extreme and makes the case that most of the time, web sites are "babysitting databases," so why not make that really easy.
In design patterns, we try to "encapsulate change." So the interesting question here is "what changes?" I think that it's more than just "babysitting databases," but over time the issue of databases has grown more and more prevalent in programming, to the point where, instead of writing specialized languages that just solve database problems, we're starting to see mainstream languages (like C# 3.0) incorporate support for databases directly in the language.
I think the big-picture Pythonic solution will not be to simply create a better RoR, or to only solve the web-programming problem (although that should be one of the goals). I think we need to take a look at databases in general and to find a better way of solving the database problem within Python, in such a way that the database support for a web system falls out naturally, but also in a way that data storage and retrieval becomes a natural thing to do in all Python programs.
This might be something inspired by the way annotations are used in EJB3, which (at least at first glance) does seem simple and elegant, and holds out the hope that EJBs are not a lost cause (although you could certainly argue that EJB3 bears almost no resemblence to EJB1 & 2, and so the reuse of the EBJ name is just a marketing ploy). I think the important direction here is not to try to create an OO database, because it doesn't seem that OO databases have ever been particularly successful, but instead to acknowledge that relational databases are a good (and possibly necessary) solution to the database problem, and what we need is something that makes straightforward usage of relational databases easy, and more complex uses possible. I think that with this problem solved, then solutions that require databases (including web-based solutions) will fall out naturally.
Just my current thought-stream; I'd be interested in hearing ideas from others.
I think you are onto something. I think higher level languages should have independence of databases in a library, ready to go. If on top of that, you have cross-database administration, with addition of columns, copying of data between columns, etc, then everything will be ready to conquer great market share.
For example, every table needs an autoincrement field (that's the norm). I shouldn't care if the database has an autoincrement type or if the database needs triggers to create the autoincremented value. That's better if handled automagically by the database layer, which does not need to be ORM initially, though a "classic" ORM solution could be built on top of it.
Also, it's better to support paging automagically. Nobody should need to retrieve more rows than he is interested from the database. And that's one more great difference between databases. I shouldn't care how the database in question handles it, because I want my code to work cross-databases whenever possible, remember?
Well, just a week ago I tried Mono/C# on Linux. It was then that I remembered that I wouldn't have an easier solution to handle cross-database programming, like I have in a custom Ruby library that I have created. It's really a pain to move to a language where you won't have that taken care of for you. For that matter, I have never used Hibernate/NHibernate, though that's the most known example of cross-database layer to date. I'm not sure if it handles what I have mentioned above automagically, but I hear that it is greatly configurable, so maybe some manual setting is possible.
I know some Python people have mentioned the Alchemy ORM library, but I have no idea how it works yet... :-)
So, I concur with you in everything that you have posted.
I just briefly looked at some of the examples in SQLAlchemy, and for what I am talking about it's too closely bound to the structure of the database. My first thought is that I'd like to be able to just add decorators to indicate that a field should be persisted in a database. Of course, there may be some even more elegant way to think about the problem.
I wouldn't want to have to pay attention to the SQL until I was doing something too complicated for the decorated approach to work. I suppose one way to look at it is in terms of tables -- if you can stay within a table, then decorators and high-level python constructs should solve the problem, but perhaps if you have to deal with multiple tables then you can drop into SQL. But even then my preference would be to just have some kind of SQL() method that allows me to pass an SQL string, and which returns an object representing the result.
Disclaimer: not an SQL/database expert (I suppose part of the reason I want something like this is so I don't have to be).
> My first thought is that I'd like to be able > to just add decorators to indicate that a > field should be persisted in a database.
That's the primary point at which the relational model doesn't map to many people's mental model: the relational model pretty much _requires_ that "a field" be part of a container, and that container must be known to the storage layer in some fashion. Sets are at the heart of the relational model, and if you don't know how a piece of data fits into a set, you're going to be fighting the library quite a bit.
That said, I've tried to design Dejavu to be an ORM in which you never have to write any SQL, and you, the programmer, don't have to be an SQL/database expert. Your _deployers_ may want to have some knowledge in that field (in order to make decisions about speed and memory requirements), but the developer shouldn't have to. The developer should have some skill in designing architectures with interacting classes, but not in SQL--it's all done in pure Python. Dejavu allows deployers to store everything using shelve if they want to, without changing any of your precious, fragile, application code. ;)
I think the main problem with ORM that object system of popular language are imperative and do not know much abount transactions (for example, support multiple snapshots of object stored in the database).
So SQL in fact somwhere is at higher level of abstraction than python.
Maybe wee need something functional to do well with large amounts of objects and transactions. Something that can infere that def age(): return today()-self.birthdate
is a pure function and can translate it to SQL instead of just executing or 'gracefully degrade' to mere execution if it can not translate it...
Max, I agree completely! For object filtering (SQL WHERE) Dejavu does this with lambdas (which restrict the "abstracted code" to expressions only). Those lambdas have their bytecode inspected, and SQL is produced from them. However, if for any reason a given storage backend cannot perfectly reproduce the given logic in SQL, the whole unit is constructed and passed to the lambda to see if it "passes".
For example, to retrieve all units whose age is less than 10, the programmer might write:
If the hypothetical generic age() function is worth translating once and for all, it can be registered and then special-cased in the decompilation to SQL. Again, if a given backend does not yet have a perfect translation for the logic, the pure Python lambda can be used as a fallback.
That must still be done by hand in Dejavu, but you've inspired me now to reinvestigate translating more complete functions from Python to SQL automatically :) In your example, I already have a builtin today() function, which all of my backends can decompile to SQL; assembling all the pieces shouldn't be hard.
But why not OO databases? OO has been successful in just about every area it has been applied to, but not in databases, where the relational model has proved far more popular. Any particular reasons for this?
I've spent the last two or three weeks working with Apple's relatively new technology, Core Data, which is part of the Cocoa frameworks. (Actually, the ideas in Core Data are not that new, stemming back to the Enterprise Objects Framework developed by NeXT.)
I think the ideas in Core Data are very powerful. Basically, you use a graphical tool to build an object model (entities and relationships), and then you basically get persistence for free. Objects can persist in XML or binary files, or an SQLite database. (Apple hasn't added other databases yet, but you can feel it coming.) As a developer, you work with the objects and the model, but never directly with SQL or XML.
I'm not experienced enough with RoR or TurboGears to know if this is the approach they take, but I think it works very well. Objective-C, which is the language of Cocoa, is in the C family, but is actually much closer to Python than to C++ or Java, being dynamically typed. So making a Core Data for Python should not be that difficult.
My first pogramming job involved something called RPG, which natively integrated database access. In fact, database access is about the only thing that language did... This was such a long time ago (16 years or so) that I don't remember much about it, but I couldn't help noticing that we may have come full-circle here and what seems new is well-fogotten old.
> But why not OO databases? OO has been successful in just > about every area it has been applied to, but not in > databases, where the relational model has proved far more > popular. Any particular reasons for this?
I think an OO database would be too high a level for most applications. One of the nice things about tables and relations is that you can do almost anything with them. One set of data that holds last year's actual financial data can be used to seed next year's budget without too much hassle, for example. Tables and relations are simple enough that you can put just about any abstraction on top of that you want. Once you go the OO route, you're basically saying "this data models this object/behavior" and I think you lose a lot of that flexibility.
Plus tables are just comfy. People have been dealing with relational data in tables for many decades now. There are many well known tools and techniques for doing so and they tend to work reasonably well across products and across platforms. If your object representation doesn't match the activity I want to perform on the data I've got to change it anyway. I'll have to figure out what your object model is and figure out how I can shoe horn that into my application.
Generally speaking, in my experience, OO databases hurt more than they help.
>>"Generally speaking, in my experience, OO databases hurt more than they help."
Depends on the problem you're trying to solve... Why not have a look at http://www.odbms.org/ and look at Rick Grehans articles ("whitepapers") about the why and when of ODBMS's?
Personally, I sometimes find it hard to think in terms of "tables" and "columns" when I'm writing OO code (Coldfusion and Java). Having an ORM solution can be a great help, but other tools (any DB is just a tool, right?) may well be better, depending on the situation. I'm not sufficiently versed in Python to answer this question: is it possible to link Python to a ODBMS like "db4o" (http://www.db4o.com)? There's a Java version of db4o as well as a .NET version...