This post originated from an RSS feed registered with Python Buzz
by Ben Last.
Original Post: Like Calls To Like Across The Vasty Deep
Feed Title: The Law Of Unintended Consequences
Feed URL: http://benlast.livejournal.com/data/rss
Feed Description: The Law Of Unintended Consequences
One of the sub-infinite number of monkeys on my back is a deep and abiding interest in networks of objects that are linked by similarity. This goes back a long way, probably to when I worked for Amaze in the heady and hedonistic days before the dot in dot-com became a full stop. There we did some work on networks of retail items, each of which was linked to any others that were "like" it, where "like" was defined as "shares some attribute(s) in common to some degree". The same problem domain crops up every so often and snares me, rather as the gold and treasure of the Pharoes snares Beni in the last scene of The Mummy, as the massive stone doors of a deadline thunder down to trap me in the black darkness of an unfinished project. What those little flesh-eating scarabs represent, I have no idea.
The PlayStationProject, on which we're currently embarked/about to embark/will have embarked by the time this gets posted, involves the generation of a VeryLargeNumber of data records. The exact content of these is derived from another somewhat smaller set of input records, which represent Things that are Related To Each Other; ah, I say, my old nemesis, we meet again. Anyroad up, the brute force solution to this exists; a nice Python script that derives the relationships between the several thousand Things and stores the degree of relatedness in a MySQL database. This takes around eight hours to run, but once it's done it only needs to be redone when the input data changes sufficiently, and that will stop happening as the project proceeds.
However, I find this solution grates against my sense of elegance. This is Python, what am I doing shoving results into a database when I could just persist the whole lot? After all, there's a neat little caching system that creates objects from input records as they're needed and then holds them in memory; why not create the lot, build the relationships and drop it all in a ZODB?
Part of the reason lies in the nature of the database. There are two people who maintain it, carefully researching information from a wide variety of paper and web sources and dropping it into MySQL via the useful MySQLCC application; a fine and flexible data storage application that cost nothing to build (except time, and that was pretty much the same as any other shared database solution). If I change the way the data's stored, I need to build an interface to it for them. And it has to be a simple, non-technical, easy interface that follows a spreadsheet metaphor. The ZODB is indeed an excellent and admirable thing, but the only way into and out of it is via Python, and I don't have the time to construct the interface tools necessary to use it. Again, the requirements of the project environment dictate technology choices; which is as it should be.
Which brings us, as you've probably already guessed, to the intermediate world of Object-Relational Mappers. Of all these, Ian Bicking's SQLObject stands out as a beautifully crafted way of instantiating objects from records. It matches a few key requirements for me:
It doesn't bugger about with your schema, apart from requiring an id key (which is the pattern I tend to use anyway).
It support singletons by design - each object that matches a record is a singleton.
It deals elegantly with extensions to the schema, or creating objects that easily wrap existing schemas.
It takes away a lot of the tedious messing about with string formatting that builds SQL.
It's not perfect; there tend to be more SQL queries fired off than one might get in a more closely mapped system, but I'll take flexibility and loose coupling over performance most days of the week. So there will be refactoring today, oh yes, there will be. And another persisted network of objects connected by similarities will spring into being.