This post originated from an RSS feed registered with Python Buzz
by Ben Last.
Original Post: Zope, Threads and Things That Are Not Modules
Feed Title: The Law Of Unintended Consequences
Feed URL: http://benlast.livejournal.com/data/rss
Feed Description: The Law Of Unintended Consequences
Thread-local storage & thread-safe locking in Zope External Methods
You may, like me, have been misled by part of the Zope management interface into believing something that's not true; that External Methods live in a module. This would be a perfectly natural assumption; when you add an External Method, one of the fields in the form you fill out is titled "Module Name". If you read the Zope Book, you'll also see phrases like "You've created a Python function in a Python module", "You can define any number of functions in one module" or "put this code in a module called my_extensions.py".
But they're not modules; at least, not in key senses of the word. What actually happens is that the source file is loaded, compiled and the resulting code object is then used to get access to the methods as needed. This is clever, but has some unfortunate side-effects, one of which is that it isn't possible to rely on certain module-level semantics.
One of the things that doesn't appear to work is a trick like this:
import thread
my_module_level_lock = thread.allocate_lock()
def my_mutex_method():
"""A method that uses the lock to enforce thread-safety"""
my_module_level_lock.acquire()
try:
f = open("myfile.log","a")
f.write("Yowza!\n")
f.close()
finally:
my_module_level_lock.release()
Obviously, what we're doing here is sharing the one module-level lock amongst all threads. But because of the way Zope uses external methods, there may in fact be more than one module-level lock. And thus, of course, you won't get proper thread exclusion. Even worse - it fails silently, so you may not even be aware that you're not locking around access to shared resources.
Need proof? Try this trick. Create a "module level" class in your External Method file (ie, the file that contains the source to your External Methods; call it an External Method file). Create a "module-level" instance of it that, in the __init__ method, dumps it's id and the thread ident to stdout:
class Noddy:
def __init__(self):
print "Noddy %s" % self
print "Created by thread %d" % thread.get_ident()
def __del__(self):
print "Noddy %s is being deleted" % self
noddy = Noddy() #Allocate a module-level Noddy()
Use bin/runzope to run the Zope instance and catch the output. Here's some I collected earlier:
Noddy __builtin__.Noddy instance at 0x40fa8f6c
Created by thread 1026
Noddy __builtin__.Noddy instance at 0x4130a3cc
Created by thread 1026
Noddy __builtin__.Noddy instance at 0x40fa8f6c is being deleted
Noddy __builtin__.Noddy instance at 0x40fa8c6c
Created by thread 1026
Noddy __builtin__.Noddy instance at 0x4131afec
Created by thread 1026
Noddy __builtin__.Noddy instance at 0x4130202c
Created by thread 1026
Noddy __builtin__.Noddy instance at 0x40fd076c
Created by thread 1026
Noddy __builtin__.Noddy instance at 0x40fe084c
Created by thread 1026
Noddy __builtin__.Noddy instance at 0x41153fac
Created by thread 1026
Noddy __builtin__.Noddy instance at 0x4114ceac
Created by thread 1026
Yow. The module level object is being created multiple times, all by the one thread. Imagine if we'd put some useful resource-eating allocation at the module level. However, if we put the Noddy stuff into a module that the External Method file imports, we get:
Noddy ThreadShared.Noddy instance at 0x40fa976c
Created by thread 1026
Just the one. As expected. And even if we run parallel requests to get multiple Zope threads working, we still get only the one instance of Noddy.
I ran into this problem whilst trying to allocate a single MySQLdb database connection per thread; for this, I wanted to keep a mapping of thread idents to MySQLdb Connection objects. But whatever I did, the mapping refused to behave as a module-level object. Thus, from necessity, was born the ThreadShared module, which returns a TLS (thread-local-storage) object per thread.
And it looks like this:
#!/usr/bin/env python
#Shared resource module for Zope-level ExternalMethod thread-local storage
import thread
#A TLS is just an object on which arbitrary attributes may
#be set. In a Zope system, you could make this a
#Products.PythonScripts.standard.Object
#You can add methods on it to do anything you need.
#If you have a thing about preferring dicts rather than arbitrary objects,
#then use a dict.
class TLS:
pass
tlsMap = {} #dict that maps thread idents to TLS objects
tlsLock = thread.allocate_lock() #global lock object over map
def getTLS():
"""Return the thread-local-storage object for the current thread.
If there isn't one, create one."""
#Lock the tlsLock.
tlsLock.acquire()
#Obtain or create the object
try:
tls = tlsMap[thread.get_ident()]
except KeyError:
tls = TLS()
tlsMap[thread.get_ident()] = tls
#Release the lock
tlsLock.release()
return tls
Okay, so you have a TLS object, how might you use it? Well, for anything that needs to be allocated once per thread. For example, assuming you've imported ThreadShared:
tls = ThreadShared.getTLS()
conn = getattr(tls,'DatabaseConnection',None)
if not conn:
conn = MySQLdb.connect( etc etc )
setattr(tls,'DatabaseConnection',conn)
#And here we have a connection for this thread.
In this case, it might well be worth adding a method to the TLS class to generate the connection, but you get the general idea.
Another problem that comes from the same source is this; how do you get a module-level lock in an External Method file? The answer is - put it in another module (a real module) that your External Method file imports. Be careful with the import path, though - by default the Extensions instance directory isn't on the import path, so if you like all your External Method code to live in the one place, you'll need to mess with sys.path to add it.
Of course, all this will come as no surprise to seasoned Zopistas, and I expect the usual level of flames informing me of my rank ignorance and stupidity in not having worked this out ages ago (presumably by reading the source code). Yet given that this is a pretty key point, you'd expect that, at the very least, External Method files weren't called "modules" in so many places. Because they're not modules, are they?