This post originated from an RSS feed registered with Web Buzz
by Stuart Langridge.
Original Post: Storing Zeitgeist data in desktopcouch
Feed Title: as days pass by
Feed URL: http://feeds.feedburner.com/kryogenix
Feed Description: scratched tallies on the prison wall
A fun game with Seif Lotfy of the
Zeitgeist project today, to answer
the question: how hard would it be to have my Zeitgeist event log be in
desktopcouch? That makes the logs from all my computers available
on all my computers. This is increasingly important as more stuff
starts going into Zeitgeist -- for example, we're going to start storing lots
of Ubuntu One events in there to make what's happening with Ubuntu One on your
machine more transparent and obvious, if you want to see it. So, after some
conversation where Seif told me it was easy and I scoffed and bet him a beer
that it wasn't...it turned out it wasn't that hard after all.
Zeitgeist has extensions. These aren't brilliantly documented yet, but you
can drop a Python file into .local/share/zeitgeist/extensions and
if it's got the right sort of class in it then that class will get run as a
part of Zeitgeist. Extensions are great for doing things like running some
code every time there's an event which goes into Zeitgeist.
/usr/share/zeitgeist/_zeitgeist/engine/extensions/blacklist.py is
an example extension.
However, we didn't do it quite that way. Because we're taking an action
on every event in zeitgeist, we don't want to slow the core down. So
instead of actually being an extension that's built into the main Zeitgeist
process, we're an extension which launches a separate subprocess. The
subprocess uses ZeitgeistClient
to get notified whenever any event happens, and then serialises that event
into desktopcouch. Basically, it's an event-driven loop driven by
ZeitgeistClient.install_monitor, so every time a new event happens
our function gets called, and that function serialises the event into a
desktopcouch Record and saves it.
The other half of the equation is getting events from desktopcouch.
Obviously, if you've got more than one machine, then sometimes events that
happened on the other machine will arrive here, and you need to pull
those new records out of desktopcouch, turn then back into event objects, and
push them into the Zeitgeist engine. The way to do this efficiently is by
monitoring desktopcouch's changes feed.
The changes feed
is a core part of CouchDB itself; the way it works is that you open an HTTP
connection to it and that connection lives forever; whenever a record changes
or is added or deleted to the database you're monitoring, a line (actually, a
JSON description) about the change is printed to that HTTP connection. So you
just watch that feed forever, and whenever you get told "this record has
changed", you go fetch that record from desktopcouch in the normal way and then
do whatever you want with it. Nicely event-driven; no polling at all, no wakeups
if you don't need them.
Getting at the changes feed from a desktopcouch database is a little more
complex than getting at it from a server CouchDB, but it's doable, and one of
the things we plan to do in the Ubuntu 11.04 development cycle is make this
trivial to do: you'll just call
databaseobject.glib_callback_for_changes(my_callback_function)
and your callback will be called every time there's a change in the database.
(The code below contains a load of complex OAuth stuff to derive a validated
URL for the _changes feed; that's what we're going to wrap up in that one
line.)
I was pretty pleased to see how simple it is to interact with Zeitgeist, and
I plan for us to work more with the Zeitgeist team. Thanks especially to Seif
who talked me through a lot of this, and to whom I owe a pint or something.
desktopcouch_gateway.py;
drop it in .local/share/zeitgeist/extensions, and then restart the
Zeitgeist server with zeitgeist-daemon --replace.