Lately I've been trying to improve how our web application employs transactions (we don't use them enough). While doing this I became aware of this rather general problem. It seems as though database transactions cannot be mixed with exceptions unless great vigilance is practiced. Consider this pseudo code:
function doOperationX() { connection = get_connection() connection.start_transaction() try { updateA(connection)
function updateA(connection) { connection.execute("UPDATE a SET ...") }
function updateB(connection) { connection.execute("UPDATE b SET ...") updateC(connection) }
function updateC(connection) { result = connection.execute("SELECT c.id FROM c INNER JOIN b ON ...") if (result.next()) connection.execute("UPDATE c SET ...") else throw C_DoesNotExist() }
Don't study the pseudo SQL too closely, the problem isn't there. Read the code like this:
1) operationX requires updating of A and, optionally, B. 2) Internally, updating B always requires that C also be updated. All or nothing. 3) Sometimes, C cannot be updated because the database is not in the proper state at this time. (Not an invalid state, mind you) 4) operationX knows that C might not be updatable, so it catches the exception and ignores it.
The problem is:
updateB() leaves the database in an inconsistent state if an exception is thrown out of updateC(). The inconsistency is that table B has been updated but D has not been updated accordingly.
Yet, it's only a problem because operationX caught the exception. Had it let the exception continue up, the finally block would have rolled back the entire transaction quite nicely.
How can I avoid this problem? The only proper solution that I can see is a prolific use of transaction checkpoints (aka savepoints) so that a partial rollback can occur if needed:
function updateB(connection) { success = false savepoint = connection.setSavepoint() try { connection.execute("UPDATE b SET ...") updateD(connection) success = true } finally { if (not success) connection.rollback(savepoint) } }
But this hurts performance considerably and requires programmer vigilance and lots of extra code! The programmer must remember that:
If your function does more than one update it MUST use a savepoint!
That's tough pill to swallow (for me). I try to avoid programming models that require programmer vigilance. All it takes is one groggy Monday...
What do you think? Am I missing a better solution?
From the pseudo code, I gather that if updateB() throws an exception, there is no need to rollback operation done in updateA(). If that is the case, you may consider committing after updateA() and rollbacking operation of updateB() in the catch block; that is easier than doing savepoint.
function doOperationX() { connection = get_connection() connection.start_transaction()