So, I occasionally ponder the D in ACID transactions and wonder what it REALLY means.
Observation #1: Committed Subject To...
When I was working at Tandem in the 1980s, we had a complex (and fascinating) multiprocessor system with dual-ported disk drives, dual-ported IO controllers, multiple (2 to 16) processors connected via a message-passing bus and it was fault tolerant. I'm very proud of that part of my career when I worked on TMF (Transaction Monitoring Facility) which implemented the transaction logging and recovery spread across the Tandem Computer's NonStop system. While working there, I first realized the spectrum of commitment that happens over time. When a transaction would commit on TMF, it would get progressively more committed over time:
So, when is the transaction durable? What about a transaction whose commitment is recorded on multiple machines but only in volatile memory (but with enough of them to have as many nines availability as you want)? Hmmm... this durable stuff is annoying (just like all the letters in ACID).
Observation #2: Commit Dependencies and the Event Horizon
I am reminded of IMS (IBM's Information Management System) and its special version called IMS Fast Path. What I'm about to say may be apocryphal but this is how I think it works based on occasional conversations with transactional old-farts...
As I understand it, there was this cool optimization called "internal commit".
IMS worked as a database management system integrated with 3270 block mode terminals. Work happened in three steps:
This has the characteristic that it is IMPOSSIBLE to see the contents of the database other then by looking at the effects of a committed transaction. Unlike most DBMS systems today, you could not begin a transaction, examine record-X, and display the results without committing the transaction that LOOKED at Record-X. Just the act of reading required a commitment to see the effects outside the system.
So, now what the heck is an internal commit? The idea is to count on the single log buffer in this mainframe based system. If transaction-T1 is committed and its updates placed into the transaction log in memory (but not flushed to disk), and transaction-T2 comes along while the system is still running, T2 cannot commit unless T1 commits. T1 will commit if the system remains alive long enough to flush the transaction log to disk. T2 is running on the same system and can only commit using the SAME log and assuming the system stays up long enough for T2 to get its changes written to the log on disk. If T2 succeeds in doing that, T1 most definitely will have committed! This is called a commit dependency --> T2 has a commit dependency on T1.
Leveraging these two concepts (the commit dependency and the fact you can't see any effects outside the system other than through a transaction commit), IMS-FastPath would play this cute trick. Say transaction-T1 modifies record-X and is on its way to committing. Once the commit record is in the log buffer in memory, then Record-X could be unlocked. This would be heretical in most systems because a crash might cause the loss of the new value for Record-X (remember, transaction-T1 is not yet durable when it is unlocked). This worked without anomalies because of what I call the event horizon.
An event horizon (my terminology) refers to the ever increasing scope of knowledge and our ability to leverage knowledge with some assumptions about its propagation. It is OK for transaction-T1's changes to record-X to be unlocked because no one can tell the difference! If you see the new value for record-X, you fate is lashed to the success of transaction-T1. You have a commit dependency on transaction-T1 (either because of the design constraints defined above OR because transaction-T1 is actually committed and durable on disk).
So, for transaction-T2 which is looking at transaction-T1, it appears durable when the changes are in the buffer in main memory due to the event horizon effect. It's like that spy movie: "I could tell you but then I'd have to kill you..." The knowledge doesn't matter if all the impacts of its use are eliminated from the system.
Observation #3: Dialog Semantics and Visibility of Failures
I spent a couple of years working on a feature of SQL Server (shipped in MS SQL Server 2005) called SQL Service Broker. Service Broker defined a notion of a dialog which implements transactionally-consistent, exactly-once, in-order messaging between to "services" which are reified by their state in the database. The notion of dialogs was that the messages would be delivered transactionally, exact-once, in-the-order-sent, within a timeout window OR a dialog-failure was delivered to the service. SQL Service Broker ONLY provides services whose state was represented in a SQL database and, hence, counted on the durability guarantees of SQL Server.
As you try to understand the semantics of message delivery guarantees, it is essential to think about WHO is being provided with the guarantee and what THEIR durability is. The more we thought about this, the more it was clear that the dialog failed precisely when it LOOKED like it failed. Consider Service-A in a dialog with Service-B. The dialog has a timeout. If Service-A cannot receive a message before the timeout, the Service Broker must give it a Dialog-Failure message. The guarantee of delivery of this message is null and void if Service-A is not around to receive the dialog failure...
While in Service Broker within MS SQ Server 2005 only fully durable services are supported, it is meaningful to support more permutations. Consider Service-A and Service-B, each of which may be either durable or in-memory. The durable flavor means that receiving and/or sending a message occurs only when the change to the service happens on disk in the database (just like changing a record in SQL Server). The in-memory flavor means that the state is just kept in memory, a system failure wipes out the state.
Consider four permutations:
Wow... so this leads us to an interesting observation. In case-1 (durable-to-durable), what if one of the two services is in a triple-data-center redundant high-availability site and the other is on a laptop? It is possible (actually easy) to lose the laptop and have the fact that it is durable on disk be irrelevant. So, what are the semantics of the dialog? What you know is that the partner service (on the other side of the dialog) either responds in time or doesn't. If the partner DOES respond and complete its part of the work (and finish the dialog), the partner may be blown to smithereens and you think you have done some cooperative work. Yet, the partner (and its part of the work) are gone! In case-4 (both in-memory), also has the dilemma that you don't know ANYTHING about the partner except if has sent you the correct messages. Basically, the only thing you know in ALL the cases is if the partner responded in a timely fashion. Once the work is completed, you have no REAL guarantee that it has stuck and won't be blown up.
During the dialog, you have a relationship that can detect the failure and amnesia via the time-out and dialog-failure. After the dialog successfully completes, you are disconnected and cannot tell if the partner's state (and work) are lost in a failure. This is intractable and is really a spectrum of durability (from in-memory to "committed subject to thermo-nuclear exchange"). You have a programmatically visible relationship and then you don't!
Durability Is in the Eye of the Beholder
So, let's consider this proposition that "durability is in the eye of the beholder". Who cares if a transaction is durable?? Remember, I am not (in this blog entry) questioning Atomicity, Consistency, or Isolation. If you see ANY effect of the transaction, you see ALL the effects of the transaction. Why did we have this D thing in ACID, anyway??
Well, it turns out the old-farts (including me) doing the old-time transaction systems just kinda' assumed a special case for interacting with the human. We knew that we needed to tell the person using the system "OK... We did it!" but we didn't talk a lot about this being an example of a messaging relationship in which the human is one participant in a two-party messaging pattern. Also, as I am about to discuss joint failure and the destruction of state at both ends of the messaging relationship, it is slightly uncomfortable to talk about the human being obliterated in the same way as I am about to discuss the annihilation of a communicating service. I'm really nicer than that...
If we assume atomicity and/or a long-running relationship like a dialog, there is a window of time during which you can programmatically tell if the remote work is lost in a failure. Once you exit that window, your make assumptions about the remote work persisting (being durable) even when you are NOT keeping tabs on it. There are cases in which the loss of the remote work only occurs when YOU are wiped out... those are convenient because you aren't around to notice the problem. There are other cases in which we just ASSUME that the probability of loss of the remote work is low enough that we ignore the dilemma.
Basically, big, complex, and distributed system are big, complex, and distributed. We can't get perfect behavior out of them. Something needs to be durable only if I, the partner, am still around to notice the need for it to durable!
- Pat