Authors: John MacCormick,Chris Bishop
I already mentioned that database practitioners are obsessed with consistency, but at this point it may not seem like such a big deal. After all, does it really matter if Jingyi is recorded as being a friend in one place and friendless in another place? We could even imagine an automated tool that scans through the database every so often, looking for discrepancies like this and fixing them. In fact, tools like this do exist and can be used in databases where consistency is of secondary importance. You may have even encountered an example of this yourself, because some operating systems, when rebooted after a crash, check the entire file system for inconsistencies.
But there do exist situations in which an inconsistency is genuinely harmful and cannot be corrected by an automated tool. A classic example is the case of transferring money between bank accounts. Here's another simple database:
Suppose Zadie has requested to transfer $200 from her checking account to her savings account. Just as in the previous example, this is going to require two rows to be updated, using a sequence of two separate disk operations. First, Zadie's checking balance will be reduced to $600, then her savings balance will be increased to $500. And if we are unlucky enough to experience a crash between these two operations, the database will look like this:
In other words, this is a complete disaster for Zadie: before the crash, Zadie had a total of $1100 in her two accounts, but now she has only $900. She never withdrew any money—but somehow, $200 has completely vanished! And there is no way to detect this, because the database is perfectly self-consistent after the crash. We have encountered a much more subtle type of inconsistency here: the new database is inconsistent with its state
before
the crash.
It's worth investigating this important point in more detail. In our first example of inconsistency, we ended up with a database that was self-evidently inconsistent:
A
friends with
B
, but
B
not friends with
A.
This type of inconsistency can be detected merely by examining the database (although the detection process could be very time-consuming, if the database contains millions—or even billions—of records). In our second example of inconsistency, the database was left in a state that is perfectly plausible, when considered as a snapshot taken at a particular time. There is no rule that states what the balances of the accounts must be, or any relationships between those balances. Nevertheless, we can observe inconsistent behavior if we examine the state of the database over time. Three facts are pertinent here: (i) before initiating her transfer, Zadie had $1100; (ii) after the crash, she had $900; (iii) in the intervening period, she did not withdraw any money. Taken together, these three facts are inconsistent, but the inconsistency cannot be detected by examining the database at a particular point in time.
To avoid both types of inconsistency, database researchers came up with the concept of a “transaction”—a set of changes to a database that must all take place if the database is to be left consistent. If some, but not all, of the changes in a transaction are performed, then the database might be left inconsistent. This is a simple but extremely powerful idea. A database programmer can issue a command like “begin transaction,” then make a bunch of interdependent changes to the database, and finish with “end transaction.” The database will guarantee that the programmer's changes will all be accomplished, even if the computer running the database crashes and restarts in the middle of the transaction.
To be absolutely correct, we should be aware that there is another possibility too: it's possible that after a crash and restart, the database will return to the exact state it was in before the transaction began. But if this happens, the programmer will receive a notification that the transaction failed and must be resubmitted—so no harm is done. We'll be discussing this possibility in greater detail later, in the section about “rolling back” transactions. But for now, the crucial point is that the database remains consistent regardless of whether a transaction is completed or rolled back.
From the description so far, it may seem that we are obsessing unnecessarily over the possibility of crashes, which are, after all, very rare on modern operating systems running modern application programs. There are two responses to this. First, the notion of “crash” as it applies here is rather general: it encompasses any incident that might cause the computer to stop functioning and thus lose data. The possibilities include power failure, disk failure, other hardware malfunctions, and bugs in the operating system or application programs. Second, even if these generalized crashes are rather rare, some databases cannot afford to take the risk: banks, insurance companies, and any other organization whose data represents actual money cannot afford inconsistency in their records, under any circumstances.
The simplicity of the solution described above (begin a transaction, perform as many operations as necessary, then end the transaction) might sound too good to be true. In fact, it can be achieved with the relatively simple “to-do list” trick described next.
The To-Do List Trick
Not all of us are lucky enough to be well organized. But whether or not we are well organized ourselves, we've all seen one of the great weapons wielded by highly organized people: the “to-do” list. Perhaps you are not a fan of making lists yourself, but it's hard to argue with their usefulness. If you have 10 errands to get done in one day, then writing them down—preferably in an efficient ordering—makes for a very good start. A to-do list is especially useful if you get distracted (or, shall we say, “crash”?) in the middle of the day. If you forget your remaining errands for any reason, a quick glance at the list will remind you of them.
Database transactions are achieved using a special kind of to-do list. That's why we'll call it the “to-do list” trick, although computer scientists use the term “write-ahead logging” for the same idea. The basic idea is to maintain a log of actions the database is planning to take. The log is stored on a hard drive or some other permanent storage, so information in the log will survive crashes and restarts. Before any of the actions in a given transaction are performed, they are all recorded in the log and thus saved to the disk. If the transaction completes successfully, we can save some space by deleting the transaction's to-do list from the log. So Zadie's money-transfer transaction described above would take place in two main steps. First, the database table is left untouched and we write the transaction's to-do list in the log:
After ensuring the log entries have been saved to some permanent storage such as a disk, we make the planned changes to the table itself:
Assuming the changes have been saved to disk, the log entries can now be deleted.
But that was the easy case. What if the computer crashes unexpectedly in the middle of the transaction? As before, let's assume the crash occurs after Zadie's checking account has been debited, but before her savings account is credited. The computer reboots and the database restarts, finding the following information on the hard drive:
Now, the computer can tell that it may have been in the middle of a transaction when it crashed, because the log contains some information. But there are four planned actions listed in the log. How can we tell which ones have already been performed on the database and which ones remain to be done? The answer to this question is delightfully simple: it doesn't matter! The reason is that every entry in a database log is constructed so that it has the same effect whether it is performed once, twice, or any other number of times.
The technical word for this is
idempotent
, so a computer scientist would say that every action in the log must be idempotent. As an example, take a look at entry number 2, “Change Zadie checking from $800 to $600.” No matter how many times Zadie's balance is set to $600, the final effect will be the same. So if the database is recovering after a crash and it sees this entry in the log, it can safely perform the action without worrying about whether it was already performed before the crash too.
Thus, when recovering after a crash, a database can just replay the logged actions of any complete transactions. And it's easy to deal with incomplete transactions too. Any set of logged actions that doesn't finish with an “end transaction” entry simply gets undone in reverse order, leaving the database as if the transaction had never begun. We'll return to this notion of “rolling back” a transaction in the discussion of replicated databases on page 134.
Atomicity, in the Large and in the Small
There is another way of understanding transactions: from the point of view of the database user, every transaction is
atomic.
Although physicists have known how to split atoms for many decades, the original meaning of “atomic” came from Greek, where it means “indivisible.” When computer scientists say “atomic,” they are referring to this original meaning. Thus, an atomic transaction cannot be divided into smaller operations: either the whole transaction completes successfully, or the database is left in its original condition, as if the transaction had never been started.