Authors: John MacCormick,Chris Bishop
Let's now examine how the prepare-then-commit trick works for a replicated database. The figure on the next page demonstrates the idea. Typically, one of the replicas is the “master” that coordinates the transaction. To be specific, suppose there are three replicas,
A
, B, and
C
, with
A
being the master. Suppose the database needs to execute a transaction that inserts a new row of data into a table. The prepare phase begins with
A
locking that table, then writing the new data into its write-ahead log. At the same time,
A
sends the new data to
B
and
C
, which also lock their own copies of the table and write the new data in their logs.
B
and
C
then report back to
A
on whether they succeeded or failed in doing this. Now the second phase begins. If any of A, B, or
C
encountered a problem (such as running out of disk space or failing to lock the table), the master
A
knows that the transaction must be rolled back and informs all replicas of this—see the figure on page 140. But if all the replicas reported success from their prepare stages,
A
sends a message to each replica confirming the transaction, and the replicas then complete it (as in the figure on the next page).
So far we have two database tricks at our disposal: the to-do list trick and the prepare-then-commit trick. What do they buy us? By combining the two tricks, your bank—and any other online entity—can implement a replicated database with atomic transactions. And this permits simultaneous, efficient service to thousands of customers, with essentially zero chance of any inconsistency or data loss. However, we have not yet looked into the heart of the database: how is the data structured, and how are queries answered? Our final database trick will provide some answers to these questions.
RELATIONAL DATABASES AND THE VIRTUAL TABLE TRICK
In all of our examples so far, the database has consisted of exactly one table. But the true power of modern database technology is unleashed in databases that have multiple tables. The basic idea is that each table stores a different set of information, but that entities in the various tables are often connected in some way. So a company's database might consist of separate tables for customer information, supplier information, and product information. But the customer table might mention items in the product table, because customers order products. And perhaps the product table will mention items in the supplier table, because products are manufactured from the suppliers' goods.
The prepare-then-commit trick: The master replica,
A
, coordinates two other replicas (B,
C)
to add some new data to the table. In the prepare phase, the master checks whether all replicas will be able to complete the transaction. Once it gets the all clear, the master tells all replicas to commit the data.
The prepare-then-commit trick with rollback: The top panel of this figure is exactly the same as in the previous figure. But during the prepare phase, one of the replicas encounters an error. As a result, the bottom panel is an “abort” phase in which each replica must roll back the transaction.
Let's take a look at a small but real example: the information stored by a college, detailing which students are taking which courses. To keep things manageable, the example will have only a handful of students and courses, but hopefully it will be clear that the same principles apply when the amount of data is much larger.
First, let's look at how the data might be stored in the simple, one-table approach we've been using so far in this chapter. This is shown in the top panel of the figure on the following page. As you can see, there are 10 rows in this database and 5 columns; a simple way of measuring the amount of information in the database is to say there are 10 × 5 = 50 data items in the database. Spend a few seconds now studying the top panel of the figure on the next page more closely. Is there anything that irks you about the way this data is stored? For instance, can you see any unnecessary repetition of data? Can you think of a more efficient way of storing the same information?
You probably realized that a lot of information about each course is duplicated for each student that takes the course. For example, three students take ARCH101, and the detailed information about this course (including its title, instructor, and room number) is repeated for each of the three students. A much more effective way of storing this information is to use two tables: one to store which courses are taken by which students, and another to store the details about each course. This two-table approach is shown in the bottom panel of the figure on the following page.
We can immediately see one of the advantages of this multitable approach: the total amount of storage required is reduced. This new approach uses one table with 10 rows and 2 columns (i.e., 10? 2 = 20 items), and a second table with 3 rows and 4 columns (i.e., 3 × 4 = 12 items), resulting in a total of 32 items. In contrast, the one-table approach needed 50 items to store exactly the same information.
How did this saving come about? It comes from the elimination of repeated information: instead of repeating the course title, instructor, and room number for each course taken by each student, this information is listed exactly once for each course. We have sacrificed something to achieve this, though: now the course numbers appear in two different places, since there is a “course number” column in both tables. So we have traded a
large
amount of repetition (of the course details) for a
small
amount of repetition (of the course numbers). Overall, this works out to be a good deal. The gains in this small example are not huge, but you can probably see that if there are hundreds of students taking each course, the storage savings from this approach would be enormous.
Top: Single-table database for students' courses.
Bottom: The same data stored more efficiently, in two tables.
There is another big advantage of the multitable approach. If the tables are designed correctly, then changes to the database can be made more easily. For example, suppose the room number for MATH314 has changed from 560 to 440. In the one-table approach (top of the figure on the previous page), four separate rows would need to be updated—and, as we discussed earlier, these four updates would need to be wrapped in a single transaction to ensure that the database remains consistent. But in the multitable approach (bottom of the figure on the facing page), only one change is required, updating a single entry in the table of course details.
Keys
It's worth pointing out here that, while this simple student-courses example is most efficiently represented using only two tables, real databases often incorporate many tables. It is easy to imagine extending our student-courses example with new tables. For example, there could be a table containing details for each student, such as a student ID number, phone number, and home address. There could be a table for each instructor, listing e-mail address, office location, and office hours. Each table is designed so that most of its columns store data that is not repeated anywhere else—the idea is that whenever details about a certain object are required, we can “look up” those details in the relevant table.
In database terminology, any column that is used to “look up” details in a table is called a
key.
For example, let's think about how we would find out the room number for Luigi's history class. Using the single-table approach of the upper panel of the figure on the previous page, we just scan the rows until we find Luigi's history class, look across to the room number column, and observe the answer, which in this case is 851. But in the multitable approach of the same figure's lower panel, we initially scan the first table to find the course number of Luigi's history class—this turns out to be “HIST256.” Then we use “HIST256” as a
key
in the other table: we look up the details for this course by finding the row containing “HIST256” as its course number, then move across that row to find the room number (again, 851). This process is shown in the figure on the following page.
The beauty of using keys like this is that databases can look up keys with superb efficiency. This is done in a similar fashion to the way a human looks up a word in a dictionary. Think about how you would go about finding the word “epistemology” in a printed dictionary. Naturally, you would
not
start at the first page and scan through every entry looking for “epistemology.” Instead, you quickly narrow in on the word by looking at the page headings, initially turning the pages in large clumps and gradually reverting to smaller clumps as you get close to your goal. Databases look up keys using the same technique, but they are even more efficient than humans. This is because the database can precalculate the “clumps” of pages that will be turned and keep a record of the headings at the start and end of each clump. A set of precalculated clumps for fast key lookup is known in computer science as a
B-tree.
The B-tree is yet another crucial and ingenious idea underpinning modern databases, but a detailed discussion of B-trees would, unfortunately, lead us too far afield.
Looking up data using a key: To find out the room number for Luigi's history course, we first find the relevant course number in the left-hand table. This value, “HIST256,” is then used as a key in the other table. Because the column of course numbers is sorted in alphabetical order, we can find the correct row very quickly, then obtain the corresponding room number (851).
The Virtual Table Trick
We are nearly ready to appreciate the main ingenious trick behind modern multitable databases. The basic idea is simple: although all of a database's information is stored in a fixed set of tables, a database can generate completely new, temporary tables whenever it needs to. We'll call these “virtual tables” to emphasize the fact that they are never really stored anywhere—the database creates them whenever they are needed to answer a query to the database and then immediately deletes them.
A simple example will demonstrate the virtual table trick. Suppose we start with the database of the lower panel of the figure on page 142, and a user enters a query asking for the names of all students taking classes from Professor Kirby. There are actually several different ways a database can proceed with this query; we'll just examine one of the possible approaches. The first step is to create a new virtual table listing students and instructors for all courses. This is done using a special database operation called a
join
of two tables. The basic idea is to combine each row of one table with each corresponding row of the other table, where the correspondence is established by a key column that appears in both tables. For example, when we join the two tables of the bottom panel of the figure on page 142 using the “course number” column as the key, the result is a virtual table exactly like the one in the figure's top panel—each student name is combined with all of the details for the relevant course from the second table, and these details are looked up using the “course number” as a key. Of course, the original query was about student names and instructors, so we don't need any of the other columns. Luckily, databases include a
projection
operation that lets us throw away columns we are not interested in. So after the join operation to combine the two tables, followed by a projection operation to eliminate some unnecessary columns, the database produces the following virtual table: