Transactional Databases - What You Need to Know
Aug 8th, 2020
As the title of this piece asks, what is a transactional database? The short answer is that it’s a storage system which supports all-or-nothing data operations, more commonly called transactions. This means that if any subset of some potentially long list of data operations to be performed (e.g., creating, updating, or deleting information) fails for any reason, then all the operations are abandoned, and the database is restored to its state before any of the operations began.
This capability provides reliable storage and processing of information because it prevents bad data operations and thus ensures that the database is always in a correct state. Transactional databases provide guarantees for data integrity, even in the case of system failures, which can be mission critical for many use cases.
Consider the quick and common example of a bank customer using an app on a mobile device to transfer money from one account to another. A complete and correct transaction requires both debiting the origin account and crediting the destination account.
But what if the power fails to the database server after the origin account is debited yet before the destination account is credited? A true, transactional database ensures that in such a case the first part of the overall transaction is canceled (or “rolled back”) because the second part didn’t go through, keeping the database in a correct and consistent state despite the failure.
In this article, we’ll examine database transactions and their importance in more detail, explain the four elements of “ACID compliance” and why they matter, discuss the common difficulties involved with maintaining those elements when data and transactions are distributed in modern systems, and offer some specific suggestions for choosing a transactional database system to meet your needs.
What is a Database Transaction?
Put simply, a database transaction is a unit of work that is performed by a database management system (DBMS) against a given database, performing data operations and updating the underlying files on various storage media. A transactional database will ensure that such units of work are all-or-nothing in the sense that any failure in any operation will cause all the operations to be halted and rolled back, leaving the database in its state prior to when the transaction began.
There are two primary reasons for using transactional databases, the first of which being that it makes units of work reliable under all circumstances. To refer back to our banking example, customers can be assured that when their money is debited from the origin account, it’s going to be credited to the destination account. If anything were to go wrong somehow along the way (e.g., network failures, server failures, etc.), customers expect their accounts to remain unchanged rather than simply lose their money to happenstance.
The second reason is that a transactional database as we’ve described is always in a correct and consistent state, regardless of the sort of failures that can occur with all the technologies involved. Nobody wants to store information in a system whose data can become incoherent because invalid operations are allowed, or that can lose information when the server goes down. Not every use case is mission critical, of course, but many are, and in those cases a transactional database can provide great business value.
Of course, not every database system is transactional in nature. Some systems focus solely on recording information that isn’t critical if lost or that doesn’t need to support transactional features. When considering your use case, it’s important to pay close attention to the degree that you need such features and match your expectations against the degree to which a potential database system adheres to transactional database principles.
The Importance of ACID Compliance
ACID is an acronym for “Atomicity, Consistency, Isolation, and Durability”, a set of principles that ensure database transactions are processed reliably and correctly. The following table explains what these four different elements involve.
Element | Meaning |
Atomicity | All data operations in a transaction complete or roll back in an all-or-nothing fashion. |
Consistency | The information in the database must be semantically meaningful in all cases (e.g., no inserting child data without a valid parent, no empty values for required fields, etc.) |
Isolation | Each transaction must run independently of any other transactions in process at the same time; i.e., information can’t be allowed to “leak” from one transaction to another. |
Durability | Any transaction that completes successfully is recorded in an indelible way, meaning its data can’t be lost in the event of software or even hardware failures. |
Let’s consider a somewhat more complicated example to illustrate why ACID guarantees matter. Imagine another common case in which two persons, A and B, are booking seats in the same row for the same showing at a movie theater. Person A is booking only a single seat, while person B is trying to reserve the entire row for a family outing.
In the event that person A reserves a seat first, then person B’s transaction will fail because one of the seats in the online shopping cart has already been reserved and can’t be double booked. This illustrates both atomicity, because one of person B’s data operations fails they all do, and consistency, because the system won’t allow meaningless data such as two persons having the same seat reserved.
Person B’s transaction will be cancelled only if person A’s reservation completes successfully and is written to the database. Until that happens, however, the property of isolation is what allows both persons to be attempting transactions against the same seats at the same time, guaranteeing that both see the contested seat as available until it’s actually reserved.
And finally, even if the whole reservation system comes crashing down after person A successfully reserves a seat, durability ensures that the correct data will still be there when its restarted. This allows person A to print the ticket as needed and enjoy the show no matter what system failures occurred after the transaction completed properly.
How Do Distributed Transactions Work?
Modern applications are increasingly distributed in nature, often globally available, and this makes matters more difficult for transactional databases. The reason is that the ACID guarantees are just as important with distributed systems as with a single piece of database software running on a single server, yet having multiple servers or nodes involved complicates matters significantly.
Think about it. It’s straightforward when a single piece of database software can “decide” to cancel all the other data operations in a transaction as soon as the first one fails. It’s another matter entirely when the database software is running on dozens (or even hundreds) of nodes scattered around the world. Any failure of any element of any data operation on any of those servers effectively requires that the entire transaction must be cancelled and rolled back safely everywhere. Similarly, even if the transaction succeeds, the overall system must ensure all the operations are both correct and durable regardless of which server(s) executed them.
As a matter of practice, distributed transactional databases are very difficult to implement properly, but fortunately some vendors have. For example, Fauna is able to provide strictly serializable, externally consistent transactions because of its architecture and data storage algorithms.
And unlike other systems, Fauna does not require strict, physical clock synchronization across all servers to provide consistency, which avoids the usual limitations on distance between replica servers and is thus practical for deployment around the world at typical, global Internet latencies. Approaches that do require synchronization can result in failures when systems’ clocks or network traffic differs by a matter of milliseconds, whereas Fauna’s more relaxed requirements don’t suffer from such problems.
This is possible because Fauna offers a transaction engine inspired by Calvin, an approach to achieving fast distributed transactions across partitioned database systems. The Fauna transaction engine makes it possible to achieve ”consistency without clocks” courtesy of its distributed transaction protocol. In effect, Fauna decides in advance in what order transactions should be executed prior to any database writes. The Fauna execution engine then processes them in such a way that the final result is the same as if they’d been processed one-at-a-time in that order.
In effect, you get all the speed and power of distributed transactions executing in parallel on multiple servers yet enjoy all the data-goodness of transactional databases as if they’d been executed serially on a single server.
Try Fauna for Free
Fauna combines all the flexibility and performance of non-transactional databases with the relational querying and capabilities of transactional databases along with ACID guarantees in a fully distributed fashion. In fact, because of the way Fauna handles its distributed transactions, users can avoid the sort of data anomalies that can occur with other systems. Immortal writes, stale reads, causal reverse, and other such problems can be prevented via strictly serializable multi-region transactions that don't limit the number of keys, documents, or partitions.
It’s also worth noting that Fauna is not the usual hosted “database as a service” (DBaaS) or even some clustered cloud offering, both of which require management. Rather, Fauna is a true “Data API”, meaning developers can simply make calls as needed without any time spent worrying about provisioning or scale, with all the benefits of transactional databases and ACID compliance.
Because it’s a true Data API, Fauna is free of all the usual provisioning and configuration headaches and is available instantly as a serverless utility. Developers need only an account to get started, which costs nothing and offers a generous starting allowance. None of the usual headaches apply: no provisioning instances, no extensive configuration, etc.
In this article, we’ve examined transactional databases, explained the ACID guarantees they provide, illustrated their value for mission critical use cases, discussed how the modern need for distributed transactions complicates the picture, and offered some specific advice for an easy path to ACID compliant data goodness.
In closing, Fauna offers all the value with easily distributed database transactions, automated provisioning, and no-hassle scaling. It’s free to sign up, easy to get started, and offers clear and simple pricing—paying only for what you actually use. If you think Fauna could manage your data needs, why not give it a try today?
If you enjoyed our blog, and want to work on systems and challenges related to globally distributed systems, and serverless databases, Fauna is hiring