Distributed systems involve multiple components located in different operating environments, coordinating their actions to be able to behave – from the user's or from another system's point of view – as a single coherent system.
Each component of a distributed system has its own data repository – commonly, its own database.
This becomes a challenge when conducting transactions, since, for the distributed system to behave as a coherent unit, the transactions in a distributed system must abide by the same principles as the transactions in a non-distributed system.
In the following, we'll see what these principles are and how we can put them to work.
1. The ACID principles
The acronym ACID summarizes the qualities that every transaction must have: Atomic, Consistent, Isolated, and Durable.
- Atomic: the transaction must be completed in its entirety or not at all. A transaction cannot be partially completed.
- Consistent: the state of the data must always be valid. In the event that a transaction fails, its operations are reverted to the last valid state before the start of the transaction.
- Isolated: operations on uncommitted data are kept isolated from other transactions.
- Durable: once committed, the results of the transaction persist, even in the event of an unforeseen system failure.
Here, you can find a more detailed explanation on ACID.
2. How to maintain ACID in a distributed system
To comply with atomicity, all operations of the distributed transaction must have the same ending: committed or aborted. It is not permitted that some operations end with commit
and others with abort
.
However, in distributed systems there usually is no direct communication between different databases. So how can you ensure atomicity in such a scenario?
The typical way is to use a voting protocol: there is an entity which acts as coordinator of the distributed transaction, collecting the 'votes' from each component on whether to commit or abort the entire transaction.
How does this system work?
If all the 'votes' are in favor of committing, the coordinator gives the order to commit to each of the components. If there is at least one vote against, the coordinator gives the order to abort all the components.
3. Two-phase commit protocol
The voting protocol requires the commit to be done in two phases:
- voting phase or preparation phase;
- commit phase.
They work as shown below:
During the preparation or voting phase, the transaction coordinator sends a canCommit
request to each of the involved databases.
Upon receiving the request from the coordinator, those databases reply with their positive or negative vote. But before voting, they must prepare to commit their part of the transaction, saving the affected objects in some persistent storage, in order to guarantee they will be able to commit their part.
Then, in the commit phase, the coordinator collects the votes.
If all the votes are positive, it sends a doCommit
request to each of the databases. If there are any negative votes, the coordinator sends a doAbort
request to all that voted positive. Upon receiving the coordinator’s request, the database engines that voted positive in the first phase act as requested: commit or abort.
The protocol is simple and withstands certain fault conditions. For example:
- If any of the database engines fails before they can cast their positive vote, the coordinator will not receive the vote, which becomes a negative vote when the timeout period expires.
- If any of the databases fail after voting, they will have saved the objects in persistent storage, so the faulty server can restart and recover the changes previously saved.
4. Pros and Cons of 2PC
Just like anything else, the Two-phase Commit Protocol (2PC) has advantages and disadvantages. The main advantage is it ensures that the information is available and consistent and all databases in a distributed system are kept in sync.
The disadvantage of 2PC is that it is a blocking protocol, which means that if an operation fails on one of the databases involved in a transaction, the progress of the transaction will be slowed down until the failed operation recovers.
Also, if the transaction coordinator fails, then the databases are left in an inconsistent state until the coordinator recovers. These problems are mitigated by defining timeout limits to prevent a blocked operation from blocking an entire transaction indefinitely.
Another disadvantage is that the speed of the 2PC protocol is always that of the slowest component. Since the protocol must wait for the responses from all the databases involved, if one of them operates slowly, it will delay all the others.
5. The XA standard
The XA (eXtended Architecture) standard is an implementation of the 2PC protocol written by the Open Group, which is used by most database engines.
According to the XA definitions, a distributed transaction is called a global transaction, and the individual transactions that each database performs within a distributed transaction are called local transactions or branch transactions.
There are several elements in XA transactions:
- the application program (AP);
- the transaction coordinator or transaction manager (TM);
- one or more resource managers (RM).
How does this flow work?
- The AP is in charge of signaling the start of the transaction, accessing the resources, and signaling the end of the transaction.
- The RM is responsible for managing the databases shared by a server for use by clients.
- The TM is responsible for controlling global transactions and interacting with RMs, telling them when to commit, roll back or do a recovery. It is also in charge of monitoring the execution of distributed transactions.
The XA specification focuses primarily on the interaction between RM and TM, defining the commands for the TM to send orders to the RMs and receive responses from them.
6. Transaction Identifier
XA transactions use a global transaction id and a local transaction id (called xid) for each XA resource.
Each XA resource enlists itself in the XA Manager by invoking the start (xid) method. This method informs the manager that the resource is ready for operations.
To carry out the first phase of the 2PC protocol, the manager invokes the prepare (xid) method, which requests the OK or Abort vote to each XA resource.
- If all XA resources vote OK, the manager invokes the commit (xid) method on each resource.
- If any resource vote Abort, the manager invokes the rollback (xid) method on each resource.
To end the transaction, the TM invokes the end (xid) method on each resource.
7. XA with MySQL
In MySQL, support for XA transactions is available starting with version 5.0.
The XA implementation in MySQL allows a MySQL server to act as an RM that carries out XA transactions within a global transaction. The TM acts as a client program that connects to the MySQL server.
As a default, commits are performed automatically by MySQL on operations that are not executed within a transaction. To change this behavior, you can disable the autocommit feature by issuing the command:
SET autocommit = 0;
Broadly speaking, the operation of XA in MySQL is similar to the general operation of 2PC. Here we can see the specific commands that are used in the interaction between the TM and a MySQL server:
Now let’s look at the commands to start a branch transaction on an individual MySQL database:
XA START
starts a transaction and associates it with a global identifier.XA END
specifies the end of the active transaction.XA PREPARE
prepares the transaction for committing.XA COMMIT
executes the commit and ends the transaction that was previously prepared.XA ROLLBACK
rolls back the transaction and ends it.XA RECOVER
displays information about all prepared transactions.
When the XA START
command is executed, the transaction changes to the ACTIVE state. After that, all commands issued to the MySQL server will be put within the active transaction. In the following example, we are initiating an XA transaction named test and inserting a row into a table:
XA START 'test';
INSERT INTO MyTable VALUES (1, 'Sample Data');
Take into account that, when a global transaction is active – as in this example – you won’t be allowed to execute statements that trigger an implicit commit, like a table creation or a table drop. If you try to do so, you will get an error telling you that the command cannot be executed.
Once all the commands are executed, the XA END
command puts the transaction in IDLE state.
XA END 'test';
For transactions in the IDLE state, you can execute an XA PREPARE
command, which puts them in the PREPARED state.
XA PREPARE 'test';
Finally, transactions in the PREPARED state can be completed with XA COMMIT
or XA ROLLBACK
, as such:
XA COMMIT 'test';
or
XA ROLLBACK 'test';
8. Transaction isolation
One thing to keep in mind before using XA transactions on MySQL is to define the level of transaction isolation.
Transaction isolation is the I of ACID. When there are multiple transactions making changes and queries simultaneously in a database, you can fine-tune the balance between performance, consistency and reliability by choosing the best isolation level option.
The InnoDB storage engine lets you choose between the four isolation levels defined by the 1992 SQL standard:
- READ UNCOMMITTED;
- READ COMMITTED;
- REPEATABLE READ;
- SERIALIZABLE.
The default value in InnoDB is REPEATABLE READ, which ensures a high level of consistency and ACID-compliance. In situations where strict consistency and repeatable results are less important than minimizing crashes, READ COMMITTED or READ UNCOMMITTED can be used to achieve performance improvements.
In XA transactions, as well as in situations where you need to troubleshoot concurrency problems or deadlocks, it is recommended to use the SERIALIZABLE option, which imposes even stricter rules than REPEATABLE READ.
9. 2PC Alternatives
The commit phase of 2PC – and therefore of its derivative XA – is the weak point of the protocol, since there may be situations in which the principle of atomicity is violated or that the intervening databases are locked indefinitely.
For this reason, other protocols have emerged that seek to overcome this flaw.
The Three-Phase Commit (3PC) approach adds an additional state called Prepared to Commit. If the coordinator fails before sending the command to prepare to commit, the databases assume that the operation was aborted.
The coordinator only sends the doCommit
command when all the databases have reported that they are Prepared to Commit. This eliminates the risk that some database commits its transaction before all the others are informed of the decision to commit.
Linear Commit is another approach in which the messages from the coordinator to the databases are passed in a chained manner, instead of being the coordinator that sends orders to each database and receives their responses.
The coordinator is first-in-line and should only exchange messages with the first database, which in turn communicates with the second, the latter with the third, and so on until the last one. This scheme reduces the number of messages and prevents the coordinator from centralizing all communication.
One last approach, called decentralized 2PC, removes the responsibility of centralizing the exchange of messages from the coordinator. In this protocol, the databases and the coordinator all communicate with each other. Therefore, the databases do not depend on the coordinator giving them the order to commit or abort, since they all know what all the others reported.
Despite its shortcomings, XA is the standard adopted by all RDBMS, making it the safest option when choosing a protocol for distributed database transactions.