Distributed Transactions #
Background #
Database transactions need to satisfy the four characteristics of ACID (Atomicity, Consistency, Isolation, Persistence).
- Atomicity means that the transaction is executed as a whole, either all or nothing;
- Consistency means that a transaction should ensure that data changes from one consistent state to another consistent state;
- Isolation means that when multiple transactions are executed concurrently, the execution of one transaction should not affect the execution of other transactions;
- Durability means that the data modified by a committed transaction is persisted. In a single data node, a transaction is limited to access control of a single database resource, which is called a local transaction. Almost all mature relational databases provide native support for local transactions. However, in the distributed application environment based on microservices, more and more application scenarios require that access to multiple services and their corresponding multiple database resources can be incorporated into the same transaction, and distributed transactions come into being. Although relational databases provide perfect ACID native support for local transactions. However, in distributed scenarios, it becomes the shackle of system performance. How to make the database meet the characteristics of ACID in distributed scenarios or find the corresponding alternative is the focus of distributed transactions.
Transaction Categories #
Local Transactions #
Each data node manages its own transactions without enabling any distributed transaction manager. There is no coordination and communication between them, and they are not aware of the success of other data node transactions. Local transactions have no performance loss, but are not as strong in terms of consistency and eventual consistency.
Two-phase commit #
The earliest distributed transaction model for the XA protocol is the X/Open Distributed Transaction Processing (DTP) model proposed by the X/Open International Consortium, or the XA protocol for short. Distributed transactions based on the XA protocol are minimally invasive to the business. The biggest advantage is that it is transparent to the user, who can use distributed transactions based on the XA protocol as if they were local transactions. Strictly guaranteed transactional ACID characteristics are a double-edged sword. Transaction execution requires all required resources to be locked during the process, which makes it more suitable for short transactions with deterministic execution times. For long transactions, exclusive access to data for the entire duration of the transaction will result in significant concurrency performance degradation for business systems that rely on hot data. Therefore, distributed transactions based on the XA protocol are not the best choice in highly concurrent performance-first scenarios.
Flexible Transactions #
A transaction that implements the transaction element of ACID is called a rigid transaction, while a transaction based on the transaction element of BASE, which stands for Basic Availability, Flexible State, and Final Consistency, is called a flexible transaction.
- Basically Available ensures that the participants of a distributed transaction are not necessarily online at the same time;
- Soft state allows for a certain delay in updating the system state, which is not necessarily perceptible to the client;
- Eventually consistent is usually guaranteed by means of message passing. The isolation requirements are high in ACID transactions, where all resources must be locked during transaction execution. The idea of flexible transactions, on the other hand, is to move the mutually exclusive locking operation up from the resource level to the business level through business logic. By relaxing the requirement of strong consistency in exchange for the improvement of system throughput. Both ACID-based strong consistency transactions and BASE-based final consistency transactions are not silver bullets and can only be used to their best advantage in the most appropriate scenarios. A detailed comparison of the differences between them can be found in the table below to help developers make technology selections.
Local Transaction | Two (three) phase services | Flexible Services | |
---|---|---|---|
Business transformation | None | None | Implementation of related interfaces |
Consistency | Not supported | Support | Final Consensus |
Isolation | Not supported | Support | Business party guarantee |
Concurrent Performance | No effect | Severe recession | Slightly declining |
Suitable scenario | Inconsistent business-side processing | Short transactions & low concurrency | Long transactions & high concurrency |
Core Concepts #
XA Transactions #
The two-phase transaction commit uses the AP (Application), TM (Transaction Manager) and RM (Resource Manager) concepts abstracted from the DTP model defined by the X/OPEN organization to ensure strong consistency of distributed transactions. The TM and RM use the XA protocol to communicate with each other in both directions. Compared with the traditional local transaction, the XA transaction adds a preparation phase in which the database can passively accept the commit command and also reverse the transaction to inform the caller whether the transaction can be committed or not.TM can collect the preparation results of all branch transactions and perform an atomic commit at the end to ensure strong consistency of the transaction. The following diagram shows the two-phase commit model.
Flexible transactions were first mentioned in a paper published in 2008, which advocated the use of final consistency to relax the requirement for strong consistency in order to achieve increased concurrency in transaction processing, and TCC and Saga are two common implementations. DBPlusEngine integrates SEATA as a solution for flexible transactions.
Flexible Transactions #
Flexible transactions were first mentioned in a paper published in 2008, which advocated the use of final consistency to relax the requirement for strong consistency in order to achieve increased concurrency in transaction processing, and TCC and Saga are two common implementations. DBPlusEngine integrates SEATA as a solution for flexible transactions.
XA Transaction #
XAShardingSphereTransactionManager is the XA implementation class for DBPlusEngine’s distributed transactions. It is responsible for managing and adapting multiple data sources, and delegating the opening, committing and rolling back of the corresponding transactions to a specific XA transaction manager.
Enabling global transactions #
When you receive set autoCommit=0 on the access side, XAShardingSphereTransactionManager will call the specific XA transaction manager to open an XA global transaction, marked with an XID.
Execute real sharding SQL #
After the XAShardingSphereTransactionManager registers the XAResource corresponding to the database connection to the current XA transaction, the transaction manager sends the XAResource.start command to the database at this stage. All SQL operations performed by the database before receiving the XAResource.end command will be marked as XA transactions.
e.g.
XAResource1.start ## Enlist phase execute
statement.execute("sql1"); ## Simulate the execution of a slice SQL1
statement.execute("sql2"); ## Simulate the execution of a slice SQL2
XAResource1.end ## COMMIT phase execute
The sql1 and sql2 in the example will be marked as XA transactions.
Committing or rolling back a transaction #
When the XAShardingSphereTransactionManager receives the commit command from the access side, it delegates the commit action to the actual XA transaction manager, which collects all registered XAResources in the current thread and sends the XAResource.end command to mark the XA transaction boundary. The prepare instruction is then sent sequentially to collect all participating XAResource votes. If the feedback from all XAResources is correct, the commit instruction is invoked for final commit; if the feedback from any XAResource is incorrect, the rollback instruction is invoked for rollback. After the transaction manager issues the commit command, any exceptions generated by XAResource will be retried through the recovery log to ensure atomicity of the commit phase and strong data consistency.
XAResource1.prepare ## ack: yes
XAResource2.prepare ## ack: yes
XAResource1.commit
XAResource2.commit
XAResource1.prepare ## ack: yes
XAResource2.prepare ## ack: no
XAResource1.rollback
XAResource2.rollback
Seata Flexible Transactions #
First mentioned in a paperpublished in 2008 flexible transactions advocate the use of final consistency to relax the requirement for strong consistency in order to achieve increased concurrency in transaction processing.
TCC and Saga are two common implementations. They advocate that developers implement their own reverse operations on the database to achieve ultimate consistency when data is rolled back. SEATA automates the generation of SQL reverse operations so that flexible transactions no longer require developer intervention to be used.
DBPlusEngine integrates SEATA as a solution for flexible transactions.
Integrating Seata AT transactions requires integrating the TM, RM and TC models into DBPlusEngine’s distributed transaction ecosystem. In terms of database resources, Seata enables JDBC operations to communicate remotely with TCs by interfacing with the DataSource interface. Similarly, DBPlusEngine is geared towards the DataSource interface for aggregation of user-configured data sources. Thus, by encapsulating the DataSource as a Seata-based DataSource, it is possible to integrate Seata AT transactions into the DBPlusEngine sharding ecosystem.
Engine initialization #
When an application containing a Seata soft transaction starts, the user-configured data source is adapted to the DataSourceProxy required for the Seata transaction and registered with RM according to the seata.conf configuration.
Enabling Global Transactions #
TM controls the boundary of global transactions. TM gets the global transaction ID by sending the Begin command to TC, and all branch transactions participate in the global transaction by this global transaction ID; the context of the global transaction ID is stored in the current thread variable.
Execute the real sharding SQL #
The sliced SQL in Seata global transaction generates undo snapshot via RM and sends participate command to TC to join the global transaction. Since DBPlusEngine’s sliced physical SQL is executed in a multi-threaded manner, the integration of Seata AT transactions requires context passing of the global transaction ID between the main thread and the child threads.
Committing or rolling back a transaction #
When a Seata transaction is committed, TM sends a global transaction commit or rollback command to TC, which coordinates all branch transactions for commit or rollback based on the global transaction ID.