Object Design #
Sharding #
Core Concepts #
Sharding is a technique that involves dividing data from a single database into multiple tables and databases based on certain criteria, in order to improve performance and availability. Both approaches can effectively prevent data from exceeding the query threshold, and database sharding can also help to distribute TPS (transactions per second) more evenly. While table sharding may not reduce database pressure, it does provide the opportunity to transfer distributed transactions to local transactions. Since cross-database upgrades can sometimes be complicated, this is an important consideration. The use of multiple primary-replica sharding can also prevent data from becoming too concentrated on one node and improve architecture availability.
Splitting data through database and table sharding is an effective method for dealing with high TPS and large amounts of data, as it keeps data amounts below a threshold and reduces traffic. Sharding methods can be divided into vertical and horizontal sharding.
Vertical Sharding #
According to the business sharding method, it is also known as vertical sharding or longitudinal sharding. The core concept of vertical sharding is to specialize databases for different uses. Before sharding, a database consists of many tables corresponding to different businesses. However, after sharding, tables are categorized into different databases according to business, and the pressure is also separated into different databases.
The diagram below illustrates an example of assigning user tables and order tables to different databases by vertical sharding based on business needs.
Vertical sharding is a method that specializes databases for different uses and categorizes tables into different databases according to business needs. However, this method requires adjusting the architecture and design periodically, and it may not be agile enough to deal with the fast-changing needs of internet businesses. Additionally, it cannot fully solve the problem of single-node limitations, although it can alleviate the issues caused by high data and concurrency amounts.
In cases where the data amount in a table still exceeds the threshold of a single node after vertical sharding, horizontal sharding should be implemented to further process the data.
Horizontal Sharding #
Horizontal sharding is also called transverse sharding. Compared with the categorization method according to business logic of vertical sharding, horizontal sharding categorizes data to multiple databases or tables according to some certain rules through certain fields, with each sharding containing only part of the data. For example, according to primary key sharding, even primary keys are put into the 0 database (or table) and odd primary keys are put into the 1 database (or table), which is illustrated as the following diagram.
Theoretically, horizontal sharding has overcome the limitation of data processing volume in single machine and can be extended relatively freely, so it can be taken as a standard solution to database sharding and table sharding.
Terms #
- Table
Table is the core concept of transparent data sharding. There are diversified tables provided for different data sharding requirements by Apache ShardingSphere.
- Logic Table
The logical name of the horizontal sharding databases (tables) with the same schema. It is the logical table identification in SQL. For instance, the data of order is divided into 10 tables according to the last number of the primary key, and they are from t_order_0
to t_order_9
, whose logic name is t_order
.
- Actual Table
The physical tables that actually exist in horizontally sharded databases, such as t_order_0
to t_order_9
in the previous example.
- Table Group
A table group refers to a set of main tables and sub-tables with consistent sharding rules. When performing multi-table join queries using table groups, it is essential to perform the join using the sharding key to avoid Cartesian product joins or cross-database joins, which can negatively impact query efficiency. For example, if you have the tables t_order and t_order_item, both sharded by order_id and you perform a join using order_id, then these two tables have a binding table relationship. Multi-table join queries between binding tables won’t result in Cartesian product joins, significantly improving query efficiency.
Take this one for example, if SQL is:
SELECT i.* FROM t_order o JOIN t_order_item i ON o.order_id=i.order_id WHERE o.order_id in (10, 11);
When binding table relations are not configured, suppose the sharding key order_id
routes value 10 to sharding 0 and value 11 to sharding 1, there will be 4 SQLs in Cartesian product after routing:
SELECT i.* FROM t_order_0 o JOIN t_order_item_0 i ON o.order_id=i.order_id WHERE o.order_id in (10, 11);
SELECT i.* FROM t_order_0 o JOIN t_order_item_1 i ON o.order_id=i.order_id WHERE o.order_id in (10, 11);
SELECT i.* FROM t_order_1 o JOIN t_order_item_0 i ON o.order_id=i.order_id WHERE o.order_id in (10, 11);
SELECT i.* FROM t_order_1 o JOIN t_order_item_1 i ON o.order_id=i.order_id WHERE o.order_id in (10, 11);
With binding table configuration and use order_id
to correlate, there should be 2 SQLs after routing:
SELECT i.* FROM t_order_0 o JOIN t_order_item_0 i ON o.order_id=i.order_id WHERE o.order_id in (10, 11);
SELECT i.* FROM t_order_1 o JOIN t_order_item_1 i ON o.order_id=i.order_id WHERE o.order_id in (10, 11);
In them, table t_order
in the left end of FROM will be taken by ShardingSphere as the primary table of query. In a similar way, ShardingSphere will also take table t_order
in the left end of FROM as the primary table of the whole binding table. All the route computations will only use the sharding strategy of the primary table, so sharding computation of t_order_item
table will use the conditions of t_order
. Due to this, sharding keys in binding tables should be totally identical.
- Broadcast Table
It refers to tables that exist in all sharding database sources. The schema and data must consist in each database. It can be applied to the small data volume that needs to correlate with big data tables to query, dictionary table for example.
- Single Table
It refers to only one table that exists in all sharding database sources. It is suitable for little data in table without sharding. When a single table grows to a certain size, SphereEx-DBPlusEngine also supports converting it into a sharding table.
- Data Node
As the atomic unit of sharding, it consists of data source name and actual table name, e.g. ds_0.t_order_0
.
Mapping relationships between logic tables and actual tables, it can be divided into two kinds: uniform topology and user-defined topology.
- Uniform Topology
It means that tables are evenly distributed in each data source, for example:
db0
├── t_order0
└── t_order1
db1
├── t_order0
└── t_order1
The data node configurations will be as follows:
db0.t_order0, db0.t_order1, db1.t_order0, db1.t_order1
- User-defined Topology
It means that tables are distributed with certain rules, for example:
db0
├── t_order0
└── t_order1
db1
├── t_order2
├── t_order3
└── t_order4
The data node configurations will be as follows:
db0.t_order0, db0.t_order1, db1.t_order2, db1.t_order3, db1.t_order4
Sharding Design #
- Sharding Key
A database field used to horizontally shard a database table. For example, if the last digits of the primary key in the order table are used to modulate the sharding, then the primary key is the sharding key. If there is no sharding key in SQL, a full route will be executed, which will result in poor performance.
In addition to supporting single sharding keys, SphereEx-DBPlusEngine also supports sharding based on multiple fields. The selection of sharding fields involves the following factors.
Data Structure: Primary Key or Unique Key
The primary key and the unique key are common constraints in a database, designed to ensure uniqueness and non-null. In a distributed environment, it is usually recommended to use the primary key or unique key as a sharding key or part of it, otherwise constraint validation cannot be completed.
Here is an extended issue of primary key design. In a distributed database architecture, it is not recommended to use an auto-increasing key as the table’s primary key because it has poor performance, low security, and is unsuitable for a distributed architecture.
Generally, an ordered global unique key is usually used instead of an auto-increasing key, which is the recommended practice for distributed database primary keys, like UUID or global sender (snowflake algorithm)
Data Structure: Index
Through the sharding key, SQL queries can be routed to the specified shard. However, in practical production environments, businesses also need to access tables through other indexes. There needs to be a separate strategy for the existing system’s indexes.
The usual strategy is to use an index table, that is, to convert the index into another sharded table, which solves the query through a secondary query. But this approach is not simple enough.
Therefore, the optimal design is not to create an index table, but to integrate the index data into the sharding key information. In this way, the column queried can directly know the sharded information it is in.
It becomes more efficient. The query can know the sharded information of the data in advance, requiring only one query to obtain the wanted result.
In summary, the choice of sharding fields does not directly affect indexes. For high-frequency index queries, it is recommended to enhance them through sharding key. A normal index can also be implemented through global secondary indexes (supported by some distributed databases).
Data Structure: Field Type
Fields that serve as sharding keys usually choose simpler data types to improve efficiencies, such as common numbers, dates, and texts. They are not recommended for complex fields such as LOB and JSON. In addition, the type of sharding fields should be stable, and DDL changes should be avoided as much as possible.
Data Feature: Table Scale
Table scale is one of the key factors in determining whether sharding is used. Once a table is sharded, it will inevitably cause a certain degree of “functional degradation.” If other methods can be used to reduce the size of the table, we should take these methods first.
Table lifecycle planning can be used, such as regular data archiving, compression, dumping, cleaning strategies to reduce size of data, or by utilizing built-in database strategies such as table partitioning and vertical sharding to effectively reduce table scale.
Data feature: Dispersion
Dispersion here refers to whether the data is sufficiently dispersed after applying a sharding algorithm based on a certain field or field combination. The purpose of data sharding is to reduce the size of the table. The fundamental principle is to achieve data dispersion as much as possible.
It is necessary to measure the degree of dispersion of the data after sharding, and to select fields that can be fully dispersed as sharding keys as much as possible.
It should be noted that if the selected field has business characteristics, attention should also be paid to its potential impact on future business changes.
Access feature: Mutability
Select fixed fields that will not change as shard keys. Although some distributed databases also support the modification of sharding keys, the cost of data migration after modification is very high, so it is still better to choose unchangeable fields.
Access feature: Transaction Isolation
Try to select the data sharded by field, and the processing of data changes can be concentrated in the fragment. A large number of business changes can be completed through local transactions, which are much less expensive and more efficient than global transactions.
Access feature: Data Filtering and Association
If a field is frequently used as a data filtering field and has a good selection rate, it can be given priority as a shard key. In another case, it can be used in conjunction with other associated tables, and it is best to choose fields that are involved in association operations.
Try to complete the join action locally after data association, and reduce the operations of data shuffling or upward aggregation. By analyzing the SQL executed in the system, select the table that needs to be sharded with the most frequently used or most important fields for sharding. This may include some queries from OLAP, which can exclude this kind of SQL.
Sharding Field Order
If multiple fields are used as sharding keys, the order generally has no effect. The field can be used for sharding mainly taking the sharding algorithm as consideration. However, in the case of composite sharding, it is necessary to consider the primary and secondary relationship of the sharding fields.
Sharding Algorithm
Common sharding algorithms include LIST, RANGE, HASH, or custom algorithms. According to the characteristics of each sharding algorithm, the appropriate one can be selected.
If the range is uniform, HASH can be used; if the data is obviously hot and cold, RANGE can be used. Some specialized designs can be used, such as using a two-level mapping method to solve scaling problems, and feature encoding fields to meet multi-feature splitting requirements. The descriptions of the two most common algorithms are as follows:
RANGE
Using data ranges for database sharding is the most straightforward sharding approach, and it can also be flexibly combined with other sharding strategies. When it’s necessary to use sharding keys for range queries, the RANGE sharding strategy can quickly locate data for efficient querying, avoiding the problem of cross-sharding queries in most cases. It is also relatively easy to scale up later by adding nodes without migrating data from other shards. However, this distribution method is prone to data hotspot problems.
HASH
Although there are many sharding solutions, Hash sharding is the most popular and widely used approach. Random sharding is not actually random and follows certain rules. Usually, sharding is performed by HASH modulo, so it is sometimes referred to as discrete sharding.
The data in randomly sharded nodes is relatively evenly distributed, and it is not easy to encounter hotspots and concurrent access bottlenecks. However, it is not convenient when it comes to data migration. A consistent HASH algorithm can largely avoid this problem. In addition, discrete sharding is also prone to complex issues of cross-shard queries.
Customized Sharding Algorithm
Provides a portal for application developers to implement their sharding algorithms that are closely related to their business operations, while allowing users to manage the physical distribution of actual tables themselves. Customized sharding algorithms are further divided into:
- Standard Sharding Algorithm
Used to deal with scenarios where sharding is performed using a single key as the sharding key =
, IN
, BETWEEN AND
, >
, <
, >=
, <=
.
- Complex Sharding Algorithm
Used to cope with scenarios where multiple keys are used as sharding keys. The logic containing multiple sharding keys is very complicated and requires the application developers to handle it on their own.
- Hint Sharding Algorithm
For scenarios involving Hint sharding.
- Auto Sharding Algorithm
Used for handling range-based automatic sharding scenarios, it can automatically create and manage sharding.
- Force Sharding
In scenarios where the sharding field is not determined by SQL but by other external conditions, you can inject sharding values using SQL Hints. For example, sharding by employee login primary key when there is no such field in the database. SQL Hints can be used through SQL comment annotations. For details, please refer to the Forced Sharding Routing documentation.
Terms of Use #
Although SphereEx-DBPlusEngine aims at being compatible with all SQL and standalone databases, Distribution brings a more complex scenario to the database.
- SQL support level
Compatible with all commonly used SQL that routes to single data nodes; SQL routing to multiple data nodes is divided, because of complexity issues, into three conditions: stable support, experimental support, and no support.
- Stable support
Full support for DML, DDL, DCL, TCL, and common DALs. Support for complex queries such as paging, de-duplication, sorting, grouping, aggregation, table association, etc. Support SCHEMA DDL and DML statements of PostgreSQL and openGauss database.
- Experimental Support
Experimental support refers specifically to support provided by implementing Federation execution engine, an experimental product that is still under development. Although largely available to users, it still requires significant optimization.
- Do not Support
Data Encryption #
Background #
As the key to the successful implementation of data protection, enterprises need to consider the security of critical data, the availability of application system functions, and system maintainability to determine the appropriate encryption protection technology solutions that meet their needs. The table below lists commonly used encryption technologies and the corresponding security risks they address.
Security Risk | Disk Encryption | File Encryption | Transparent Data Encryption (TDE) | Database encryption (Third-Party Hardening) | Application-Level Encryption |
---|---|---|---|---|---|
Prevent data leakage caused by disk loss | Y | Y | Y | Y | Y |
Prevent system root and administrator accounts from accessing | N | Y | Y | Y | Y |
Control database administrator access to data | N | N | N | Y | Y |
Resist targeted threats and prevent data leakage caused by APTs | N | Y | Y | Y | Y |
Providing fine-grained access logs and complying with regulations | N | Y | N | Y | Y |
Ensure backup data and data snapshot storage | N | Y | Y | Y | Y |
Protection of Unstructured Data and Files | Y | Y | N | N | Y |
Preventing hardware and database vendors from “eavesdropping” | N | Y | N | Y | Y |
Disk Encryption
Disk encryption uses block-level encryption technology. For example, AWS’s EBS and Alibaba Cloud’s ECS both support disk encryption. The biggest advantage of this encryption is that it is transparent to the operating system. Performance is reduced after encryption compared to before encryption, and the degree of performance degradation varies depending on the application.
File Encryption
File encryption is achieved by stacking on top of other file systems such as Ext2, Ext3, ReiserFS, and JFS to provide applications with transparent, dynamic, efficient, and secure encryption functions. Typically, it is used to encrypt specified directories. It is important to note that this encryption method may result in significant performance loss.
Database Encryption - TDE
Transparent Data Encryption (TDE) is an encryption technology provided by databases, which encrypts and decrypts data files in real-time I/O. The data is encrypted before being written to the disk and decrypted when read from the disk into memory. TDE does not increase the size of data files, and developers do not need to change any application programs. The corresponding key management is also provided by the database API or component, making it transparent to applications. This method is suitable when disks or systems cannot be open to users under certain conditions, such as cloud environments.
Database Encryption - Third-Party Hardening
Another way of encrypting databases is by using third-party hardening, which embeds the products of third-party professional database encryption vendors into the database to provide transparent data encryption capabilities. “Transparent” means that users of the application system can use it without modification, and authorized users can see plaintext data without any feeling of encryption. In addition, it can enhance the security capabilities of the original database, such as providing separation of powers and desensitization display.
Application-Level Encryption
Application-level encryption can be said to be the ultimate solution, which can ensure that data is encrypted before it reaches the database and can real-time protect sensitive user data. The key here is to provide application transparency, ensuring that the application does not require modification or only requires minimal modification. This method is completely controlled by the user, without the need to trust any third-party vendors to provide data security guarantees, giving users full freedom and flexibility. For example, it can provide a unified security encryption policy across multiple databases.
Terms #
- Logical column.
It is used to calculate the encryption and decryption columns and it is the logical identifier of the column in SQL. Logical columns contain ciphertext columns (mandatory), query-helper columns (optional), like-query columns (optional), and plaintext columns (optional).
- Logic column Type(dataType)
Used to define the type of logical column, such as INT NOT NULL, VARCHAR(200) DEFAULT NULL, etc. The specific definitions can be found in the official documentation for various dialects of the field, such as column_definition in the MySQL create statement.(https://dev.mysql.com/doc/refman/8.0/en/create‑table.html)
- cipherColumn
Column after encryption
- cipherDataType
DataType use to define cipherColumn. Same as logic column
- assistedQueryColumn
Assisted columns for queries. For some non-idempotent encryption algorithms with a higher security level, irreversible idempotent columns are provided for query.
- assistedQueryDataType
Used to define the data type of the assisted query column, same as logical column type.
- plainColumn
Used to store plaintext and provide services during the migration of encrypted data.
It can be deleted after the data cleansing is complete.
- plainDataType
Used to define the data type of the plain column, same as logical column type.
- encrypting
Batch encryption of unencrypted data in the database.
- decrypting
Batch decryption of encrypted data in the database.
Implementation: #
Encryption Process:
For example, if there is a table named
t_user
in the database, and they’re two fields in the table:pwd_plain
for storing plaintext data andpwd_cipher
for storing ciphertext data, and logicColumn is defined aspwd
, then users should write SQL forlogicColumn
, that isINSERT INTO t_user SET pwd = '123'
. SphereEx-DBPlusEngine receives the SQL and finds that thepwd
is thelogicColumn
based on the encryption configuration provided by the user. Therefore, it encrypts the logical column and its corresponding plaintext data.
SphereEx-DBPlusEngine transforms the column names and data encryption mapping between the logical columns facing users and the plain and cipher columns facing the underlying database. As shown in the figure below:
The user’s SQL is separated from the underlying data table structure according to the encryption rules provided by the user so that the user’s SQL writing does not depend on the real database table structure.
The connection, mapping, and transformation between the user and the underlying database are handled by Apache ShardingSphere.
The picture below shows the processing flow and conversion logic when the encryption module is used to add, delete, change and check, as shown in the figure below.
- Cloud key management
The key is managed in the cloud, for example, the secretkey function of AWS is used to save the key, to improve the security and convenience of the entire encryption.
When initializing the encryption algorithm, the program establishes a connection with AWS to obtain the relevant key stored in AWS, and then stores the key in the algorithm. The whole data encryption process does not involve network interaction with the cloud.
- Encrypting
The encryption task is triggered by DistSQL. After receiving the request of the encryption task, the program will create the encryption task according to the current encryption rules. The encrypting task is mainly composed of two parts, one is the query task, the other is the update task. The query task is responsible for querying the user’s table data and obtaining the plaintext fields to be encrypted, and then pushing them to the channel. The update task obtains data from the channel and encrypts the update. The whole task creation and execution process will interact with the governance center, so users can query the task progress and clean up tasks through relevant DistSQL.
Terms of Use #
- Supported:
- Encryption and decryption of one or multiple columns in a database table;
- Compatibility with all commonly used SQL.
- Unsupported:
- Existing data in the database table needs to be handled separately;
- Encryption fields cannot be compared using operators such as greater than, less than, ORDER BY, BETWEEN, LIKE, etc.;
- Encryption fields cannot be used in calculation operations, such as AVG, SUM, and arithmetic expressions.
- Other:
- The encrypted columns, assisted query columns, LIKE query columns, etc. configured in the encryption rules need to have the same case as the columns in the database.