How to perform database migration #
Data Table #
Databases have single-field primary keys, and field types support integers and strings. We recommend using integer primary keys for higher concurrency and better performance. DDL operations on the original table are not supported during the migration process.
Optimize migration configuration #
The migration configuration can be customized depending on the situation. Some configuration items can be calculated more precisely to improve performance.
Full configuration example:
ALTER MIGRATION RULE (
READ(
WORKER_THREAD=40,
BATCH_SIZE=1000,
SHARDING_SIZE=10000000,
RATE_LIMITER (TYPE(NAME='QPS',PROPERTIES('qps'='500')))
),
WRITE(
WORKER_THREAD=40,
BATCH_SIZE=1000,
RATE_LIMITER (TYPE(NAME='TPS',PROPERTIES('tps'='2000')))
),
STREAM_CHANNEL (TYPE(NAME='MEMORY',PROPERTIES('block-queue-size'='10000')))
);
The default values are available without configuration and can be viewed via DistSQL:
SHOW MIGRATION RULE;
Concurrency #
Full migration will split based on table dimensions, as well as data volume dimensions to improve concurrency and thus performance.
By default, splitting is done on the basis of 10 million data segments. Suppose there are 2 tables with 200 million data and 400 million data: according to the default settings, they can be split into 60 segments.
If the machine is highly configured and does not require high flow limits, then consider increasing the number of concurrent threads WORKER_THREAD
to 60.
Flow Restriction #
If the source database is already online and has high requirements for fast response, it is better to turn on the flow limit to avoid affecting the online business.
You can set a small flow limit according to the configuration of the database server, dynamically modify the flow limit according to the load, and then restart the job to take effect.
Example of DistSQL:
-- Modify current limit value
ALTER MIGRATION RULE (
READ(
RATE_LIMITER (TYPE(NAME='QPS',PROPERTIES('qps'='1000')))
)
);
-- Query the list of migrated jobs
SHOW MIGRATION LIST;
-- Close job
STOP MIGRATION {jobId};
-- Start Job
START MIGRATION {jobId};
Heterogeneous migration #
The target side is proxy, which currently supports a limited number of database types, including MySQL, PostgreSQL, and openGauss. New database types are being supported.
The support on the source side is shown in the following table:
Oracle | MySQL | PostgreSQL | openGauss | |
---|---|---|---|---|
Full migration | Y | Y | Y | Y |
Incremental migration | N | Y | Y | Y |
Data Comparison | Y | Y | Y | Y |
The source side supports a bit more types of databases.
The first type: besides full migration, it also supports incremental synchronization, including MySQL, PostgreSQL, openGauss.
The second type: only supports full migration, including Oracle and other commercial databases. Incremental synchronization is being supported.
Target-side support:
MySQL | PostgreSQL | openGauss | |
---|---|---|---|
Full migration | Y | Y | Y |
Incremental migration | Y | Y | Y |
Data Comparison | Y | Y | Y |
Data consistency checks support heterogeneous databases, but there are restrictions on the types of algorithms supported. The DATA_MATCH
algorithm is currently recommended.