Logo
Common strategies for performance optimization

Common strategies for performance optimization #

Introduction #

DBPlusEngine has been heavily optimized in terms of code and parameters for generic scenarios, and DBPlusEngine can be started with default parameters to achieve better performance. For users with higher performance requirements, further optimizations can be made for scenarios and environments.

Java Versions #

DBPlusEngine currently supports Java 8 as a minimum, and as of this writing, the latest LTS (Long term support) version is Java 17.

The new version of Java will have some performance improvements over the old version of Java in terms of the JDK class library, JIT, GC, etc.

JVM Parameters #

Set the appropriate -Xmx #

JVM heap memory parameters need to be set in business and environment, not too small and not too big. Too little heap memory may lead to more GCs, and too much heap memory may increase the GC stall time, which is reflected in the increased execution delay of some SQL.

For OLTP scenarios, the heap memory should not exceed 16 GB if all SQL can be pushed down to the database for execution.

For scenarios using federated queries, some calculations need to be performed in the memory of the DBPlusEngine process, so the heap memory can be increased appropriately according to the actual data volume to avoid OOM.

-Xss Settings #

-Xss does not need to be specifically set in general scenarios. Adjustments may need to be considered in the following cases:

  • Using jOOQ

During jOOQ class loading, the stack of calls related to class loading will be relatively deep, and a small -Xss (e.g. less than 512k) may cause java.lang.

  • SQL is very long or has many parameters

When SQL is long or has many parameters, the SQL parsing may have a deep call stack. For example, the following SQL:

insert into some_table values (1,'foo'), (2,'bar'), ..., (100000,'baz')

In the case of a very large number of values, the call stack required for SQL parsing will be deep. If -Xss is too small, java.lang.StackOverflowError occurs.

-XX:+AggressiveHeap #

Applies to: Java 10 and higher

By enabling this option, the JVM will optimize the heap for long-running memory-intensive applications, which can help DBPlusEngine performance in high-concurrency scenarios.

-XX:+UseNUMA #

In NUMA environments, use this parameter to optimize the memory allocation of the JVM. Since the JVM detects whether it is currently a NUMA environment when it starts, this parameter will have no effect even if it started in a non-NUMA environment.

-XX:+SegmentedCodeCache #

Applies to: Java 9 and higher

This parameter will enable segmented code caching, which is useful for DBPlusEngine performance improvement.

-XX:+UseJVMCICompiler #

Applies to: OpenJDK 11

Use JVMCI as the default compiler. Enabling this option has a performance-boosting effect on DBPlusEngine performance peaks.

Using this parameter requires the following parameters to be enabled first:

-XX:+UnlockExperimentalVMOptions

DBPlusEngine Optimization #

DBPlusEngine Parameter Optimization #

max-connections-size-per-query #

Default value: 1

This parameter controls the maximum number of connections allowed to be fetched on each data source for each query execution. When there are query requests routed to multiple libraries/multiple tables, increasing this parameter can improve the concurrent execution of the query.

proxy-backend-query-fetch-size #

Applies to: Proxy

Default value: -1.

This parameter controls the number of rows of data transferred per network interaction when the Proxy queries data to the storage node.

With the default value of -1, the Proxy will try to use streaming queries when querying data to MySQL storage nodes, which reduces the Proxy’s memory usage and avoids OOM in case of very large result sets.

If the number of rows in the result set is manageable, you can increase this parameter, which may reduce the number of interactions between the Proxy and the database to reduce SQL execution latency, but will increase the memory usage of the Proxy.

proxy-frontend-executor-size #

Applies to: Proxy

This parameter controls the number of Netty threads in the Proxy and defaults to the number of logical CPUs available to the Proxy’s JVM process * 2.

If the Proxy is under some load, top -H allows you to see the resource usage of the threads in the JVM. The threads named with epollEventLoopGroup are Netty threads.

In the case of a bottleneck in Proxy throughput: if the average CPU usage of Netty threads is low, you can reduce the number of threads to reduce the performance overhead such as thread switching; otherwise, you need to increase the number of threads to increase the network I/O processing capacity.

DBPlusEngine Third-Party Dependency Parameter Optimization #

Turn off Netty leak detection #

Applies to: Proxy

Netty has leak detection capabilities, with 4 levels of detection:

  • DISABLED: off.
  • SIMPLE: (default) simple sampling, low-performance overhead.
  • ADVANCED: on top of simple sampling, the recent access code path of the leaked resource is also recorded, with relatively high overhead.
  • PARANOID: (for testing only) detects the presence of leaks in all resources, with very high-performance overhead.

Netty leak detection is SIMPLE level by default, which has low-performance overhead.

However, for stable versions of Proxy, leak detection can be turned off completely to eliminate the performance overhead of the leak detection logic itself.

-Dio.netty.leakDetection.level=DISABLED

DBPlusEngine Proxy Client side Optimization #

MySQL Connector/J Parameter Recommendations #

Enable Server-side Prepared Statement #

Parameter NameMySQL Connector/J DefaultConnection DBPlusEngine Proxy Recommended ValuesDescription
cachePrepStmtstruetrueInternal cache in the driver Prepared Statement.
useServerPrepStmtsfalsetrueUsing the server Prepared Statement.
prepStmtCacheSize258192Each database connection within the driver Prepared Statement Cache size.
prepStmtCacheSqlLimit2562048The length limit of SQL statements allowed to be cached.

DBPlusEngine Proxy internally caches the SQL parsing and SQL context of Prepared Statement on Server side, which can reduce the overhead and latency of SQL execution.

TCP Buffer Configuration #

By default, the TCP buffer for MySQL Connector/J uses the operating system’s default value. If it needs to be adjusted to a specific value, it can be configured with the following two parameters.

  • tcpRcvBuf
  • tcpSndBuf

Merging Batch Operations #

When using PreparedStatement’s addBatch/executeBatch methods, batch operations can not be implemented directly through the MySQL protocol due to the limitations of the MySQL protocol itself. You can enable MySQL Connector/J rewriting for bulk operations with the following parameters:

  • rewriteBatchedStatements=true
  • allowMultiQueries=false(MySQL Connector/J Default Value)

When rewriting is enabled for bulk operations:

  • insert SQL is automatically rewritten as multiple sets of values;
  • update/delete statements are automatically rewritten as multiple statements.

DBPlusEngine Proxy automatically recognizes MySQL COM_SET_OPTIONS command internally, so you need to keep allowMultiQueries=false parameter set while using bulk operations.

Avoid Long Network Wait Times #

You can reduce the wait time for abnormal connections by setting socketTimeout. It is recommended to set it at least to the SQL maximum execution time plus one RTT (Round-Trip Time). In practical business scenarios, considering network fluctuations and database peak load, this value should be set higher.

PostgreSQL/openGauss JDBC Driver Parameter suggestion #

TCP Buffer Configuration #

By default, the TCP buffer of PostgreSQL/openGauss JDBC Driver uses the default value of the operating system. If you need to adjust it to a specific value, you can configure it with the following two parameters.

  • receiveBufferSize
  • sendBufferSize

Environment Optimization #

CPU Affinity #

In some user environments, certain CPU cores on a machine may have specialized uses.

For example, in a 128-core environment, CPUs 0-15 are dedicated to handling network interrupts, so if DBPlusEngine is running on all CPUs, the network processing power and maximum DBPlusEngine throughput may be affected.

In this case, you can implement the available CPUs for the JVM where DBPlusEngine is located. e.g., using numactland specifying that CPUs 16-127 cores are available for DBPlusEngine, modify the startup command:

numactl -C 16-127 java ...# 省略其他参数

Removing NIC Performance Bottlenecks #

DBPlusEngine is a network I/O intensive application, especially Proxy, which interacts with database clients and storage nodes at the same time, with frequent network sending and receiving. In this case, you need to avoid network interrupt processing from becoming a performance bottleneck for DBPlusEngine, especially if the NIC has a small queue, which can limit the maximum throughput of DBPlusEngine.

If the NIC has only a single queue and network interrupt processing reaches the upper limit (usually manifested by a CPU core with hi/si usage close to 100% and other CPU cores with relatively low usage), consider spreading network interrupt processing across multiple CPU cores by enabling irqbalance, etc.

Scenario suggestions #

Configure indexes for sharding keys #

In data paging scenarios, it is recommended to add indexes if the paging key is not the primary key to reduce the SQL execution latency conditional on the paging key.

Paging correction and paging query optimization #

For paged query SQL that can be routed to a single library/single table, DBPlusEngine will directly push the SQL down to the storage node for execution.

When the paged query SQL is routed to multiple libraries or tables, DBPlusEngine will rewrite the paging parameters of the statement to ensure the correctness of the query. Especially in the case of large offset queries, the efficiency of page-corrected SQL queries can be significantly reduced. For cases that may involve large offset queries, it is recommended not to use LIMIT method for paging. You can consider building a secondary index with the number of row records and row offset, or using the ID at the end of the last page data as the condition of the next query to avoid the high delay of SQL execution.

Avoid including unwanted data in SQL #

When writing SQL, users are recommended to precisely control the required columns and query conditions, which can effectively reduce the SQL execution overhead and network I/O overhead. The network topology of DBPlusEngine Proxy is more complex than the traditional direct connection to the database, and the network I/O overhead caused by transferring unnecessary data is more obvious.