Monitor and Alarm #

Description:

Monitoring can only be completed through SphereEx-Console.
Currently, there is no alarm function.

Cluster ecosystem tool monitoring #

Currently, SphereEx-Console and SphereEx-Boot have no monitor function.

Monitoring of cluster components #

Host monitoring #

Monitoring indicators

Category	Subcategory	Indicator
Resource Overview	Resource Overview	System uptime
		CPU Cores
		Total Memory
		Total CPU Usage
		Total Memory Usage
		Space Usage for each
Performance Data	Performance Data	CPU CPU usage
		Memory Statistics
		Network bandwidth (per s)
		System load (1, 5, 15 min)
		Disk read/write throughput (per s)
		Disk read/write rate IOPS
		IO operation ratio (per s)
		IO read/write time (per time)

View Monitoring
Applicable Scenarios
View host monitoring.
Notes
- The monitoring center is functioning normally.
- The database has been configured for monitoring.
Steps
1. Log in to SphereEx-Console.
2. Click Monitoring -> Hosts to enter the monitoring list.
3. Click the Monitoring button in the action column to view the monitoring indicators.
Configure Monitoring
Applicable Scenarios
Configure host monitoring.
Notes
- The monitoring center is functioning normally.
- The Governance Center host has been registered and has read-write permissions for the monitoring plugin installation directory.
Steps
1. Log in to SphereEx-Console.
2. Click Resources -> Hosts to enter the host list.
3. Click Configure Monitoring in the operation column to enter the monitoring configuration page.
4. Configure the monitoring information as follows.

Field	Data Source	Required/Optional	Description
Host IP	Previous page
Monitoring Center	User selection, monitoring center list	Required	Installed monitoring center
Monitoring plugin port	Default filled, user editable	Required	Default value is the default installation directory corresponding to the monitoring plugin, which can be edited
Monitoring plugin installation directory	Default filled, user editable		Default value is the default installation directory corresponding to the monitoring plugin, which can be edited

Click Install to complete the installation of the monitoring plugin and the configuration of the monitoring center.

Database Monitoring #

Monitoring Metrics

Category	Subcategory	Metrics
Overview	Overview	Instance availability
		File open count
		Read-only secondary
		Master-slave delay
		Secondary SQL thread
		Secondary IO thread
		Slow query enabled
		Slow query threshold
Performance metrics	Performance metrics	QPS (Queries value / time within a specified time)
		TPS ((com_insert + com_delete + com_update + com_select) count / time)
		Inbound traffic
		Outbound traffic
		Number of slow queries
		Current number of connections
		Buffer pool utilization rate

Viewing Monitoring
- Applicable Scenarios
View database instance monitoring
Notes
- The monitoring center is functioning normally.
- The database has been configured for monitoring.
Operation Steps
1. Log in to SphereEx-Console.
2. Click Monitoring->Database to enter the monitoring list.
3. Click the Monitoring button in the operation column to view the monitoring indicators.
Configuring Monitoring
Applicable Scenarios
Configure database monitoring
Notes
- The monitoring center is functioning normally.
- The Governance Center host has been registered and has read-write permissions for the monitoring plugin installation directory.
Operation Steps
1. Log in to SphereEx-Console.
2. Click Resource->Database to enter the host list.
3. Click Configure Monitoring in the operation column to enter the monitoring configuration page.
4. Configure the monitoring information as follows.

Field Name	Data Source	Optional/Required	Description
Monitoring Center	User-selected, List of Monitoring Centers	Required	Pre-installed monitoring centers
Monitoring Plugin Port	Default filled, User-editable	Required	The default port corresponding to the monitoring plug-in is filled by default, editable
Monitoring Plugin Installation Directory	Default filled, User-editable		Default installation directory for monitoring plugins, editable
Database Monitoring User	User input	Required
Database Monitoring Password	User input	Required	Password protected

Click Install to complete the installation of the monitoring plugin and configuration of the monitoring center.

Governance Center Monitoring #

Monitoring Metrics

Category	Subcategory	Metrics
Overview	Overview	ZK cluster node status
		Node Roles
		Follower number
Performance	Performance	Average response latency
		Maximum response latency
		Minimum response latency
		Packets received
		Packets sent
		Active connections
		Pending requests
		Primary-Secondary Status
		Znode number
		Watch number
		Temporary node count
		Approximate total data size
		Open file descriptor count
		Maximum file descriptor count
		Sync operations blocked

Viewing Monitoring Metrics
Applicable Scenarios
View monitoring metrics for Governance Center instances.
Note
- The monitoring center is functioning normally.
- Governance Center monitoring is configured.
Procedure
1. Log in to SphereEx-Console.
2. Click Monitoring -> Governance Center to enter the Governance Center list.
3. Click Monitoring Node in the Operations column to enter the monitoring list.
4. Click the Monitor button in the Operations column to view the monitoring metrics.
Configuring Monitoring
Applicable Scenarios
Configure monitoring for Governance Center.
Prerequisites
- The monitoring center is functioning normally.
- The Governance Center host has been registered and has read-write permissions for the monitoring plugin installation directory.
Procedure
1. Log in to SphereEx-Console.
2. Click Resources > Governance Center to enter the host list.
3. Click Configure Monitoring in the Operations column to enter the monitoring configuration page.
4. Configure the monitoring information as follows.

Fields	Data Source	Required/Opeational	Description
Monitoring Centre	User select	Required	Only one monitoring center allowed per cluster
Governance Center Node IP	Auto filled preves page	Required	Not editable
Governance Center Node Port	Auto filled previous page information	Required	Not editable
Monitoring Plugin Port	User-editable default value	Required	Default port for the monitoring plugin, editable
Monitoring Plugin Installation Directory	User-editable default value		Default port for the monitoring plugin, editable

Click Install to complete the installation of the monitoring plugin and the configuration of the monitoring center.

Monitoring Center Monitoring #

Monitoring Metrics

Category	Subcategory	Metrics
Overview	Overview	Version
		Number of monitored instances
		Number of threads
		Last successful configuration reload time
		Was the last configuration reload successful
		Total number of chunks
		Number of created chunks
		Number of removed chunks
		Total number of samples

View Monitoring
Applicable Scenarios
View monitoring data of the monitoring center itself
Note
Monitoring center is running properly
Steps
1. Log in to SphereEx-Console.
2. Click Monitoring > Monitoring Center to enter the monitoring list.
3. Click the Monitoring button in the operation column to view the monitoring indicators.
Configure Monitoring
The monitoring of the monitoring center is enabled by default when it is installed, and no additional configuration is required.

Log Center Monitoring #

Todo

This feature will be implemented in Console version 1.2.

Cluster Monitoring #

Monitoring indicators

Category	Subcategory	Metrics
Overview	Cluster Overview	Number of component nodes
		Number of storage nodes
		Number of compute nodes
		Number of governance center nodes
Metadata	Metadata	Number of logical databases
		Number of users
		Number of tables (logical tables + single tables)
		Number of sharded tables
		Number of broadcast tables
		Number of table groups
		Number of single tables
		Number of encrypted tables
		Number of plugins (excluding single table plugins)
Performance Data	Connection details	Number of routes (instant value, change value, change rate)
		Number of executions (instant value, change value, change rate)
		Number of parses (instant value, change value, change rate)
		Number of requests (instant value, change value, change rate)
		Number of connections (instant value)
	Performance Details	QPS
		TPS
		Number of request bytes
		Number of response bytes
		Response time
		Total number of transactions
		Number of committed transactions
		Number of rolled-back transactions
		Transaction rollback rate
		Connection duration (analysis)
		Request duration (analysis)
Parsing Engine	DML sql	Total number of inserts
		Total number of deletes
		Total number of updates
		Total number of selects
		SQL statistics (Insert\Delete\Update\Select)
	DDL sql	Total number of DDLs
		Total number of DCLs
		Total number of DALs
		Total number of TCLs
		SQL statistics (DDL\DCL\DAL\TCL)
	DistSQL	Total number of RQLs
		Total number of RDLs
		Total number of RALs
		DistSQL statistics (RQL\RDL\RAL)
		Parsing duration
Routing Engine	Routing Engine	Data source routing
Routing Engine	Routing Engine	Top 10 table routing analysis
Thread Status	Thread Status	Current number of threads
		Number of daemon threads
		Peak number of threads
		Total number of thread starts
		Number of thread deadlocks
		JVM thread state data
Errors	Errors	Number of errors
Errors	Errors	Numbers of errors

View monitoring
Applicable scenarios
View cluster monitoring.
Notes
- Monitoring center is functioning normally.
- Governance Center has already been configured for monitoring.
Operation steps:
1. Log in to SphereEx-Console.
2. Click Monitoring->Cluster to enter the monitoring list.
3. Click the Monitoring button in the operation column to view the monitoring indicators.
Configure monitoring
Applicable scenarios
Configure cluster monitoring, which actually configures the monitoring of the computing nodes.
Notes
- Monitoring center is functioning normally.
- The host for the Governance Center has been registered and has read and write permissions for the monitoring plugin installation directory.
- Operation steps
1. Log in to SphereEx-Console.
2. Click Cluster Management->Cluster to enter the host list.
3. Click Configure Monitoring in the operation column to enter the monitoring configuration page.
4. Configure the monitoring information as follows.

Field Name	Data Source	Optional/Required	Description
Compute Node IP	Input on the previous page	Required	Not editable
Monitoring Plugin Port	User input	Required	For self-built clusters, the monitoring plugin port is automatically filled and not allowed to be modified. For registered clusters, the user can fill in the port.

Click on Add Monitoring Configuration to complete the monitoring center configuration.

Compute node monitoring #

Monitoring indicators

Category	Subcategory	Metrics
Overview	Overview	JDK version info
		Start time
		Compute node version
		Running status
		Running time
		Used memory size
		Used cache pool size
Performance	Performance	Total requests
		Total routes
		Total executions
		Total parsing
		Number of requests
		Number of parses
		Number of routes
		Number of executions
		QPS
		TPS
		Transaction rollback rate
Metadata	Metadata	Number of logical databases
		Number of users
		Number of tables (logical + single)
		Number of sharded tables
Parsing engine	DML sql	Total Insert
		Total Delete
		Total Update
		Total Select
		SQL statistics (Insert\Delete\Update\Select)
	DDL sql	Total DDL
		Total DCL
		Total DAL
		Total TCL
		SQL statistics (DDL\DCL\DAL\TCL)
	DistSQL	Total RQL
		Total RDL
		Total RAL
		DistSQL statistics (RQL\RDL\RAL)
		Parsing time
Routing engine	Routing engine	Data source routing
Routing engine	Routing engine	Top 10 table routing analysis
Thread status	Thread status	Current thread count
		Number of daemon threads
		Peak thread count
		Total thread startup
		Thread deadlock count
		JVM thread status data
Error statistics	Error statistics	Numbers of errors
Error statistics	Error statistics	Numbers of errors

View monitoring
Application Scenario
View cluster monitoring.
Notes
- Monitoring center is functioning normally.
- Governance Center has already been configured for monitoring.
Steps
1. Log in to SphereEx-Console.
2. Click Monitoring -> Cluster to enter the cluster list.
3. Click Node Monitoring in the operation column to enter the compute node list.
4. Click Monitoring in the operation column to view monitoring indicators.
Configure monitoring
Complete by configuring cluster monitoring.。

Storage node monitoring #

Currently, monitoring of storage nodes is not supported.

Agent management #

Introduction to Agent #

Background

In order to grasp the distributed system status, observe running state of the cluster is a new challenge. The point-to-point operation mode of logging in to a specific server cannot suite to large number of distributed servers. Telemetry through observable data is the recommended operation and maintenance mode for them. Tracking, metrics and logging are important ways to obtain observable data of system status. APM (application performance monitoring) is to monitor and diagnose the performance of the system by collecting, storing and analyzing the observable data of the system. Its main functions include performance index monitoring, call stack analysis, service topology, etc. DBPlusEngine is not responsible for gathering, storing and demonstrating APM data, but provides the necessary information for the APM. In other words, DBPlusEngine is only responsible for generating valuable data and submitting it to relevant systems through standard protocols or plug-ins. Tracing is to obtain the tracking information of SQL parsing and SQL execution. DBPlusEngine provides support for SkyWalking，Zipkin，Jaeger and OpenTelemetry by default. It also supports users to develop customized components through plug-in.MMMMMM

Use OpenTelemetry

OpenTelemetry was merged by OpenTracing and OpenCencus in 2019. In this way, you only need to fill in the appropriate configuration in the agent configuration file according to OpenTelemetry SDK Autoconfigure Guide，Data can be exported to Jaeger, Zipkin.

Use SkyWalking

Cooperating with Apache SkyWalking team, DBPlusEngine team has realized ShardingSphere automatic monitor probe to automatically send performance data to SkyWalking. Note that automatic probe in this way cannot be used together with DBPlusEngine plug-in probe.

Metrics used to collect and display statistical indicator of cluster. DBPlusEngine supports Prometheus by default.

Challenges

Tracing and metrics need to collect system information through event tracking. Lots of events tracking make kernel code mess, difficult to maintain, and difficult to customize extend.

Goal

The goal of Apache ShardingSphere observability module is providing as many performance and statistical indicators as possible and isolating kernel code and embedded code.

Core concepts

Agent

Based on bytecode enhancement and plugin design to provide tracing, metrics and logging features.

APM is an acronym for Application Performance Monitoring. Focusing on the performance diagnosis of distributed systems, its main functions include call chain display, application topology analysis, etc.

Tracing

Tracing data between distributed services or internal processes will be collected by agent. It will then be sent to third-party APM systems.

Metrics

System statistical indicators are collected through probes for writing in timing database and display by third-party applications.

Agent Configuration #

Agent Configuration

Directory Structure

Create an agent directory and extract the agent-bin package into the agent directory.

copymkdir agent
tar -zxvf apache-shardingsphere-${latest.release.version}-shardingsphere-agent-bin.tar.gz -C agent
cd agent
tree 
.
├── LICENSE
├── NOTICE
├── README.txt
├── conf
│   └── agent.yaml
├── plugins
│   ├── lib
│   │   ├── shardingsphere-agent-metrics-core-${latest.release.version}-SNAPSHOT.jar
│   │   └── shardingsphere-agent-plugin-core-${latest.release.version}-SNAPSHOT.jar
│   ├── logging
│   │   └── shardingsphere-agent-logging-file-${latest.release.version}-SNAPSHOT.jar
│   ├── metrics
│   │   └── shardingsphere-agent-metrics-prometheus-${latest.release.version}-SNAPSHOT.jar
│   └── tracing
│       ├── shardingsphere-agent-tracing-opentelemetry-${latest.release.version}-SNAPSHOT.jar
│       └── shardingsphere-agent-tracing-opentracing-${latest.release.version}-SNAPSHOT.jar
├── shardingsphere-agent-${latest.release.version}-SNAPSHOT.jar
└── template
    ├── dbplusengine-driver-grafana-template.json
    └── dbplusengine-proxy-grafana-template.json

Configuration

The file agent.yaml is the configuration file, and the available plugins include Jaeger, OpenTracing, Zipkin, OpenTelemetry, Logging, and Prometheus. To enable a plugin, simply comment out the corresponding plugin name in ignoredPluginNames.

copyplugins:
  logging:
    File:
      props:
        slow-query-log: true
        long-query-time: 5000
        general-query-log: true
  # metrics:
  #   Prometheus:
  #     host: "localhost"
  #     port: 9090
  #     props:
  #       jvm-information-collector-enabled: "true"
  # tracing:
  #   OpenTelemetry:
  #     props:
  #       otel.service.name: "shardingsphere"
  #       otel.traces.exporter: "jaeger"
  #       otel.exporter.otlp.traces.endpoint: "http://localhost:14250"
  #       otel.traces.sampler: "always_on"

Parameter Description

Name	Description	Range	Default Value
slow-query-log	Whether to enable the slow query log	true, false	TRUE
long-query-time	Slow query threshold (ms)	positive integer	5000
general-query-log	Full query log	true, false	TRUE
jvm-information-collector-enabled	Whether to enable the JVM collector	true, false	TRUE
otel.service.name	Service name for link tracking	String	shardingsphere
otel.traces.exporter	Expoter jaeger, zipkin		jaeger
otel.exporter.otlp.traces.endpoint	Data sending address, Actual receiving data address		http://localhost:14250
otel.traces.sampler	Sampler always_on		always_on

More info aboutOpenTelemetry refers OpenTelemetry SDK Autoconfigure Guide

Using Agent in DBPlusEngine-Proxy #

Start Agent at the same time as Start Proxy

copybin/start.sh -g

Normal startup logs can be viewed in the corresponding DBPlusEngine-Proxy logs, and Metric and Tracing data can be viewed through the configured address after accessing the Proxy.

Agent Metrics

Metrics	Description	Metrics Types
build_info	build version information	GAUGE
proxy_info	“Running information, the type tag distinguishes the type. boot-time indicates the timestamp of the startup time, boot-duration indicates the startup time (in milliseconds), uptime indicates the elapsed running time (in milliseconds)”	GAUGE
proxy_meta_data_info	Metadata info	GAUGE
proxy_state	Status information, 0 normal status, 1 fuse status, 2 lock status	GAUGE
proxy_current_connections	Current client connections	GAUGE
parsed_sql_total	SQL parsing numbers, distinguished by INSERT, DELETE, UPDATE, SELECT, DDL, DCL, DAL, TCL, RQL, RDL, RAL, RUL types	COUNTER
routed_sql_total	Number of SQL routes, differentiated by INSERT, DELETE, UPDATE, SELECT type	COUNTER
routed_result_total	The number of routing results, counting the number of storage nodes and tables routed to	COUNTER
proxy_transactions_total	Total number of transactions, categorized by commit, rollback, autocommit	COUNTER
proxy_execute_errors_total	Number of Execution Exceptions, by Exception Type	COUNTER
proxy_requests_total	Number of requests received	COUNTER
proxy_execute_total	Total number of executions (counts the number of SQLs executed by routes to storage nodes)	COUNTER
proxy_execute_error_total	execution exception number	COUNTER
proxy_request_bytes_total	request bytes	COUNTER
proxy_response_bytes_total	response bytes	COUNTER
parse_sql_latency_millis	Parsing SQL time-consuming	HISTOGRAM
route_sql_latency_millis	Route SQL time-consuming	HISTOGRAM
proxy_execute_latency_millis	Execution time-consuming	HISTOGRAM
commit_sql_count_histogram	The distribution of the number of SQL submitted in each transaction HISTOGRAM
rollback_sql_count_histogram	The distribution of the number of SQL in each rollback transaction	HISTOGRAM
proxy_connection_usage_seconds	Connection Duration Distribution to Proxy	HISTOGRAM

DBPlusEngine-Driver 中使用 Agent #

Start Agent

Prepare projects that integrate DBPlusEngine-Driver, such as SpringBoot projects
Add javaagent configuration at startup

java -javaagent:/xxx/shardingsphere-agent-${latest.release.version}.jar -jar spring-boot-dbplusengine-driver-test.jar

Agent Metrics

Metrics	Description	Metrics Types
build_info	build version information	GAUGE
jdbc_state	Status information, 0 normal status, 1 fuse status, 2 lock status	GAUGE
jdbc_meta_data_info	Metadata info	GAUGE
parsed_sql_total	SQL parsing numbers, distinguished by INSERT, DELETE, UPDATE, SELECT, DDL, DCL, DAL, TCL, RQL, RDL, RAL, RUL types	COUNTER
routed_sql_total	Number of SQL routes, differentiated by INSERT, DELETE, UPDATE, SELECT type	COUNTER
routed_result_total	The number of routing results, counting the number of storage nodes and tables routed to	COUNTER
jdbc_statement_execute_total	Execution SQL statement number	COUNTER
jdbc_statement_execute_errors_total	execution exception number	COUNTER
jdbc_transactions_total	Number of transactions, categorized by commit, rollback, autocommit	COUNTER
jdbc_statement_execute_latency_millis	Execution SQL time-consuming	HISTOGRAM
parse_sql_latency_millis	Parsing SQL time-consuming	HISTOGRAM
route_sql_latency_millis	Route SQL time-consuming	HISTOGRAM