Logo
Monitor and Alarm

Monitor and Alarm #

Description:

  1. Monitoring can only be completed through SphereEx-Console.
  2. Currently, there is no alarm function.

Cluster ecosystem tool monitoring #

Currently, SphereEx-Console and SphereEx-Boot have no monitor function.

Monitoring of cluster components #

Host monitoring #

  • Monitoring indicators
CategorySubcategoryIndicator
Resource OverviewResource OverviewSystem uptime
CPU Cores
Total Memory
Total CPU Usage
Total Memory Usage
Space Usage for each
Performance DataPerformance DataCPU CPU usage
Memory Statistics
Network bandwidth (per s)
System load (1, 5, 15 min)
Disk read/write throughput (per s)
Disk read/write rate IOPS
IO operation ratio (per s)
IO read/write time (per time)
  • View Monitoring

    Applicable Scenarios

    View host monitoring.

    Notes

    • The monitoring center is functioning normally.
    • The database has been configured for monitoring.

    Steps

    1. Log in to SphereEx-Console.
    2. Click Monitoring -> Hosts to enter the monitoring list.
    3. Click the Monitoring button in the action column to view the monitoring indicators.
  • Configure Monitoring

    Applicable Scenarios

    Configure host monitoring.

    Notes

    • The monitoring center is functioning normally.
    • The Governance Center host has been registered and has read-write permissions for the monitoring plugin installation directory.

    Steps

    1. Log in to SphereEx-Console.
    2. Click Resources -> Hosts to enter the host list.
    3. Click Configure Monitoring in the operation column to enter the monitoring configuration page.
    4. Configure the monitoring information as follows.
FieldData SourceRequired/OptionalDescription
Host IPPrevious page
Monitoring CenterUser selection, monitoring center listRequiredInstalled monitoring center
Monitoring plugin portDefault filled, user editableRequiredDefault value is the default installation directory corresponding to the monitoring plugin, which can be edited
Monitoring plugin installation directoryDefault filled, user editableDefault value is the default installation directory corresponding to the monitoring plugin, which can be edited
  1. Click Install to complete the installation of the monitoring plugin and the configuration of the monitoring center.

Database Monitoring #

  • Monitoring Metrics
CategorySubcategoryMetrics
OverviewOverviewInstance availability
File open count
Read-only secondary
Master-slave delay
Secondary SQL thread
Secondary IO thread
Slow query enabled
Slow query threshold
Performance metricsPerformance metricsQPS (Queries value / time within a specified time)
TPS ((com_insert + com_delete + com_update + com_select) count / time)
Inbound traffic
Outbound traffic
Number of slow queries
Current number of connections
Buffer pool utilization rate
  • Viewing Monitoring

    - Applicable Scenarios

    View database instance monitoring

    Notes

    • The monitoring center is functioning normally.
    • The database has been configured for monitoring.

    Operation Steps

    1. Log in to SphereEx-Console.
    2. Click Monitoring->Database to enter the monitoring list.
    3. Click the Monitoring button in the operation column to view the monitoring indicators.
  • Configuring Monitoring

    Applicable Scenarios

    Configure database monitoring

    Notes

    • The monitoring center is functioning normally.
    • The Governance Center host has been registered and has read-write permissions for the monitoring plugin installation directory.

    Operation Steps

    1. Log in to SphereEx-Console.
    2. Click Resource->Database to enter the host list.
    3. Click Configure Monitoring in the operation column to enter the monitoring configuration page.
    4. Configure the monitoring information as follows.
Field NameData SourceOptional/RequiredDescription
Monitoring CenterUser-selected, List of Monitoring CentersRequiredPre-installed monitoring centers
Monitoring Plugin PortDefault filled, User-editableRequiredThe default port corresponding to the monitoring plug-in is filled by default, editable
Monitoring Plugin Installation DirectoryDefault filled, User-editableDefault installation directory for monitoring plugins, editable
Database Monitoring UserUser inputRequired
Database Monitoring PasswordUser inputRequiredPassword protected
  1. Click Install to complete the installation of the monitoring plugin and configuration of the monitoring center.

Governance Center Monitoring #

  • Monitoring Metrics
CategorySubcategoryMetrics
OverviewOverviewZK cluster node status
Node Roles
Follower number
PerformancePerformanceAverage response latency
Maximum response latency
Minimum response latency
Packets received
Packets sent
Active connections
Pending requests
Primary-Secondary Status
Znode number
Watch number
Temporary node count
Approximate total data size
Open file descriptor count
Maximum file descriptor count
Sync operations blocked
  • Viewing Monitoring Metrics

    Applicable Scenarios

    View monitoring metrics for Governance Center instances.

    Note

    • The monitoring center is functioning normally.
    • Governance Center monitoring is configured.

    Procedure

    1. Log in to SphereEx-Console.
    2. Click Monitoring -> Governance Center to enter the Governance Center list.
    3. Click Monitoring Node in the Operations column to enter the monitoring list.
    4. Click the Monitor button in the Operations column to view the monitoring metrics.
  • Configuring Monitoring

    Applicable Scenarios

    Configure monitoring for Governance Center.

    Prerequisites

    • The monitoring center is functioning normally.
    • The Governance Center host has been registered and has read-write permissions for the monitoring plugin installation directory.

    Procedure

    1. Log in to SphereEx-Console.
    2. Click Resources > Governance Center to enter the host list.
    3. Click Configure Monitoring in the Operations column to enter the monitoring configuration page.
    4. Configure the monitoring information as follows.
FieldsData SourceRequired/OpeationalDescription
Monitoring CentreUser selectRequiredOnly one monitoring center allowed per cluster
Governance Center Node IPAuto filled preves pageRequiredNot editable
Governance Center Node PortAuto filled previous page informationRequiredNot editable
Monitoring Plugin PortUser-editable default valueRequiredDefault port for the monitoring plugin, editable
Monitoring Plugin Installation DirectoryUser-editable default valueDefault port for the monitoring plugin, editable
  1. Click Install to complete the installation of the monitoring plugin and the configuration of the monitoring center.

Monitoring Center Monitoring #

  • Monitoring Metrics
CategorySubcategoryMetrics
OverviewOverviewVersion
Number of monitored instances
Number of threads
Last successful configuration reload time
Was the last configuration reload successful
Total number of chunks
Number of created chunks
Number of removed chunks
Total number of samples
  • View Monitoring

    Applicable Scenarios

    View monitoring data of the monitoring center itself

    Note

    Monitoring center is running properly

    Steps

    1. Log in to SphereEx-Console.
    2. Click Monitoring > Monitoring Center to enter the monitoring list.
    3. Click the Monitoring button in the operation column to view the monitoring indicators.
  • Configure Monitoring

    The monitoring of the monitoring center is enabled by default when it is installed, and no additional configuration is required.

Log Center Monitoring #

Todo

This feature will be implemented in Console version 1.2.

Cluster Monitoring #

  • Monitoring indicators
CategorySubcategoryMetrics
OverviewCluster OverviewNumber of component nodes
Number of storage nodes
Number of compute nodes
Number of governance center nodes
MetadataMetadataNumber of logical databases
Number of users
Number of tables (logical tables + single tables)
Number of sharded tables
Number of broadcast tables
Number of table groups
Number of single tables
Number of encrypted tables
Number of plugins (excluding single table plugins)
Performance DataConnection detailsNumber of routes (instant value, change value, change rate)
Number of executions (instant value, change value, change rate)
Number of parses (instant value, change value, change rate)
Number of requests (instant value, change value, change rate)
Number of connections (instant value)
Performance DetailsQPS
TPS
Number of request bytes
Number of response bytes
Response time
Total number of transactions
Number of committed transactions
Number of rolled-back transactions
Transaction rollback rate
Connection duration (analysis)
Request duration (analysis)
Parsing EngineDML sqlTotal number of inserts
Total number of deletes
Total number of updates
Total number of selects
SQL statistics (Insert\Delete\Update\Select)
DDL sqlTotal number of DDLs
Total number of DCLs
Total number of DALs
Total number of TCLs
SQL statistics (DDL\DCL\DAL\TCL)
DistSQLTotal number of RQLs
Total number of RDLs
Total number of RALs
DistSQL statistics (RQL\RDL\RAL)
Parsing duration
Routing EngineRouting EngineData source routing
Top 10 table routing analysis
Thread StatusThread StatusCurrent number of threads
Number of daemon threads
Peak number of threads
Total number of thread starts
Number of thread deadlocks
JVM thread state data
ErrorsErrorsNumber of errors
Numbers of errors
  • View monitoring

    Applicable scenarios

    View cluster monitoring.

    Notes

    • Monitoring center is functioning normally.
    • Governance Center has already been configured for monitoring.

    Operation steps:

    1. Log in to SphereEx-Console.
    2. Click Monitoring->Cluster to enter the monitoring list.
    3. Click the Monitoring button in the operation column to view the monitoring indicators.
  • Configure monitoring

    Applicable scenarios

    Configure cluster monitoring, which actually configures the monitoring of the computing nodes.

    Notes

    • Monitoring center is functioning normally.
    • The host for the Governance Center has been registered and has read and write permissions for the monitoring plugin installation directory.

    - Operation steps

    1. Log in to SphereEx-Console.
    2. Click Cluster Management->Cluster to enter the host list.
    3. Click Configure Monitoring in the operation column to enter the monitoring configuration page.
    4. Configure the monitoring information as follows.
Field NameData SourceOptional/RequiredDescription
Compute Node IPInput on the previous pageRequiredNot editable
Monitoring Plugin PortUser inputRequiredFor self-built clusters, the monitoring plugin port is automatically filled and not allowed to be modified. For registered clusters, the user can fill in the port.
  1. Click on Add Monitoring Configuration to complete the monitoring center configuration.

Compute node monitoring #

  • Monitoring indicators
CategorySubcategoryMetrics
OverviewOverviewJDK version info
Start time
Compute node version
Running status
Running time
Used memory size
Used cache pool size
PerformancePerformanceTotal requests
Total routes
Total executions
Total parsing
Number of requests
Number of parses
Number of routes
Number of executions
QPS
TPS
Transaction rollback rate
MetadataMetadataNumber of logical databases
Number of users
Number of tables (logical + single)
Number of sharded tables
Parsing engineDML sqlTotal Insert
Total Delete
Total Update
Total Select
SQL statistics (Insert\Delete\Update\Select)
DDL sqlTotal DDL
Total DCL
Total DAL
Total TCL
SQL statistics (DDL\DCL\DAL\TCL)
DistSQLTotal RQL
Total RDL
Total RAL
DistSQL statistics (RQL\RDL\RAL)
Parsing time
Routing engineRouting engineData source routing
Top 10 table routing analysis
Thread statusThread statusCurrent thread count
Number of daemon threads
Peak thread count
Total thread startup
Thread deadlock count
JVM thread status data
Error statisticsError statisticsNumbers of errors
Numbers of errors
  • View monitoring

    Application Scenario

    View cluster monitoring.

    Notes

    • Monitoring center is functioning normally.
    • Governance Center has already been configured for monitoring.

    Steps

    1. Log in to SphereEx-Console.
    2. Click Monitoring -> Cluster to enter the cluster list.
    3. Click Node Monitoring in the operation column to enter the compute node list.
    4. Click Monitoring in the operation column to view monitoring indicators.
  • Configure monitoring

    Complete by configuring cluster monitoring.。

Storage node monitoring #

Currently, monitoring of storage nodes is not supported.

Agent management #

Introduction to Agent #

Background

In order to grasp the distributed system status, observe running state of the cluster is a new challenge. The point-to-point operation mode of logging in to a specific server cannot suite to large number of distributed servers. Telemetry through observable data is the recommended operation and maintenance mode for them. Tracking, metrics and logging are important ways to obtain observable data of system status. APM (application performance monitoring) is to monitor and diagnose the performance of the system by collecting, storing and analyzing the observable data of the system. Its main functions include performance index monitoring, call stack analysis, service topology, etc. DBPlusEngine is not responsible for gathering, storing and demonstrating APM data, but provides the necessary information for the APM. In other words, DBPlusEngine is only responsible for generating valuable data and submitting it to relevant systems through standard protocols or plug-ins. Tracing is to obtain the tracking information of SQL parsing and SQL execution. DBPlusEngine provides support for SkyWalking,Zipkin,Jaeger and OpenTelemetry by default. It also supports users to develop customized components through plug-in.MMMMMM

  • Use OpenTelemetry

OpenTelemetry was merged by OpenTracing and OpenCencus in 2019. In this way, you only need to fill in the appropriate configuration in the agent configuration file according to OpenTelemetry SDK Autoconfigure Guide,Data can be exported to Jaeger, Zipkin.

  • Use SkyWalking

Cooperating with Apache SkyWalking team, DBPlusEngine team has realized ShardingSphere automatic monitor probe to automatically send performance data to SkyWalking. Note that automatic probe in this way cannot be used together with DBPlusEngine plug-in probe.

Metrics used to collect and display statistical indicator of cluster. DBPlusEngine supports Prometheus by default.

Challenges

Tracing and metrics need to collect system information through event tracking. Lots of events tracking make kernel code mess, difficult to maintain, and difficult to customize extend.

Goal

The goal of Apache ShardingSphere observability module is providing as many performance and statistical indicators as possible and isolating kernel code and embedded code.

Core concepts

  • Agent

Based on bytecode enhancement and plugin design to provide tracing, metrics and logging features.

  • APM

APM is an acronym for Application Performance Monitoring. Focusing on the performance diagnosis of distributed systems, its main functions include call chain display, application topology analysis, etc.

  • Tracing

Tracing data between distributed services or internal processes will be collected by agent. It will then be sent to third-party APM systems.

  • Metrics

System statistical indicators are collected through probes for writing in timing database and display by third-party applications.

Agent Configuration #

Agent Configuration

  • Directory Structure

Create an agent directory and extract the agent-bin package into the agent directory.

mkdir agent
tar -zxvf apache-shardingsphere-${latest.release.version}-shardingsphere-agent-bin.tar.gz -C agent
cd agent
tree 
.
├── LICENSE
├── NOTICE
├── README.txt
├── conf
│   └── agent.yaml
├── plugins
│   ├── lib
│   │   ├── shardingsphere-agent-metrics-core-${latest.release.version}-SNAPSHOT.jar
│   │   └── shardingsphere-agent-plugin-core-${latest.release.version}-SNAPSHOT.jar
│   ├── logging
│   │   └── shardingsphere-agent-logging-file-${latest.release.version}-SNAPSHOT.jar
│   ├── metrics
│   │   └── shardingsphere-agent-metrics-prometheus-${latest.release.version}-SNAPSHOT.jar
│   └── tracing
│       ├── shardingsphere-agent-tracing-opentelemetry-${latest.release.version}-SNAPSHOT.jar
│       └── shardingsphere-agent-tracing-opentracing-${latest.release.version}-SNAPSHOT.jar
├── shardingsphere-agent-${latest.release.version}-SNAPSHOT.jar
└── template
    ├── dbplusengine-driver-grafana-template.json
    └── dbplusengine-proxy-grafana-template.json
  • Configuration

The file agent.yaml is the configuration file, and the available plugins include Jaeger, OpenTracing, Zipkin, OpenTelemetry, Logging, and Prometheus. To enable a plugin, simply comment out the corresponding plugin name in ignoredPluginNames.

plugins:
  logging:
    File:
      props:
        slow-query-log: true
        long-query-time: 5000
        general-query-log: true
  # metrics:
  #   Prometheus:
  #     host: "localhost"
  #     port: 9090
  #     props:
  #       jvm-information-collector-enabled: "true"
  # tracing:
  #   OpenTelemetry:
  #     props:
  #       otel.service.name: "shardingsphere"
  #       otel.traces.exporter: "jaeger"
  #       otel.exporter.otlp.traces.endpoint: "http://localhost:14250"
  #       otel.traces.sampler: "always_on"
  • Parameter Description
NameDescriptionRangeDefault Value
slow-query-logWhether to enable the slow query logtrue, falseTRUE
long-query-timeSlow query threshold (ms)positive integer5000
general-query-logFull query logtrue, falseTRUE
jvm-information-collector-enabledWhether to enable the JVM collectortrue, falseTRUE
otel.service.nameService name for link trackingStringshardingsphere
otel.traces.exporterExpoter jaeger, zipkinjaeger
otel.exporter.otlp.traces.endpointData sending address, Actual receiving data addresshttp://localhost:14250
otel.traces.samplerSampler always_onalways_on

More info aboutOpenTelemetry refers OpenTelemetry SDK Autoconfigure Guide

Using Agent in DBPlusEngine-Proxy #

Start Agent at the same time as Start Proxy

bin/start.sh -g

Normal startup logs can be viewed in the corresponding DBPlusEngine-Proxy logs, and Metric and Tracing data can be viewed through the configured address after accessing the Proxy.

Agent Metrics

MetricsDescriptionMetrics Types
build_infobuild version informationGAUGE
proxy_info“Running information, the type tag distinguishes the type. boot-time indicates the timestamp of the startup time, boot-duration indicates the startup time (in milliseconds), uptime indicates the elapsed running time (in milliseconds)”GAUGE
proxy_meta_data_infoMetadata infoGAUGE
proxy_stateStatus information, 0 normal status, 1 fuse status, 2 lock statusGAUGE
proxy_current_connectionsCurrent client connectionsGAUGE
parsed_sql_totalSQL parsing numbers, distinguished by INSERT, DELETE, UPDATE, SELECT, DDL, DCL, DAL, TCL, RQL, RDL, RAL, RUL typesCOUNTER
routed_sql_totalNumber of SQL routes, differentiated by INSERT, DELETE, UPDATE, SELECT typeCOUNTER
routed_result_totalThe number of routing results, counting the number of storage nodes and tables routed toCOUNTER
proxy_transactions_totalTotal number of transactions, categorized by commit, rollback, autocommitCOUNTER
proxy_execute_errors_totalNumber of Execution Exceptions, by Exception TypeCOUNTER
proxy_requests_totalNumber of requests receivedCOUNTER
proxy_execute_totalTotal number of executions (counts the number of SQLs executed by routes to storage nodes)COUNTER
proxy_execute_error_totalexecution exception numberCOUNTER
proxy_request_bytes_totalrequest bytesCOUNTER
proxy_response_bytes_totalresponse bytesCOUNTER
parse_sql_latency_millisParsing SQL time-consumingHISTOGRAM
route_sql_latency_millisRoute SQL time-consumingHISTOGRAM
proxy_execute_latency_millisExecution time-consumingHISTOGRAM
commit_sql_count_histogramThe distribution of the number of SQL submitted in each transaction HISTOGRAM
rollback_sql_count_histogramThe distribution of the number of SQL in each rollback transactionHISTOGRAM
proxy_connection_usage_secondsConnection Duration Distribution to ProxyHISTOGRAM

DBPlusEngine-Driver 中使用 Agent #

Start Agent

  • Prepare projects that integrate DBPlusEngine-Driver, such as SpringBoot projects

  • Add javaagent configuration at startup

java -javaagent:/xxx/shardingsphere-agent-${latest.release.version}.jar -jar spring-boot-dbplusengine-driver-test.jar

Agent Metrics

MetricsDescriptionMetrics Types
build_infobuild version informationGAUGE
jdbc_stateStatus information, 0 normal status, 1 fuse status, 2 lock statusGAUGE
jdbc_meta_data_infoMetadata infoGAUGE
parsed_sql_totalSQL parsing numbers, distinguished by INSERT, DELETE, UPDATE, SELECT, DDL, DCL, DAL, TCL, RQL, RDL, RAL, RUL typesCOUNTER
routed_sql_totalNumber of SQL routes, differentiated by INSERT, DELETE, UPDATE, SELECT typeCOUNTER
routed_result_totalThe number of routing results, counting the number of storage nodes and tables routed toCOUNTER
jdbc_statement_execute_totalExecution SQL statement numberCOUNTER
jdbc_statement_execute_errors_totalexecution exception numberCOUNTER
jdbc_transactions_totalNumber of transactions, categorized by commit, rollback, autocommitCOUNTER
jdbc_statement_execute_latency_millisExecution SQL time-consumingHISTOGRAM
parse_sql_latency_millisParsing SQL time-consumingHISTOGRAM
route_sql_latency_millisRoute SQL time-consumingHISTOGRAM