Reviewed-by: Rumpler, Mihály <mihaly.rumpler@t-systems.com> Co-authored-by: qiujiandong1 <qiujiandong1@huawei.com> Co-committed-by: qiujiandong1 <qiujiandong1@huawei.com>
182 KiB
What Metrics Are Supported by the Agent?
OS metric: CPU
Metric |
Name |
Description |
Value Range |
Unit |
Earliest Agent Version Required |
|---|---|---|---|---|---|
cpu_usage |
(Agent) CPU Usage |
Used to monitor CPU usage
|
0-100 |
% |
2.4.1 |
cpu_usage_idle |
(Agent) Idle CPU Usage |
Percentage of the time that CPU is idle Unit: Percent
|
0-100 |
% |
2.4.5 |
cpu_usage_other |
(Agent) Other Process CPU Usage |
Other CPU usage of the monitored object
|
0-100 |
% |
2.4.5 |
cpu_usage_system |
(Agent) Kernel Space CPU Usage |
Percentage of time that the CPU is used by kernel space
|
0-100 |
% |
2.4.5 |
cpu_usage_user |
(Agent) User Space CPU Usage |
Percentage of time that the CPU is used by user space
|
0-100 |
% |
2.4.5 |
cpu_usage_nice |
(Agent) Nice Process CPU Usage |
Percentage of the time that the CPU is in user mode with low-priority processes which can easily be interrupted by higher-priority processes
|
0-100 |
% |
2.4.5 |
cpu_usage_iowait |
(Agent) iowait Process CPU Usage |
Percentage of time that the CPU is waiting for I/O operations to complete
|
0-100 |
% |
2.4.5 |
cpu_usage_irq |
(Agent) CPU Interrupt Time |
Percentage of time that the CPU is servicing interrupts
|
0-100 |
% |
2.4.5 |
cpu_usage_softirq |
(Agent) CPU Software Interrupt Time |
Percentage of time that the CPU is servicing software interrupts
|
0-100 |
% |
2.4.5 |
OS Metric: CPU Load
Metric |
Name |
Description |
Value Range |
Unit |
Earliest Agent Version Required |
|---|---|---|---|---|---|
load_average1 |
(Agent) 1-Minute Load Average |
CPU load averaged from the last 1 minute
|
≥ 0 |
None |
2.4.1 |
load_average5 |
(Agent) 5-Minute Load Average |
CPU load averaged from the last 5 minutes
|
≥ 0 |
None |
2.4.1 |
load_average15 |
(Agent) 15-Minute Load Average |
CPU load averaged from the last 15 minutes
|
≥ 0 |
None |
2.4.1 |
OS Metric: Memory
Metric |
Name |
Description |
Value Range |
Unit |
Earliest Agent Version Required |
|---|---|---|---|---|---|
mem_available |
(Agent) Available Memory |
Amount of memory that is available and can be given instantly to processes
|
≥ 0 |
GB |
2.4.5 |
mem_usedPercent |
(Agent) Memory Usage |
Memory usage of the instance
|
0-100 |
% |
2.4.1 |
mem_free |
(Agent) Idle Memory |
Amount of memory that is not being used
|
≥ 0 |
GB |
2.4.5 |
mem_buffers |
(Agent) Buffer |
Amount of memory that is being used for buffers
|
≥ 0 |
GB |
2.4.5 |
mem_cached |
(Agent) Cache |
Amount of memory that is being used for file caches
|
≥ 0 |
GB |
2.4.5 |
total_open_files |
(Agent) Total File Handles |
Total handles used by all processes
|
≥ 0 |
Count |
2.4.5 |
OS Metric: Disk
Currently, CES Agent can collect only physical disk metrics and does not support disks mounted using the network file system protocol.
By default, CES Agent will not monitor Docker-related mount points. The prefix of the mount point is as follows:
/var/lib/docker;/mnt/paas/kubernetes;/var/lib/mesos
Metric |
Name |
Description |
Value Range |
Unit |
Earliest Agent Version Required |
|---|---|---|---|---|---|
disk_free |
(Agent) Available Disk Space |
Free space on the disks
|
≥ 0 |
GB |
2.4.1 |
disk_total |
(Agent) Disk Storage Capacity |
Total disk capacity
|
≥ 0 |
GB |
2.4.5 |
disk_used |
(Agent) Used Disk Space |
Disk's used space
|
≥ 0 |
GB |
2.4.5 |
disk_usedPercent |
(Agent) Disk Usage |
Percentage of used disk space. It is calculated as follows: Disk Usage = Used Disk Space/Disk Storage Capacity.
|
0-100 |
% |
2.4.1 |
disk_rwstate |
(Agent) Disk Read/Write Status |
Read and write status of the disk attached to the monitored object. The status can be 0 (read and write) or 1 (read-only).
|
|
None |
2.5.6 |
OS Metric: Disk I/O
Metric |
Name |
Description |
Value Range |
Unit |
Earliest Agent Version Required |
|---|---|---|---|---|---|
disk_agt_read_bytes_rate |
(Agent) Disks Read Rate |
Volume of data read from the instance per second
|
≥ 0 |
byte/s |
2.4.5 |
disk_agt_read_requests_rate |
(Agent) Disks Read Requests |
Number of read requests sent to the monitored disk per second
|
≥ 0 |
Request/s |
2.4.5 |
disk_agt_write_bytes_rate |
(Agent) Disks Write Rate |
Volume of data written to the instance per second
|
≥ 0 |
byte/s |
2.4.5 |
disk_agt_write_requests_rate |
(Agent) Disks Write Requests |
Number of write requests sent to the monitored disk per second
|
≥ 0 |
Request/s |
2.4.5 |
disk_readTime |
(Agent) Average Read Request Time |
The average time taken for disk read operations
|
≥ 0 |
ms/count |
2.4.5 |
disk_writeTime |
(Agent) Average Write Request Time |
The average time taken for disk write operations
|
≥ 0 |
ms/count |
2.4.5 |
disk_ioUtils |
(Agent) Disk I/O Usage |
Percentage of the time that the disk has had I/O requests queued to the total disk operation time
|
0-100 |
% |
2.4.1 |
disk_queue_length |
(Agent) Disk Queue Length |
Average number of read or write requests queued up for completion for the monitored disk in the monitoring period
|
≥ 0 |
count |
2.4.5 |
disk_write_bytes_per_operation |
Disk Bytes Per Write Operation |
Average number of bytes in an I/O write for the monitored disk in the monitoring period
|
≥ 0 |
Byte/op |
2.4.5 |
disk_read_bytes_per_operation |
Disk Bytes Per Read Operation |
Average number of bytes in an I/O read for the monitored disk in the monitoring period
|
≥ 0 |
Byte/op |
2.4.5 |
disk_io_svctm |
(Agent) Disk I/O Service Time |
Average time in an I/O read or write for the monitored disk in the monitoring period
|
≥ 0 |
ms/op |
2.4.5 |
disk_device_used_percent |
Block Device Usage |
Percentage of total disk space that is used. The calculation formula is as follows: Used storage space of all mounted disk partitions/Total disk storage space.
|
0-100 |
% |
2.5.6 |
OS Metric: File System
Metric |
Name |
Description |
Value Range |
Unit |
Earliest Agent Version Required |
|---|---|---|---|---|---|
disk_fs_rwstate |
(Agent) File System Read/Write Status |
Read and write status of the mounted file system of the monitored object The status can be 0 (read and write) or 1 (read-only).
|
|
None |
2.4.5 |
disk_inodesTotal |
(Agent) Disk inode Total |
Total number of index nodes on the disk
|
≥ 0 |
None |
2.4.5 |
disk_inodesUsed |
(Agent) Total inode Used |
Number of used index nodes on the disk
|
≥ 0 |
None |
2.4.5 |
disk_inodesUsedPercent |
(Agent) Percentage of Total inode Used |
Number of used index nodes on the disk
|
0-100 |
% |
2.4.1 |
OS Metric: NTP
Metric |
Name |
Description |
Value Range |
Unit |
Conversion Rule |
Earliest Agent Version Required |
|---|---|---|---|---|---|---|
ntp_offset |
(Agent) NTP Offset |
NTP offset of the monitored object
|
≥ 0 |
ms |
N/A |
2.7.1 |
OS Metric: TCP Connections
By default, two basic metrics related to TCP connections are collected: (Agent) TCP TOTAL and (Agent) TCP ESTABLISHED.
Metric |
Name |
Description |
Value Range |
Unit |
Earliest Agent Version Required |
|---|---|---|---|---|---|
net_tcp_total |
(Agent) Total Number of TCP Connections |
Total number of TCP connections
|
≥ 0 |
count |
2.4.1 |
net_tcp_established |
(Agent) Number of connections in the ESTABLISHED state |
Number of TCP connections in the ESTABLISHED state
|
≥ 0 |
count |
2.4.1 |
net_tcp_sys_sent |
(Agent) Number of connections in the TCP SYS_SENT state. |
Number of TCP connections that are being requested by the client
|
≥ 0 |
count |
2.4.5 |
net_tcp_sys_recv |
(Agent) Number of connections in the TCP SYS_RECV state. |
Number of pending TCP connections received by the server
|
≥ 0 |
count |
2.4.5 |
net_tcp_fin_wait1 |
(Agent) Number of TCP connections in the FIN_WAIT1 state. |
Number of TCP connections waiting for ACK packets when the connections are being actively closed by the client
|
≥ 0 |
count |
2.4.5 |
net_tcp_fin_wait2 |
(Agent) Number of TCP connections in the FIN_WAIT2 state. |
Number of TCP connections in the FIN_WAIT2 state
|
≥ 0 |
count |
2.4.5 |
net_tcp_time_wait |
(Agent) TCP TIME_WAIT Connections |
Number of TCP connections in the TIME_WAIT state
|
≥ 0 |
count |
2.4.5 |
net_tcp_close |
(Agent) Number of TCP connections in the CLOSE state. |
Number of closed TCP connections
|
≥ 0 |
count |
2.4.5 |
net_tcp_close_wait |
(Agent) TCP CLOSE_WAIT Connections |
Number of TCP connections in the CLOSE_WAIT state
|
≥ 0 |
count |
2.4.5 |
net_tcp_last_ack |
(Agent) Number of TCP connections in the LAST_ACK state. |
Number of TCP connections waiting for ACK packets when the connections are being passively closed by the client
|
≥ 0 |
count |
2.4.5 |
net_tcp_listen |
(Agent) Number of TCP connections in the LISTEN state. |
Number of TCP connections in the LISTEN state
|
≥ 0 |
count |
2.4.5 |
net_tcp_closing |
(Agent) Number of TCP connections in the CLOSING state. |
Number of TCP connections to be automatically closed by the server and the client at the same time
|
≥ 0 |
count |
2.4.5 |
net_tcp_retrans |
(Agent) TCP Retransmission Rate |
Percentage of packets that are resent
|
0-100 |
% |
2.4.5 |
OS Metric: NIC
Metric |
Name |
Description |
Value Range |
Unit |
Earliest Agent Version Required |
|---|---|---|---|---|---|
net_bitRecv |
(Agent) Outbound Bandwidth |
Number of bits sent by this NIC per second
|
≥ 0 |
bit/s |
2.4.1 |
net_bitSent |
(Agent) Inbound Bandwidth |
Number of bits received by this NIC per second
|
≥ 0 |
bit/s |
2.4.1 |
net_packetRecv |
(Agent) NIC Packet Receive Rate |
Number of packets received by this NIC per second
|
≥ 0 |
Count/s |
2.4.1 |
net_packetSent |
(Agent) NIC Packet Send Rate |
Number of packets sent by this NIC per second
|
≥ 0 |
Count/s |
2.4.1 |
net_errin |
(Agent) Receive Error Rate |
Percentage of receive errors detected by this NIC per second
|
0-100 |
% |
2.4.5 |
net_errout |
(Agent) Transmit Error Rate |
Percentage of transmit errors detected by this NIC per second
|
0-100 |
% |
2.4.5 |
net_dropin |
(Agent) Received Packet Drop Rate |
Percentage of packets received by this NIC which were dropped per second
|
0-100 |
% |
2.4.5 |
net_dropout |
(Agent) Transmitted Packet Drop Rate |
Percentage of packets transmitted by this NIC which were dropped per second
|
0-100 |
% |
2.4.5 |
Process Monitoring Metrics
Metric |
Name |
Description |
Value Range |
Unit |
Earliest Agent Version Required |
|---|---|---|---|---|---|
proc_pHashId_cpu |
(Agent) CPU Usage |
CPU consumed by a process. pHashId (process name and process ID) is the value of md5.
|
0–1 x Number of vCPUs |
% |
2.4.1 |
proc_pHashId_mem |
(Agent) Memory Usage |
Memory consumed by a process. pHashId (process name and process ID) is the value of md5.
|
0-100 |
% |
2.4.1 |
proc_pHashId_file |
(Agent) Number of opened files |
Number of files opened by a process. pHashId (process name and process ID) is the value of md5.
|
≥ 0 |
Count |
2.4.1 |
proc_running_count |
(Agent) Running processes |
Number of processes that are running
|
≥ 0 |
None |
2.4.1 |
proc_idle_count |
(Agent) Idle Processes |
Number of processes that are idle
|
≥ 0 |
None |
2.4.1 |
proc_zombie_count |
(Agent) Zombie Processes |
Number of zombie processes
|
≥ 0 |
None |
2.4.1 |
proc_blocked_count |
(Agent) Blocked Processes |
Number of processes that are blocked
|
≥ 0 |
None |
2.4.1 |
proc_sleeping_count |
(Agent) Sleeping Processes |
Number of processes that are sleeping
|
≥ 0 |
None |
2.4.1 |
proc_total_count |
(Agent) Total Processes |
Total number of processes on the monitored object
|
≥ 0 |
None |
2.4.1 |
proc_specified_count |
(Agent) Specified Processes |
Number of specified processes
|
≥ 0 |
None |
2.4.1 |