diff --git a/docs/ces/api-ref/ErrorCode.html b/docs/ces/api-ref/ErrorCode.html index 79f0d28e9..740697f8b 100644 --- a/docs/ces/api-ref/ErrorCode.html +++ b/docs/ces/api-ref/ErrorCode.html @@ -4,12 +4,10 @@

Function

If an error occurs during API calling, the system returns error information. This section describes the error codes contained in the error information for Cloud Eye APIs.

Example Response

{
-    "code": 400,
-    "element": "Bad Request",
-    "message": "The system received a request which cannot be recognized",
-    "details": {
-        "details": "Some content in message body is not correct",
-        "code": "ces.0014"
+    "http_code":"403",
+    "message": {
+        "details":"Policy doesn't allow [ces:alarms:get] to be performed",
+        "code":"403"
     }
 }
diff --git a/docs/ces/api-ref/ces_01_0054.html b/docs/ces/api-ref/ces_01_0054.html index af0647e28..1e69d9329 100644 --- a/docs/ces/api-ref/ces_01_0054.html +++ b/docs/ces/api-ref/ces_01_0054.html @@ -14,7 +14,7 @@

Description

-

Solution

+

Solution

Impact

@@ -84,7 +84,7 @@

Major

-

The ECS was restored to be normal after the automatic migration.

+

The ECS was recovered after the automatic migration.

This event indicates that the ECS has recovered and been working properly.

@@ -137,9 +137,9 @@

Critical

-

The processes of the host accommodating the ECS were abnormal.

+

The host where the ECS resides is faulty. The system will automatically try to start the ECS.

-

Contact O&M personnel.

+

After the ECS is started, check whether this ECS and services on it can run properly.

The ECS is faulty.

@@ -163,92 +163,92 @@

Once a physical host running ECSs breaks down, the ECSs are automatically migrated to a functional physical host. During the migration, the ECSs will be restarted.

-
Table 2 Advanced Anti-DDoS (AAD)

Event Source

+
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - @@ -317,7 +317,7 @@ - @@ -497,7 +497,7 @@ - @@ -541,7 +541,7 @@ - @@ -595,7 +595,7 @@ - @@ -681,7 +681,7 @@ - @@ -1178,7 +1178,7 @@ - @@ -1506,7 +1506,7 @@ - @@ -1639,2456 +1639,1077 @@
Table 2 Advanced Anti-DDoS (AAD)

Event Source

Namespace

+

Namespace

Event Name

+

Event Name

Event ID

+

Event ID

Event Severity

+

Event Severity

Description

+

Description

Solution

+

Solution

Impact

+

Impact

AAD

-

+

AAD

+

SYS.DDOS

+

SYS.DDOS

DDoS Attack Events

+

DDoS Attack Events

ddosAttackEvents

+

ddosAttackEvents

Major

+

Major

A DDoS attack occurs in the AAD protected lines.

+

A DDoS attack occurs in the AAD protected lines.

Judge the impact on services based on the attack traffic and attack type. If the attack traffic exceeds your purchased elastic bandwidth, change to another line or increase your bandwidth.

+

Judge the impact on services based on the attack traffic and attack type. If the attack traffic exceeds your purchased elastic bandwidth, change to another line or increase your bandwidth.

Services may be interrupted.

+

Services may be interrupted.

Domain name scheduling event

+

Domain name scheduling event

domainNameDispatchEvents

+

domainNameDispatchEvents

Major

+

Major

The high-defense CNAME corresponding to the domain name is scheduled, and the domain name is resolved to another high-defense IP address.

+

The high-defense CNAME corresponding to the domain name is scheduled, and the domain name is resolved to another high-defense IP address.

Pay attention to the workloads involving the domain name.

+

Pay attention to the workloads involving the domain name.

Services are not affected.

+

Services are not affected.

Blackhole event

+

Blackhole event

blackHoleEvents

+

blackHoleEvents

Major

+

Major

The attack traffic exceeds the purchased AAD protection threshold.

+

The attack traffic exceeds the purchased AAD protection threshold.

A blackhole is canceled after 30 minutes by default. The actual blackhole duration is related to the blackhole triggering times and peak attack traffic on the current day. The maximum duration is 24 hours. If you need to permit access before a blackhole becomes ineffective, contact technical support.

+

A blackhole is canceled after 30 minutes by default. The actual blackhole duration is related to the blackhole triggering times and peak attack traffic on the current day. The maximum duration is 24 hours. If you need to permit access before a blackhole becomes ineffective, contact technical support.

Services may be interrupted.

+

Services may be interrupted.

Cancel Blackhole

+

Cancel Blackhole

cancelBlackHole

+

cancelBlackHole

Informational

+

Informational

The customer's AAD instance recovers from the black hole state.

+

The customer's AAD instance recovers from the black hole state.

This is only a prompt and no action is required.

+

This is only a prompt and no action is required.

Customer services recover.

+

Customer services recover.

IP address scheduling triggered

+

IP address scheduling triggered

ipDispatchEvents

+

ipDispatchEvents

Major

+

Major

IP route changed

+

IP route changed

Check the workloads of the IP address.

+

Check the workloads of the IP address.

Services are not affected.

+

Services are not affected.

Description

Solution

+

Solution

Impact

Description

Solution

+

Solution

Impact

The standby DB instance does not take over workloads from the primary DB instance due to network or server failures. The original primary DB instance continues to provide services within a short time.

Perform the operation again during off-peak hours.

+

Perform the operation again during off-peak hours.

Read replica promotion failed.

RDS rebuilds the standby DB instance with its high availability. After the instance is rebuilt, this event will be reported.

The DB instance status is normal. Check whether services are running properly.

+

The DB instance status is normal. Check whether services are running properly.

The instance is recovered.

Description

Solution

+

Solution

Impact

Description

Solution

+

Solution

Impact

Description

Solution

+

Solution

Impact

-
Table 10 Layer 2 Connection Gateway (L2CG)

Event Source

+
- - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + +
Table 10 Elastic Volume Service (EVS)

Event Source

Namespace

+

Namespace

Event Name

+

Event Name

Event ID

+

Event ID

Event Severity

+

Event Severity

Description

+

Description

Solution

+

Solution

Impact

+

Impact

L2CG

+

EVS

SYS.ESW

+

SYS.EVS

IP addresses conflicted

+

Update disk

IPConflict

+

updateVolume

Major

+

Minor

A cloud server and an on-premises server that need to communicate use the same IP address.

+

Update the name and description of an EVS disk.

Check the ARP and switch information to locate the servers that have the same IP address and change the IP address.

+

No further action is required.

The communications between the on-premises and cloud servers may be abnormal.

+

None

+

Expand disk

+

extendVolume

+

Minor

+

Expand an EVS disk.

+

No further action is required.

+

None

+

Delete disk

+

deleteVolume

+

Major

+

Delete an EVS disk.

+

No further action is required.

+

Deleted disks cannot be recovered.

+

QoS upper limit reached

+

reachQoS

+

Major

+

The I/O latency increases as the QoS upper limits of the disk are frequently reached and flow control triggered.

+

Change the disk type to one with a higher specification.

+

The current disk may fail to meet service requirements.

-
Table 11 Elastic IP and bandwidth

Event Source

+
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Table 11 Key Management Service (KMS)

Event Source

Namespace

+

Namespace

Event Name

+

Event Name

Event ID

+

Event ID

Event Severity

+

Event Severity

Elastic IP and bandwidth

+

KMS

SYS.VPC

+

SYS.KMS

VPC deleted

+

Key disabled

deleteVpc

+

disableKey

Major

+

Major

VPC modified

+

Key deletion scheduled

modifyVpc

+

scheduleKeyDeletion

Minor

+

Minor

Subnet deleted

+

Grant retired

deleteSubnet

+

retireGrant

Minor

+

Major

Subnet modified

+

Grant revoked

modifySubnet

+

revokeGrant

Minor

-

Bandwidth modified

-

modifyBandwidth

-

Minor

-

VPN deleted

-

deleteVpn

-

Major

-

VPN modified

-

modifyVpn

-

Minor

+

Major

-
Table 12 Elastic Volume Service (EVS)

Event Source

+
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Table 12 Cloud Eye (CES)

Event Source

Namespace

+

Event Name

Event Name

+

Event ID

Event ID

+

Event Severity

Event Severity

+

Description

Description

-

Solution

-

Impact

+

Solution

EVS

+

Cloud Eye

SYS.EVS

+

Agent heartbeat interruption

Update disk

+

agentHeartbeatInterrupted

updateVolume

+

Major

Minor

+

The Agent sends a heartbeat message to Cloud Eye every minute. If Cloud Eye cannot receive a heartbeat for 3 minutes, Agent Status is displayed as Faulty.

Update the name and description of an EVS disk.

-

No further action is required.

-

None

-

Expand disk

-

extendVolume

-

Minor

-

Expand an EVS disk.

-

No further action is required.

-

None

-

Delete disk

-

deleteVolume

-

Major

-

Delete an EVS disk.

-

No further action is required.

-

Deleted disks cannot be recovered.

-

QoS upper limit reached

-

reachQoS

-

Major

-

The I/O latency increases as the QoS upper limits of the disk are frequently reached and flow control triggered.

-

Change the disk type to one with a higher specification.

-

The current disk may fail to meet service requirements.

+
  • Confirm that the Agent domain name cannot be resolved.
  • Check whether your account is in arrears.
  • The Agent process is faulty. Restart the Agent. If the Agent process is still faulty after the restart, the Agent files may be damaged. In this case, reinstall the Agent.
  • Confirm that the server time is inconsistent with the local standard time.
  • Update the Agent to the latest version.
-
Table 13 Key Management Service (KMS)

Event Source

+
- - - - + + + - - - - - + + + - + - - + + + + + + + + + - - - + + + - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Table 13 Distributed Cache Service (DCS)

Event Source

Namespace

+

Namespace

Event Name

+

Event Name

Event ID

+

Event ID

Event Severity

+

Event Severity

+

Description

+

Solution

+

Impact

KMS

+

DCS

SYS.KMS

+

SYS.DCS

Key disabled

+

Full sync retry during online migration

disableKey

+

migrationFullResync

Major

+

Minor

+

If online migration fails, full synchronization will be triggered because incremental synchronization cannot be performed.

+

Check whether full sync retries are triggered repeatedly. Check whether the source instance is connected and whether it is overloaded. If full sync retries are triggered repeatedly, contact O&M personnel.

+

The migration task is disconnected from the source instance, triggering another full sync. As a result, the CPU usage of the source instance may increase sharply.

Key deletion scheduled

+
  

masterStandbyFailover

scheduleKeyDeletion

+

Minor

Minor

+

The master node was abnormal, promoting a replica to master.

+
    

Memcached master/standby switchover

+

memcachedMasterStandbyFailover

+

Minor

+

The master node was abnormal, promoting the standby node to master.

+

Check whether services can recover by themselves. If applications cannot recover, restart them.

+

Persistent connections to the instance will be interrupted.

Grant retired

+

Redis server abnormal

retireGrant

+

redisNodeStatusAbnormal

Major

+

Major

+

The Redis server status was abnormal.

+

Check whether services are affected. If yes, contact O&M personnel.

+

If the master node is abnormal, an automatic failover is performed. If a standby node is abnormal and the client directly connects to the standby node for read/write splitting, no data can be read.

Grant revoked

+

Redis server recovered

revokeGrant

+

redisNodeStatusNormal

Major

+

Major

+

The Redis server status recovered.

+

Check whether services can recover. If the applications are not reconnected, restart them.

+

Recover from an exception.

+

Sync failure in data migration

+

migrateSyncDataFail

+

Major

+

Online migration failed.

+

Reconfigure the migration task and migrate data again. If the fault persists, contact O&M personnel.

+

Data migration fails.

+

Memcached instance abnormal

+

memcachedInstanceStatusAbnormal

+

Major

+

The Memcached node status was abnormal.

+

Check whether services are affected. If yes, contact O&M personnel.

+

The Memcached instance is abnormal and may not be accessed.

+

Memcached instance recovered

+

memcachedInstanceStatusNormal

+

Major

+

The Memcached node status recovered.

+

Check whether services can recover. If the applications are not reconnected, restart them.

+

Recover from an exception.

+

Instance backup failure

+

instanceBackupFailure

+

Major

+

The DCS instance fails to be backed up due to an OBS access failure.

+

Retry backup manually.

+

Automated backup fails.

+

Instance node abnormal restart

+

instanceNodeAbnormalRestart

+

Major

+

DCS nodes restarted unexpectedly when they became faulty.

+

Check whether services can recover. If the applications are not reconnected, restart them.

+

Persistent connections to the instance will be interrupted.

+

Long-running Lua scripts stopped

+

scriptsStopped

+

Informational

+

Lua scripts that had timed out automatically stopped running.

+

Optimize Lua scrips to prevent execution timeout.

+

If Lua scripts take a long time to execute, they will be forcibly stopped to avoid blocking the entire instance.

+

Node restarted

+

nodeRestarted

+

Informational

+

After write operations had been performed, the node automatically restarted to stop Lua scripts that had timed out.

+

Check whether services can recover by themselves. If applications cannot recover, restart them.

+

Persistent connections to the instance will be interrupted.

-
Table 14 Object Storage Service (OBS)

Event Source

+
- - - - + + - - - - - + + - - - - - - - - - - -
Table 14 Config

Event Source

Namespace

+

Event Name

Event Name

+

Event ID

Event ID

+

Event Severity

Event Severity

+

Description

+

Solution

+

Impact

OBS

+

RMS

SYS.OBS

+

Configuration noncompliance notification

Bucket deleted

+

configurationNoncomplianceNotification

deleteBucket

+

Major

Major

+

The assignment evaluation result is Non-compliant.

+

Modify the noncompliant configuration items of the resource.

+

None

Bucket policy deleted

+

Configuration compliance notification

deleteBucketPolicy

+

configurationComplianceNotification

Major

+

Informational

Bucket ACL configured

+

The assignment evaluation result changed to be Compliant.

setBucketAcl

+

None

Minor

-

Bucket policy configured

-

setBucketPolicy

-

Minor

+

None

-
Table 15 Cloud Eye

Event Source

+
- - - - - + + - - - - - - + + + + + + + + +
Table 15 Host Security Service (HSS)

Event Source

Event Name

+

Namespace

Event ID

+

Event Name

Event Severity

+

Event ID

Description

+

Event Severity

Solution

+

Description

+

Solution

+

Impact

Cloud Eye

+

HSS

Agent heartbeat interruption

+

SYS.HSS

agentHeartbeatInterrupted

+

HSS agent disconnected

Major

+

hssAgentAbnormalOffline

The Agent sends a heartbeat message to Cloud Eye every minute. If Cloud Eye cannot receive a heartbeat for 3 minutes, Agent Status is displayed as Faulty.

+

Major

  • Confirm that the Agent domain name cannot be resolved.
  • Check whether your account is in arrears.
  • The Agent process is faulty. Restart the Agent. If the Agent process is still faulty after the restart, the Agent files may be damaged. In this case, reinstall the Agent.
  • Confirm that the server time is inconsistent with the local standard time.
  • Update the Agent to the latest version.
+

The communication between the agent and the server is abnormal, or the agent process on the server is abnormal.

+

Fix your network connection. If the agent is still offline for a long time after the network recovers, the agent process may be abnormal. In this case, log in to the server and restart the agent process.

+

Services are interrupted.

+

Abnormal HSS agent status

+

hssAgentAbnormalProtection

+

Major

+

The agent is abnormal probably because it does not have sufficient resources.

+

Log in to the server and check your resources. If the usage of memory or other system resources is too high, increase their capacity first. If the resources are sufficient but the fault persists after the agent process is restarted, submit a service ticket to the O&M personnel.

+

Services are interrupted.

-
Table 16 DataSpace

Event Source

+
- - - - - - - - - - - - - - - + + + + + + + + + + + + + +
Table 16 Image Management Service (IMS)

Event Source

Namespace

+

Namespace

Event Name

+

Event Name

Event ID

+

Event ID

Event Severity

+

Event Severity

Description

+

Description

Solution

+

Solution

Impact

+

Impact

Data Space

+

IMS

SYS.HWDS

+

SYS.IMS

New revision

+

Create Image

newRevision

+

createImage

Minor

+

Major

An updated version was released.

+

An image was created.

After receiving the notification, export the data of the updated version as required.

+

None

None.

+

You can use this image to create cloud servers.

+

Update Image

+

updateImage

+

Major

+

Metadata of an image was modified.

+

None

+

Cloud servers may fail to be created from this image.

+

Delete Image

+

deleteImage

+

Major

+

An image was deleted.

+

None

+

This image will be unavailable on the management console.

-
Table 17 Distributed Cache Service (DCS)

Event Source

+
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Table 17 Bare Metal Server (BMS)

Event Source

Namespace

+

Event Name

Event Name

+

Event ID

Event ID

+

Event Severity

Event Severity

+

Description

Description

+

Solution

Solution

-

Impact

+

Impact

DCS

+

BMS

SYS.DCS

+

ECC uncorrectable errors generated on GPU SRAM

Full sync retry during online migration

+

SRAMUncorrectableEccError

migrationFullResync

+

Major

Minor

+

There are ECC uncorrectable errors generated on GPU SRAM.

If online migration fails, full synchronization will be triggered because incremental synchronization cannot be performed.

+

If services are affected, submit a service ticket.

Check whether full sync retries are triggered repeatedly. Check whether the source instance is connected and whether it is overloaded. If full sync retries are triggered repeatedly, contact O&M personnel.

-

The migration task is disconnected from the source instance, triggering another full sync. As a result, the CPU usage of the source instance may increase sharply.

-
  

masterStandbyFailover

-

Minor

-

The master node was abnormal, promoting a replica to master.

-
    

Memcached master/standby switchover

-

memcachedMasterStandbyFailover

-

Minor

-

The master node was abnormal, promoting the standby node to master.

-

Check whether services can recover by themselves. If applications cannot recover, restart them.

-

Persistent connections to the instance are interrupted.

-

Redis server abnormal

-

redisNodeStatusAbnormal

-

Major

-

The Redis server status was abnormal.

-

Check whether services are affected. If yes, contact O&M personnel.

-

If the master node is abnormal, an automatic failover is performed. If a standby node is abnormal and the client directly connects to the standby node for read/write splitting, no data can be read.

-

Redis server recovered

-

redisNodeStatusNormal

-

Major

-

The Redis server status recovered.

-

Check whether services can recover. If the applications are not reconnected, restart them.

-

Recover from an exception.

-

Sync failure in data migration

-

migrateSyncDataFail

-

Major

-

Online migration failed.

-

Reconfigure the migration task and migrate data again. If the fault persists, contact O&M personnel.

-

Data migration fails.

-

Memcached instance abnormal

-

memcachedInstanceStatusAbnormal

-

Major

-

The Memcached node status was abnormal.

-

Check whether services are affected. If yes, contact O&M personnel.

-

The Memcached instance is abnormal and may not be accessed.

-

Memcached instance recovered

-

memcachedInstanceStatusNormal

-

Major

-

The Memcached node status recovered.

-

Check whether services can recover. If the applications are not reconnected, restart them.

-

Recover from an exception.

-

Instance backup failure

-

instanceBackupFailure

-

Major

-

The DCS instance fails to be backed up due to an OBS access failure.

-

Retry backup manually.

-

Automated backup fails.

-

Instance node abnormal restart

-

instanceNodeAbnormalRestart

-

Major

-

DCS nodes restarted unexpectedly when they became faulty.

-

Check whether services can recover. If the applications are not reconnected, restart them.

-

Persistent connections to the instance are interrupted.

-

Long-running Lua scripts stopped

-

scriptsStopped

-

Informational

-

Lua scripts that had timed out automatically stopped running.

-

Optimize Lua scrips to prevent execution timeout.

-

If Lua scripts take a long time to execute, they will be forcibly stopped to avoid blocking the entire instance.

-

Node restarted

-

nodeRestarted

-

Informational

-

After write operations had been performed, the node automatically restarted to stop Lua scripts that had timed out.

-

Check whether services can recover by themselves. If applications cannot recover, restart them.

-

Persistent connections to the instance are interrupted.

-
-
- -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Table 18 Intelligent Cloud Access (ICA)

Event Source

-

Namespace

-

Event Name

-

Event ID

-

Event Severity

-

Description

-

Solution

-

Impact

-

ICA

-

SYS.ICA

-

BGP peer disconnection

-

BgpPeerDisconnection

-

Major

-

The BGP peer is disconnected.

-

Log in to the gateway and locate the cause.

-

Service traffic may be interrupted.

-

BGP peer connection success

-

BgpPeerConnectionSuccess

-

Major

-

The BGP peer is successfully connected.

-

None

-

None

-

Abnormal GRE tunnel status

-

AbnormalGreTunnelStatus

-

Major

-

The GRE tunnel status is abnormal.

-

Log in to the gateway and locate the cause.

-

Service traffic may be interrupted.

-

Normal GRE tunnel status

-

NormalGreTunnelStatus

-

Major

-

The GRE tunnel status is normal.

-

None

-

None

-

WAN interface goes up

-

EquipmentWanGoingOnline

-

Major

-

The WAN interface goes online.

-

None

-

None

-

WAN interface goes down

-

EquipmentWanGoingOffline

-

Major

-

The WAN interface goes offline.

-

Check whether the event is caused by a manual operation or device fault.

-

The device cannot be used.

-

Intelligent enterprise gateway going online

-

IntelligentEnterpriseGatewayGoingOnline

-

Major

-

The intelligent enterprise gateway goes online.

-

None

-

None

-

Intelligent enterprise gateway going offline

-

IntelligentEnterpriseGatewayGoingOffline

-

Major

-

The intelligent enterprise gateway goes offline.

-

Check whether the event is caused by a manual operation or device fault.

-

The device cannot be used.

-
-
- -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Table 19 Multi-Site High Availability Service (MAS)

Event Source

-

Namespace

-

Event Name

-

Event ID

-

Event Severity

-

Description

-

Solution

-

Impact

-

MAS

-

SYS.MAS

-

Abnormal database instance

-

dbError

-

Major

-

Abnormal database instance is detected by MAS.

-

Log in to the MAS console to view the cause and rectify the fault.

-

Services are interrupted.

-

Database instance recovered

-

dbRecovery

-

Major

-

The database instance is recovered.

-

None

-

Services are interrupted.

-

Abnormal Redis instance

-

redisError

-

Major

-

Abnormal Redis instance is detected by MAS.

-

Log in to the MAS console to view the cause and rectify the fault.

-

Services are interrupted.

-

Redis instance recovered

-

redisRecovery

-

Major

-

The Redis instance is recovered.

-

None

-

Services are interrupted.

-

Abnormal MongoDB database

-

mongodbError

-

Major

-

Abnormal MongoDB database is detected by MAS.

-

Log in to the MAS console to view the cause and rectify the fault.

-

Services are interrupted.

-

MongoDB database recovered

-

mongodbRecovery

-

Major

-

The MongoDB database is recovered.

-

None

-

Services are interrupted.

-

Abnormal Elasticsearch instance

-

esError

-

Major

-

Abnormal Elasticsearch instance is detected by MAS.

-

Log in to the MAS console to view the cause and rectify the fault.

-

Services are interrupted.

-

Elasticsearch instance recovered

-

esRecovery

-

Major

-

The Elasticsearch instance is recovered.

-

None

-

Services are interrupted.

-

Abnormal API

-

apiError

-

Major

-

The abnormal API is detected by MAS.

-

Log in to the MAS console to view the cause and rectify the fault.

-

Services are interrupted.

-

API recovered

-

apiRecovery

-

Major

-

The API is recovered.

-

None

-

Services are interrupted.

-

Area status changed

-

netChange

-

Major

-

Area status changes are detected by MAS.

-

Log in to the MAS console to view the cause and rectify the fault.

-

Network of the multi-active areas may change.

-
-
- -
- - - - - - - - - - - - - - - - - - - - - - - - -
Table 20 Config

Event Source

-

Event Name

-

Event ID

-

Event Severity

-

Description

-

Solution

-

Impact

-

RMS

-

Configuration noncompliance notification

-

configurationNoncomplianceNotification

-

Major

-

The assignment evaluation result is Non-compliant.

-

Modify the noncompliant configuration items of the resource.

-

None

-

Configuration compliance notification

-

configurationComplianceNotification

-

Informational

-

The assignment evaluation result changed to be Compliant.

-

None

-

None

-
-
- -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Table 21 SecMaster

Event Source

-

Namespace

-

Event Name

-

Event ID

-

Event Severity

-

Description

-

Solution

-

Impact

-

SecMaster

-

SYS.SecMaster

-

Exclusive engine creation failed

-

createEngineFailed

-

Major

-

The underlying resources are insufficient.

-

Submit a ticket to request sufficient resources from the O&M personnel and try again.

-

The exclusive engine cannot be created.

-

Exclusive engine exception

-

engineException

-

Critical

-

The traffic is too heavy or there are malicious processes or plug-ins.

-
  1. Check the executions of plug-ins and processes, see if they occupy too many resources.
  2. Check the instance monitoring information to see whether there is a sharp increase in the number of instances.
-

The instance cannot be executed.

-

Playbook instance execution failed

-

playbookInstanceExecFailed

-

Minor

-

Playbooks or processes are incorrectly configured.

-

Check the instance monitoring information to find the cause of the failure, and modify the playbook and process configuration.

-

None

-

Playbook instance increased sharply

-

playbookInstanceIncreaseSharply

-

Minor

-

Playbooks or processes are incorrectly configured.

-

Check the instance monitoring information to find the cause of the increase, and modify the playbook and process configuration.

-

None

-

Log messages increased sharply

-

logIncrease

-

Major

-

The upstream services suddenly generate a large number of log messages.

-

Check whether the upstream services are normal.

-

None

-

Log messages decreased sharply

-

logsDecrease

-

Major

-

Logs generated by the upstream services suddenly decrease.

-

Check whether the upstream services are normal.

-

None

-
-
- -
- - - - - - - - - - - - - - - - - - - -
Table 22 Key Pair Service

Event Source

-

Namespace

-

Event Name

-

Event ID

-

Event Severity

-

Description

-

Solution

-

Impact

-

KPS

-

SYS.KPS

-

Key pair deleted

-

KPSDeleteKeypair

-

Informational

-

A key pair was deleted. This operation cannot be undone.

-

If this event occurred frequently within a short period of time, check whether malicious deletion took place.

-

Deleted key pairs cannot be restored.

-
-
- -
- - - - - - - - - - - - - - - - - - - - - - - - - - -
Table 23 Host Security Service

Event Source

-

Namespace

-

Event Name

-

Event ID

-

Event Severity

-

Description

-

Solution

-

Impact

-

HSS

-

SYS.HSS

-

HSS agent disconnected

-

hssAgentAbnormalOffline

-

Major

-

The communication between the agent and the server is abnormal, or the agent process on the server is abnormal.

-

Fix your network connection. If the agent is still offline for a long time after the network recovers, the agent process may be abnormal. In this case, log in to the server and restart the agent process.

-

Services are interrupted.

-

Abnormal HSS agent status

-

hssAgentAbnormalProtection

-

Major

-

The agent is abnormal probably because it does not have sufficient resources.

-

Log in to the server and check your resources. If the usage of memory or other system resources is too high, increase their capacity first. If the resources are sufficient but the fault persists after the agent process is restarted, submit a service ticket to the O&M personnel.

-

Services are interrupted.

-
-
- -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Table 24 Image Management Service

Event Source

-

Namespace

-

Event Name

-

Event ID

-

Event Severity

-

Description

-

Solution

-

Impact

-

IMS

-

SYS.IMS

-

Create Image

-

createImage

-

Major

-

An image was created.

-

None

-

You can use this image to create cloud servers.

-

Update Image

-

updateImage

-

Major

-

Metadata of an image was modified.

-

None

-

Cloud servers may fail to be created from this image.

-

Delete Image

-

deleteImage

-

Major

-

An image was deleted.

-

None

-

This image will be unavailable on the management console.

-
-
- -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Table 25 Cloud Storage Gateway (CSG)

Event Source

-

Namespace

-

Event Name

-

Event ID

-

Event Severity

-

Description

-

CSG

-

SYS.CSG

-

Abnormal CSG process status

-

gatewayProcessStatusAbnormal

-

Major

-

This event is triggered when an exception occurs in the CSG process status.

-

Abnormal CSG connection status

-

gatewayToServiceConnectAbnormal

-

Major

-

This event is triggered when no CSG status report is returned for five consecutive periods.

-

Abnormal connection status between CSG and OBS

-

gatewayToObsConnectAbnormal

-

Major

-

This event is triggered when CSG cannot connect to OBS.

-

Read-only file system

-

gatewayFileSystemReadOnly

-

Major

-

This event is triggered when the partition file system on CSG becomes read-only.

-

Read-only file share

-

gatewayFileShareReadOnly

-

Major

-

This event is triggered when the file share becomes read-only due to insufficient cache disk storage space.

-
-
- -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Table 26 Global Accelerator

Event Source

-

Namespace

-

Event Name

-

Event ID

-

Event Severity

-

Description

-

Solution

-

Impact

-

GA

-

SYS.GA

-

Anycast IP address blocked

-

blockAIP

-

Critical

-

The used bandwidth of an EIP exceeded 5 Gbit/s, the EIP were blocked and packets were discarded. Such an event may be caused by DDoS attacks.

-

Locate the root cause and rectify the fault.

-

Services are affected. The traffic will not be properly forwarded.

-

Anycast IP address unblocked

-

unblockAIP

-

Critical

-

The anycast IP address was unblocked.

-

Ensure that traffic can be properly forwarded.

-

None

-

Unhealthy endpoint

-

healthCheckError

-

Major

-

Health check detects the endpoint unhealthy.

-

Submit a service ticket.

-

-

If an endpoint is considered unhealthy, traffic will not be forwarded to it until the endpoint recovers.

-
-
- -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Table 27 Cloud Certificate Manager (CCM)

Event Source

-

Namespace

-

Event Name

-

Event ID

-

Event Severity

-

Description

-

Solution

-

Impact

-

CCM

-

SYS.CCM

-

Certificate revocation

-

CCMRevokeCertificate

-

Major

-

The certificate enters into the revocation process. Once revoked, the certificate cannot be used anymore.

-

Check whether the certificate revocation is really needed. Certificate revocation can be canceled.

-

If a certificate is revoked, the website is inaccessible using HTTPS.

-

Certificate auto-deployment failure

-

CCMAutoDeploymentFailure

-

Major

-

The certificate fails to be automatically deployed.

-

Check service resources whose certificates need to be replaced.

-

If no new certificate is deployed after a certificate expires, the website is inaccessible using HTTPS.

-

Certificate expiration

-

CCMCertificateExpiration

-

Major

-

An SSL certificate has expired.

-

Purchase a new certificate in a timely manner.

-

If no new certificate is deployed after a certificate expires, the website is inaccessible using HTTPS.

-

Certificate about to expire

-

CCMcertificateAboutToExpiration

-

Major

-

This alarm is generated when an SSL certificate is about to expire in one week, one month, and two months.

-

Renew or purchase a new certificate in a timely manner.

-

If no new certificate is deployed after a certificate expires, the website is inaccessible using HTTPS.

-
-
- -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Table 28 Bare Metal Server (BMS)

Event Source

-

Event Name

-

Event ID

-

Event Severity

-

Description

-

Solution

-

Impact

-

BMS

-

ECC uncorrectable errors generated on GPU SRAM

-

SRAMUncorrectableEccError

-

Major

-

There are ECC uncorrectable errors generated on GPU SRAM.

-

If services are affected, submit a service ticket.

-

The GPU hardware may be faulty. As a result, the GPU memory is faulty, and services exit abnormally.

+

The GPU hardware may be faulty. As a result, the GPU memory is faulty, and services exit abnormally.

osShutdown

+

osShutdown

osShutdown

+

osShutdown

Major

+

Major

The BMS was stopped

+

The BMS was stopped

  • on the management console.
  • by calling APIs.
  • Deploy service applications in HA mode.
  • After the BMS is started, check whether services recover.
+
  • Deploy service applications in HA mode.
  • After the BMS is started, check whether services recover.

Services are interrupted.

+

Services are interrupted.

Abnormal shutdown

+

Abnormal shutdown

serverShutdown

+

serverShutdown

Major

+

Major

The BMS was stopped unexpectedly, which may be caused by

+

The BMS was stopped unexpectedly, which may be caused by

  • unexpected power-off.
  • hardware faults.
  • Deploy service applications in HA mode.
  • After the BMS is started, check whether services recover.
+
  • Deploy service applications in HA mode.
  • After the BMS is started, check whether services recover.

Services are interrupted.

+

Services are interrupted.

Abnormal reboot

+

Abnormal reboot

serverReboot

+

serverReboot

Major

+

Major

The BMS restarted unexpectedly, which may be caused by

+

The BMS restarted unexpectedly, which may be caused by

  • OS faults.
  • hardware faults.
  • Deploy service applications in HA mode.
  • After the BMS is restarted, check whether services recover.
+
  • Deploy service applications in HA mode.
  • After the BMS is restarted, check whether services recover.

Services are interrupted.

+

Services are interrupted.

Network interruption

+

Network interruption

linkDown

+

linkDown

Major

+

Major

The BMS network was disconnected. Possible causes are as follows:

+

The BMS network was disconnected. Possible causes are as follows:

  • The BMS was unexpectedly stopped or restarted.
  • The switch was faulty.
  • The gateway was faulty.
  • Deploy service applications in HA mode.
  • After the BMS is started, check whether services recover.
+
  • Deploy service applications in HA mode.
  • After the BMS is started, check whether services recover.

Services are interrupted.

+

Services are interrupted.

PCIE error

+

PCIE error

pcieError

+

pcieError

Major

+

Major

The PCIe devices or main board of the BMS was faulty.

+

The PCIe devices or main board of the BMS was faulty.

  • Deploy service applications in HA mode.
  • After the BMS is started, check whether services recover.
+
  • Deploy service applications in HA mode.
  • After the BMS is started, check whether services recover.

The network or disk read/write services are affected.

+

The network or disk read/write services are affected.

Disk error

+

Disk error

diskError

+

diskError

Major

+

Major

The disk backplane or disks of the BMS were faulty.

+

The disk backplane or disks of the BMS were faulty.

  • Deploy service applications in HA mode.
  • After the fault is rectified, check whether services recover.
+
  • Deploy service applications in HA mode.
  • After the fault is rectified, check whether services recover.

Data read/write services are affected, or the BMS cannot be started.

+

Data read/write services are affected, or the BMS cannot be started.

Storage error

+

Storage error

storageError

+

storageError

Major

+

Major

The BMS failed to connect to EVS disks. Possible causes are as follows:

+

The BMS failed to connect to EVS disks. Possible causes are as follows:

  • The SDI card was faulty.
  • Remote storage devices were faulty.
  • Deploy service applications in HA mode.
  • After the fault is rectified, check whether services recover.
+
  • Deploy service applications in HA mode.
  • After the fault is rectified, check whether services recover.

Data read/write services are affected, or the BMS cannot be started.

+

Data read/write services are affected, or the BMS cannot be started.

OS reboot

+

OS reboot

osReboot

+

osReboot

Major

+

Major

The BMS was restarted

+

The BMS was restarted

  • on the management console.
  • by calling APIs.
  • Deploy service applications in HA mode.
  • After the BMS is restarted, check whether services recover.
+
  • Deploy service applications in HA mode.
  • After the BMS is restarted, check whether services recover.

Services are interrupted.

+

Services are interrupted.

Inforom alarm generated on GPU

+

Inforom alarm generated on GPU

gpuInfoROMAlarm

+

gpuInfoROMAlarm

Major

+

Major

The driver failed to read inforom information due to GPU faults.

+

The driver failed to read inforom information due to GPU faults.

Non-critical services can continue to use the GPU card. For critical services, submit a service ticket to resolve this issue.

+

Non-critical services can continue to use the GPU card. For critical services, submit a service ticket to resolve this issue.

Services will not be affected if inforom information cannot be read. If error correction code (ECC) errors are reported on GPU, faulty pages may not be automatically retired and services are affected.

+

Services will not be affected if inforom information cannot be read. If error correction code (ECC) errors are reported on GPU, faulty pages may not be automatically retired and services are affected.

Double-bit ECC alarm generated on GPU

+

Double-bit ECC alarm generated on GPU

doubleBitEccError

+

doubleBitEccError

Major

+

Major

A double-bit ECC error occurred on GPU.

+

A double-bit ECC error occurred on GPU.

  1. If services are interrupted, restart the services to restore.
  2. If services cannot be restarted, restart the VM where services are running.
  3. If services still cannot be restored, submit a service ticket.
+
  1. If services are interrupted, restart the services to restore.
  2. If services cannot be restarted, restart the VM where services are running.
  3. If services still cannot be restored, submit a service ticket.

Services may be interrupted. After faulty pages are retired, the GPU card can continue to be used.

+

Services may be interrupted. After faulty pages are retired, the GPU card can continue to be used.

Too many retired pages

+

Too many retired pages

gpuTooManyRetiredPagesAlarm

+

gpuTooManyRetiredPagesAlarm

Major

+

Major

An ECC page retirement error occurred on GPU.

+

An ECC page retirement error occurred on GPU.

If services are affected, submit a service ticket.

+

If services are affected, submit a service ticket.

Services may be affected.

+

Services may be affected.

ECC alarm generated on GPU A100

+

ECC alarm generated on GPU A100

gpuA100EccAlarm

+

gpuA100EccAlarm

Major

+

Major

An ECC error occurred on GPU.

+

An ECC error occurred on GPU.

  1. If services are interrupted, restart the services to restore.
  2. If services cannot be restarted, restart the VM where services are running.
  3. If services still cannot be restored, submit a service ticket.
+
  1. If services are interrupted, restart the services to restore.
  2. If services cannot be restarted, restart the VM where services are running.
  3. If services still cannot be restored, submit a service ticket.

Services may be interrupted. After faulty pages are retired, the GPU card can continue to be used.

+

Services may be interrupted. After faulty pages are retired, the GPU card can continue to be used.

GPU ECC memory page retirement failure

+

GPU ECC memory page retirement failure

eccPageRetirementRecordingFailure

+

eccPageRetirementRecordingFailure

Major

+

Major

Automatic page retirement failed due to ECC errors.

+

Automatic page retirement failed due to ECC errors.

  1. If services are interrupted, restart the services to restore.
  2. If services cannot be restarted, restart the VM where services are running.
  3. If services still cannot be restored, submit a service ticket.
+
  1. If services are interrupted, restart the services to restore.
  2. If services cannot be restarted, restart the VM where services are running.
  3. If services still cannot be restored, submit a service ticket.

Services may be interrupted, and memory page retirement fails. As a result, services cannot no longer use the GPU card.

+

Services may be interrupted, and memory page retirement fails. As a result, services cannot no longer use the GPU card.

GPU ECC page retirement alarm generated

+

GPU ECC page retirement alarm generated

eccPageRetirementRecordingEvent

+

eccPageRetirementRecordingEvent

Minor

+

Minor

Memory pages are automatically retired due to ECC errors.

+

Memory pages are automatically retired due to ECC errors.

  1. If services are interrupted, restart the services to restore.
  2. If services cannot be restarted, restart the VM where services are running.
  3. If services still cannot be restored, submit a service ticket.
+
  1. If services are interrupted, restart the services to restore.
  2. If services cannot be restarted, restart the VM where services are running.
  3. If services still cannot be restored, submit a service ticket.

Generally, this alarm is generated together with the ECC error alarm. If this alarm is generated independently, services are not affected.

+

Generally, this alarm is generated together with the ECC error alarm. If this alarm is generated independently, services are not affected.

Too many single-bit ECC errors on GPU

+

Too many single-bit ECC errors on GPU

highSingleBitEccErrorRate

+

highSingleBitEccErrorRate

Major

+

Major

There are too many single-bit ECC errors.

+

There are too many single-bit ECC errors.

  1. If services are interrupted, restart the services to restore.
  2. If services cannot be restarted, restart the VM where services are running.
  3. If services still cannot be restored, submit a service ticket.
+
  1. If services are interrupted, restart the services to restore.
  2. If services cannot be restarted, restart the VM where services are running.
  3. If services still cannot be restored, submit a service ticket.

Single-bit errors can be automatically rectified and do not affect GPU-related applications.

+

Single-bit errors can be automatically rectified and do not affect GPU-related applications.

GPU card not found

+

GPU card not found

gpuDriverLinkFailureAlarm

+

gpuDriverLinkFailureAlarm

Major

+

Major

A GPU link is normal, but the NVIDIA driver cannot find the GPU card.

+

A GPU link is normal, but the NVIDIA driver cannot find the GPU card.

  1. Restart the VM to restore services.
  2. If services still cannot be restored, submit a service ticket.
+
  1. Restart the VM to restore services.
  2. If services still cannot be restored, submit a service ticket.

The GPU card cannot be found.

+

The GPU card cannot be found.

GPU link faulty

+

GPU link faulty

gpuPcieLinkFailureAlarm

+

gpuPcieLinkFailureAlarm

Major

+

Major

GPU hardware information cannot be queried through lspci due to a GPU link fault.

+

GPU hardware information cannot be queried through lspci due to a GPU link fault.

If services are affected, submit a service ticket.

+

If services are affected, submit a service ticket.

The driver cannot use GPU.

+

The driver cannot use GPU.

GPU card lost

+

GPU card lost

vmLostGpuAlarm

+

vmLostGpuAlarm

Major

+

Major

The number of GPU cards on the VM is less than the number specified in the specifications.

+

The number of GPU cards on the VM is less than the number specified in the specifications.

If services are affected, submit a service ticket.

+

If services are affected, submit a service ticket.

GPU cards get lost.

+

GPU cards get lost.

GPU memory page faulty

+

GPU memory page faulty

gpuMemoryPageFault

+

gpuMemoryPageFault

Major

+

Major

The GPU memory page is faulty, which may be caused by applications, drivers, or hardware.

+

The GPU memory page is faulty, which may be caused by applications, drivers, or hardware.

If services are affected, submit a service ticket.

+

If services are affected, submit a service ticket.

The GPU hardware may be faulty. As a result, the GPU memory is faulty, and services exit abnormally.

+

The GPU hardware may be faulty. As a result, the GPU memory is faulty, and services exit abnormally.

GPU image engine faulty

+

GPU image engine faulty

graphicsEngineException

+

graphicsEngineException

Major

+

Major

The GPU image engine is faulty, which may be caused by applications, drivers, or hardware.

+

The GPU image engine is faulty, which may be caused by applications, drivers, or hardware.

If services are affected, submit a service ticket.

+

If services are affected, submit a service ticket.

The GPU hardware may be faulty. As a result, the image engine is faulty, and services exit abnormally.

+

The GPU hardware may be faulty. As a result, the image engine is faulty, and services exit abnormally.

GPU temperature too high

+

GPU temperature too high

highTemperatureEvent

+

highTemperatureEvent

Major

+

Major

GPU temperature too high

+

GPU temperature too high

If services are affected, submit a service ticket.

+

If services are affected, submit a service ticket.

If the GPU temperature exceeds the threshold, the GPU performance may deteriorate.

+

If the GPU temperature exceeds the threshold, the GPU performance may deteriorate.

GPU NVLink faulty

+

GPU NVLink faulty

nvlinkError

+

nvlinkError

Major

+

Major

A hardware fault occurs on the NVLink.

+

A hardware fault occurs on the NVLink.

If services are affected, submit a service ticket.

+

If services are affected, submit a service ticket.

The NVLink link is faulty and unavailable.

+

The NVLink link is faulty and unavailable.

nvidia-smi suspended

+

nvidia-smi suspended

nvidiaSmiHangEvent

+

nvidiaSmiHangEvent

Major

+

Major

nvidia-smi timed out.

+

nvidia-smi timed out.

If services are affected, submit a service ticket.

+

If services are affected, submit a service ticket.

The driver may report an error during service running.

+

The driver may report an error during service running.

-
Table 29 Elastic IP and bandwidth

Event Source

+
- - - - - - - - - - - - - - - - - - - - - - - - -
Table 18 Virtual Private Cloud (VPC)

Event Source

Event Name

+

Event Name

Event ID

+

Event ID

Event Severity

+

Event Severity

Elastic IP and bandwidth

+

Elastic IP and bandwidth

Delete VPC

+

Delete VPC

deleteVpc

+

deleteVpc

Major

+

Major

Modify VPC

+

Modify VPC

modifyVpc

+

modifyVpc

Minor

+

Minor

Delete subnet

+

Delete subnet

deleteSubnet

+

deleteSubnet

Minor

+

Minor

Modify subnet

+

Modify subnet

modifySubnet

+

modifySubnet

Minor

+

Minor

Modify bandwidth

+

Modify bandwidth

modifyBandwidth

+

modifyBandwidth

Minor

+

Minor

Delete VPN

+

Delete VPN

deleteVpn

+

deleteVpn

Major

+

Major

Modify VPN

+

Modify VPN

modifyVpn

+

modifyVpn

Minor

+

Minor

-
Table 30 Cloud Phone Host (CPH)

Event Source

+
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Table 19 Object Storage Service (OBS)

Event Source

Event Name

+

Event Name

Event ID

+

Event ID

Event Severity

-

Description

-

Solution

-

Impact

+

Event Severity

CPH

+

OBS

Server shutdown

+

Delete bucket

cphServerOsShutdown

+

deleteBucket

Major

-

The cloud phone server was stopped

-
  • on the management console.
  • by calling APIs.
-

Deploy service applications in HA mode.

-

After the fault is rectified, check whether services recover.

-

Services are interrupted.

+

Major

Server abnormal shutdown

+

Delete bucket policy

cphServerShutdown

+

deleteBucketPolicy

Major

-

The cloud phone server was stopped unexpectedly. Possible causes are as follows:

-
  • The cloud phone server was powered off unexpectedly.
  • The cloud phone server was stopped due to hardware faults.
-

Deploy service applications in HA mode.

-

After the fault is rectified, check whether services recover.

-

Services are interrupted.

+

Major

Server reboot

+

Set bucket ACL

cphServerOsReboot

+

setBucketAcl

Major

-

The cloud phone server was rebooted

-
  • on the management console.
  • by calling APIs.
-

Deploy service applications in HA mode.

-

After the fault is rectified, check whether services recover.

-

Services are interrupted.

+

Minor

Server abnormal reboot

+

Set bucket policy

cphServerReboot

+

setBucketPolicy

Major

-

The cloud phone server was rebooted unexpectedly due to

-
  • OS faults.
  • hardware faults.
-

Deploy service applications in HA mode.

-

After the fault is rectified, check whether services recover.

-

-

Services are interrupted.

-

Network interruption

-

cphServerlinkDown

-

Major

-

The network where the cloud phone server was deployed was disconnected. Possible causes are as follows:

-
  • The cloud phone server was stopped unexpectedly and rebooted.
  • The switch was faulty.
  • The gateway node was faulty.
-

Deploy service applications in HA mode.

-

After the fault is rectified, check whether services recover.

-

-

Services are interrupted.

-

PCIE error

-

cphServerPcieError

-

Major

-

The PCIe device or main board on the cloud phone server was faulty.

-

Deploy service applications in HA mode.

-

After the fault is rectified, check whether services recover.

-

-

The network or disk read/write is affected.

-

Disk error

-

cphServerDiskError

-

Major

-

The disk on the cloud phone server was faulty due to

-
  • disk backplane faults.
  • disk faults.
-

Deploy service applications in HA mode.

-

After the fault is rectified, check whether services recover.

-

-

Data read/write services are affected, or the BMS cannot be started.

-

Storage error

-

cphServerStorageError

-

Major

-

The cloud phone server could not connect to EVS disks. Possible causes are as follows:

-
  • SDI card faults
  • Remote storage devices were faulty.
-

Deploy service applications in HA mode.

-

After the fault is rectified, check whether services recover.

-

-

Data read/write services are affected, or the BMS cannot be started.

-

GPU offline

-

cphServerGpuOffline

-

Major

-

GPU of the cloud phone server was loose and disconnected.

-

Stop the cloud phone server and reboot it.

-

Faults occur on cloud phones whose GPUs are disconnected. Cloud phones cannot run properly even if they are restarted or reconfigured.

-

GPU timeout

-

cphServerGpuTimeOut

-

Major

-

GPU of the cloud phone server timed out.

-

Reboot the cloud phone server.

-

-

Cloud phones whose GPUs timed out cannot run properly and are still faulty even if they are restarted or reconfigured.

-

Disk full

-

cphServerDiskFull

-

Major

-

Disk space of the cloud phone server was used up.

-

Clear the application data in the cloud phone to release space.

-

-

Cloud phone is sub-healthy, prone to failure, and unable to start.

-

Disk readonly

-

cphServerDiskReadOnly

-

Major

-

The disk of the cloud phone server became read-only.

-

Reboot the cloud phone server.

-

-

Cloud phone is sub-healthy, prone to failure, and unable to start.

-

Phone metadata damage

-

cphPhoneMetaDataDamage

-

Major

-

Cloud phone metadata was damaged.

-

Contact O&M personnel.

-

The cloud phone cannot run properly even if it is restarted or reconfigured.

-

GPU failure

-

gpuAbnormal

-

Critical

-

The GPU was faulty.

-

Submit a service ticket.

-

Services are interrupted.

-

GPU back to normal

-

gpuNormal

-

Informational

-

The GPU was running properly.

-

No further action is required.

-

N/A

-

Kernel crash

-

kernelCrash

-

Critical

-

The kernel log indicated crash.

-

Submit a service ticket.

-

Services are interrupted during the crash.

-

Kernel OOM

-

kernelOom

-

Major

-

The kernel log indicated out of memory.

-

Submit a service ticket.

-

Services are interrupted.

-

Hardware malfunction

-

hardwareError

-

Critical

-

The kernel log indicated Hardware Error.

-

Submit a service ticket.

-

Services are interrupted.

-

PCIE error

-

pcieAer

-

Critical

-

The kernel log indicated PCIe Bus Error.

-

Submit a service ticket.

-

Services are interrupted.

-

SCSI error

-

scsiError

-

Critical

-

The kernel log indicated SCSI Error.

-

Submit a service ticket.

-

Services are interrupted.

-

Image storage became read-only

-

partReadOnly

-

Critical

-

The image storage became read-only.

-

Submit a service ticket.

-

Services are interrupted.

-

Image storage superblock damaged

-

badSuperBlock

-

Critical

-

The superblock of the file system of the image storage was damaged.

-

Submit a service ticket.

-

Services are interrupted.

-

Image storage /.sharedpath/master became read-only

-

isuladMasterReadOnly

-

Critical

-

Mount point /.sharedpath/master of the image storage became read-only.

-

Submit a service ticket.

-

Services are interrupted.

-

Cloud phone data disk became read-only

-

cphDiskReadOnly

-

Critical

-

The cloud phone data disk became read-only.

-

Submit a service ticket.

-

Services are interrupted.

-

Cloud phone data disk superblock damaged

-

cphDiskBadSuperBlock

-

Critical

-

The superblock of the file system of the cloud phone data disk was damaged.

-

Submit a service ticket.

-

Services are interrupted.

+

Minor

-
Table 31 Object Storage Service (OBS)

Event Source

+
- - - + + + - - - - - - - - - - - - - - - - - - -
Table 20 Elastic IP (EIP)

Event Source

Event Name

+

Event Name

Event ID

+

Event ID

Event Severity

+

Event Severity

+

Description

+

Solution

+

Impact

OBS

+

EIP

Delete bucket

+

EIP bandwidth overflow

deleteBucket

+

EIPBandwidthOverflow

Major

+

Major

Delete bucket policy

-

deleteBucketPolicy

-

Major

-

Set bucket ACL

-

setBucketAcl

-

Minor

-

Set bucket policy

-

setBucketPolicy

-

Minor

-
-
- -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Table 32 Enterprise connection

Event Source

-

Event Name

-

Event ID

-

Event Severity

-

Description

-

Solution

-

Impact

-

EC

-

The WAN interface goes up

-

EquipmentWanGoesOnline

-

Major

-

The WAN interface goes online.

-

None

-

None

-

The WAN interface goes down

-

EquipmentWanGoesOffline

-

Major

-

The WAN interface goes offline.

-

Check whether the event is caused by a manual operation or device fault.

-

The device cannot be used.

-

BGP peer disconnection

-

BgpPeerDisconnection

-

Major

-

BGP peer disconnection

-

Check whether the event is caused by a manual operation or device fault.

-

The device cannot be used.

-

BGP peer connection success

-

BgpPeerConnectionSuccess

-

Major

-

The BGP peer is successfully connected.

-

None

-

None

-

Abnormal GRE tunnel status

-

AbnormalGreTunnelStatus

-

Major

-

Abnormal GRE tunnel status

-

Check whether the event is caused by a manual operation or device fault.

-

The device cannot be used.

-

Normal GRE tunnel status

-

NormalGreTunnelStatus

-

Major

-

The GRE tunnel status is normal.

-

None

-

None

-

The intelligent enterprise gateway goes online

-

IntelligentEnterpriseGatewayGoesOnline

-

Major

-

The intelligent enterprise gateway goes online.

-

None

-

None

-

The intelligent enterprise gateway goes offline

-

IntelligentEnterpriseGatewayGoesOffline

-

Major

-

The intelligent enterprise gateway goes offline.

-

Check whether the event is caused by a manual operation or device fault.

-

The device cannot be used.

-
-
- -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Table 33 Elastic IP (EIP)

Event Source

-

Event Name

-

Event ID

-

Event Severity

-

Description

-

Solution

-

Impact

-

EIP

-

EIP bandwidth overflow

-

EIPBandwidthOverflow

-

Major

-

The used bandwidth exceeded the purchased one, which may slow down the network or cause packet loss. The value of this event is the maximum value in a monitoring period, and the value of the EIP inbound and outbound bandwidth is the value at a specific time point in the period.

+

The used bandwidth exceeded the purchased one, which may slow down the network or cause packet loss. The value of this event is the maximum value in a monitoring period, and the value of the EIP inbound and outbound bandwidth is the value at a specific time point in the period.

The metrics are described as follows:

-

egressDropBandwidth: dropped outbound packets (bytes)

-

egressAcceptBandwidth: accepted outbound packets (bytes)

-

egressMaxBandwidthPerSec: peak outbound bandwidth (byte/s)

-

ingressAcceptBandwidth: accepted inbound packets (bytes)

-

ingressMaxBandwidthPerSec: peak inbound bandwidth (byte/s)

-

ingressDropBandwidth: dropped inbound packets (bytes)

+

egressDropBandwidth: dropped outbound packets (bytes)

+

egressAcceptBandwidth: accepted outbound packets (bytes)

+

egressMaxBandwidthPerSec: peak outbound bandwidth (byte/s)

+

ingressAcceptBandwidth: accepted inbound packets (bytes)

+

ingressMaxBandwidthPerSec: peak inbound bandwidth (byte/s)

+

ingressDropBandwidth: dropped inbound packets (bytes)

Check whether the EIP bandwidth keeps increasing and whether services are normal. Increase bandwidth if necessary.

+

Check whether the EIP bandwidth keeps increasing and whether services are normal. Increase bandwidth if necessary.

The network becomes slow or packets are lost.

+

The network becomes slow or packets are lost.

Delete EIP

+

Delete EIP

deleteEip

+

deleteEip

Minor

+

Minor

The EIP was released.

+

The EIP was released.

Check whether the EIP was release by mistake.

+

Check whether the EIP was release by mistake.

The server that has the EIP bound cannot access the Internet.

+

The server that has the EIP bound cannot access the Internet.

EIP blocked

+

EIP blocked

blockEIP

+

blockEIP

Critical

+

Critical

The used bandwidth of an EIP exceeded 5 Gbit/s, the EIP were blocked and packets were discarded. Such an event may be caused by DDoS attacks.

+

The used bandwidth of an EIP exceeded 5 Gbit/s, the EIP were blocked and packets were discarded. Such an event may be caused by DDoS attacks.

Replace the EIP to prevent services from being affected.

+

Replace the EIP to prevent services from being affected.

Locate and deal with the fault.

Services are impacted.

+

Services are impacted.

EIP unblocked

+

EIP unblocked

unblockEIP

+

unblockEIP

Critical

+

Critical

The EIP was unblocked.

+

The EIP was unblocked.

Use the previous EIP again.

+

Use the previous EIP again.

None

+

None

Start DDoS traffic scrubbing

+

Start DDoS traffic scrubbing

ddosCleanEIP

+

ddosCleanEIP

Major

+

Major

Traffic scrubbing on the EIP was started to prevent DDoS attacks.

+

Traffic scrubbing on the EIP was started to prevent DDoS attacks.

Check whether the EIP was attacked.

+

Check whether the EIP was attacked.

Services may be interrupted.

+

Services may be interrupted.

Stop DDoS traffic scrubbing

+

Stop DDoS traffic scrubbing

ddosEndCleanEip

+

ddosEndCleanEip

Major

+

Major

Traffic scrubbing on the EIP to prevent DDoS attacks was ended.

+

Traffic scrubbing on the EIP to prevent DDoS attacks was ended.

Check whether the EIP was attacked.

+

Check whether the EIP was attacked.

Services may be interrupted.

+

Services may be interrupted.

Enterprise-class QoS bandwidth limit exceeded

+

Enterprise-class QoS bandwidth limit exceeded

EIPBandwidthRuleOverflow

+

EIPBandwidthRuleOverflow

Major

+

Major

The used QoS bandwidth exceeded the allocated one, which may slow down the network or cause packet loss. The value of this event is the maximum value in a monitoring period, and the value of the EIP inbound and outbound bandwidth is the value at a specific time point in the period.

-

egressDropBandwidth: dropped outbound packets (bytes)

-

egressAcceptBandwidth: accepted outbound packets (bytes)

-

egressMaxBandwidthPerSec: peak outbound bandwidth (byte/s)

-

ingressAcceptBandwidth: accepted inbound packets (bytes)

-

ingressMaxBandwidthPerSec: peak inbound bandwidth (byte/s)

-

ingressDropBandwidth: dropped inbound packets (bytes)

+

The used QoS bandwidth exceeded the allocated one, which may slow down the network or cause packet loss. The value of this event is the maximum value in a monitoring period, and the value of the EIP inbound and outbound bandwidth is the value at a specific time point in the period.

+

egressDropBandwidth: dropped outbound packets (bytes)

+

egressAcceptBandwidth: accepted outbound packets (bytes)

+

egressMaxBandwidthPerSec: peak outbound bandwidth (byte/s)

+

ingressAcceptBandwidth: accepted inbound packets (bytes)

+

ingressMaxBandwidthPerSec: peak inbound bandwidth (byte/s)

+

ingressDropBandwidth: dropped inbound packets (bytes)

Check whether the EIP bandwidth keeps increasing and whether services are normal. Increase bandwidth if necessary.

+

Check whether the EIP bandwidth keeps increasing and whether services are normal. Increase bandwidth if necessary.

The network becomes slow or packets are lost.

-
-
- -
- - - - - - - - - - - - - - - diff --git a/docs/ces/api-ref/ces_03_0033.html b/docs/ces/api-ref/ces_03_0033.html index d79aa592e..894fa96ca 100644 --- a/docs/ces/api-ref/ces_03_0033.html +++ b/docs/ces/api-ref/ces_03_0033.html @@ -155,11 +155,11 @@
Table 34 Enterprise Switch

Event Source

-

Event Name

-

Event ID

-

Event Severity

-

Description

-

Solution

-

Impact

-

Enterprise Switch

-

IP Address Conflict

-

IPConflict

-

Major

-

A cloud server and an on-premises server that need to communicate use the same IP address.

-

Check the ARP and switch information to locate the servers that have the same IP address and change the IP address.

-

The communications between the on-premises and cloud servers may be abnormal.

+

The network becomes slow or packets are lost.

-
Table 4 datapoints data structure description

Parameter

+
- - diff --git a/docs/ces/api-ref/ces_03_0034.html b/docs/ces/api-ref/ces_03_0034.html index d13e711c2..3dd8e9c8b 100644 --- a/docs/ces/api-ref/ces_03_0034.html +++ b/docs/ces/api-ref/ces_03_0034.html @@ -27,11 +27,11 @@

Request

  1. The size of a POST request cannot exceed 512 KB. Otherwise, the request will be denied.
  2. The default maximum query intervals of different periods are different.

    If period is 1, the maximum interval between from and to is 4 hours. If the interval between from and to is longer than 4 hours, adjust the value of from to to - 4*3600*1000.

    -

    If period is 300, the maximum interval between from and to is one day. If the interval between from and to is longer than one day, adjust the value of from to to - 24*3600*1000.

    -

    If period is 1200, the maximum interval between from and to is three days. If the interval between from and to is longer than three days, adjust the value of from to to - 3*24*3600*1000.

    -

    If period is 3600, the maximum interval between from and to is 10 days. If the interval between from and to is longer than 10 days, adjust the value of from to to - 10*24*3600*1000.

    -

    If period is 14400, the maximum interval between from and to is 30 days. If the interval between from and to is longer than 30 days, adjust the value of from to to - 30*24*3600*1000.

    -

    If period is 86400, the maximum interval between from and to is 180 days. If the interval between from and to is longer than 180 days, adjust the value of from to to - 180*24*3600*1000.

    +

    If period is 300, the maximum interval between from and to is one day. If the interval between from and to is longer than one day, adjust the value of from to to - 24*3600*1000.

    +

    If period is 1200, the maximum interval between from and to is three days. If the interval between from and to is longer than three days, adjust the value of from to to - 3*24*3600*1000.

    +

    If period is 3600, the maximum interval between from and to is 10 days. If the interval between from and to is longer than 10 days, adjust the value of from to to - 10*24*3600*1000.

    +

    If period is 14400, the maximum interval between from and to is 30 days. If the interval between from and to is longer than 30 days, adjust the value of from to to - 30*24*3600*1000.

    +

    If period is 86400, the maximum interval between from and to is 180 days. If the interval between from and to is longer than 180 days, adjust the value of from to to - 180*24*3600*1000.

  • Request parameters @@ -100,13 +100,13 @@
Table 4 datapoints data structure description

Parameter

Type

+

Type

Description

+

Description

-
Table 3 metrics data structure description

Parameter

+
- - - @@ -149,13 +149,13 @@
Table 3 metrics data structure description

Parameter

Mandatory

+

Mandatory

Type

+

Type

Description

+

Description

-
Table 4 dimensions data structure description

Parameter

+
- - - @@ -365,7 +365,7 @@

Response

  • Response parameters
Table 4 dimensions data structure description

Parameter

Mandatory

+

Mandatory

Type

+

Type

Description

+

Description

- @@ -383,11 +383,11 @@
Table 5 Parameter description

Parameter

Type

+

Type

Description

-
Table 6 metrics data structure description

Parameter

+
- - @@ -437,11 +437,11 @@
Table 6 metrics data structure description

Parameter

Type

+

Type

Description

+

Description

-
Table 7 dimensions data structure description

Parameter

+
- - @@ -465,11 +465,11 @@
Table 7 dimensions data structure description

Parameter

Type

+

Type

Description

+

Description

-
Table 8 datapoints data structure description

Parameter

+
- - diff --git a/docs/ces/api-ref/ces_03_0046.html b/docs/ces/api-ref/ces_03_0046.html index 634c33b10..5b9a431a8 100644 --- a/docs/ces/api-ref/ces_03_0046.html +++ b/docs/ces/api-ref/ces_03_0046.html @@ -6,7 +6,7 @@

You can grant users permissions by using roles and policies. A policy consists of permissions for an entire service. Users with such a policy assigned are granted all of the permissions required for that service. Policies define API-based permissions for operations on specific resources, allowing for more fine-grained, secure access control of cloud resources.

If you want to allow or deny the access to an API, use policies for authorization.

-

An account has permissions to call all APIs. An IAM user under the account can call specific APIs only after being assigned the required permissions. The permissions required for calling an API are determined by the actions supported by the API. Only users who have been granted permissions allowing the actions can call the API successfully. For example, if an IAM user queries the alarm rule list using an API, the user must have been granted permissions that allow the ces:alarms:list action.

+

An account has all the permissions required to call all APIs, but IAM users must be assigned the required permissions. The permissions required for calling an API are determined by the actions supported by the API. Only users who have been granted permissions allowing the actions can call the API successfully. For example, if an IAM user queries the alarm rule list using an API, the user must have been granted permissions that allow the ces:alarms:list action.

Supported Actions

Cloud Eye provides system-defined policies that can be directly used in IAM. You can also create custom policies and use them to supplement system-defined policies, implementing more refined access control. Operations supported by policies are specific to APIs. The following are common concepts related to policies:

  • Permissions: Defined by actions in a custom policy.
  • Actions: Added to a custom policy to control permissions for specific operations.
  • Related actions: Actions on which a specific action depends to take effect. When assigning permissions for the action to a user, you also need to assign permissions for the dependent actions.
  • Authorization Scope: A custom policy can be applied to IAM projects or enterprise projects or both. Policies that contain actions supporting both IAM and enterprise projects can be assigned to user groups and take effect in both IAM and Enterprise Management. Policies that only contain actions supporting IAM projects can be assigned to user groups and only take effect for IAM. Such policies will not take effect if they are assigned to user groups in Enterprise Management.
  • APIs: REST APIs that can be called in a custom policy

Cloud Eye supports the following actions that can be defined in custom policies:

diff --git a/docs/ces/api-ref/ces_03_0053.html b/docs/ces/api-ref/ces_03_0053.html index f1fefa14a..bed8908df 100644 --- a/docs/ces/api-ref/ces_03_0053.html +++ b/docs/ces/api-ref/ces_03_0053.html @@ -1,8 +1,11 @@ -

Common Parameters

-

+ +

Common Parameters

+ +

+
- @@ -102,6 +103,13 @@ + + + + - @@ -127,6 +136,13 @@ + + + + - + + + - diff --git a/docs/ces/api-ref/ces_03_0075.html b/docs/ces/api-ref/ces_03_0075.html index a5160486a..93793d248 100644 --- a/docs/ces/api-ref/ces_03_0075.html +++ b/docs/ces/api-ref/ces_03_0075.html @@ -81,7 +81,7 @@ - @@ -90,7 +90,7 @@ - @@ -168,7 +168,7 @@ -
Table 8 datapoints data structure description

Parameter

Type

+

Type

Description

+

Description

CBR metrics

Network

+

Network

+

Elastic IP and bandwidth

NAT Gateway metrics

Virtual Private Network

+

SYS.VPN

+

VPN metrics

+

Security

Web Application Firewall

@@ -111,7 +119,8 @@

WAF metrics

Application

+

Application

+

Distributed Message Service

DCS metrics

API Gateway

+

SYS.APIC

+

API Gateway metrics

+

Database

Relational Database Service

diff --git a/docs/ces/api-ref/ces_03_0060.html b/docs/ces/api-ref/ces_03_0060.html index b7d1f717a..aaa802b4e 100644 --- a/docs/ces/api-ref/ces_03_0060.html +++ b/docs/ces/api-ref/ces_03_0060.html @@ -8,7 +8,13 @@

2024-01-04

+

2025-01-04

+

This release incorporates the following changes:

+

Updated Reporting Events, added the dimensions parameter to Table 4, and added Table 5.

+

2024-01-04

This release incorporates the following changes:

diff --git a/docs/ces/api-ref/ces_03_0074.html b/docs/ces/api-ref/ces_03_0074.html index feaaf7100..26764de8a 100644 --- a/docs/ces/api-ref/ces_03_0074.html +++ b/docs/ces/api-ref/ces_03_0074.html @@ -175,7 +175,7 @@

No

Specifies the event source. If the event is a system event, the value is the namespace of each service. To view the namespace of each service, see Services Interconnected with Cloud Eye.

+

If the event is a system event, the source is the namespace of each service. To view the namespace of each service, see Services Interconnected with Cloud Eye.

If the event is a custom event, the event source is defined by the user.

from

Integer

+

Long

No

to

Integer

+

Long

No

No

Specifies the event source. If the event is a system event, the source is the namespace of each service. To view the namespace of each service, see Services Interconnected with Cloud Eye. If the event is a custom event, the event source is defined by the user.

+

Specifies the event source. For a system event, the source is the namespace of each service. To view the namespace of each service, see Services Interconnected with Cloud Eye. If the event is a custom event, the event source is defined by the user.

event_info

diff --git a/docs/ces/api-ref/en-us_topic_0032831274.html b/docs/ces/api-ref/en-us_topic_0032831274.html index 0af026f9d..b1ab04617 100644 --- a/docs/ces/api-ref/en-us_topic_0032831274.html +++ b/docs/ces/api-ref/en-us_topic_0032831274.html @@ -2,7 +2,7 @@

Adding Monitoring Data

Function

This API is used to add one or more pieces of custom metric monitoring data to solve the problem that the system metrics cannot meet specific service requirements.

-

For details about the monitoring data retention period, see How Long Is Metric Data Retained?

+

For details about the monitoring data retention period, see How Long Is Metric Data Retained? in Cloud Eye User Guide.

URI

POST /V1.0/{project_id}/metric-data

  • Parameter description diff --git a/docs/ces/api-ref/en-us_topic_0109034020.html b/docs/ces/api-ref/en-us_topic_0109034020.html index 35fb6f26d..979d83a32 100644 --- a/docs/ces/api-ref/en-us_topic_0109034020.html +++ b/docs/ces/api-ref/en-us_topic_0109034020.html @@ -27,80 +27,82 @@
  • Example
    POST https://{Cloud Eye endpoint}/V1.0/{project_id}/events
-

Request

  • Request parameters -
    Table 2 Parameter description

    Parameter

    +

    Request

    Events with the same time, project_id, event_source, event_name, event_type, event_state, event_level, event_user, resource_id and resource_name fields are considered as the same event.

    +
    +
    • Request parameters +
      - - - - - - -
      Table 2 Parameter description

      Parameter

      Type

      +

      Type

      Mandatory

      +

      Mandatory

      Description

      +

      Description

      [Array element]

      +

      [Array element]

      Array of EventItem objects

      +

      Array of EventItem objects

      Yes

      +

      Yes

      Specifies the event list.

      +

      Specifies the event list.

      -
      Table 3 Parameter description of the EventItem field

      Parameter

      +
      - - - - - - - - - - - - - - - - - - - @@ -108,125 +110,177 @@
      Table 3 Parameter description of the EventItem field

      Parameter

      Mandatory

      +

      Mandatory

      Type

      +

      Type

      Description

      +

      Description

      event_name

      +

      event_name

      Yes

      +

      Yes

      String

      +

      String

      Specifies the event name.

      +

      Specifies the event name.

      Start with a letter. Enter 1 to 64 characters. Only letters, digits, and underscores (_) are allowed.

      event_source

      +

      event_source

      Yes

      +

      Yes

      String

      +

      String

      Specifies the event source.

      +

      Specifies the event source.

      The format is service.item. Set this parameter based on the site requirements.

      service and item each must be a string that starts with a letter and contains 3 to 32 characters, including only letters, digits, and underscores (_).

      time

      +

      time

      Yes

      +

      Yes

      Long

      +

      Long

      Specifies when the event occurred, which is a UNIX timestamp (ms).

      +

      Specifies when the event occurred, which is a UNIX timestamp (ms).

      NOTE:

      Since there is a latency between the client and the server, the data timestamp to be inserted should be within the period that starts from one hour before the current time plus 20s to 10 minutes after the current time minus 20s. In this way, the timestamp will be inserted to the database without being affected by the latency.

      For example, if the current time is 2020.01.30 12:00:30, the timestamp inserted must be within the range [2020.01.30 11:00:50, 2020.01.30 12:10:10]. The corresponding UNIX timestamp is [1580353250, 1580357410].

      detail

      +

      detail

      Yes

      +

      Yes

      Detail object

      +

      Detail object

      Specifies the event details.

      +

      Specifies the event details.

      For details, see Table 4.

      -
      Table 4 detail data structure description

      Parameter

      +
      - - - - - - - - - - - + + + + + - - - - - - - - - - - - - - - - - - - - + + + + + + +
      Table 4 detail data structure description

      Parameter

      Mandatory

      +

      Mandatory

      Type

      +

      Type

      Description

      +

      Description

      content

      +

      content

      No

      +

      No

      String

      +

      String

      Specifies the event content. Enter up to 4,096 characters.

      +

      Specifies the event content. Enter up to 4,096 characters.

      +
      NOTE:

      In some scenarios, this field does not support \n. When this happens, \n is preferentially converted to \\n.

      +

      resource_id

      +

      group_id

      No

      +

      No

      String

      +

      String

      Specifies the resource ID. Enter up to 128 characters, including letters, digits, underscores (_), hyphens (-), and colon (:).

      +

      Specifies the resource group the event belongs to.

      +

      This ID must be an existing resource group ID.

      +

      To query the group ID, perform the following steps:

      +
      1. Log in to the management console.
      2. Click Cloud Eye.
      3. Choose Resource Groups.

        Obtain the resource group ID in the Name /ID column.

        +
      +

      resource_id

      +

      No

      +

      String

      +

      Specifies the resource ID. Enter up to 128 characters, including letters, digits, underscores (_), hyphens (-), and colon (:).

      Example: 6a69bf28-ee62-49f3-9785-845dacd799ec

      To query the resource ID, perform the following steps:

      1. Log in to the management console.
      2. Under Computing, select Elastic Cloud Server.

        On the Resource Overview page, obtain the resource ID.

      resource_name

      +

      resource_name

      No

      +

      No

      String

      +

      String

      Specifies the resource name. Enter up to 128 characters, including letters, digits, underscores (_), hyphens (-), and periods (.).

      +

      Specifies the resource name. Enter up to 128 characters, including letters, digits, underscores (_), hyphens (-), and periods (.).

      event_state

      +

      event_state

      No

      +

      No

      String

      +

      String

      Specifies the event status.

      +

      Specifies the event status.

      The value can be normal, warning, or incident.

      event_level

      +

      event_level

      No

      +

      No

      String

      +

      String

      Specifies the event severity.

      +

      Specifies the event severity.

      The value can be Critical, Major, Minor, or Info.

      event_user

      +

      event_user

      No

      +

      No

      String

      +

      String

      Specifies the event user.

      +

      Specifies the event user.

      Enter up to 64 characters, including letters, digits, underscores (_), hyphens (-), slashes (/), and spaces.

      event_type

      +

      event_type

      No

      +

      No

      String

      +

      String

      Specifies the event type.

      +

      Specifies the event type.

      Its value can be EVENT.SYS or EVENT.CUSTOM. EVENT.SYS indicates system events that cannot be reported by users. Only custom events can be reported.

      dimensions

      +

      No

      +

      Array of objects

      +

      Specifies the event dimension. Currently, a maximum of four dimensions are supported. Resource information is described by dimension.

      +

      Event alarm rules can be configured by dimension to monitor resources and resource groups.

      +

      For parameter details, see Table 5.

      +
      +
      + +
      + + + + + + + + + + + + + + +
      Table 5 dimensions data structure description

      Parameter

      +

      Type

      +

      Mandatory

      +

      Description

      +

      name

      +

      String

      +

      Yes

      +

      Specifies the dimension. For example, the ECS dimension is instance_id. For details about the dimension of each service, see the key column in Services Interconnected with Cloud Eye.

      +

      value

      +

      String

      +

      Yes

      +

      Specifies the dimension value, for example, an ECS ID.

      +

      The parameter can contain 1 to 256 characters.

      +
      -
      • Example request
        [{
        -    "event_name":"systemInvaded",
        -    "event_source":"financial.System",
        -    "time":1522121194000,
        -    "detail":{
        -        "content":"The financial system was invaded",
        -        "group_id":"rg15221211517051YWWkEnVd",
        -        "resource_id":"1234567890sjgggad",
        -        "resource_name":"ecs001",
        -        "event_state":"normal",
        -        "event_level":"Major",
        -        "event_user":"xiaokong",
        -        "event_type": "EVENT.CUSTOM"
        +
        • Example request
          [
          + {
          +  "event_name": "systemInvaded",
          +  "event_source": "financial.System",
          +  "time": 1742264993000,
          +  "detail": {
          +   "content": "The financial system was invaded",
          +   "group_id": "rg15221211517051YWWkEnVd",
          +   "resource_id": "1234567890sjgggad",
          +   "resource_name": "ecs001",
          +   "event_state": "normal",
          +   "event_level": "Major",
          +   "event_user": "xiaokong",
          +   "event_type": "EVENT.CUSTOM",
          +   "dimensions": [
          +    {
          +     "name": "instance_id",
          +     "value": "instance_xxx"
               }
          -},
          -{
          -    "event_name":"systemInvaded",
          -    "event_source":"financial.System",
          -    "time":1522121194020,
          -    "detail":{
          -        "content":"The financial system was invaded",
          -        "group_id":"rg15221211517051YWWkEnVd",
          -        "resource_id":"1234567890sjgggad",
          -        "resource_name":"ecs001",
          -        "event_state":"normal",
          -        "event_level":"Major",
          -        "event_user":"xihong",
          -        "event_type": "EVENT.CUSTOM"
          -    }
          -}]
          + ] + } + } +]

      Response

      • Response parameters -
        Table 5 Parameter description

        Parameter

        +
        @@ -239,14 +293,14 @@
        Table 6 Parameter description

        Parameter

        Type

        Array of objects

        Specifies the event list.

        -

        For details, see Table 6.

        +

        For details, see Table 7.

        -
        Table 6 Response parameters

        Parameter

        +
        @@ -283,10 +337,6 @@ { "event_id":"evdgiqwgedkkcvhdjcdu346", "event_name":"systemInvaded" - }, - { - "event_id":"evdgiqwgedkkcvhdjcdu347", - "event_name":"systemParalysis" } ]
        Table 7 Response parameters

        Parameter

        Mandatory