forked from docs/docsportal
updating EpMon and Test Cases
This commit is contained in:
parent
fc2f6fd929
commit
8537442f8f
@ -9,11 +9,16 @@ The authentication is centrally managed by LDAP.
|
|||||||
|
|
||||||
- Alerta is a monitoring tool to integrate alerts from multiple sources.
|
- Alerta is a monitoring tool to integrate alerts from multiple sources.
|
||||||
- The alerts from different sources can be consolidated and de-duplicated.
|
- The alerts from different sources can be consolidated and de-duplicated.
|
||||||
- On ApiMon it is hosted on same instance as Grafana just listening on different port.
|
- On ApiMon it is hosted on same instance as Grafana just listening on
|
||||||
- The Zulip API was integrated with Alerta, to send notification of errors/alerts on zulip stream.
|
different port.
|
||||||
- Alerts displayed on OTC Alerta are generated either by Executor or by Grafana.
|
- The Zulip API was integrated with Alerta, to send notification of
|
||||||
- “Executor alerts” focus on playbook results, whether playbook has completed or failed.
|
errors/alerts on zulip stream.
|
||||||
- “Grafana alerts” focus on breaching the defined thresholds. For example API response time is higher than defined threshold.
|
- Alerts displayed on OTC Alerta are generated either by Executor or by
|
||||||
|
Grafana.
|
||||||
|
- “Executor alerts” focus on playbook results, whether playbook has
|
||||||
|
completed or failed.
|
||||||
|
- “Grafana alerts” focus on breaching the defined thresholds. For example
|
||||||
|
API response time is higher than defined threshold.
|
||||||
|
|
||||||
.. image:: training_images/alerta_dashboard.png
|
.. image:: training_images/alerta_dashboard.png
|
||||||
|
|
||||||
|
@ -8,8 +8,10 @@ The authentication is centrally managed by LDAP.
|
|||||||
|
|
||||||
|
|
||||||
- The ApiMon Dashboards are segregated based on the type of service.
|
- The ApiMon Dashboards are segregated based on the type of service.
|
||||||
- The “OTC KPI” dashboard provides high level overview about OTC stability and reliability for management.
|
- The “OTC KPI” dashboard provides high level overview about OTC stability and
|
||||||
- “Endpoint monitoring” dashboard monitors health of every endpoint url listed by endpoint services catalogue.
|
reliability for management.
|
||||||
|
- “Endpoint monitoring” dashboard monitors health of every endpoint url listed
|
||||||
|
by endpoint services catalogue.
|
||||||
- “Respective service statistics” dashboards provide more detailed overview.
|
- “Respective service statistics” dashboards provide more detailed overview.
|
||||||
- Dashboards can be replicated/customized for individual Squad needs.
|
- Dashboards can be replicated/customized for individual Squad needs.
|
||||||
|
|
||||||
@ -20,3 +22,17 @@ OTC KPI Dashboard
|
|||||||
=================
|
=================
|
||||||
|
|
||||||
.. image:: training_images/kpi_dashboard.png
|
.. image:: training_images/kpi_dashboard.png
|
||||||
|
|
||||||
|
24/7 dasbhoards
|
||||||
|
===============
|
||||||
|
|
||||||
|
Endpoint Monitoring Dashboard
|
||||||
|
=============================
|
||||||
|
|
||||||
|
Common Test Results Dashboard
|
||||||
|
=============================
|
||||||
|
|
||||||
|
Service Based dashboard
|
||||||
|
=======================
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,21 +10,22 @@ understand what is supported in which mode.
|
|||||||
|
|
||||||
The most important differences are described in the table below:
|
The most important differences are described in the table below:
|
||||||
|
|
||||||
+-----------------------+------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------+
|
+-----------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------+
|
||||||
| **Differences** | **ApiMon (CMO)** | **ApiMon(FMO)** |
|
| **Differences** | **ApiMon (CMO)** | **ApiMon(FMO)** |
|
||||||
+=======================+============================================================================================================+===============================================================+
|
+=======================+============================================================================================================+==========================================================================+
|
||||||
| Playbook scenarios | https://github.com/opentelekomcloud-infra/apimon-test | https://github.com/stackmon/apimon-tests/tree/main/playbooks |
|
| Playbook scenarios | https://github.com/opentelekomcloud-infra/apimon-test | https://github.com/stackmon/apimon-tests/tree/main/playbooks |
|
||||||
+-----------------------+------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------+
|
+-----------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------+
|
||||||
| Dashboards | https://github.com/opentelekomcloud-infra/system-config/tree/main/playbooks/templates/grafana/apimon | https://github.com/stackmon/apimon-tests/tree/main/dashboards |
|
| Dashboards setup | https://github.com/opentelekomcloud-infra/system-config/tree/main/playbooks/templates/grafana/apimon | https://github.com/stackmon/apimon-tests/tree/main/dashboards |
|
||||||
+-----------------------+------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------+
|
+-----------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------+
|
||||||
| Environment setup | https://github.com/opentelekomcloud-infra/system-config/blob/main/inventory/service/group_vars/apimon.yaml | https://github.com/opentelekomcloud-infra/stackmon-config |
|
| Environment setup | https://github.com/opentelekomcloud-infra/system-config/blob/main/inventory/service/group_vars/apimon.yaml | https://github.com/opentelekomcloud-infra/stackmon-config |
|
||||||
+-----------------------+------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------+
|
+-----------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------+
|
||||||
| Implementation mode | standalone app | plugin based |
|
| Implementation mode | standalone app | plugin based |
|
||||||
+-----------------------+------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------+
|
+-----------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------+
|
||||||
| Source of information | opentelekomcloud-infra | stackmon |
|
| Source of information | opentelekomcloud-infra | stackmon |
|
||||||
+-----------------------+------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------+
|
+-----------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------+
|
||||||
| Portal | https://dashboard.tsi-dev.otc-service.com/ | https://dashboard.tsi-dev.otc-service.com/ |
|
| Dashboards | https://dashboard.tsi-dev.otc-service.com/ | https://dashboard.tsi-dev.otc-service.com/ |
|
||||||
+-----------------------+------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------+
|
| | https://dashboard.tsi-dev.otc-service.com/dashboards/f/UaB8meoZk/apimon | https://dashboard.tsi-dev.otc-service.com/dashboards/f/CloudMon/cloudmon |
|
||||||
| Documentation | https://confluence.tsi-dev.otc-service.com/display/ES/API-Monitoring | https://stackmon.github.io/ |
|
+-----------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------+
|
||||||
| | | https://stackmon-cloudmon.readthedocs.io/en/latest/index.html |
|
| Documentation | https://confluence.tsi-dev.otc-service.com/display/ES/API-Monitoring | https://stackmon.github.io/ |
|
||||||
+-----------------------+------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------+
|
| | | https://stackmon-cloudmon.readthedocs.io/en/latest/index.html |
|
||||||
|
+-----------------------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------+
|
||||||
|
@ -1,3 +1,38 @@
|
|||||||
============================
|
============================
|
||||||
Endpoint Monitoring overview
|
Endpoint Monitoring overview
|
||||||
============================
|
============================
|
||||||
|
|
||||||
|
|
||||||
|
EpMon is a standalone python based process targetting every OTC service. Tt
|
||||||
|
finds service in the service catalogs and sends GET requests to the configured
|
||||||
|
endpoints.
|
||||||
|
|
||||||
|
Performing extensive tests like provisioning a server is giving a great
|
||||||
|
coverage, but is usually not something what can be performed very often and
|
||||||
|
leaves certain gaps on the timescale of monitoring. In order to cover this gap
|
||||||
|
EpMon component is capable to send GET requests to the given URLs relying on the
|
||||||
|
API discovery of the OpenStack cloud (perform GET request to /servers or the
|
||||||
|
compute endpoint). Such requests are cheap and can be performed in the loop i.e.
|
||||||
|
every 5 seconds. Latency of those calls, as well as the return codes are being
|
||||||
|
captured and sent to the metrics storage.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
Currently EpMon configuration is located in system-config:
|
||||||
|
https://github.com/opentelekomcloud-infra/system-config/blob/main/inventory/service/group_vars/apimon.yaml
|
||||||
|
(this will change in future once CloudMon will take place)
|
||||||
|
|
||||||
|
And defines the query HTTP targets for every single OTC service.
|
||||||
|
|
||||||
|
EpMon dashboard provides general availability status of every service definition
|
||||||
|
from service catalog:
|
||||||
|
|
||||||
|
.. image:: training_images/epmon_status_dashboard.jpg
|
||||||
|
|
||||||
|
Additionally it provides further details for the endpoints like response times,
|
||||||
|
detected error codes or no responses at all.
|
||||||
|
|
||||||
|
.. image:: training_images/epmon_dashboard_details.jpg
|
||||||
|
|
||||||
|
EpMon findings are also reported to Alerta and notifications are sent to Zulip
|
||||||
|
dedicated topic "apimon_endpoint_monitoring".
|
@ -49,6 +49,9 @@ ApiMon Architecture Summary
|
|||||||
- Alerta further sends error notification on Zulip #Alerts Stream.
|
- Alerta further sends error notification on Zulip #Alerts Stream.
|
||||||
- Log Files are maintained on OTC object storage via swift.
|
- Log Files are maintained on OTC object storage via swift.
|
||||||
|
|
||||||
|
ApiMon features
|
||||||
|
---------------
|
||||||
|
|
||||||
ApiMon comes with the following features:
|
ApiMon comes with the following features:
|
||||||
|
|
||||||
- Support of ansible playbooks for testing scenarios
|
- Support of ansible playbooks for testing scenarios
|
||||||
@ -72,7 +75,9 @@ ApiMon comes with the following features:
|
|||||||
- Every exectution of ansible playbooks stores the log file for further
|
- Every exectution of ansible playbooks stores the log file for further
|
||||||
investigation/analysis on swift
|
investigation/analysis on swift
|
||||||
|
|
||||||
What ApiMon is NOT:
|
|
||||||
|
What ApiMon is NOT
|
||||||
|
------------------
|
||||||
|
|
||||||
The following items are out of scope (while some of them are technically
|
The following items are out of scope (while some of them are technically
|
||||||
possible):
|
possible):
|
||||||
|
@ -5,13 +5,13 @@ Logs
|
|||||||
|
|
||||||
|
|
||||||
- Every single job run log is stored on object storage
|
- Every single job run log is stored on object storage
|
||||||
- Each single job log file provides unique URL which can be accessed to see log details
|
- Each single job log file provides unique URL which can be accessed to see log
|
||||||
|
details
|
||||||
- These URLs are available on all APIMON levels:
|
- These URLs are available on all APIMON levels:
|
||||||
- In Zulip alarm messages
|
- In Zulip alarm messages
|
||||||
- In Alerta events
|
- In Alerta events
|
||||||
- In Grafana Dashboards
|
- In Grafana Dashboards
|
||||||
- Logs are simple plain text files of the whole playbook output.
|
- Logs are simple plain text files of the whole playbook output::
|
||||||
|
|
||||||
|
|
||||||
2020-07-12 05:54:04.661170 | TASK [List Servers]
|
2020-07-12 05:54:04.661170 | TASK [List Servers]
|
||||||
|
|
||||||
|
@ -2,7 +2,8 @@
|
|||||||
Monitoring coverage
|
Monitoring coverage
|
||||||
===================
|
===================
|
||||||
|
|
||||||
Multiple factors define the monitoring coverage to simulate common customer use cases.
|
Multiple factors define the monitoring coverage to simulate common customer use
|
||||||
|
cases.
|
||||||
|
|
||||||
|
|
||||||
Monitored locations
|
Monitored locations
|
||||||
|
@ -2,8 +2,17 @@
|
|||||||
Notifications
|
Notifications
|
||||||
=============
|
=============
|
||||||
|
|
||||||
You will see notifications of errors on OTC Zulip #Alerts Stream.
|
You will see notifications of errors on OTC Zulip:
|
||||||
|
|
||||||
If the error has been acknowledged on Alerta, the new notification message for repeating error wont get posted again on Zulip.
|
- #Alerts Stream
|
||||||
|
- #Alerts-Hybrid Stream
|
||||||
|
- #Alerts-Preprod Stream
|
||||||
|
|
||||||
|
Every stream contains topics based on the service type (if represented by
|
||||||
|
standalone ansible playbook) and general apimon_endpoint_monitor topic whihc
|
||||||
|
contains alerts of GET queries towards all services.
|
||||||
|
|
||||||
|
If the error has been acknowledged on Alerta, the new notification message for
|
||||||
|
repeating error wont get posted again on Zulip.
|
||||||
|
|
||||||
.. image:: training_images/zulip_notifications.png
|
.. image:: training_images/zulip_notifications.png
|
||||||
|
@ -1,3 +1,9 @@
|
|||||||
==============
|
==============
|
||||||
Test Scenarios
|
Test Scenarios
|
||||||
==============
|
==============
|
||||||
|
|
||||||
|
|
||||||
|
Test Scenarios playbooks are located at
|
||||||
|
https://github.com/opentelekomcloud-infra/apimon-test. (the location will change
|
||||||
|
with CloudMon replacement in future).
|
||||||
|
|
||||||
|
File diff suppressed because one or more lines are too long
Before Width: | Height: | Size: 59 KiB After Width: | Height: | Size: 247 KiB |
Binary file not shown.
After Width: | Height: | Size: 96 KiB |
Binary file not shown.
After Width: | Height: | Size: 165 KiB |
Loading…
x
Reference in New Issue
Block a user