From 5d25748c82d8bcac3d66f9a1ab31ddc53780dc70 Mon Sep 17 00:00:00 2001 From: Vladimir Hasko Date: Sun, 21 May 2023 21:05:03 +0000 Subject: [PATCH] fixing wrong bullets in multiple places --- .../internal/apimon_training/alerts.rst | 21 +++---- .../internal/apimon_training/databases.rst | 9 --- .../internal/apimon_training/introduction.rst | 2 +- doc/source/internal/apimon_training/logs.rst | 18 +++--- .../internal/apimon_training/metrics.rst | 59 ++++++++++--------- .../apimon_training/test_scenarios.rst | 7 --- 6 files changed, 53 insertions(+), 63 deletions(-) diff --git a/doc/source/internal/apimon_training/alerts.rst b/doc/source/internal/apimon_training/alerts.rst index 293b579..846330a 100644 --- a/doc/source/internal/apimon_training/alerts.rst +++ b/doc/source/internal/apimon_training/alerts.rst @@ -6,16 +6,17 @@ https://alerts.eco.tsi-dev.otc-service.com/ The authentication is centrally managed by LDAP. -- Alerta is a monitoring tool to integrate alerts from multiple sources. -- The alerts from different sources can be consolidated and de-duplicated. -- On ApiMon it is hosted on same instance as Grafana just listening on - different port. -- The Zulip API was integrated with Alerta, to send notification of - errors/alerts on zulip stream. -- Alerts displayed on OTC Alerta are generated either by Executor or by - Grafana. - - “Executor alerts” focus on playbook results, whether playbook has completed or failed. - - “Grafana alerts” focus on breaching the defined thresholds. For example API response time is higher than defined threshold. +- Alerta is a monitoring tool to integrate alerts from multiple sources. +- The alerts from different sources can be consolidated and de-duplicated. +- On ApiMon it is hosted on same instance as Grafana just listening on + different port. +- The Zulip API was integrated with Alerta, to send notification of + errors/alerts on zulip stream. +- Alerts displayed on OTC Alerta are generated either by Executor or by + Grafana. + + - “Executor alerts” focus on playbook results, whether playbook has completed or failed. + - “Grafana alerts” focus on breaching the defined thresholds. For example API response time is higher than defined threshold. .. image:: training_images/alerta_dashboard.png diff --git a/doc/source/internal/apimon_training/databases.rst b/doc/source/internal/apimon_training/databases.rst index 5a7ce20..5c8a0b1 100644 --- a/doc/source/internal/apimon_training/databases.rst +++ b/doc/source/internal/apimon_training/databases.rst @@ -57,7 +57,6 @@ Counters and timers have following subbranches: - apimon.metric → specific apimon metrics not gathered by the OpenStack API methods - - openstack.api → pure API request metrics Every section has further following branches: @@ -81,13 +80,10 @@ OpenStack metrics branch is structured as following: - response code - received response code - count/upper/lower/mean/etc - timer specific metrics (available only under stats.timers.openstack.api.$environment.$zone.$service.$request_method.$resource.$status_code.{count,mean,upper,*}) - - count/rate - counter specific metrics (available only under stats.counters.openstack.api.$environment.$zone.$service.$request_method.$resource.$status_code.{count,mean,upper,*}) - attempted - counter for the attempted requests (only for counters) - - failed - counter of failed requests (not received response, connection problems, etc) (only for counters) - - passed - counter of requests receiving any response back (only for counters) @@ -106,15 +102,10 @@ apimon.metric - stats.timers.apimon.metric.$environment.$zone.**csm_lb_timings**.{public,private}.{http,https,tcp}.$az.__VALUE__ - timer values for the loadbalancer test - - stats.counters.apimon.metric.$environment.$zone.**csm_lb_timings**.{public,private}.{http,https,tcp}.$az.{attempted,passed,failed} - counter values for the loadbalancer test - - stats.timers.apimon.metric.$environment.$zone.**curl**.$host.{passed,failed}.__VALUE__ - timer values for the curl test - - stats.counters.apimon.metric.$environment.$zone.**curl**.$host.{attempted,passed,failed} - counter values for the curl test - - stats.timers.apimon.metric.$environment.$zone.**dns**.$ns_name.$host - timer values for the NS lookup test. $ns_name is the DNS servers used to query the records - - stats.counters.apimon.metric.$environment.$zone.**dns**.$ns_name.$host.{attempted,passed,failed} - counter values for the NS lookup test diff --git a/doc/source/internal/apimon_training/introduction.rst b/doc/source/internal/apimon_training/introduction.rst index 3b32700..81f437b 100644 --- a/doc/source/internal/apimon_training/introduction.rst +++ b/doc/source/internal/apimon_training/introduction.rst @@ -79,7 +79,7 @@ ApiMon comes with the following features: - Each squad can control and manage their test scenarios and dashboards - Every execution of ansible playbooks stores the log file for further - investigation/analysis on swift object storage + investigation/analysis on swift object storage What ApiMon is NOT diff --git a/doc/source/internal/apimon_training/logs.rst b/doc/source/internal/apimon_training/logs.rst index ef712a6..25b6d68 100644 --- a/doc/source/internal/apimon_training/logs.rst +++ b/doc/source/internal/apimon_training/logs.rst @@ -6,14 +6,16 @@ Logs - - Every single job run log is stored on OpenStack Swift object storage. - - Each single job log file provides unique URL which can be accessed to see log - details - - These URLs are available on all APIMON levels: - - In Zulip alarm messages - - In Alerta events - - In Grafana Dashboards - - Logs are simple plain text files of the whole playbook output:: +- Every single job run log is stored on OpenStack Swift object storage. +- Each single job log file provides unique URL which can be accessed to see log + details +- These URLs are available on all APIMON levels: + + - In Zulip alarm messages + - In Alerta events + - In Grafana Dashboards + +- Logs are simple plain text files of the whole playbook output:: 2020-07-12 05:54:04.661170 | TASK [List Servers] 2020-07-12 05:54:09.050491 | localhost | ok diff --git a/doc/source/internal/apimon_training/metrics.rst b/doc/source/internal/apimon_training/metrics.rst index c5c9866..244330f 100644 --- a/doc/source/internal/apimon_training/metrics.rst +++ b/doc/source/internal/apimon_training/metrics.rst @@ -6,34 +6,37 @@ Metrics The ansible playbook scenarios generate metrics in two ways: -- The Ansible playbook internally invokes method calls to **OpenStack SDK - libraries.** They in turn generate metrics about each API call they do. This - requires some special configuration in the clouds.yaml file (currently - exposing metrics into statsd and InfluxDB is supported). For details refer - to the [config - documentation](https://docs.openstack.org/openstacksdk/latest/user/guides/stats.html) - of the OpenStack SDK. The following metrics are captured: - - response HTTP code - - duration of API call - - name of API call - - method of API call - - service type -- Ansible plugins may **expose additional metrics** (i.e. whether the overall - scenario succeed or not) with help of [callback - plugin](https://github.com/stackmon/apimon/tree/main/apimon/ansible/callback). - Since sometimes it is not sufficient to know only the timings of each API - call, Ansible callbacks are utilized to report overall execution time and - result (whether the scenario succeeded and how long it took). The following - metrics are captured: - - test case - - playbook name - - environment - - action name - - result code - - result string - - service type - - state type - - total amount of (failed, passed, ignored, skipped tests) +- The Ansible playbook internally invokes method calls to **OpenStack SDK + libraries.** They in turn generate metrics about each API call they do. This + requires some special configuration in the clouds.yaml file (currently + exposing metrics into statsd and InfluxDB is supported). For details refer + to the `config + documentation `_ + of the OpenStack SDK. The following metrics are captured: + + - response HTTP code + - duration of API call + - name of API call + - method of API call + - service type + +- Ansible plugins may **expose additional metrics** (i.e. whether the overall + scenario succeed or not) with help of `callback + plugin `_. + Since sometimes it is not sufficient to know only the timings of each API + call, Ansible callbacks are utilized to report overall execution time and + result (whether the scenario succeeded and how long it took). The following + metrics are captured: + + - test case + - playbook name + - environment + - action name + - result code + - result string + - service type + - state type + - total amount of (failed, passed, ignored, skipped tests) Custom metrics: diff --git a/doc/source/internal/apimon_training/test_scenarios.rst b/doc/source/internal/apimon_training/test_scenarios.rst index 3b66956..fea1a47 100644 --- a/doc/source/internal/apimon_training/test_scenarios.rst +++ b/doc/source/internal/apimon_training/test_scenarios.rst @@ -66,7 +66,6 @@ ensure sustainability of the endless exceution of such scenarios: `Openstack.Cloud `_ collections for native interaction with cloud in ansible. - - In case there are features not supported by collection you can still use script module and call directly python SDK script to invoke required request towards cloud @@ -80,7 +79,6 @@ ensure sustainability of the endless exceution of such scenarios: - Make sure that deletion / cleanup of the resources is triggered even if some of the tasks in playbooks will fail - - Make sure that deletion / cleanup is triggered in right order - **Simplicity** @@ -92,10 +90,8 @@ ensure sustainability of the endless exceution of such scenarios: - ApiMon is not supposed to validate full service functionality. For such cases we have different team / framework within QA responsibility - - Focus only on core functions which are critical for basic operation / lifecycle of the service. - - The less functions you use the less potential failure rate you will have on runnign scenario for whatever reasons @@ -103,7 +99,6 @@ ensure sustainability of the endless exceution of such scenarios: - Every single hardcoded parameter in scenario will later lead to potential outage of the scenario's run in future when such parameter might change - - Try to obtain all such parameters dynamically from the cloud directly. - **Special tags for combined metrics** @@ -112,7 +107,6 @@ ensure sustainability of the endless exceution of such scenarios: metric you can do with using tags parameter in the tasks - Custom metrics in Test Scenarios ================================ @@ -196,4 +190,3 @@ In following example the custom metric stores the result of multiple tasks in sp command: "ssh -o 'UserKnownHostsFile=/dev/null' -o 'StrictHostKeyChecking=no' linux@{{ server_ip }} -i ~/.ssh/{{ test_keypair_name }}.pem" tags: ["az=default", "service=compute", "metric=create_server"] -