Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com> Co-authored-by: yangtong <yangtong2@huawei.com> Co-committed-by: yangtong <yangtong2@huawei.com>
45 KiB
Adding an SQL Inspection
Scenario
You can add rules for specified tenants and SQL engines on MRS Manager. The system will display hints on, intercept, or block SQL requests matched by the rules.
Adding a Rule
- Log in to MRS Manager as a user with the Manager administrator rights.
- Click Cluster and choose SQL Inspector. The SQL Inspector page is displayed.
You can click View Supported Rules to view all SQL inspection rules supported by the current cluster.
- Click Add Rule. After the password of the current user is verified, the Add Rule page is displayed.
- Set the required parameters and click OK.
Parameter
Description
Name
Name of a SQL inspection rule
ID
Rule ID
For details about meaning of the rules corresponding to the IDs, see Table 1.
Tenant
Click Add to select the name of the tenant to which the current rule will be associated.
If you need to add a new tenant, plan and create a cluster tenant by referring to Tenant Resources.
Services and Actions
Click Add to specify the SQL engine to which this rule will be associated with and set the threshold parameters of the rule.
Each rule can be associated with one SQL engine. If you want to configure a rule for other SQL engines, add new rules.
- Service: Select the SQL engine associated with the current rule.
- If an SQL request meets the rule, the system performs the following operations:
- Hint: Record logs and display a hint for handling the SQL request. If the rule has parameters, you need to configure the threshold.
- Intercept: Intercept the SQL request that meets the rule. If the rule has parameters, you need to configure the threshold.
- Block: Block the SQL request that meets the rule. If the rule has parameters, you need to configure the threshold.NOTE:
For static and dynamic interception rules, Hint and Block operations are supported. For blocking rules, only the Block operation is supported.
- View the added prevention rule on the SQL Defense page. The rule takes effect dynamically.
To adjust the current rule, click Modify in the Operation column of the row that contains the target rule. After the user password is verified, you can modify rule parameters.
MRS SQL Inspection Rules
ID |
Description |
Engine |
Threshold |
Example SQL Statement |
Impact |
|---|---|---|---|---|---|
static_0001 |
Check whether the number of the count(distinct) functions used in a SQL statement exceeds the preconfigured threshold. |
|
Number of the count(distinct) functions Recommended value: 10 |
SELECT COUNT(DISTINCT deviceId), COUNT(DISTINCT collDeviceId) FROM table GROUP BY deviceName, collDeviceName, collCurrentVersion; |
The select count(distinct) syntax generates only one Reduce. When a large table is processed, the data volume to shuffle is large and the execution is slow. If there are multiple count distinct, multiple records are generated for the same record for shuffling, increasing the shuffling amount and slowing down the job execution. |
static_0002 |
Check whether not in <subquery> is used in a SQL statement. |
|
N/A |
SELECT * FROM Orders o WHERE Orders.Order_ID not in (Select Order_ID FROM HeldOrders h where h.order_id = o.order_id); |
The not in subquery performance is poor. If the listed values of the not in clause contains null, no data is returned in the result. |
static_0003 |
Check whether the number of joins in a SQL statement exceeds the threshold. |
|
Number of joins Recommended value: 20 |
N/A |
The more tables are joined, the more files, partitions, and data are scanned. As a result, the SQL statement occupies too much memory, affecting cluster stability. |
static_0004 |
Check whether the number of the union all operators in a SQL statement exceeds the threshold. |
|
Number of the union all operators in a statement Recommended value: 20 |
select * from tables t1 union all select * from tables t2 union all select * from tables t3 union all select * from tables t4 union all select * from tables t5 union all select * from tables t6 union all select * from tables t7 union all select * from tables t8 union all select * from tables t9; |
A large number of union all operations may generate ultra-large result sets. As a result, a large number of HDFS and Yarn resources are occupied during shuffling. |
static_0005 |
Check whether the number of nested subqueries exceeds the threshold. |
|
Number of nested subqueries Recommended value: 20 |
select * from ( with temp1 as (select * from tables) select * from temp1); |
If there are too many SQL nesting layers, temporary data is generated for multiple times, and SQL statements are difficult to maintain and modify. You are advised to avoid multiple nested queries to improve execution efficiency and SQL maintainability. |
static_0006 |
Check whether the length of a SQL statement exceeds the threshold. |
|
Length of the SQL statement, in KB Recommended value: 10 |
N/A |
If a SQL string is too long, the SQL statement may be too complex, which may cause memory and performance problems. In addition, the SQL statement is difficult to maintain. |
static_0007 |
Check whether the Cartesian product exists when multiple tables are associated. |
|
N/A |
select * from A,B; |
The Cartesian product causes data expansion. When a task is running, a large amount of HDFS space and YARN resources may be occupied, affecting the execution of other tasks. |
static_0008 |
Check whether alter table update is performed at the cluster level (on cluster). |
ClickHouse |
N/A |
alter table testtb1 on cluster default_cluster update price=10.0 where id='100' |
Updating and deleting data consume a large number of CPU and memory resources. Cluster-level operations pose high pressure on the database. As a result, tasks on some nodes may fail to be executed, time out, or do not respond for a long time, affecting cluster stability. |
static_0009 |
Check whether alter table delete is performed at the cluster level (on cluster). |
ClickHouse |
N/A |
alter table testtb1 on cluster default_cluster delete where id ='10' |
|
static_0010 |
Check whether alter table add column is performed at the cluster level (on cluster). |
ClickHouse |
N/A |
alter table testtb1 on cluster default_cluster add column testc String |
Adding and deleting columns consume a large number of CPU and memory resources. Cluster-level operations pose high pressure on the database. As a result, tasks on some nodes may fail to be executed, time out, or do not respond for a long time, causing metadata inconsistency and affecting cluster stability. |
static_0011 |
Check whether alter table drop column is performed at the cluster level (on cluster). |
ClickHouse |
N/A |
alter table testtb1 on cluster default_cluster drop column testc |
|
static_0012 |
Check whether optimize final is performed at the cluster level (on cluster). |
ClickHouse |
N/A |
optimize table testtb1 on cluster default_cluster final |
Manual combination consumes a large number of CPU and memory resources and disk I/O resources when the table data volume is large. Cluster-level operations pose high pressure on the database. As a result, tasks on some nodes may fail to be executed, time out, or do not respond for a long time, affecting cluster stability. |
static_0013 |
Check whether drop is performed at the cluster level (on cluster). |
ClickHouse |
N/A |
drop table/database test on cluster default_cluster; |
Dropping tables consumes a large number of CPU and memory resources and disk I/O resources when the metadata volume and data volume are large. Cluster-level operations pose high pressure on the database. As a result, tasks on some nodes may fail to be executed, time out, or do not respond for a long time, affecting cluster stability. |
static_0014 |
Check whether truncate table is performed at the cluster level (on cluster). |
ClickHouse |
N/A |
truncate table testtb1 on cluster default_cluster; |
Deleting table data consumes a large number of CPU and memory resources and disk I/O resources when the table data volume is large. Cluster-level operations pose high pressure on the database. As a result, tasks on some nodes may fail to be executed, time out, or do not respond for a long time, affecting cluster stability. |
dynamic_0001 |
Check whether the number of scanned files exceeds the threshold. |
|
Number of files that will be scanned or have been scanned Recommended value: 100,000 |
SELECT ss_ticket_number FROM store_sales WHERE ss_ticket_number=72291252 LIMIT 10; |
Scanning a large number of files with a SQL statement can generate a large number of slices, overloading HiveServer memory and potentially causing the instance to crash. This can also consume a significant amount of cluster resources, delaying other tasks. |
dynamic_0002 |
Check whether the number of partitions involved in a table operation (select, delete, update, or alter) exceeds the threshold. |
|
Number of partitions involved in the delete or alter operation Recommended value: 10,000 |
DELETE FROM table_name WHERE column_name = value |
Scanning too many partitions can overload the database, causing it to run slowly and consume excessive memory in HiveServer and MetaStore. This can lead to frequent GC, disrupting other tasks and potentially causing the instance to restart unexpectedly. |
dynamic_0003 |
When the right table of a join is a distributed table, check whether the data volume of the right table exceeds the threshold. |
ClickHouse |
Number of rows in the right table of a join Recommended value: 100,000,000 |
SELECT name, text FROM table_1 JOIN table_2 ON table_1.Id = table_2.Id |
Large data volumes in the right table can cause the join operation to consume excessive memory, potentially leading to memory insufficiency, service failure, and cluster instability. |
dynamic_0004 |
Check whether a SQL statement overwrites the same table where it reads data. |
|
N/A |
N/A |
Such SQL statements may cause data loss or inconsistency. |
running_0001 |
Check whether the number of rows returned by a Select statement to the client exceeds the threshold. |
|
Number of rows in the query result Recommended value: 100,000 |
select * from table |
Large query results can cause server memory overload, leading to OOM exceptions and instability. Excessive results also slow down query efficiency. |
running_0002 |
Check whether the peak memory usage of a SQL statement exceeds the threshold (absolute value). |
|
Memory occupied by a SQL statement during runtime, in MB |
N/A |
Long-running tasks consume cluster resources, slowing down other tasks and overall performance. |
running_0003 |
Check whether the running duration of a SQL statement exceeds the threshold. |
|
Running duration of a SQL statement, in seconds |
N/A |
Long-running tasks can monopolize cluster resources, reducing overall utilization. They also generate a significant amount of intermediate data. To avoid delays, tasks that fail to meet expectations should be adjusted promptly. |
running_0004 |
Check whether the size of data scanned by a SQL statement exceeds the threshold. |
|
Data scanned by a SQL statement, in GB Recommended value: 10,240 |
N/A |
Large datasets can consume significant memory resources, impacting other tasks' performance. Intermediate data can also fill disk space, compromising cluster stability. |
running_0005 |
Check whether the amount of shuffle data that has been written by a SQL statement exceeds the threshold. |
Spark |
Amount of shuffle data written by a SQL statement, in GB |
N/A |
When executing SQL statements with operators like join and aggregation, a significant amount of data is shuffled, leading to high disk usage and potentially causing disk space exhaustion, which can compromise cluster stability. |

