Large query isolation can be configured to manage queries that have high memory usage or take too long to complete. This helps improve the stability of OpenSearch clusters and prevent out-of-memory (OOM) exceptions.
Log in to OpenSearch Dashboards and go to the command execution page. OpenSearch clusters support multiple access methods. This topic uses OpenSearch Dashboards as an example to describe the operation procedures.
The left part of the console is the command input box, and the triangle icon in its upper-right corner is the execution button. The right part shows the execution result.
Large query isolation places large queries in an isolation pool, where they may be canceled based on preset memory or duration thresholds. Large query isolation is enabled by default. Any change takes effect immediately.
PUT _cluster/settings
{
"persistent": {
"search.isolator.enabled": true
}
}
Parameter |
Type |
Description |
|---|---|---|
search.isolator.enabled |
Boolean |
Whether to enable large query isolation. When enabled, large queries are managed separately from other normal queries.
|
PUT _cluster/settings
{
"persistent": {
"search.isolator.memory.task.limit": "50MB",
"search.isolator.time.management": "10s"
}
}
Parameter |
Type |
Description |
|---|---|---|
search.isolator.memory.task.limit |
String |
Large query memory threshold: When a query requests more memory than specified by this threshold, it is placed into an isolation pool. Value range: 0b to the maximum heap memory of a node. Unit: MB, GB, or other units supported by OpenSearch. The default value is 50 MB. |
search.isolator.time.management |
String |
Large query duration threshold: When a query has lasted longer than specified by this threshold, it is placed into an isolation pool. Value range: ≥ 0ms Unit: s (second), ms (millisecond), m (minute), or h (hour). The default value is 10s. |
PUT _cluster/settings
{
"persistent": {
"search.isolator.memory.pool.limit": "50%",
"search.isolator.memory.heap.limit": "90%",
"search.isolator.count.limit": 1000
}
}
Parameter |
Type |
Description |
|---|---|---|
search.isolator.memory.pool.limit |
String |
Isolation pool memory usage threshold, which indicates the maximum node heap memory usage allowed. When the total memory requested by all large queries in the isolation pool exceeds this threshold, one of the large queries is automatically canceled based on a predefined policy. Value range: 0.0–100.0% The default value is 50. |
search.isolator.memory.heap.limit |
String |
Heap memory usage threshold, which indicates the actual node heap memory usage, contributed by both writes and queries. When the node heap memory usage exceeds this threshold, one of the large queries is automatically canceled based on a predefined policy. Value range: 0.0–100.0% The default value is 90%. |
search.isolator.count.limit |
Integer |
The maximum number of queries in the isolation pool. When this threshold is reached, query cancelation is triggered, and no new queries will be accepted. Value range: 10–50000 The default value is 1000. |
In addition to search.isolator.memory.pool.limit and search.isolator.count.limit, you can configure search.isolator.memory.task.limit and search.isolator.time.management to control the number of query tasks that enter the isolation pool.
PUT _cluster/settings
{
"persistent": {
"search.isolator.strategy": "fair",
"search.isolator.strategy.ratio": "0.5%"
}
}
Parameter |
Type |
Description |
|---|---|---|
search.isolator.strategy |
String |
Policy for selecting which query to cancel when query isolation is triggered.
The large query isolation pool is checked every second until the heap memory is within the safe range. |
search.isolator.strategy.ratio |
String |
Fair policy threshold, which is the ratio of the memory difference between two queries in the isolation pool to the maximum node heap memory. If this threshold is not reached, the query that has lasted the longest will be canceled. Otherwise, the query that occupies the most memory will be canceled. This parameter is valid only when search.isolator.strategy is set to fair. Value range: 0.0–100.0% The default value is 1. |
A global query timeout, as the name indicates, applies to all queries. Global query timeout is disabled by default. Any change takes effect immediately.
PUT _cluster/settings
{
"persistent": {
"search.isolator.time.enabled": true,
"search.isolator.time.limit": "110s"
}
}
Parameter |
Type |
Description |
|---|---|---|
search.isolator.time.enabled |
Boolean |
Whether to enable a global query timeout. When enabled, queries are automatically canceled when they last longer than a predefined timeout.
|
search.isolator.time.limit |
String |
The value of the global query timeout. Value range: ≥ 0 ms The default value is 120s. |
PUT _cluster/settings
{
"persistent": {
"search.isolator.log.count": "100"
}
}
Parameter |
Data Type |
Description |
|---|---|---|
search.isolator.log.count |
Integer |
Maximum number of log records retained for canceled queries. Canceled query requests are recorded in the memory for the purpose of analyzing and optimizing large queries. Excess records will be discarded. This parameter is valid only when search.isolator.enabled is set to true. Value range: 0–5000 The default value is 100. You can use the following API to check query cancelation logs:
In the commands above, nodeId indicates the node ID. Example response: {
"_nodes": {
"total": 1,
"successful": 1,
"failed": 0
},
"cluster_name": "test",
"nodes": {
"CTqrZFXWTzmLonSZyNMKkQ": {
"name": "test-ess-esn-1-1",
"host": "172.16.101.116",
"total_cancel": 0, //Total number of canceled queries
"isolator_cancel": 0, //Number of queries canceled because isolation pool thresholds were exceeded
"out_of_time_cancel": 0 //Number of queries canceled due to timeout
}
}
}
|