Files
doc-exports/docs/css/umn/css_01_0120.html
zhengxiu 93d856d5c5 css umn 25.6.0 version
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com>
Co-authored-by: zhengxiu <zhengxiu@huawei.com>
Co-committed-by: zhengxiu <zhengxiu@huawei.com>
2025-11-25 11:34:43 +00:00

134 lines
22 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<a name="EN-US_TOPIC_0000001992205805"></a><a name="EN-US_TOPIC_0000001992205805"></a>
<h1 class="topictitle1">Optimizing Vector Cluster Performance</h1>
<div id="body0000001992205805"><p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p1531323504315">This topic explains how to optimize the performance of a CSS vector database from two aspects—write and query.</p>
<div class="section" id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_section739215417110"><h4 class="sectiontitle">Optimizing Write Performance</h4><p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p294327181110">Writing vector data incurs three major overheads: replica synchronization, index refresh, and segment merging. When index data is written in real time, frequent index refresh operations generate a large number of small segments. This triggers frequent vector index build and merge operations, which consume excessive CPU/IO resources. You can try the following solutions to optimize write performance.</p>
<div class="dropdownexpand"><div class="dropdowntitle" onclick="ExpandorCollapseNode(this)"><p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p7609163161213">Solution 1: temporarily disable replicas</p></div>
<div class="dropdowncontext"><ul id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_ul88601152191219"><li id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li11861175201212">Description<p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p15361105415127"><a name="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li11861175201212"></a><a name="en-us_topic_0000001938377528_li11861175201212"></a>Temporarily disable replicas during data ingestion and enable them after data ingestion is complete. Use this solution when importing historical data in batches or performing a full update (for example, when initializing a vector database).</p>
</li><li id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li286111529126">Operation<p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p2422175591212"><a name="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li286111529126"></a><a name="en-us_topic_0000001938377528_li286111529126"></a>Set the number of replicas:</p>
<pre class="screen" id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_screen286912451719">PUT <i><span class="varname" id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_varname83811536177">my_index</span></i>/_settings
{
"number_of_replicas": 0
}</pre>
</li><li id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li1686155201213">Result<p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p7335185651212"><a name="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li1686155201213"></a><a name="en-us_topic_0000001938377528_li1686155201213"></a>Write performance is enhanced by avoiding real-time vector index building on replica nodes.</p>
</li></ul>
</div></div><div class="dropdownexpand"><div class="dropdowntitle" onclick="ExpandorCollapseNode(this)"><p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p6190191112133">Solution 2: adjust the refresh interval</p></div>
<div class="dropdowncontext"><ul id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_ul1019014116138"><li id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li1519017119133">Description<p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p4190011191311"><a name="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li1519017119133"></a><a name="en-us_topic_0000001938377528_li1519017119133"></a>Set the index refresh interval to 120s or longer to reduce the number of small segments generated during frequent index refreshes and also reduce the vector index building overhead caused by segment merges. You can also disable automatic index refresh by setting the refresh interval to 1. Use this solution in high-throughput write scenarios (for example, when writing vectorized log data).</p>
</li><li id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li151909113136">Operation<p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p61902011101314"><a name="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li151909113136"></a><a name="en-us_topic_0000001938377528_li151909113136"></a>Set <strong id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_b1454516817553">refresh_interval</strong>.</p>
<pre class="screen" id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_screen549215389186">PUT <i><span class="varname" id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_varname11670153211912">my_index</span></i>/_settings
{
"refresh_interval": "120s"
}</pre>
</li><li id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li16190161118134">Result<p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p10190131151319"><a name="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li16190161118134"></a><a name="en-us_topic_0000001938377528_li16190161118134"></a>Index refreshes occur less frequently. The reduces the number of small segments and also the overhead of segment merges, leading to enhanced write performance.</p>
</li></ul>
</div></div><div class="dropdownexpand"><div class="dropdowntitle" onclick="ExpandorCollapseNode(this)"><p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p12102181311311">Solution 3: increase indexing threads</p></div>
<div class="dropdowncontext"><ul id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_ul171021013111310"><li id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li2102613151318">Description<p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p1510281313139"><a name="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li2102613151318"></a><a name="en-us_topic_0000001938377528_li2102613151318"></a>Increasing the number of threads for vector index building accelerates the indexing process. However, too many such threads will compete for query resources. Use this solution when there are sufficient CPU resources but the write latency is high—such as in GPU-accelerated environments.</p>
</li><li id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li1610261351310">Operation<p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p7102191319134"><a name="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li1610261351310"></a><a name="en-us_topic_0000001938377528_li1610261351310"></a>The default value of <strong id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_b467253331620">native.vector.index_threads</strong> is 4. Change this value as needed.</p>
<pre class="screen" id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_screen119751022182320">PUT _cluster/settings
{
"persistent": {
"native.vector.index_threads": 8
}
}</pre>
</li><li id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li1710219136132">Result<p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p31021913201315"><a name="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li1710219136132"></a><a name="en-us_topic_0000001938377528_li1710219136132"></a>Vector index building is accelerated, and the performance of concurrent writes is enhanced.</p>
</li></ul>
</div></div></div>
<div class="section" id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_section1841011015113"><h4 class="sectiontitle">Optimizing Query Performance</h4><p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p1041532318114">Query performance is affected by the following factors: the number of segments, the memory circuit breaker mechanism, and field recall. An excessively large number of segments impacts search efficiency; when off-heap memory becomes insufficient, vector index data is frequently swapped in and out of the memory; recalling all fields increases the load during the fetch phase. You can optimize query performance by addressing these factors.</p>
<div class="dropdownexpand"><div class="dropdowntitle" onclick="ExpandorCollapseNode(this)"><p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p13695272136">Solution 1: perform force merge</p></div>
<div class="dropdowncontext"><ul id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_ul76962717131"><li id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li669182711311">Description<p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p11691027141320"><a name="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li669182711311"></a><a name="en-us_topic_0000001938377528_li669182711311"></a>After batch data ingestion, perform the force merge operation to forcibly merge segments, thus reducing the number of segments. Typically, you should perform this operation after data ingestion and before data query (for example, after a scheduled batch ingestion).</p>
</li><li id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li169162741310">Operation<p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p169132731313"><a name="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li169162741310"></a><a name="en-us_topic_0000001938377528_li169162741310"></a>Perform the force merge operation:</p>
<pre class="screen" id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_screen58166117301">POST <i><span class="varname" id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_varname71217623114">my_index</span></i>/_forcemerge?max_num_segments=1</pre>
</li><li id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li569172771318">Result<p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p269122710132"><a name="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li569172771318"></a><a name="en-us_topic_0000001938377528_li569172771318"></a>Multiple segments are merged into a single segment. This reduces the file scanning overhead and accelerates the query speed.</p>
</li></ul>
</div></div><div class="dropdownexpand"><div class="dropdowntitle" onclick="ExpandorCollapseNode(this)"><p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p1840273015134">Solution 2: adjust the upper limit of the segment size</p></div>
<div class="dropdowncontext"><ul id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_ul24021630201311"><li id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li114021330181318">Description<p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p154021130111314"><a name="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li114021330181318"></a><a name="en-us_topic_0000001938377528_li114021330181318"></a>During batch writes, the maximum size of segments generated by the system is 5 GB. You can increase this upper limit to reduce the number of segments generated after automatic merging. Typically, you should perform this operation before batch data ingestion starts.</p>
</li><li id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li7402230121314">Operation<p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p194021830191318"><a name="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li7402230121314"></a><a name="en-us_topic_0000001938377528_li7402230121314"></a>Increase the maximum segment size:</p>
<pre class="screen" id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_screen539895761520">PUT <i><span class="varname" id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_varname1359243824219">my_index</span></i>/_settings
{
"index.merge.policy.max_merged_segment": "10gb"
}</pre>
</li><li id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li7402143018136">Result<p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p1740214307136"><a name="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li7402143018136"></a><a name="en-us_topic_0000001938377528_li7402143018136"></a>Increasing the maximum segment size helps to reduce the number of segments and thus accelerate query performance.</p>
</li></ul>
</div></div><div class="dropdownexpand"><div class="dropdowntitle" onclick="ExpandorCollapseNode(this)"><p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p1363463101311">Solution 3: adjust the circuit breaker limit for off-heap memory</p></div>
<div class="dropdowncontext"><ul id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_ul8634153111319"><li id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li86341931181317">Description<p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p1063413117133"><a name="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li86341931181317"></a><a name="en-us_topic_0000001938377528_li86341931181317"></a>When the off-heap memory required by vector indexes exceeds the circuit breaker limit, the index cache manager frequently swaps in and out index data from the cache, which slows down queries. You can raise the circuit breaker limit to reduce circuit breaking (indicated by CircuitBreakingException recorded in the log) resulted from insufficient memory.</p>
</li><li id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li76341831191314">Operation<p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p2634731171313"><a name="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li76341831191314"></a><a name="en-us_topic_0000001938377528_li76341831191314"></a>The default circuit breaker limit for off-heap memory is 80%. You can adjust this limit as required.</p>
<pre class="screen" id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_screen4351528143020">PUT _cluster/settings
{
"persistent": {
"native.cache.circuit_breaker.cpu.limit": "85%"
}
}</pre>
</li><li id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li363418317132">Result<p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p15634123131311"><a name="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li363418317132"></a><a name="en-us_topic_0000001938377528_li363418317132"></a>It is less likely for vector index data to be swapped out from the memory, and query jitter is reduced.</p>
</li></ul>
</div></div><div class="dropdownexpand"><div class="dropdowntitle" onclick="ExpandorCollapseNode(this)"><p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p2822123281319">Solution 4: optimize field recall</p></div>
<div class="dropdowncontext"><ul id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_ul3822232131313"><li id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li14822163291310">Description<p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p4822133261312"><a name="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li14822163291310"></a><a name="en-us_topic_0000001938377528_li14822163291310"></a>If the query result needs to return only a few fields that are either keywords or numeric values, you can use the docvalue_fields parameter to fetch them. Use this method if only numeric or enumerated metadata (such as product IDs and class labels) needs to be fetched. It can significantly reduce overhead during the fetch phase.</p>
</li><li id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li1482211324134">Operation<p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p882253216131"><a name="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li1482211324134"></a><a name="en-us_topic_0000001938377528_li1482211324134"></a>Use the docvalue_fields parameter to fetch only specific fields:</p>
<pre class="screen" id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_screen13521154942219">POST <i><span class="varname" id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_varname1943615212518">my_index</span></i>/_search
{
"size": 2,
"stored_fields": ["_none_"],
"docvalue_fields": ["my_label"],
"query": {
"vector": {
"my_vector": {
"vector": [1, 1],
"topk": 2
}
}
}
}</pre>
</li><li id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li12822232171319">Result<p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p482223212136"><a name="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li12822232171319"></a><a name="en-us_topic_0000001938377528_li12822232171319"></a>There is no need to parse the entire _source document. Column-oriented storage (docvalues) reduces the overhead during the fetch phase and improves query performance.</p>
</li></ul>
</div></div></div>
<div class="section" id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_en-us_topic_0000001309709789_section8992201842518"><h4 class="sectiontitle">Setting Cache Timeout</h4><p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p1333214011241">When the cluster's memory resources are insufficient, data is frequently updated, or high data freshness is required, you can enable automatic cache expiration to have inactive data cleared from the cache. This helps to optimize system performance, ensure data consistency, and improve query stability. Use this approach where data updates frequently or memory resources are stretching thin.</p>
<p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p161567308282">Run the following command to set cache timeout:</p>
<pre class="screen" id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_screen12549193632718">PUT _cluster/settings
{
"persistent": {
"native.cache.expiry.enabled": "true",
"native.cache.expiry.time": "30m"
}
}</pre>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_en-us_topic_0000001309709789_table14840054154616" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Parameter description</caption><thead align="left"><tr id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_en-us_topic_0000001309709789_row10840205444620"><th align="left" class="cellrowborder" valign="top" width="25%" id="mcps1.3.4.5.2.4.1.1"><p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_en-us_topic_0000001309709789_p11840195417463">Parameter</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="25%" id="mcps1.3.4.5.2.4.1.2"><p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p1788103102914">Type</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.4.5.2.4.1.3"><p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_en-us_topic_0000001309709789_p584045412460">Description</p>
</th>
</tr>
</thead>
<tbody><tr id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_en-us_topic_0000001309709789_row2840254184612"><td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.4.5.2.4.1.1 "><p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_en-us_topic_0000001309709789_p1084075424612">native.cache.expiry.enabled</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.4.5.2.4.1.2 "><p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p108817382913">Boolean</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.5.2.4.1.3 "><p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p789513454205">Whether to enable automatic cache expiration.</p>
<div class="p" id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p136151250122014">Value range:<ul id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_ul1524969182119"><li id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li18249296212"><strong id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_b378178151417">true</strong>: Enable automatic cache expiration. Inactive data in the cache will be cleared.</li><li id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_li1025013932110"><strong id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_b461132471513">false</strong> (default value): Disable automatic cache expiration.</li></ul>
</div>
</td>
</tr>
<tr id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_en-us_topic_0000001309709789_row1684035414620"><td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.4.5.2.4.1.1 "><p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_en-us_topic_0000001309709789_p2084016540466">native.cache.expiry.time</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.4.5.2.4.1.2 "><p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p3889318293">String</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.5.2.4.1.3 "><p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_en-us_topic_0000001309709789_p13840454124613">Timeout of inactive cache items.</p>
<p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p1112616312232">This parameter takes effect only when <span class="parmvalue" id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_parmvalue121451214152310"><b>native.cache.expiry.enabled=true</b></span>.</p>
<p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p06251346202320">Value: a time string, for example, <strong id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_b52546191716">24h</strong> (24 hours) or <strong id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_b192867571714">30m</strong> (30 minutes).</p>
<p id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_p175999062317">Default value: <strong id="EN-US_TOPIC_0000001992205805__en-us_topic_0000001938377528_b199017263391828">24h</strong>.</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="css_01_0101.html">Configuring Vector Search for OpenSearch Clusters</a></div>
</div>
</div>