Files
doc-exports/docs/css/umn/css_01_0466.html
zhengxiu 2125539080 css umn 25.1.0 version
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com>
Co-authored-by: zhengxiu <zhengxiu@huawei.com>
Co-committed-by: zhengxiu <zhengxiu@huawei.com>
2025-07-04 09:10:17 +00:00

306 lines
39 KiB
HTML

<a name="css_01_0466"></a><a name="css_01_0466"></a>
<h1 class="topictitle1">Creating Vector Indexes in an OpenSearch Cluster</h1>
<div id="body0000001992165597"><p id="css_01_0466__css_01_0121_p1088563214517">To create a vector index, perform the following steps:</p>
<ol id="css_01_0466__css_01_0121_ol1619419559456"><li id="css_01_0466__css_01_0121_li191943552459"><a href="#css_01_0466__css_01_0121_en-us_topic_0000001309709789_section8992201842518">(Optional) Preparations</a>: Configure advanced cluster settings based on service needs.</li><li id="css_01_0466__css_01_0121_li23091424184611"><a href="#css_01_0466__css_01_0121_section68017273556">(Optional) Pre-Building and Registering a Center Point Vector</a>: If index algorithms <span class="parmvalue" id="css_01_0466__parmvalue9728132912159"><b>IVF_GRAPH</b></span> and <span class="parmvalue" id="css_01_0466__parmvalue11728152901511"><b>IVF_GRAPH_PQ</b></span> are selected when creating a vector index, pre-build and register a center point vector.</li><li id="css_01_0466__css_01_0121_li171952553459"><a href="#css_01_0466__css_01_0121_en-us_topic_0000001309709789_section137344225249">Creating a Vector Index</a>: Create a vector index based on service needs.</li><li id="css_01_0466__css_01_0121_li1619519552452"><a href="#css_01_0466__css_01_0121_en-us_topic_0000001309709789_section137931314240">Importing Vector Data</a>: Import vector data to the cluster.</li><li id="css_01_0466__css_01_0121_li7115281857"><a href="css_01_0467.html">Using Vector Indexes for Data Search in an OpenSearch Cluster</a>: Perform a vector search.</li></ol>
<div class="section" id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_section116717617479"><h4 class="sectiontitle">Prerequisites</h4><p id="css_01_0466__css_01_0121_p17591181212610">You have created an OpenSearch 1.3.6 cluster. For details, see <a href="css_01_0465.html#css_01_0465__css_01_0118_section18221195417136">Cluster Node Specifications Selection for Vector Search</a>.</p>
</div>
<div class="section" id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_section8992201842518"><a name="css_01_0466__css_01_0121_en-us_topic_0000001309709789_section8992201842518"></a><a name="css_01_0121_en-us_topic_0000001309709789_section8992201842518"></a><h4 class="sectiontitle">(Optional) Preparations</h4><p id="css_01_0466__css_01_0121_p1641214115586">Before creating a vector index, configure advanced settings for the cluster based on service needs.</p>
<ul id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_ul15987041576"><li id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_li14987041175">When importing data offline, you are advised to set <strong id="css_01_0466__b94754009721741">refresh_interval</strong> of indexes to <strong id="css_01_0466__b138707179421741">-1</strong> to disable automatic index refreshing and thus improve batch write performance.</li><li id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_li10987184111711">You are advised to set <strong id="css_01_0466__b126502817721741">number_of_replicas</strong> to <strong id="css_01_0466__b115920245821741">0</strong>. After the offline data import is complete, you can modify the parameter value again as needed.</li><li id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_li15567154814461"><a href="#css_01_0466__css_01_0121_en-us_topic_0000001309709789_table14840054154616">Table 1</a> describes the other advanced settings.
<div class="tablenoborder"><a name="css_01_0466__css_01_0121_en-us_topic_0000001309709789_table14840054154616"></a><a name="css_01_0121_en-us_topic_0000001309709789_table14840054154616"></a><table cellpadding="4" cellspacing="0" summary="" id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_table14840054154616" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Parameters for advanced cluster settings</caption><thead align="left"><tr id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_row10840205444620"><th align="left" class="cellrowborder" valign="top" width="28.07%" id="mcps1.3.4.3.3.2.2.3.1.1"><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p11840195417463">Parameter</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="71.93%" id="mcps1.3.4.3.3.2.2.3.1.2"><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p584045412460">Description</p>
</th>
</tr>
</thead>
<tbody><tr id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_row1984014541461"><td class="cellrowborder" valign="top" width="28.07%" headers="mcps1.3.4.3.3.2.2.3.1.1 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p1184012542463">native.cache.circuit_breaker.enabled</p>
</td>
<td class="cellrowborder" valign="top" width="71.93%" headers="mcps1.3.4.3.3.2.2.3.1.2 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p1884010543460">Whether to enable the circuit breaker for off-heap memory.</p>
<p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p108401554134616">Default value: <strong id="css_01_0466__b14069602421741">true</strong></p>
</td>
</tr>
<tr id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_row2084025454614"><td class="cellrowborder" valign="top" width="28.07%" headers="mcps1.3.4.3.3.2.2.3.1.1 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p198401754104612">native.cache.circuit_breaker.cpu.limit</p>
</td>
<td class="cellrowborder" valign="top" width="71.93%" headers="mcps1.3.4.3.3.2.2.3.1.2 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p8840165410464">Upper limit of off-heap memory usage of the vector index.</p>
<p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p14840145414615">For example, if the overall memory of a host is 128 GB and the heap memory occupies 31 GB, the default upper limit of the off-heap memory usage is 43.65 GB, that is, (128 - 31) x 45%. If the off-heap memory usage exceeds its upper limit, the circuit breaker will be triggered.</p>
<p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p1584075419469">Default value: <strong id="css_01_0466__b127222771221741">45%</strong></p>
</td>
</tr>
<tr id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_row2840254184612"><td class="cellrowborder" valign="top" width="28.07%" headers="mcps1.3.4.3.3.2.2.3.1.1 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p1084075424612">native.cache.expiry.enabled</p>
</td>
<td class="cellrowborder" valign="top" width="71.93%" headers="mcps1.3.4.3.3.2.2.3.1.2 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p14840454154620">Whether to enable the cache expiration policy. If this parameter is set to <strong id="css_01_0466__b47483490221741">true</strong>, some cache items that have not been accessed for a long time will be cleared.</p>
<p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p188403541465">Value: <strong id="css_01_0466__b199481026521741">true</strong> or <strong id="css_01_0466__b194472956021741">false</strong>.</p>
<p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p12840854134616">Default value: <strong id="css_01_0466__b145410936221741">false</strong></p>
</td>
</tr>
<tr id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_row1684035414620"><td class="cellrowborder" valign="top" width="28.07%" headers="mcps1.3.4.3.3.2.2.3.1.1 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p2084016540466">native.cache.expiry.time</p>
</td>
<td class="cellrowborder" valign="top" width="71.93%" headers="mcps1.3.4.3.3.2.2.3.1.2 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p13840454124613">Expiration time.</p>
<p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p3840954104614">Default value: <strong id="css_01_0466__b149113642121741">24h</strong></p>
</td>
</tr>
<tr id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_row18840954194618"><td class="cellrowborder" valign="top" width="28.07%" headers="mcps1.3.4.3.3.2.2.3.1.1 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p1684015418465">native.vector.index_threads</p>
</td>
<td class="cellrowborder" valign="top" width="71.93%" headers="mcps1.3.4.3.3.2.2.3.1.2 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p14840105414463">Number of threads used for creating underlying indexes. Each shard uses multiple threads. Set a relatively small value to avoid resource preemption caused by the build queries of too many threads.</p>
<p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p48407544465">Default value: <strong id="css_01_0466__b116521378621741">4</strong></p>
</td>
</tr>
</tbody>
</table>
</div>
</li></ul>
</div>
<div class="section" id="css_01_0466__css_01_0121_section68017273556"><a name="css_01_0466__css_01_0121_section68017273556"></a><a name="css_01_0121_section68017273556"></a><h4 class="sectiontitle">(Optional) Pre-Building and Registering a Center Point Vector</h4><p id="css_01_0466__css_01_0121_p98011327145516">If index algorithms <span class="parmvalue" id="css_01_0466__parmvalue101535892821741"><b>IVF_GRAPH</b></span> and <span class="parmvalue" id="css_01_0466__parmvalue30363563721741"><b>IVF_GRAPH_PQ</b></span> are selected when creating a vector index, you need to pre-build and register a center point vector.</p>
<p id="css_01_0466__css_01_0121_p18801627135515">The vector index acceleration algorithms <strong id="css_01_0466__b209720158321741">IVF_GRAPH</strong> and <strong id="css_01_0466__b133156924621741">IVF_GRAPH_PQ</strong> are suitable for ultra-large-scale computing. These two algorithms allow you to narrow down the query scope by dividing a vector space into subspaces through clustering or random sampling. Before pre-build, you need to obtain all center point vectors by clustering or random sampling. Center point vectors are pre-built into the GRAPH or GRAPH_PQ index and then registered with the Elasticsearch cluster. All nodes in the cluster can share this index file. Reuse of the center index among shards can effectively reduce the training overhead and the number of center index queries, improving the write and query performance.</p>
<ol id="css_01_0466__css_01_0121_ol20801162712555"><li id="css_01_0466__css_01_0121_li1580172720557">On the <strong id="css_01_0466__b159098141721741">Clusters</strong> page, locate the target cluster, and click <strong id="css_01_0466__b200150525821741">Access Kibana</strong> in the <strong id="css_01_0466__b206682843021741">Operation</strong> column.</li><li id="css_01_0466__css_01_0121_li68018270553">Click <strong id="css_01_0466__b31857702121741">Dev Tools</strong> in the navigation tree on the left.</li><li id="css_01_0466__css_01_0121_li78011227125516">Create a center point index table.<ul id="css_01_0466__css_01_0121_ul168013275553"><li id="css_01_0466__css_01_0121_li198011627135518">For example, if the created index is named <strong id="css_01_0466__b213430821221741">my_dict</strong>, <strong id="css_01_0466__b111752376821741">number_of_shards</strong> of the index must be set to <strong id="css_01_0466__b43445141821741">1</strong>. Otherwise, the index cannot be registered.</li><li id="css_01_0466__css_01_0121_li4802102714555">If you want to use the <strong id="css_01_0466__b51286844321741">IVF_GRAPH</strong> index, set <strong id="css_01_0466__b43278100621741">algorithm</strong> of the center point index to <strong id="css_01_0466__b19257603621741">GRAPH</strong>.</li><li id="css_01_0466__css_01_0121_li1080282765512">If you want to use the <strong id="css_01_0466__b152929666221741">IVF_GRAPH_PQ</strong> index, set <strong id="css_01_0466__b108082272121741">algorithm</strong> of the center point index to <strong id="css_01_0466__b90683856921741">GRAPH_PQ</strong>.</li></ul>
<pre class="screen" id="css_01_0466__css_01_0121_screen11802172712557">PUT my_dict
{
"settings": {
"index": {
"vector": true
},
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"properties": {
"my_vector": {
"type": "vector",
"dimension": 2,
"indexing": true,
"algorithm": "GRAPH",
"metric": "euclidean"
}
}
}
}</pre>
</li><li id="css_01_0466__css_01_0121_li1580211270559">Write the center point vector to the created index.<p id="css_01_0466__css_01_0121_p18025271559"><a name="css_01_0466__css_01_0121_li1580211270559"></a><a name="css_01_0121_li1580211270559"></a>Write the center point vector obtained through sampling or clustering into the created <strong id="css_01_0466__b118149984921741">my_dict</strong> index. For details, see <a href="#css_01_0466__css_01_0121_en-us_topic_0000001309709789_section137931314240">Importing Vector Data</a>.</p>
</li><li id="css_01_0466__css_01_0121_li5802182775514">Call the registration API.<p id="css_01_0466__css_01_0121_p1880262725512"><a name="css_01_0466__css_01_0121_li5802182775514"></a><a name="css_01_0121_li5802182775514"></a>Register the created <strong id="css_01_0466__b85622570021741">my_dict</strong> index with a <strong id="css_01_0466__b187653640921741">Dict</strong> object with a globally unique identifier name (<strong id="css_01_0466__b207917590721741">dict_name</strong>).</p>
<pre class="screen" id="css_01_0466__css_01_0121_screen1680212775516">PUT _vector/register/my_dict
{
"dict_name": "my_dict"
}</pre>
</li><li id="css_01_0466__css_01_0121_li1780242714555">Create an <strong id="css_01_0466__b85386421421741">IVF_GRAPH</strong> or <strong id="css_01_0466__b96781201621741">IVF_GRAPH_PQ</strong> index.<p id="css_01_0466__css_01_0121_p18025279559">You do not need to specify the dimension or metric information. Simply specify the registered dictionary name.</p>
<pre class="screen" id="css_01_0466__css_01_0121_screen14802112795510">PUT my_index
{
"settings": {
"index": {
"vector": true,
"sort.field": "my_vector.centroid" # Set the centroid subfield of each vector field as a sorting field.
}
},
"mappings": {
"properties": {
"my_vector": {
"type": "vector",
"indexing": true,
"algorithm": "IVF_GRAPH",
"dict_name": "my_dict",
"offload_ivf": true
}
}
}
}</pre>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="css_01_0466__css_01_0121_table16803127125519" frame="border" border="1" rules="all"><caption><b>Table 2 </b>Field mappings parameters</caption><thead align="left"><tr id="css_01_0466__css_01_0121_row3803727185512"><th align="left" class="cellrowborder" valign="top" width="27.27%" id="mcps1.3.5.4.6.5.2.3.1.1"><p id="css_01_0466__css_01_0121_p88031327175517">Parameter</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="72.72999999999999%" id="mcps1.3.5.4.6.5.2.3.1.2"><p id="css_01_0466__css_01_0121_p2080372715512">Description</p>
</th>
</tr>
</thead>
<tbody><tr id="css_01_0466__css_01_0121_row48032027135518"><td class="cellrowborder" valign="top" width="27.27%" headers="mcps1.3.5.4.6.5.2.3.1.1 "><p id="css_01_0466__css_01_0121_p14803132745520">dict_name</p>
</td>
<td class="cellrowborder" valign="top" width="72.72999999999999%" headers="mcps1.3.5.4.6.5.2.3.1.2 "><p id="css_01_0466__css_01_0121_p780392725511">Specifies the name of the depended central point index. The vector dimensions and metrics of the index are the same as those of the Dict index.</p>
</td>
</tr>
<tr id="css_01_0466__css_01_0121_row2803132718551"><td class="cellrowborder" valign="top" width="27.27%" headers="mcps1.3.5.4.6.5.2.3.1.1 "><p id="css_01_0466__css_01_0121_p1580352716554">offload_ivf</p>
</td>
<td class="cellrowborder" valign="top" width="72.72999999999999%" headers="mcps1.3.5.4.6.5.2.3.1.2 "><p id="css_01_0466__css_01_0121_p18031027175519">Unloads the IVF inverted index implemented by the underlying index to Elasticsearch. This reduces the use of non-heap memory and the overhead of write and merge operations. You are advised to set this parameter to <strong id="css_01_0466__b140276548221741">true</strong>.</p>
<p id="css_01_0466__css_01_0121_p1180382712557">Value: <strong id="css_01_0466__b36025301821741">true</strong> or <strong id="css_01_0466__b18804673321741">false</strong>.</p>
<p id="css_01_0466__css_01_0121_p88031927145512">Default value: <strong id="css_01_0466__b58950339521741">false</strong></p>
</td>
</tr>
</tbody>
</table>
</div>
</li></ol>
</div>
<div class="section" id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_section137344225249"><a name="css_01_0466__css_01_0121_en-us_topic_0000001309709789_section137344225249"></a><a name="css_01_0121_en-us_topic_0000001309709789_section137344225249"></a><h4 class="sectiontitle">Creating a Vector Index</h4><ol id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_ol927111214106"><li id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_li3555152983620">Log in to the CSS management console.</li><li id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_li1274935516811">Choose <strong id="css_01_0466__b128059053321741">Clusters</strong> in the navigation pane. On the <span class="uicontrol" id="css_01_0466__uicontrol109591233921741"><b>Clusters</b></span> page, locate the target cluster and click <span class="uicontrol" id="css_01_0466__uicontrol206109455121741"><b>Access Kibana</b></span> in the <strong id="css_01_0466__b74632166421741">Operation</strong> column.</li><li id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_li927171291011">Click <strong id="css_01_0466__b163708297221741">Dev Tools</strong> in the navigation tree on the left and run the following command to create a vector index.<p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p201369284315">Create an index named <strong id="css_01_0466__b138843428121741">my_index</strong> that contains a vector field <strong id="css_01_0466__b166986034121741">my_vector</strong> and a text field <strong id="css_01_0466__b79289023321741">my_label</strong>. The vector field creates the graph index and uses Euclidean distance to measure similarity.</p>
<pre class="screen" id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_screen1122662184413">PUT <strong id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_b36573428595">my_index</strong>
{
"settings": {
"index": {
"vector": true
}
},
"mappings": {
"properties": {
"<strong id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_b5599194810597">my_vector</strong>": {
"type": "vector",
"dimension": 2,
"indexing": true,
"algorithm": "GRAPH",
"metric": "euclidean"
},
"<strong id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_b2666125135918">my_label</strong>": {
"type": "keyword"
}
}
}
}</pre>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_table189861827103114" frame="border" border="1" rules="all"><caption><b>Table 3 </b>Parameters for creating an index</caption><thead align="left"><tr id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_row313720280311"><th align="left" class="cellrowborder" valign="top" width="15.9%" id="mcps1.3.6.2.3.4.2.4.1.1"><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p513762803116">Type</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="23.810000000000002%" id="mcps1.3.6.2.3.4.2.4.1.2"><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p161374285318">Parameter</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="60.29%" id="mcps1.3.6.2.3.4.2.4.1.3"><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p15137828193115">Description</p>
</th>
</tr>
</thead>
<tbody><tr id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_row813742811312"><td class="cellrowborder" valign="top" width="15.9%" headers="mcps1.3.6.2.3.4.2.4.1.1 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p913772817312">Index settings parameters</p>
</td>
<td class="cellrowborder" valign="top" width="23.810000000000002%" headers="mcps1.3.6.2.3.4.2.4.1.2 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p21379285311">vector</p>
</td>
<td class="cellrowborder" valign="top" width="60.29%" headers="mcps1.3.6.2.3.4.2.4.1.3 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p161371728183114">To use a vector index, set this parameter to <strong id="css_01_0466__b182299471121741">true</strong>.</p>
</td>
</tr>
<tr id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_row10137192819314"><td class="cellrowborder" rowspan="7" valign="top" width="15.9%" headers="mcps1.3.6.2.3.4.2.4.1.1 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p213752833114">Field mappings parameters</p>
</td>
<td class="cellrowborder" valign="top" width="23.810000000000002%" headers="mcps1.3.6.2.3.4.2.4.1.2 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p3137192833111">type</p>
</td>
<td class="cellrowborder" valign="top" width="60.29%" headers="mcps1.3.6.2.3.4.2.4.1.3 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p10137528193113">Field type, for example, <strong id="css_01_0466__b40776543121741">vector</strong>.</p>
</td>
</tr>
<tr id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_row11137102816314"><td class="cellrowborder" valign="top" headers="mcps1.3.6.2.3.4.2.4.1.1 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p9137162819314">dimension</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.6.2.3.4.2.4.1.2 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p1813722813114">Vector dimensionality. Value range: [1, 4096]</p>
</td>
</tr>
<tr id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_row11375285319"><td class="cellrowborder" valign="top" headers="mcps1.3.6.2.3.4.2.4.1.1 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p41371928143113">indexing</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.6.2.3.4.2.4.1.2 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p847320141736">Whether to enable vector index acceleration.</p>
<div class="p" id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p1446833612412">The value can be:<ul id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_ul14231831846"><li id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_li172319311418"><strong id="css_01_0466__b184076979221741">false</strong>: disables vector index acceleration. If this parameter is set to <strong id="css_01_0466__b200471708321741">false</strong>, vector data is written only to docvalues, and only <strong id="css_01_0466__b169211927421741">ScriptScore</strong> and <strong id="css_01_0466__b101255844721741">Rescore</strong> can be used for vector query.</li><li id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_li1923123120417"><strong id="css_01_0466__b136565770021741">true</strong>: enables vector index acceleration. If this parameter is set to <strong id="css_01_0466__b182454369521741">true</strong>, an extra vector index is created. The index algorithm is specified by the <strong id="css_01_0466__b12478536421741">algorithm</strong> field and <strong id="css_01_0466__b113428452121741">VectorQuery</strong> can be used for data query.</li></ul>
</div>
<p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p9137192817313">Default value: <strong id="css_01_0466__b192797285421741">false</strong></p>
</td>
</tr>
<tr id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_row17137928133115"><td class="cellrowborder" valign="top" headers="mcps1.3.6.2.3.4.2.4.1.1 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p20137928103120">algorithm</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.6.2.3.4.2.4.1.2 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p31371028153117">Index algorithm. This parameter is valid only when <span class="parmname" id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_parmname1892551140"><b>indexing</b></span> is set to <span class="parmvalue" id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_parmvalue34381757842"><b>true</b></span>.</p>
<div class="p" id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p59041981956">The value can be:<ul id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_ul13137192843118"><li id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_li9137102883117"><strong id="css_01_0466__b115925554921741">FLAT</strong>: brute-force algorithm that calculates the distance between the target vector and all vectors in sequence. The algorithm relies on sheer computing power and its recall rate reaches 100%. You can use this algorithm if you require high recall accuracy.</li><li id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_li1013752814316"><strong id="css_01_0466__b84794640121741">GRAPH</strong>: Hierarchical Navigable Small Worlds (HNSW) algorithm for graph indexes. This algorithm is mainly used in scenarios where high performance and precision are required and the data records of a single shard is fewer than 10 million.</li><li id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_li9138128173111"><strong id="css_01_0466__b10365444821741">GRAPH_PQ</strong>: combination of the HNSW algorithm and the PQ algorithm. The PQ algorithm reduces the storage overhead of original vectors, so that HNSW can easily search for data among hundreds of millions of records.</li><li id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_li3138142873115"><strong id="css_01_0466__b96948581121741">IVF_GRAPH</strong>: combination of IVF and HNSW. The entire space is divided into multiple cluster centroids, which makes search much faster but slightly inaccurate. You can use this algorithm if you require high performance when searching for data among hundreds of millions of records.</li><li id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_li11138112811312"><strong id="css_01_0466__b39478035921741">IVF_GRAPH_PQ</strong>: combination of the PQ algorithm with the IVF or HNSW algorithm to further improve the system capacity and reduce the system overhead. This algorithm is applicable to scenarios where there are more than 1 billion files in shards and high retrieval performance is required.</li></ul>
</div>
<div class="p" id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p1992814222519">Default value: <strong id="css_01_0466__b115393250921741">GRAPH</strong><div class="note" id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_note49515161850"><span class="notetitle"> NOTE: </span><div class="notebody"><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p19955161253">If <strong id="css_01_0466__b97153489521741">IVF_GRAPH</strong> or <strong id="css_01_0466__b212077091421741">IVF_GRAPH_PQ</strong> is specified, you need to pre-build and register a central point index. For details, see <a href="#css_01_0466__css_01_0121_section68017273556">(Optional) Pre-Building and Registering a Center Point Vector</a>.</p>
</div></div>
</div>
</td>
</tr>
<tr id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_row17361132932517"><td class="cellrowborder" valign="top" headers="mcps1.3.6.2.3.4.2.4.1.1 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p1361102910258">Other optional parameters</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.6.2.3.4.2.4.1.2 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p3361172917256">If <strong id="css_01_0466__b144446857421741">Indexing</strong> is set to <strong id="css_01_0466__b206826403021741">true</strong>, CSS provides optional parameters for vector search that you can configure to achieve higher query performance or precision. For more information, see <a href="#css_01_0466__css_01_0121_en-us_topic_0000001309709789_table9916164920432">Table 4</a>.</p>
</td>
</tr>
<tr id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_row1138828193117"><td class="cellrowborder" valign="top" headers="mcps1.3.6.2.3.4.2.4.1.1 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p16138152843114">metric</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.6.2.3.4.2.4.1.2 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p12803936171619">Method of calculating the distance between vectors.</p>
<p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p580012449167">The value can be:</p>
<ul id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_ul275905161620"><li id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_li3759195117167"><strong id="css_01_0466__b73236964221741">euclidean</strong>: Euclidean distance</li><li id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_li1976613910173"><strong id="css_01_0466__b85870728421741">inner_product</strong>: inner product distance</li><li id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_li2983719191718"><strong id="css_01_0466__b11647501721741">cosine</strong>: cosine distance</li><li id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_li116272039161717"><strong id="css_01_0466__b83041801721741">hamming</strong>: Hamming distance, which can be used only when <strong id="css_01_0466__b139289115821741">dim_type</strong> is set to <strong id="css_01_0466__b44087403221741">binary</strong>.</li></ul>
<p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p713810288312">Default value: <strong id="css_01_0466__b120335799121741">euclidean</strong></p>
</td>
</tr>
<tr id="css_01_0466__css_01_0121_row167855103202"><td class="cellrowborder" valign="top" headers="mcps1.3.6.2.3.4.2.4.1.1 "><p id="css_01_0466__css_01_0121_p478561011201">dim_type</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.6.2.3.4.2.4.1.2 "><p id="css_01_0466__css_01_0121_p778531016206">Type of the vector dimension value.</p>
<p id="css_01_0466__css_01_0121_p13376105710208">The value can be <strong id="css_01_0466__b80022027921741">binary</strong> and <strong id="css_01_0466__b178541233621741">float</strong> (default).</p>
</td>
</tr>
</tbody>
</table>
</div>
<div class="tablenoborder"><a name="css_01_0466__css_01_0121_en-us_topic_0000001309709789_table9916164920432"></a><a name="css_01_0121_en-us_topic_0000001309709789_table9916164920432"></a><table cellpadding="4" cellspacing="0" summary="" id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_table9916164920432" frame="border" border="1" rules="all"><caption><b>Table 4 </b>Optional parameters</caption><thead align="left"><tr id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_row13952749114315"><th align="left" class="cellrowborder" valign="top" width="15.15%" id="mcps1.3.6.2.3.5.2.4.1.1"><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p169521949154312">Type</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="26.82%" id="mcps1.3.6.2.3.5.2.4.1.2"><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p109521249144312">Parameter</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="58.03%" id="mcps1.3.6.2.3.5.2.4.1.3"><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p1695214913438">Description</p>
</th>
</tr>
</thead>
<tbody><tr id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_row49522492436"><td class="cellrowborder" rowspan="5" valign="top" width="15.15%" headers="mcps1.3.6.2.3.5.2.4.1.1 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p13952144974312">Graph index configuration parameters</p>
</td>
<td class="cellrowborder" valign="top" width="26.82%" headers="mcps1.3.6.2.3.5.2.4.1.2 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p395214916432">neighbors</p>
</td>
<td class="cellrowborder" valign="top" width="58.03%" headers="mcps1.3.6.2.3.5.2.4.1.3 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p146721318450">Number of neighbors of each vector in a graph index. The default value is <strong id="css_01_0466__b160278737521741">64</strong>. A larger value indicates higher query precision. A larger index results in a slower build and query speed.</p>
<p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p195264919438">Value range: [10, 255]</p>
</td>
</tr>
<tr id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_row10952184915433"><td class="cellrowborder" valign="top" headers="mcps1.3.6.2.3.5.2.4.1.1 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p10952449134318">shrink</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.6.2.3.5.2.4.1.2 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p42315159456">Cropping coefficient during HNSW build. The default value is <strong id="css_01_0466__b40844209521741">1.0f</strong>.</p>
<p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p13952144910430">Value range: (0.1, 10)</p>
</td>
</tr>
<tr id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_row12952134919436"><td class="cellrowborder" valign="top" headers="mcps1.3.6.2.3.5.2.4.1.1 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p29521749194312">scaling</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.6.2.3.5.2.4.1.2 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p6353141934511">Scaling ratio of the upper-layer graph nodes during HNSW build. The default value is <strong id="css_01_0466__b209544438821741">50</strong>.</p>
<p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p79527493437">Value range: (0, 128]</p>
</td>
</tr>
<tr id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_row12952249194310"><td class="cellrowborder" valign="top" headers="mcps1.3.6.2.3.5.2.4.1.1 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p1695284974314">efc</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.6.2.3.5.2.4.1.2 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p1449215201459">Queue size of the neighboring node during HNSW build. The default value is <strong id="css_01_0466__b62071752621741">200</strong>. A larger value indicates a higher precision and slower build speed.</p>
<p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p7952349194313">Value range: (0, 100000]</p>
</td>
</tr>
<tr id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_row1495216496438"><td class="cellrowborder" valign="top" headers="mcps1.3.6.2.3.5.2.4.1.1 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p119529494438">max_scan_num</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.6.2.3.5.2.4.1.2 "><p id="css_01_0466__css_01_0121_p12112134410225">Maximum number of nodes that can be scanned. The default value is <strong id="css_01_0466__b32485014321741">10000</strong>. A larger value indicates a higher precision and slower indexing speed.</p>
<p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p10952114944310">Value range: (0, 1000000]</p>
</td>
</tr>
<tr id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_row1395216498438"><td class="cellrowborder" rowspan="2" valign="top" width="15.15%" headers="mcps1.3.6.2.3.5.2.4.1.1 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p18952194914431">PQ index configuration parameters</p>
</td>
<td class="cellrowborder" valign="top" width="26.82%" headers="mcps1.3.6.2.3.5.2.4.1.2 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p0952149114312">centroid_num</p>
</td>
<td class="cellrowborder" valign="top" width="58.03%" headers="mcps1.3.6.2.3.5.2.4.1.3 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p442382274517">Number of cluster centroids of each fragment. The default value is <strong id="css_01_0466__b8695251421741">255</strong>.</p>
<p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p1095234911438">Value range: (0, 65535]</p>
</td>
</tr>
<tr id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_row795254913434"><td class="cellrowborder" valign="top" headers="mcps1.3.6.2.3.5.2.4.1.1 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p395294915433">fragment_num</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.6.2.3.5.2.4.1.2 "><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p1631132312459">Number of fragments. The default value is <strong id="css_01_0466__b120628928221741">0</strong>. The plug-in automatically sets the number of fragments based on the vector length.</p>
<p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p1495284910437">Value range: [0, 4096]</p>
</td>
</tr>
</tbody>
</table>
</div>
</li></ol>
</div>
<div class="section" id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_section137931314240"><a name="css_01_0466__css_01_0121_en-us_topic_0000001309709789_section137931314240"></a><a name="css_01_0121_en-us_topic_0000001309709789_section137931314240"></a><h4 class="sectiontitle">Importing Vector Data</h4><p id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p124892526317">Run the following command to import vector data. When writing vector data to the <strong id="css_01_0466__b73807806821741">my_index</strong> index, you need to specify the vector field name and vector data.</p>
<ul id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_ul14457177133419"><li id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_li134576710342">If the input vector data is an array of floating-point numbers separated by commas (,):<pre class="screen" id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_screen122366212715">POST my_index/_doc
{
"my_vector": <em id="css_01_0466__css_01_0121_i148491717241">[1.0, 2.0]</em>
}</pre>
</li></ul>
<ul id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_ul2023361373419"><li id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_li1823312136345">If the input vector data is a Base64 string encoded using little endian:<div class="p" id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_p18887114863414"><a name="css_01_0466__css_01_0121_en-us_topic_0000001309709789_li1823312136345"></a><a name="css_01_0121_en-us_topic_0000001309709789_li1823312136345"></a>When writing binary vectors or high dimensional vectors that have a large number of valid bits, the Base64 encoding format is efficient for data transmission and parsing.<pre class="screen" id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_screen13775245173413">POST my_index/_doc
{
"my_vector": <em id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_i17751645123420">"AACAPwAAAEA="</em>
}</pre>
</div>
</li><li id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_li9687175817347">To write a large amount of data, bulk operations are recommended.<pre class="screen" id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_screen17698287350">POST my_index/_bulk
{"index": {}}
{"my_vector": <em id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_i1374193123516">[1.0, 2.0]</em>, "my_label": "<em id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_i33288398359">red</em>"}
{"index": {}}
{"my_vector": <em id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_i7718933113512">[2.0, 2.0]</em>, "my_label": "<em id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_i1114542173514">green</em>"}
{"index": {}}
{"my_vector": <em id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_i676403614353">[2.0, 3.0]</em>, "my_label": "<em id="css_01_0466__css_01_0121_en-us_topic_0000001309709789_i11636114583519">red</em>"}</pre>
</li></ul>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="css_01_0464.html">Configuring Vector Search for OpenSearch Clusters</a></div>
</div>
</div>