doc-exports/docs/css/umn/css_01_0118.html

<a name="css_01_0118"></a><a name="css_01_0118"></a>

<h1 class="topictitle1">About Vector Search</h1>
<div id="body0000001081670505"><p id="css_01_0118__en-us_topic_0000001223594448_p443014484225">Unstructured data, such as images, videos, and language corpora, is converted into vectors, which are searched based on similarity using either an exact or approximate nearest neighbors algorithm.</p>
<div class="section" id="css_01_0118__en-us_topic_0000001223594448_section45446533237"><h4 class="sectiontitle">How It Works</h4><p id="css_01_0118__en-us_topic_0000001223594448_p1892712312611">Vector search works in a way similar to traditional search. To improve vector search performance, we need to:</p>
<ul id="css_01_0118__en-us_topic_0000001223594448_ul18466630152615"><li id="css_01_0118__en-us_topic_0000001223594448_li1466530102616"><strong id="css_01_0118__en-us_topic_0000001223594448_b9200126114717">Narrow down the matched scope</strong><p id="css_01_0118__en-us_topic_0000001223594448_p109274234263">Similar to traditional text search, vector search use indexes to accelerate the search instead of going through all data. Traditional text search uses inverted indexes to filter out irrelevant documents, whereas vector search creates indexes for vectors to bypass irrelevant vectors, narrowing down the search scope.</p>
</li><li id="css_01_0118__en-us_topic_0000001223594448_li148363592612"><strong id="css_01_0118__en-us_topic_0000001223594448_b1729973134720">Reduce the complexity of calculating a single vector</strong><p id="css_01_0118__en-us_topic_0000001223594448_p1892715234266">The vector search method can quantize and approximate high dimensional vectors first. By doing this, a smaller and more relevant dataset can be obtained. Then more sophisticated algorithms are applied to this smaller dataset to perform computation and sorting. This way, complex computation is performed on only part of the vectors, and efficiency is improved.</p>
</li></ul>
<p id="css_01_0118__en-us_topic_0000001223594448_p4672653151113">Vector search means to retrieve the k-nearest neighbors (KNN) to the query vector in a given vector data set by using a specific measurement method. Generally, CSS only focuses on Approximate Nearest Neighbor (ANN), because a KNN search requires excessive computational resources.</p>
</div>
<div class="section" id="css_01_0118__en-us_topic_0000001223594448_section1682114111274"><h4 class="sectiontitle">Vector Search in CSS</h4><p id="css_01_0118__en-us_topic_0000001223594448_p368234122719">The CSS vector search engine integrates a variety of vector indexes, such as brute-force search, Hierarchical Navigable Small World (HNSW) graphs, product quantization, and IVF-HNSW. It also supports multiple similarity calculation methods, such as Euclidean, inner product, cosine, and Hamming. The recall rate and retrieval performance of the engine are better than those of open-source engines. It can meet the requirements for high performance, high precision, low costs, and multi-modal computation.</p>
<p id="css_01_0118__en-us_topic_0000001223594448_p3682194119271">The search engine also supports all the capabilities of the native Elasticsearch, including distribution, multi-replica, error recovery, snapshot, and permission control. The engine is compatible with the native Elasticsearch ecosystem, including the cluster monitoring tool Cerebro, the visualization tool Kibana, and the real-time data ingestion tool Logstash. Several client languages, such as Python, Java, Go, and C++, are supported.</p>
</div>
<div class="section" id="css_01_0118__en-us_topic_0000001223594448_section15508103014206"><h4 class="sectiontitle">Constraints</h4><ul id="css_01_0118__en-us_topic_0000001223594448_ul13628842142018"><li id="css_01_0118__li0239191816816">The built-in CSS vector search engine is available for Elasticsearch 7.6.2 and 7.10.2 clusters only.</li><li id="css_01_0118__en-us_topic_0000001223594448_li1462834282015">The vector search plug-in performs in-memory computing and requires more memory than common indexes do. You are advised to use memory-optimized compute specifications.</li></ul>
</div>
<div class="section" id="css_01_0118__section18221195417136"><a name="css_01_0118__section18221195417136"></a><a name="section18221195417136"></a><h4 class="sectiontitle">Cluster Node Specifications Selection for Vector Search</h4><p id="css_01_0118__en-us_topic_0000001261909812_p2804352104215">Off-heap memory is used for index construction and query in vector search. Therefore, the required cluster capacity is related to the index type and off-heap memory size. You can estimate the off-heap memory required by full indexing to select the appropriate cluster specifications. Due to high memory usage of vector search, CSS disables the vector search plug-in by default for clusters whose memory is 8 GB or less.</p>
<div class="p" id="css_01_0118__en-us_topic_0000001261909812_p1216235123617">There are different methods for estimating the size of off-heap memory required by different types of indexes. The calculation formulas are as follows:<ul id="css_01_0118__en-us_topic_0000001261909812_ul1626234143110"><li id="css_01_0118__en-us_topic_0000001261909812_li122610343315"><strong id="css_01_0118__b2042219402161">GRAPH index</strong><p id="css_01_0118__p3907152119169"><span id="css_01_0118__ph174514153911">mem_needs = (dim x dim_size + neighbors x 4) x num + delta</span></p>
<div class="note" id="css_01_0118__en-us_topic_0000001261909812_note264165903915"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="css_01_0118__en-us_topic_0000001261909812_p1164155923917">If you need to update indexes in real time, consider the off-heap memory overhead required for vector index construction and automatic merge. The actual size of required <strong id="css_01_0118__b1788741085413">mem_needs</strong> is at least 1.5 to 2 times of the original estimation.</p>
</div></div>
</li><li id="css_01_0118__en-us_topic_0000001261909812_li4981114418312"><strong id="css_01_0118__en-us_topic_0000001261909812_b11437195014419">PQ index</strong><p id="css_01_0118__p25749308240">mem_needs = frag_num x frag_size x num + delta</p>
</li><li id="css_01_0118__en-us_topic_0000001261909812_li85212188414"><strong id="css_01_0118__en-us_topic_0000001261909812_b16836553134119">FLAT and IVF indexes</strong><p id="css_01_0118__p1318663214249">mem_needs = dim x dim_size x num + delta</p>
</li></ul>

<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="css_01_0118__en-us_topic_0000001261909812_table1875348443" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Parameter description</caption><thead align="left"><tr id="css_01_0118__en-us_topic_0000001261909812_row276174134419"><th align="left" class="cellrowborder" valign="top" width="28.89%" id="mcps1.3.5.3.2.2.3.1.1"><p id="css_01_0118__en-us_topic_0000001261909812_p1876841448">Parameter</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="71.11%" id="mcps1.3.5.3.2.2.3.1.2"><p id="css_01_0118__en-us_topic_0000001261909812_p14761464413">Description</p>
</th>
</tr>
</thead>
<tbody><tr id="css_01_0118__en-us_topic_0000001261909812_row107613413445"><td class="cellrowborder" valign="top" width="28.89%" headers="mcps1.3.5.3.2.2.3.1.1 "><p id="css_01_0118__en-us_topic_0000001261909812_p0761943444">dim</p>
</td>
<td class="cellrowborder" valign="top" width="71.11%" headers="mcps1.3.5.3.2.2.3.1.2 "><p id="css_01_0118__en-us_topic_0000001261909812_p27614484419">Vector dimensionality</p>
</td>
</tr>
<tr id="css_01_0118__en-us_topic_0000001261909812_row47616484416"><td class="cellrowborder" valign="top" width="28.89%" headers="mcps1.3.5.3.2.2.3.1.1 "><p id="css_01_0118__en-us_topic_0000001261909812_p15767474413">neighbors</p>
</td>
<td class="cellrowborder" valign="top" width="71.11%" headers="mcps1.3.5.3.2.2.3.1.2 "><p id="css_01_0118__en-us_topic_0000001261909812_p187644154418">Number of neighbors of a graph node. The default value is <strong id="css_01_0118__b116222947734835">64</strong>.</p>
</td>
</tr>
<tr id="css_01_0118__en-us_topic_0000001261909812_row197616434417"><td class="cellrowborder" valign="top" width="28.89%" headers="mcps1.3.5.3.2.2.3.1.1 "><p id="css_01_0118__en-us_topic_0000001261909812_p6763411446">dim_size</p>
</td>
<td class="cellrowborder" valign="top" width="71.11%" headers="mcps1.3.5.3.2.2.3.1.2 "><p id="css_01_0118__en-us_topic_0000001261909812_p16769444420">Number of bytes required by each dimension. The default value is four bytes in the float type.</p>
</td>
</tr>
<tr id="css_01_0118__en-us_topic_0000001261909812_row6764416440"><td class="cellrowborder" valign="top" width="28.89%" headers="mcps1.3.5.3.2.2.3.1.1 "><p id="css_01_0118__en-us_topic_0000001261909812_p18765444414">num</p>
</td>
<td class="cellrowborder" valign="top" width="71.11%" headers="mcps1.3.5.3.2.2.3.1.2 "><p id="css_01_0118__en-us_topic_0000001261909812_p6761845447">Total number of vectors</p>
</td>
</tr>
<tr id="css_01_0118__en-us_topic_0000001261909812_row97613454416"><td class="cellrowborder" valign="top" width="28.89%" headers="mcps1.3.5.3.2.2.3.1.1 "><p id="css_01_0118__en-us_topic_0000001261909812_p137611413447">delta</p>
</td>
<td class="cellrowborder" valign="top" width="71.11%" headers="mcps1.3.5.3.2.2.3.1.2 "><p id="css_01_0118__en-us_topic_0000001261909812_p157613464410">Metadata size. This parameter can be left blank.</p>
</td>
</tr>
<tr id="css_01_0118__en-us_topic_0000001261909812_row27612424419"><td class="cellrowborder" valign="top" width="28.89%" headers="mcps1.3.5.3.2.2.3.1.1 "><p id="css_01_0118__en-us_topic_0000001261909812_p076144114413">frag_num</p>
</td>
<td class="cellrowborder" valign="top" width="71.11%" headers="mcps1.3.5.3.2.2.3.1.2 "><p id="css_01_0118__en-us_topic_0000001261909812_p276114154418">Number of vector segments during quantization and coding. If this parameter is not specified when an index is created, the value is determined by vector dimensionality <span class="parmname" id="css_01_0118__en-us_topic_0000001261909812_parmname32491715164713"><b>dim</b></span>.</p>
<pre class="screen" id="css_01_0118__en-us_topic_0000001261909812_screen28239236473">if dim &lt;= 256:
  frag_num = dim / 4
elif dim &lt;= 512:
  frag_num = dim / 8
else :
  frag_num = 64</pre>
</td>
</tr>
<tr id="css_01_0118__en-us_topic_0000001261909812_row187684104410"><td class="cellrowborder" valign="top" width="28.89%" headers="mcps1.3.5.3.2.2.3.1.1 "><p id="css_01_0118__en-us_topic_0000001261909812_p776846446">frag_size</p>
</td>
<td class="cellrowborder" valign="top" width="71.11%" headers="mcps1.3.5.3.2.2.3.1.2 "><p id="css_01_0118__en-us_topic_0000001261909812_p87604154416">Size of the center point during quantization and coding. The default value is 1.</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<p id="css_01_0118__en-us_topic_0000001261909812_p19203914115119">These calculation methods can estimate the size of off-heap memory required by a complete vector index. To determine cluster specifications, you also need to consider the heap memory overhead of each node.</p>
<p id="css_01_0118__en-us_topic_0000001261909812_p53134525012">Heap memory allocation policy: The size of the heap memory of each node is half of the node physical memory, and the maximum size is <strong id="css_01_0118__b107546537334835">31 GB</strong>.</p>
<p id="css_01_0118__en-us_topic_0000001261909812_p16313452502">For example, if you create a Graph index for the SIFT10M dataset, with <span class="parmname" id="css_01_0118__en-us_topic_0000001261909812_parmname1113013488515"><b>dim</b></span> set to <span class="parmvalue" id="css_01_0118__en-us_topic_0000001261909812_parmvalue13734161255216"><b>128</b></span>, <span class="parmname" id="css_01_0118__en-us_topic_0000001261909812_parmname8379010185210"><b>dim_size</b></span> to <span class="parmvalue" id="css_01_0118__en-us_topic_0000001261909812_parmvalue1197581115311"><b>4</b></span>, <span class="parmname" id="css_01_0118__en-us_topic_0000001261909812_parmname2411772538"><b>neighbors</b></span> to the default value <span class="parmvalue" id="css_01_0118__en-us_topic_0000001261909812_parmvalue11371171312536"><b>64</b></span>, and <span class="parmname" id="css_01_0118__en-us_topic_0000001261909812_parmname1086192219533"><b>num</b></span> to <span class="parmvalue" id="css_01_0118__en-us_topic_0000001261909812_parmvalue978262955317"><b>10 million</b></span>, the off-heap memory required by the Graph index is around 7.5 GB. Calculation formula: <span class="parmvalue" id="css_01_0118__parmvalue6249757174112"><b>mem_needs = (128 x 4 + 64 x 4) x 10000000 ≈ 7.5</b></span>.</p>
<p id="css_01_0118__en-us_topic_0000001261909812_p531317519506">Considering the overhead of heap memory, a single server with <strong id="css_01_0118__b552412312592">8 vCPUs</strong> and <strong id="css_01_0118__b2571330334835">16 GB memory</strong> is recommended. If real-time write or update is required, you need to apply for larger memory.</p>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="css_01_0117.html">Configuring Vector Search for Elasticsearch Clusters</a></div>
</div>
</div>