Files
doc-exports/docs/css/umn/css_01_0264.html
zhengxiu 93d856d5c5 css umn 25.6.0 version
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com>
Co-authored-by: zhengxiu <zhengxiu@huawei.com>
Co-committed-by: zhengxiu <zhengxiu@huawei.com>
2025-11-25 11:34:43 +00:00

213 lines
7.9 KiB
HTML

<a name="EN-US_TOPIC_0000002152432486"></a><a name="EN-US_TOPIC_0000002152432486"></a>
<h1 class="topictitle1">Using Nested Fields for Vector Search</h1>
<div id="body0000002152432486"><p id="EN-US_TOPIC_0000002152432486__p8060118">Nested fields allow multiple vectorized records to be stored in a single document. For example, in an RAG scenario, documents usually need to be segmented by paragraph or by a fixed length, and then vectorized into multiple semantic vectors. By means of nested fields, these vectors can be written into a same Elasticsearch document. For a document that contains multiple vector records, if the query vector matches any of them, the document is returned.</p>
<div class="section" id="EN-US_TOPIC_0000002152432486__section7402052171913"><h4 class="sectiontitle">Constraints</h4><p id="EN-US_TOPIC_0000002152432486__p242820500615">Only Elasticsearch 7.10.2 clusters support this feature.</p>
</div>
<div class="section" id="EN-US_TOPIC_0000002152432486__section920834941814"><h4 class="sectiontitle">Creating a Vector Index</h4><p id="EN-US_TOPIC_0000002152432486__p1814521513234">Create a vector index with nested fields. The index contains an <strong id="EN-US_TOPIC_0000002152432486__b1147823312334">id</strong> field whose type is <strong id="EN-US_TOPIC_0000002152432486__b11727144215333">keyword</strong>, and an <strong id="EN-US_TOPIC_0000002152432486__b10774185012335">embedding</strong> field whose type is <strong id="EN-US_TOPIC_0000002152432486__b386116013417">nested</strong>. The embedding field contains two subfields: <strong id="EN-US_TOPIC_0000002152432486__b922417201355">chunk</strong> and <strong id="EN-US_TOPIC_0000002152432486__b14480924123517">emb</strong>. The <strong id="EN-US_TOPIC_0000002152432486__b4416182815350">chunk</strong> subfield is of the <strong id="EN-US_TOPIC_0000002152432486__b038012119351">keyword</strong> type, and the <strong id="EN-US_TOPIC_0000002152432486__b5241534193514">emb</strong> subfield is of the <strong id="EN-US_TOPIC_0000002152432486__b1496212398351">vector</strong> type.</p>
<pre class="screen" id="EN-US_TOPIC_0000002152432486__screen15862126144714">PUT my_index
{
"settings": {
"index.vector": true
},
"mappings": {
"properties": {
"id": {
"type": "keyword"
},
"embedding": {
"type": "nested",
"properties": {
"chunk": {
"type": "keyword"
},
"emb": {
"type": "vector",
"dimension": 2,
"indexing": true,
"algorithm": "GRAPH",
"metric": "euclidean"
}
}
}
}
}
}</pre>
</div>
<div class="section" id="EN-US_TOPIC_0000002152432486__section1069103718276"><h4 class="sectiontitle">Importing Vector Data</h4><p id="EN-US_TOPIC_0000002152432486__p107491342132810">Use the bulk operation to write data in arrays. Each document contains two vector records.</p>
<pre class="screen" id="EN-US_TOPIC_0000002152432486__screen20150127125014">POST my_index/_bulk
{"index":{}}
{"id": 1, "embedding": [{"chunk":1,"emb": [1, 1]}, {"chunk":2,"emb": [2, 2]}]}
{"index":{}}
{"id": 2, "embedding": [{"chunk":1,"emb": [2, 2]}, {"chunk":2,"emb": [3, 3]}]}
{"index":{}}
{"id": 3, "embedding": [{"chunk":1,"emb": [3, 3]}, {"chunk":2,"emb": [4, 4]}]}</pre>
</div>
<div class="section" id="EN-US_TOPIC_0000002152432486__section20812133113112"><h4 class="sectiontitle">Vector Search</h4><p id="EN-US_TOPIC_0000002152432486__p124414255492">The nested query is required for nested fields. To perform such a query, you need to set the path parameter to specify the nested path, and set <strong id="EN-US_TOPIC_0000002152432486__b101091310103910">score_mode</strong> to <strong id="EN-US_TOPIC_0000002152432486__b14692812103918">max</strong>, indicating the maximum similarity between all vectors in the document and the query vector.</p>
<ul id="EN-US_TOPIC_0000002152432486__ul4274936174213"><li id="EN-US_TOPIC_0000002152432486__li62744361425">Standard query<p id="EN-US_TOPIC_0000002152432486__p1690601511614"><a name="EN-US_TOPIC_0000002152432486__li62744361425"></a><a name="li62744361425"></a>Query the top 10 documents that are most similar to vector [1, 1].</p>
<pre class="screen" id="EN-US_TOPIC_0000002152432486__screen1279552916507">GET my_index/_search
{
"_source": {"excludes": ["embedding"]},
"query": {
"nested": {
"path": "embedding",
"score_mode": "max",
"query": {
"vector": {
"embedding.emb": {
"vector": [1, 1],
"topk": 10
}
}
}
}
}
}</pre>
<p id="EN-US_TOPIC_0000002152432486__p19738125816532">An example of the query result:</p>
<pre class="screen" id="EN-US_TOPIC_0000002152432486__screen103475417404">{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "Hc4Vc5QBSxCnghau22AE",
"_score" : 1.0,
"_source" : {
"id" : 1
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "Hs4Vc5QBSxCnghau22AE",
"_score" : 0.33333334,
"_source" : {
"id" : 2
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "H84Vc5QBSxCnghau22AE",
"_score" : 0.11111111,
"_source" : {
"id" : 3
}
}
]
}
}</pre>
</li><li id="EN-US_TOPIC_0000002152432486__li081194014213">Pre-filtering query<p id="EN-US_TOPIC_0000002152432486__p6281329155415"><a name="EN-US_TOPIC_0000002152432486__li081194014213"></a><a name="li081194014213"></a>First retrieve documents whose ID is ["2", "3"], and then return the top 10 documents that are most similar to the query vector [1, 1].</p>
<pre class="screen" id="EN-US_TOPIC_0000002152432486__screen1560120439117">GET my_index/_search
{
"query": {
"nested": {
"path": "embedding",
"score_mode": "max",
"query": {
"vector": {
"embedding.emb": {
"vector": [1, 1],
"topk": 10,
"filter": {
"terms": {"id": ["2", "3"]}
}
}
}
}
}
}
}</pre>
<p id="EN-US_TOPIC_0000002152432486__p188900299012">An example of the query result:</p>
<pre class="screen" id="EN-US_TOPIC_0000002152432486__screen1352602315214">{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 0.33333334,
"hits" : [
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "3t0ZypcB-Tff59gMTZO2",
"_score" : 0.33333334,
"_source" : {
"id" : 2,
"embedding" : [
{
"chunk" : 1,
"emb" : [
2,
2
]
},
{
"chunk" : 2,
"emb" : [
3,
3
]
}
]
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "390ZypcB-Tff59gMTZO2",
"_score" : 0.11111111,
"_source" : {
"id" : 3,
"embedding" : [
{
"chunk" : 1,
"emb" : [
3,
3
]
},
{
"chunk" : 2,
"emb" : [
4,
4
]
}
]
}
}
]
}
}</pre>
</li></ul>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="css_01_0117.html">Configuring Vector Search for Elasticsearch Clusters</a></div>
</div>
</div>