About CSS Vector Search

In the age of AI, unstructured data—such as images, videos, audios, and text—is growing rapidly. Traditional keyword-based search cannot effectively handle unstructured data because it cannot capture deep semantic or visual features. To enable effective search across unstructured data, CSS provides a vector search solution that enables high-performance, high-accuracy nearest neighbor or approximate nearest neighbor search. Typical use cases include image search, video search, similar product recommendations, semantic text search, and cross-modal search (for example, searching for images using text).

Advantages

How It Works

CSS vector search uses approximate nearest neighbor (ANN) search to mitigate the intensive computational load of k-Nearest Neighbors (k-NN) search, perfectly balancing search efficiency and accuracy. Key points include:

Procedure

  1. Data preparation: Use an AI model (such as CNN and Transformer) to process your unstructured data (such as images, videos, and text) and extract feature vectors.
  2. Index creation: Create vector indexes in your Elasticsearch cluster and define vector field mappings, including specifying vector dimensions, indexing algorithms, and similarity measurement methods.
  3. Data write: Store feature vectors (typically along with the original data or metadata) into these indexes.
  4. Vector search: Use the standard Elasticsearch query DSL (such as KNN query) to provide the query vector (generated by the same model), and specify the number (k) of nearest neighbors you expect to return.
  5. Result: The CSS vector search engine performs an efficient ANN search and returns the k most relevant results and their similarity scores. Your application can then process these results (for example, showing similar images or recommending relevant products).

Constraints

Only Elasticsearch 7.6.2 and 7.10.2 clusters offer the built-in CSS vector search engine. CSS's Elasticsearch clusters does not support open-source vector search.