GEO & AI Search

Vector Database

A vector database stores high-dimensional embeddings of data such as text and images and uses approximate nearest neighbor (ANN) search to quickly find the vectors whose meaning is closest to a query vector. Unlike a traditional database that retrieves exact matches, it searches by similarity (distance).

A vector database stores embedding vectors and answers similarity queries using approximate nearest neighbor (ANN) search.
Where a traditional database looks for exact value matches, a vector database finds the "most similar vectors" using similarity measures such as cosine similarity or Euclidean distance.
Because brute-force comparison is slow, indexes like HNSW and IVF trade a little accuracy for speed, keeping searches in the millisecond range even at large scale.
It serves as the core infrastructure for semantic search, recommendations, and RAG, which grounds an LLM in external knowledge.

Overview

A vector database is a database that stores embedding vectors and handles similarity queries through approximate nearest neighbor (ANN) search. When data such as text, images, or audio is passed through an embedding model, the result is an array of hundreds to thousands of numbers (a vector) that captures meaning. Indexing and storing these vectors and quickly returning the "N closest to a query vector" is the central job of a vector database. Pinecone defines a vector database as a system that "indexes and stores vector embeddings for fast retrieval and similarity search, with capabilities like CRUD operations, metadata filtering, and horizontal scaling."

Because data with similar meaning sits closer together in vector space, you can surface results that are "semantically related" even when the keywords do not match exactly. For this reason, vector databases have become the foundational infrastructure for semantic search, recommendation systems, and retrieval-augmented generation (RAG), where an LLM pulls in external knowledge while composing an answer.

Traditional Database vs. Vector Database

The most fundamental difference lies in what counts as a match. As Pinecone explains, "in traditional databases, we usually search for rows where the value exactly matches our query, whereas in vector databases we apply a similarity metric to find the vectors that are most similar to our query."

Aspect	Traditional (relational/scalar) DB	Vector database
Search method	Exact value matching (=, LIKE, ranges)	Approximate search based on similarity (distance)
Data shape	Scalar values such as numbers and strings, structured rows/columns	High-dimensional embedding vectors (hundreds to thousands of dimensions)
Query result	The exact set of rows that satisfy the conditions	The top N items closest to the query vector (approximate)
Core index	B-tree, hash index	ANN indexes such as HNSW and IVF
Trade-off	Guaranteed accuracy	Trades accuracy for speed (approximate results)
Typical use	Transactions, aggregation, structured queries	Semantic search, recommendations, RAG

Traditional scalar-based databases struggle with the complexity and scale of high-dimensional vectors. A vector database is optimized for a single operation, finding the N closest items in a collection given a query vector, and that operation is precisely approximate nearest neighbor search.

Approximate Nearest Neighbor (ANN) and Indexes

Brute-force search, which compares the query vector against every stored vector one by one, is exact but becomes far too slow once the number of high-dimensional vectors grows into the millions or billions. Vector databases therefore use ANN algorithms based on hashing, quantization, and graph techniques to find nearby vectors without scanning everything. As Pinecone puts it, "because vector databases return approximate results, the main trade-off we consider is the balance between accuracy and speed." AWS makes the same point, noting that "indexing significantly accelerates similarity search, but it produces results using approximate nearest neighbor (ANN) algorithms, and ANN trades some accuracy for gains in performance and memory efficiency."

HNSW

HNSW (Hierarchical Navigable Small World) is the most widely used graph-based index. It builds a multi-layer graph made up of layers at different resolutions, starting from an entry point in the coarsest top layer and descending into progressively finer lower layers to navigate toward neighbors close to the query vector. Pinecone rates HNSW as "one of the best-performing indexes for vector similarity search, offering very fast search speeds and excellent recall," while adding that "it is not the best in terms of memory efficiency."

IVF

IVF (Inverted File) indexes partition the vector space into multiple lists (clusters) and then search only the few lists nearest to the query vector. They generally build faster and use less memory than graph-based indexes, though HNSW tends to deliver better query performance. Milvus offers variants such as IVF_FLAT, IVF_SQ8, and IVF_PQ, each striking a different balance among accuracy, speed, and memory.

Distance Measures

The similarity measure that defines what "close" means is usually one of the following three.

Cosine similarity: the cosine of the angle between two vectors, yielding a value between -1 and 1.
Euclidean distance: the straight-line distance between two points in vector space.
Dot product: reflects both the magnitudes of the two vectors and the cosine of the angle between them.

Leading Solutions

Vector databases broadly fall into two camps: dedicated vector databases and extensions of existing databases. Pinecone is a leading managed, dedicated vector database, while Milvus and Weaviate are widely used open-source dedicated vector databases. Milvus supports a range of indexes, including IVF, HNSW, DiskANN, and GPU-accelerated indexes, allowing fine-grained tuning of the balance between latency and accuracy.

Alternatively, you can add vector search to an existing PostgreSQL instance instead of adopting a separate dedicated system. The open-source extension pgvector stores embeddings alongside relational data while preserving ACID guarantees, point-in-time recovery, and JOINs, and it supports two indexes, HNSW and IVFFlat. For distance operators it provides L2 distance (<->), inner product (<#>), and cosine distance (<=>), among others.

-- pgvector: create a 3-dimensional vector column and an HNSW index
CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3));

CREATE INDEX ON items USING hnsw (embedding vector_l2_ops);

-- retrieve the top 5 rows with the smallest L2 distance to the query vector
SELECT * FROM items ORDER BY embedding <-> '[3,1,2]' LIMIT 5;

Evidence and Examples

The role of vector databases in generative AI becomes clearest in the RAG pipeline. AWS defines a vector database as a system that "lets you store and query vectors at scale using efficient nearest neighbor query algorithms and appropriate indexes," and explains that in RAG "the user input is turned into a vector using the same embedding model used earlier, and then a nearest neighbor query is run against that input in the vector space." In other words, the user's question is embedded, the most relevant document chunks are pulled from the vector database, and the LLM generates an answer grounded in those chunks.

Thanks to this structure, RAG can incorporate up-to-date and domain-specific knowledge into answers without retraining or fine-tuning the model, which has made it the standard pattern for producing fact-grounded responses.