Vector and Similarity Search

Modern apps search by meaning , not just keywords: "find documents like this one", "recommend similar products", "answer from my docs" (RAG). The trick is to store an embedding — a numeric vector from an ML model — next to each row, then find rows whose vectors are closest. This lesson covers the pgvector extension end to end: a vector(1536) column, the distance operators ( <-> , <=> , <#> ), k-nearest-neighbour queries, and the approximate indexes (HNSW, IVFFlat) that make it fast at scale.

Learn Vector and Similarity Search in our free SQL course — an interactive lesson with worked examples, a practice exercise and a quick reference.

Part of the free SQL course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.

An embedding is a list of numbers (a vector) that an ML model produces to represent the meaning of a piece of text, an image, or audio. Similar meanings land at nearby points in that high-dimensional space, so "search by meaning" becomes "find the nearest vectors".

Real embeddings have hundreds or thousands of dimensions (e.g. 1536). The model is what assigns the numbers — Postgres just stores and compares them.

Postgres doesn't know about vectors out of the box. The pgvector extension adds a native vector data type and the operators to compare them. You enable it once per database with CREATE EXTENSION vector . After that, vector columns and similarity operators just work.

Think of a library where every book is placed on a giant map by topic, not by title. Books about the same subject sit close together. To "find similar books" you just look at the shelf nearby. Embeddings are those map coordinates; pgvector is the map.

You store embeddings in a column typed vector(n) , where n is the number of dimensions your model outputs (commonly 1536). It lives right alongside your ordinary columns, so each row keeps both the human-readable text and the numeric vector of what that text means.

Pick n to match your embedding model exactly. If the model returns 1536 numbers, the column must be vector(1536) . Mixing dimensions (storing a 768-vector in a 1536 column) is an error.

pgvector gives you three ways to measure how far apart two vectors are — smaller means more similar. <-> is L2 (Euclidean) distance, the straight-line gap. <=> is cosine distance, based on the angle between vectors (it ignores length and is the usual choice for text). <#> is the (negative) inner product .

This is the heart of the whole topic. To find the k most similar rows to a query, you ORDER BY the distance operator (ascending — smallest distance first) and LIMIT k . Embed the user's question with the same model you used for your rows, drop that query vector into the ORDER BY , and you've built semantic search.

Fill in the cosine operator and the clause that keeps only the top results, so the query returns the 3 most similar documents. The expected behaviour is in the comments.

With a few rows, scanning every vector is fine. With millions, it's too slow — so pgvector offers approximate nearest-neighbour (ANN) indexes. HNSW builds a navigable graph: excellent recall and fast queries, slower to build and more memory. IVFFlat groups vectors into clusters ( lists ) and searches the nearest few: cheaper to build, you tune the lists parameter.

The index's operator class must match the distance you query with: vector_cosine_ops for <=> , vector_l2_ops for <-> , vector_ip_ops for <#> . A mismatched index simply won't be used.

Fill in the graph-based ANN index type so cosine searches use an index instead of scanning every row.

Vector search is the backbone of three huge use cases. Semantic search finds results by meaning, even when no keywords match. RAG (retrieval-augmented generation) fetches the most relevant chunks of your documents and feeds them to a language model so it can answer from your data. Recommendations surface "items similar to this" by nearest-neighbour over product or content embeddings. In every case the recipe is the same: embed with a model, store the vector, search by distance.

No. You generate embeddings with an ML model in your application (Python, JavaScript, etc.), then store the returned array of floats in the vector column. Postgres only stores and compares them.

For text embeddings, cosine distance ( <=> ) is the standard default because it compares direction (meaning) and ignores magnitude. Use <-> (L2) or <#> (inner product) only if your model's documentation recommends it.

HNSW usually gives better recall and query speed at the cost of slower builds and more memory — a good default. IVFFlat is cheaper to build and tunable via lists , handy for very large or frequently rebuilt datasets.

Retrieval-augmented generation: you embed the user's question, run a k-NN search to pull the most relevant document chunks, and feed those to a language model as context so it answers from your data. The k-NN step is exactly the query in this lesson.

Put it together — a brief, a blank canvas, and the expected result in the comments. Write it, then run it on a Postgres instance with pgvector to confirm.

Practice quiz

What is an embedding in the context of vector search?

A numeric vector that represents the meaning of text, an image, or other data
A compressed backup of a table
A type of SQL index
A foreign key between two tables

Answer: A numeric vector that represents the meaning of text, an image, or other data. An embedding is a fixed-length vector of numbers produced by an ML model that captures the semantic meaning of the input.

Which Postgres extension adds a native vector data type and similarity operators?

postgis
pg_trgm
pgvector
pgcrypto

Answer: pgvector. The pgvector extension provides the vector type plus distance operators for similarity search.

How do you enable pgvector in a database?

INSTALL vector
CREATE EXTENSION vector
ENABLE pgvector
IMPORT vector

Answer: CREATE EXTENSION vector. CREATE EXTENSION vector installs the extension so you can use vector columns and operators.

What does a column declared as vector(1536) store?

1536 bytes of text
A maximum of 1536 tables
Up to 1536 separate rows
A vector with exactly 1536 dimensions

Answer: A vector with exactly 1536 dimensions. vector(1536) holds a 1536-dimensional vector; the number is the embedding dimensionality from your model.

In pgvector, which operator computes cosine distance?

Answer: <=>. The <=> operator is cosine distance; <-> is L2 (Euclidean) and <#> is negative inner product.

What does the <-> operator measure?

Cosine distance
L2 (Euclidean) distance
Inner product
String concatenation

Answer: L2 (Euclidean) distance. The <-> operator computes the L2 (Euclidean) straight-line distance between two vectors.

How do you find the k nearest neighbours to a query vector?

HAVING embedding < k
GROUP BY embedding
WHERE embedding = query
ORDER BY embedding <=> query LIMIT k

Answer: ORDER BY embedding <=> query LIMIT k. Order by the distance operator ascending and LIMIT k to return the k most similar rows.

Why add an HNSW or IVFFlat index to a vector column?

To allow NULL embeddings
To enforce uniqueness
To enable approximate nearest-neighbour search that's far faster at scale
To compress the table to disk

Answer: To enable approximate nearest-neighbour search that's far faster at scale. HNSW and IVFFlat are approximate (ANN) indexes that trade a little recall for much faster searches on large tables.

Where do the embedding vectors actually come from?

They are computed automatically by Postgres
An external ML model generates them from your text or images
They are random numbers
They are derived from the primary key

Answer: An external ML model generates them from your text or images. You call an ML embedding model (in your app), then store the returned vector in the column.

Which application is a natural fit for vector search?

Semantic search, RAG, and recommendations
Locking rows in a transaction
Exact integer ID lookups
Enforcing referential integrity

Answer: Semantic search, RAG, and recommendations. Vector search powers semantic/meaning-based search, retrieval-augmented generation, and similarity recommendations.