You have spent weeks reading about vector databases. You installed Qdrant, you tried ChromaDB, and now you are staring at another Docker container wondering if you really need a fourth service just to store some embeddings. The honest answer for most teams is: no, you do not.
If you already run PostgreSQL, you can add vector similarity search to it using pgvector, an open-source PostgreSQL extension. You get nearest-neighbor search, filtered queries, indexing, and full SQL semantics, all inside the database you already operate, back up, and monitor. No new port, no new process, no new backup strategy.
This tutorial walks you through installing pgvector on Ubuntu, wiring it up to PostgreSQL, inserting embeddings from Ollama, and running real semantic queries with SQL. By the end you will have a working vector search setup entirely inside PostgreSQL.
What pgvector Is
pgvector adds a vector data type to PostgreSQL. A vector(768) column stores a 768-dimensional floating-point array, the same kind of array that Qdrant calls a “point” and ChromaDB calls an “embedding.”
Once your table has a vector column, you can:
- Insert rows with their embedding values.
- Query for the N rows whose vectors are closest to a given query vector, using the
<->(Euclidean distance),<=>(cosine distance), or<#>(inner product) operators. - Filter with standard SQL
WHEREclauses, by date, category, user ID, anything. - Index vector columns with HNSW or IVFFlat to speed up approximate nearest-neighbor search at scale.
The key insight: vector search becomes one more query type in SQL, not a separate system. You can join vectors with relational tables, apply transactions, use EXPLAIN ANALYZE, and everything else you already know from PostgreSQL.
How Vector Search Works
The closest vector to your query vector is the most semantically similar match. The distance calculation is done at query time across all rows (or the index, when one is built).
For text embeddings you almost always want cosine similarity, it measures the angle between two vectors, ignoring their magnitude. A cosine distance of 0.0 means identical direction (identical meaning). A cosine distance of 1.0 means completely orthogonal (unrelated).
-- cosine distance: smaller = more similar
SELECT title, 1 - (embedding <=> query_vector) AS similarity
FROM documents
ORDER BY embedding <=> query_vector
LIMIT 5;
The <=> operator returns cosine distance. Subtracting from 1 gives cosine similarity, which is more intuitive to read.
Prerequisites
Before starting, make sure you have:
- Ubuntu 20.04, 22.04, or 24.04
- PostgreSQL 14, 15, or 16 installed and running
- Ollama installed and running (for generating embeddings in the tutorial steps)
curlandjqinstalled- Python 3.10+ and
pip(optional, for the Python example at the end) - A user with
sudoprivileges
If you do not have PostgreSQL installed, install it now:
sudo apt update
sudo apt install -y postgresql postgresql-contrib
sudo systemctl enable --now postgresql
Verify PostgreSQL is running:
sudo systemctl status postgresql
Pull an embedding model into Ollama if you have not already:
ollama pull nomic-embed-text
Step 1: Install pgvector
pgvector is available from the PostgreSQL APT repository for PostgreSQL 14, 15, and 16. Find your PostgreSQL major version first:
psql --version
Example output: psql (PostgreSQL) 16.3
Install the matching pgvector package:
# For PostgreSQL 16:
sudo apt install -y postgresql-16-pgvector
# For PostgreSQL 15:
sudo apt install -y postgresql-15-pgvector
# For PostgreSQL 14:
sudo apt install -y postgresql-14-pgvector
If your PostgreSQL is from the upstream APT repository (not Ubuntu’s default), this package will be available. If the package is not found, install from source:
sudo apt install -y build-essential postgresql-server-dev-16 git
git clone https://github.com/pgvector/pgvector.git /tmp/pgvector
cd /tmp/pgvector
make
sudo make install
Replace postgresql-server-dev-16 with the version matching your installation.
Step 2: Enable the Extension in Your Database
pgvector is installed as a shared library, but you must enable it per-database. Switch to the postgres user and create a dedicated database:
sudo -u postgres psql
Inside the PostgreSQL prompt:
CREATE DATABASE vectordb;
\c vectordb
CREATE EXTENSION vector;
\dx
The \dx command lists installed extensions. You should see vector in the output:
List of installed extensions
Name | Version | Schema | Description
---------+---------+------------+--------------------------------
plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
vector | 0.8.0 | public | vector data type and ivfflat and hnsw access methods
Exit the prompt:
\q
The extension is now active. Every vector data type and operator (<->, <=>, <#>) is available in vectordb.
Step 3: Create a Table with a Vector Column
Still inside the vectordb database, create a table to store documents alongside their embeddings:
sudo -u postgres psql -d vectordb
CREATE TABLE documents (
id BIGSERIAL PRIMARY KEY,
source TEXT NOT NULL,
content TEXT NOT NULL,
embedding vector(768),
created_at TIMESTAMPTZ DEFAULT NOW()
);
The vector(768) dimension must match the output dimension of your embedding model. nomic-embed-text outputs 768-dimensional vectors. If you use a different model, adjust the number: all-MiniLM-L6-v2 outputs 384, OpenAI text-embedding-3-small outputs 1536.
Getting the dimension wrong at insert time causes an error, PostgreSQL enforces it strictly.
Step 4: Generate and Insert Embeddings
You will now insert some document rows with real embeddings generated by Ollama. Open a terminal (outside psql) and use the Ollama embed API:
curl -s http://localhost:11434/api/embed \
-H "Content-Type: application/json" \
-d '{"model": "nomic-embed-text", "input": "To restart the application server, SSH into app01.internal and run sudo systemctl restart myapp."}' \
| jq '.embeddings[0] | length'
This should print 768, confirming the dimension.
Now insert a few rows. In practice you would write a script to do this at scale, but the manual SQL insert is the fastest way to verify the setup is working.
Get the embedding for your first document:
curl -s http://localhost:11434/api/embed \
-H "Content-Type: application/json" \
-d '{"model": "nomic-embed-text", "input": "To restart the application server, run sudo systemctl restart myapp"}' \
| jq -r '[.embeddings[0]] | @json'
The output is a JSON array of 768 floats. Copy it, you will paste it into a SQL INSERT. In production you would use a language driver so you never copy-paste vectors by hand.
For the tutorial, insert using a Python one-liner so you do not have to paste 768 floats manually:
pip install psycopg2-binary requests
Create a helper script insert_docs.py:
import requests
import psycopg2
OLLAMA_URL = "http://localhost:11434/api/embed"
MODEL = "nomic-embed-text"
PG_DSN = "dbname=vectordb user=postgres host=localhost"
documents = [
("runbook-v1", "To restart the application server, SSH into app01.internal and run sudo systemctl restart myapp."),
("runbook-v1", "Logs are at /var/log/myapp/app.log. Rotate them with logrotate if they exceed 500 MB."),
("runbook-v1", "Database backups run at 02:00 UTC via backup-db.sh on db01.internal. Backups are stored in /mnt/backups."),
("runbook-v1", "Prometheus scrapes metrics every 15 seconds. Grafana dashboards are at http://grafana.internal:3000."),
("runbook-v1", "P1 incidents require paging on-call via PagerDuty. P2 and below are handled during business hours."),
("guide-v1", "PostgreSQL vacuums dead tuples automatically via autovacuum. Run VACUUM ANALYZE manually after bulk loads."),
("guide-v1", "Use pg_dump to back up a single database: pg_dump -Fc mydb > mydb.dump"),
]
def embed(text):
r = requests.post(OLLAMA_URL, json={"model": MODEL, "input": text})
r.raise_for_status()
return r.json()["embeddings"][0]
conn = psycopg2.connect(PG_DSN)
cur = conn.cursor()
for source, content in documents:
vec = embed(content)
cur.execute(
"INSERT INTO documents (source, content, embedding) VALUES (%s, %s, %s)",
(source, content, vec)
)
print(f"Inserted: {content[:60]}...")
conn.commit()
cur.close()
conn.close()
print("Done.")
Run it:
python3 insert_docs.py
Each line confirms an inserted row. You now have seven rows with real embeddings stored in PostgreSQL.
Step 5: Run a Similarity Search
Back in psql, query for the most relevant documents:
sudo -u postgres psql -d vectordb
First, you need the embedding for your query text. Do this in a separate terminal:
curl -s http://localhost:11434/api/embed \
-H "Content-Type: application/json" \
-d '{"model": "nomic-embed-text", "input": "how do I restart the server"}' \
| jq -r '.embeddings[0][:5]'
In a real application your language driver passes the vector directly. For this tutorial, add a SQL-level query using the Python script to avoid copy-pasting the full vector:
# query.py
import requests
import psycopg2
from psycopg2.extras import RealDictCursor
OLLAMA_URL = "http://localhost:11434/api/embed"
MODEL = "nomic-embed-text"
PG_DSN = "dbname=vectordb user=postgres host=localhost"
def embed(text):
r = requests.post(OLLAMA_URL, json={"model": MODEL, "input": text})
r.raise_for_status()
return r.json()["embeddings"][0]
question = "How do I restart the application server?"
vec = embed(question)
conn = psycopg2.connect(PG_DSN)
cur = conn.cursor(cursor_factory=RealDictCursor)
cur.execute("""
SELECT
source,
content,
1 - (embedding <=> %s::vector) AS similarity
FROM documents
ORDER BY embedding <=> %s::vector
LIMIT 3
""", (vec, vec))
rows = cur.fetchall()
for row in rows:
print(f"[{row['similarity']:.4f}] ({row['source']}) {row['content'][:80]}")
cur.close()
conn.close()
python3 query.py
Expected output (similarity scores will vary slightly):
[0.8912] (runbook-v1) To restart the application server, SSH into app01.internal and run sudo ...
[0.7843] (runbook-v1) Logs are at /var/log/myapp/app.log. Rotate them with logrotate if they...
[0.6511] (runbook-v1) Database backups run at 02:00 UTC via backup-db.sh on db01.internal. Ba...
The restart runbook chunk scores highest because its meaning is closest to the question. That is vector search working correctly.
Filtered Search
One advantage of pgvector over standalone vector databases is that you already have SQL WHERE filtering. Filter by source:
cur.execute("""
SELECT source, content, 1 - (embedding <=> %s::vector) AS similarity
FROM documents
WHERE source = 'guide-v1'
ORDER BY embedding <=> %s::vector
LIMIT 3
""", (vec, vec))
This restricts the search to documents from guide-v1 only, without any extra indexing setup and without learning a new filter syntax.
Step 6: Create a Vector Index
By default, pgvector performs an exact nearest-neighbor search, it scans every row and computes the distance. This is accurate but slow on large tables.
For tables with more than 10,000–50,000 rows, create an index. pgvector supports two types:
IVFFlat, divides vectors into clusters and searches only nearby clusters. Faster to build, lower memory usage.
HNSW, builds a layered graph structure. Slower to build, uses more memory, but gives better recall at query time. Recommended for most production use cases.
Create an HNSW index on the embedding column using cosine distance:
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);
For IVFFlat (you must have at least some rows inserted first):
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);
The lists parameter controls how many clusters are created. A starting value of rows / 1000 (minimum 10) is reasonable. With an index in place, PostgreSQL automatically uses it for ORDER BY embedding <=> $1 LIMIT N queries.
Confirm the index is being used:
EXPLAIN SELECT source, content
FROM documents
ORDER BY embedding <=> '[0.1, 0.2, ...]'::vector
LIMIT 5;
The query plan should show Index Scan rather than Seq Scan.
Common Mistakes and Troubleshooting
ERROR: type "vector" does not exist
The extension is not enabled in the current database. Run CREATE EXTENSION vector; while connected to that database, not the postgres default database.
ERROR: expected N dimensions, not M
The vector you are inserting has a different number of dimensions than the column was created with. Check nomic-embed-text outputs exactly 768 dimensions. If you switched embedding models, you must DROP TABLE and recreate it with the correct vector(N) dimension.
Exact search is slow on large tables
You need an index. Create the HNSW index (see Step 6). Note that an HNSW index on 1 million rows takes several minutes to build and requires significant RAM. Build it during a maintenance window.
Similarity scores look wrong or random
You are probably mixing embedding models, indexing with one model and querying with another. Cosine similarity only makes sense when both vectors come from the same model. Drop and re-insert with a consistent model.
psycopg2.errors.UndefinedFunction: operator does not exist: vector <=> vector
This can happen when the schema search path is wrong. Ensure public is in the search path: SET search_path TO public; or use SELECT set_config('search_path', 'public', false);.
Best Practices
Match dimensions to your embedding model and never change them. The moment you change the embedding model, the vectors are incompatible with the existing ones. Treat the column’s vector(N) dimension as immutable. Version your tables (documents_v2) when you upgrade models.
Use HNSW for production, IVFFlat for bulk-load scenarios. HNSW gives better recall and handles inserts without reindexing. IVFFlat requires you to specify lists upfront and performs best when the full dataset is loaded before the index is created.
Put vector search inside your existing PostgreSQL backup strategy. The entire setup (schema, rows, and indexes) lives in the database dump. A pg_dump of vectordb captures everything. There is no separate vector store to back up.
Add a GIN or B-tree index on metadata columns you filter by. If your queries always include WHERE source = $1, add CREATE INDEX ON documents (source). PostgreSQL combines the B-tree filter with the vector index scan via a bitmap index scan, making filtered searches faster.
Limit concurrent embedding + insert load. Generating embeddings from Ollama for thousands of documents can saturate your CPU. Process inserts in batches of 50–100, with a short pause between batches, to keep the system responsive while indexing runs in the background.
Conclusion
You now have vector similarity search running entirely inside PostgreSQL:
- pgvector installed and enabled as a database extension
- A
vector(768)column in a regular SQL table - Real embeddings from Ollama’s
nomic-embed-textstored and queried with the<=>cosine distance operator - Filtered semantic search using standard SQL
WHEREclauses - An HNSW index for fast approximate nearest-neighbor queries at scale
The practical implication is significant: if your application already uses PostgreSQL, you do not need to add Qdrant or ChromaDB to your stack to build a RAG pipeline or a semantic search feature. One database, one backup job, one connection pool.
That said, pgvector is not a replacement for a dedicated vector database in every scenario. If your dataset exceeds several million vectors, if you need horizontal scaling, or if you need Qdrant’s built-in dashboard and snapshot API, a standalone vector database is the better fit. The Qdrant vs ChromaDB comparison covers those trade-offs in detail.