https://www.youtube.com/watch?v=fY-pGkrLXg4
You're building a RAG app. Your team says: Postgres for the data, Pinecone for the vector search. You nod — because that's what you always do, one database per job.
Here's the thing nobody tells you up front: Postgres is the database that became a platform. The vector search, the geospatial queries, the time-series rollups, the fuzzy text search — all of it might already be Postgres. And three specific design decisions are what made that possible.
MVCC: Readers Don't Wait for Writers
Two transactions on the same row. One reads. One updates. Same instant.
In a traditional row-lock database, one of them has to wait. In Postgres, for ordinary reads and writes, neither one waits — and the reason is one fact that almost nobody teaches in an intro.
In Postgres, an UPDATE does not change the row. It writes a new row. Every row has two hidden system columns: xmin (the transaction that created it) and xmax (the transaction that ended its visibility by updating or deleting it). When one transaction reads and another updates the same row, each one is looking at a different version.
-- txid 100: reads row. xmin=100 is visible to it.
SELECT email FROM users WHERE id = 42;
-- txid 101 at the same instant: updates. Creates NEW row with xmin=101.
-- Old row gets stamped xmax=101.
UPDATE users SET email = 'new@example.com' WHERE id = 42;
Both queries return without waiting. This is MVCC — multi-version concurrency control. Postgres still uses locks for schema changes and explicit SELECT FOR UPDATE, but for ordinary read-write conflicts on the same row, versioning replaces blocking.
And critically: this behavior isn't reserved for the users table. Every Postgres extension inherits it. Your vector search is lock-free. Your geospatial queries are lock-free. All of it, for free, from the core.
The Planner: EXPLAIN ANALYZE Shows You Everything
Paste EXPLAIN in front of any query and Postgres shows you the plan it's about to execute — not the SQL you wrote, but a tree of operators (scans, joins, sorts, indexes) with estimated costs. Add ANALYZE, and Postgres runs the query and fills in real timing at every node.
EXPLAIN ANALYZE
SELECT u.email, o.total
FROM users u JOIN orders o ON o.user_id = u.id
WHERE u.country = 'CA';
The output tells you: did it use an index on country? Did it hash-join or nested-loop? How many rows did each node actually produce vs. expect? You don't have to guess.
The mental model: your SQL is the input; the plan is the output. The planner builds the plan, the executor runs it. You wrote what answer you want — the planner picks how.
Here's the part that matters for the platform argument. The same planner that handles a JOIN on users also handles a pgvector nearest-neighbor query. Same tree, same operator framework. Extensions can even teach the planner how to estimate costs for their own operators. That's why, when you bolt on vector search, geospatial, or time-series, the queries feel native — to the planner, they are native.
Extensions: One Engine, Many Databases
Here's the decision that actually made Postgres what it is: CREATE EXTENSION.
Run this:
CREATE EXTENSION pgvector;
That one command installed new functionality into your Postgres instance:
- A new type:
vector, for storing arrays of floats - A new distance operator:
<->, computing the L2 distance between two vectors - New index access methods (
ivfflat,hnsw) for nearest-neighbor search
And it all plugs into the same engine. Same transactions. Same MVCC. Same planner.
Now you can do this:
CREATE TABLE docs (
id int PRIMARY KEY,
content text,
embedding vector(1536)
);
CREATE INDEX ON docs USING ivfflat (embedding vector_l2_ops);
SELECT id, content
FROM docs
ORDER BY embedding <-> $1
LIMIT 10;
That's vector similarity search, running inside the same transaction as your users table. No separate service. No separate API. No separate backup story.
And this is the pattern — not the feature. Once you see it once, you see it everywhere:
-
PostGIS adds geometry types, spatial operators (
ST_DWithin,ST_Intersects), GiST-based spatial indexes. Same mechanism. -
pg_trgm adds trigram-based fuzzy text matching, so
WHERE name % 'databse'matches "database". Same mechanism. - TimescaleDB goes further — it layers its own chunking machinery for time-series — but it plugs into the same transactions, the same planner, the same write-ahead log.
One engine. Many databases.
And because every change to core Postgres storage — whether it's an INSERT into users or an update to your pgvector index — goes through the same write-ahead log before touching the data file, extensions built on that storage inherit the same crash-recovery story as your primary tables. Kill the process mid-write, restart, the log replays.
So What Do You Do With This
Next time your team is about to add another datastore, run through four checks:
- Does Postgres have the data type?
- Does it have an index for your query pattern?
- Is there an extension that covers the use case?
- Is the latency acceptable for your workload?
If it passes all four — don't add the service yet. If it fails one, or you need extreme scale, distribution, or operational isolation — then add the separate service, intentionally.
You don't have to like every database. You just have to know what Postgres already does.
Top comments (1)
The Go to Rust trajectory you describe is eerily familiar. I hit amost the exact same wall with GC + GMU memory, though in my case it was trying to manage vector embeddings on a Raspberry Pi for a robot perception pipeline, not training a model. The GC would reclaim tensor memory that the neural runtime was still using, and the only fix was to copy everything to heap-allocated buffers. It worked, but it killed the latency budget for real-time inference.\n\nYour point about libtorch FFI being a shipping instinct rather than a purity trap really resonates. The discourse in the Rust ML space often defaults to pure Rust or nothing, but I think that ignores the reality that libtorch CUDA kernels represent thousands of engineer-years of optimization that no Rust-native project can match near term. The FlowBuilder DSL approach is clever. Declarative graph composition with selective freezing is exactly the kind of ergonomic layer that makes Rust viable for rapid ML prototyping.\n\nOne thing I am curious about: how does the CUDA Graphs integration work in practice for dynamic architectures? The FBRL project sounds like it has variable computational graphs (especially with the recursive feedback loops). Does CUDAGraphs handle graph structure changes between iterations, or do you need to rebuild the capture for each new architecture configuration?
\nI ran into a similar tension with vector search on embedded hardware. You want zero-copy for performance, but the embedding dimensions keep changing as the model evolves. Ended up building a small Rust-native embedding store with mmap storage to keep things deterministic. The memory ownership story in Rust was the whole reason we could make it work on constrained hardware where Go GC would have been a non-starter.