The hottest decision in RAG architecture. Four options dominate in 2026: Pinecone (managed SaaS), Weaviate (open-source + cloud), pgvector (PostgreSQL extension), Azure AI Search.
Pinecone wins for MVPs and small scale — simplest setup, good API, great developer experience. Price grows fast: for 5M embeddings ~USD 1500-2500/month (pod tier dependent). Above that scale economically inefficient.
Self-hosted Weaviate wins for medium scale — full control, low operational costs (mainly GPU/CPU hosting), full hybrid search out-of-the-box. Requires DevOps capacity. Typical TCO for 50M embeddings: USD 1000-2500/month.
pgvector wins when you already have PostgreSQL — simplest operational story (one DB, one backup, one monitoring). Performance better than expected (works great up to 50M vectors). Doesn't have all Pinecone/Weaviate features but sufficient for most use cases.
Azure AI Search wins for Microsoft ecosystem — native integration with Azure OpenAI, Microsoft Entra ID, SharePoint connectors. Premium pricing, but organizations usually already have Azure subscription.