LangChain + Pinecone (2026): what Pinecone is, and how to build RAG with it
Short answer: Pinecone is a managed vector database that stores embeddings and serves similarity search. LangChain is a framework that orchestrates LLMs, retrievers, and tools. They are complementary, not competitors — in a RAG app, LangChain does the chunking, embedding, and chain logic while Pinecone stores and searches the vectors. You connect them with the official langchain-pinecone package: embed and upsert with PineconeVectorStore.from_documents(...), then retrieve with .as_retriever().
What is a Pinecone vector database?
Pinecone is a fully managed, cloud-hosted vector database. It stores high-dimensional embeddings — numeric vectors that represent the meaning of text, images, or other data — and lets you query them by similarity: given a query vector, it returns the stored items whose vectors are nearest to it. Pinecone handles indexing, scaling, and low-latency approximate nearest-neighbor search, so you don't run and tune your own vector index. It's the storage-and-retrieval layer behind retrieval-augmented generation (RAG), semantic search, and recommendation systems.
What is LangChain?
LangChain is an open-source orchestration framework for building LLM applications. It provides the glue: document loaders and text splitters, a unified interface to embedding models and LLMs (OpenAI, Anthropic, Google, and others), retriever abstractions, and chain/agent logic. LangChain does not store vectors itself — it delegates that to a vector store such as Pinecone, Weaviate, or pgvector.
Why use Pinecone with LangChain (they're not alternatives)
This is the key point that "Pinecone vs LangChain" framing gets wrong: you don't choose between them — you use both. They sit at different layers of a RAG stack:
| Layer | Pinecone | LangChain |
|---|---|---|
| Role | Vector database (storage + search) | Orchestration framework (app logic) |
| Stores embeddings? | Yes | No (calls a vector store) |
| Handles chunking/embedding calls? | No | Yes |
| In a RAG app | The search index | The pipeline around it |
LangChain connects to Pinecone through the official langchain-pinecone integration package, which relies on the pinecone-client v3 SDK.
How to build RAG with LangChain and Pinecone (minimal example)
Verified against the official LangChain Pinecone integration docs, June 2026.
1. Install
pip install --upgrade langchain-pinecone langchain-openai langchain
Set three environment variables: PINECONE_API_KEY, PINECONE_INDEX_NAME, and OPENAI_API_KEY (for the embeddings).
2. Create a Pinecone index
The index dimension must match your embedding model. OpenAI's text-embedding-3-small (and the older ada-002) produce 1536-dimension vectors.
from pinecone import Pinecone, ServerlessSpec
pc = Pinecone(api_key="YOUR_PINECONE_API_KEY")
index_name = "langchain-rag-index"
if index_name not in [i["name"] for i in pc.list_indexes()]:
pc.create_index(
name=index_name,
dimension=1536, # matches text-embedding-3-small
metric="cosine",
spec=ServerlessSpec(cloud="aws", region="us-east-1"),
)
3. Embed and upsert your documents
from langchain_openai import OpenAIEmbeddings
from langchain_pinecone import PineconeVectorStore
from langchain_text_splitters import CharacterTextSplitter
# split your source text into chunks
splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = splitter.split_documents(your_loaded_documents)
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
# embeds the chunks and upserts them into Pinecone in one call
docsearch = PineconeVectorStore.from_documents(
docs, embeddings, index_name=index_name
)
4. Retrieve relevant context
# simple similarity search
results = docsearch.similarity_search("your question here", k=4)
# or as a retriever to plug into a chain (MMR for diversity)
retriever = docsearch.as_retriever(search_type="mmr")
context_docs = retriever.invoke("your question here")
From here you pass the retrieved chunks as context to your LLM (via a LangChain chain or your own prompt). To add more data later to an existing index, use PineconeVectorStore(index_name=index_name, embedding=embeddings).add_texts([...]).
When Pinecone + LangChain is the right choice
- You're building RAG or semantic search and want a managed vector store you don't have to operate.
- You expect scale — millions of vectors, low-latency queries, serverless billing.
- You want to stay framework-flexible — LangChain lets you swap embedding models or LLMs without rewriting your retrieval layer.
If you're prototyping locally or at small scale, a lighter store (FAISS, Chroma, or pgvector) through the same LangChain interface may be cheaper — Pinecone earns its keep at production scale. See the Pinecone review and LangChain review for the deeper take, and Pinecone vs LangChain for why the "vs" framing is misleading.