⏳
Loading cheatsheet...
Chains, agents, memory, tools, RAG, vector stores, LLM integrations, and production patterns.
| Layer | Package | Responsibility |
|---|---|---|
| Core | langchain-core | Base abstractions: LLMs, ChatModels, Tools, Messages, Prompts, Runnables |
| Standard | langchain | Chains, Agents, Retrievers, Memory, higher-level composition |
| Community | langchain-community | 100+ third-party integrations (vector stores, loaders, tools) |
| Partner | langchain-openai, langchain-anthropic, etc. | First-class provider integrations with optimized support |
langchain-core types directly and compose with LCEL. Avoid legacy langchain Chain classes — they are deprecated. Always import from the most specific package (e.g., langchain_openai not langchain.chat_models).# Core (always install)
pip install langchain langchain-core
# Partner packages (pick what you need)
pip install langchain-openai # OpenAI / Azure OpenAI
pip install langchain-anthropic # Anthropic Claude
pip install langchain-google-genai # Google Gemini
# Community integrations
pip install langchain-community
# Common add-ons
pip install langchain-chroma # Chroma vector store
pip install langchain-text-splitters # Text splitting utilities
pip install langchainhub # Shared prompts from LangSmith Hub
# Full install (everything)
pip install langchain langchain-community langchain-core
# LangSmith (tracing & evaluation)
pip install langsmith# Core
npm install langchain @langchain/core
# Partner packages
npm install @langchain/openai
npm install @langchain/anthropic
npm install @langchain/google-genai
# Community
npm install @langchain/community
# Vector stores & tools
npm install @langchain/chroma
npm install langchain-chroma
# Full
npm install langchain @langchain/core @langchain/openai# Required environment variables
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_API_KEY=...
LANGCHAIN_API_KEY=... # LangSmith
LANGCHAIN_TRACING_V2=true # Enable tracing
LANGCHAIN_PROJECT=my-project # Project name| Package | Provider | Models |
|---|---|---|
| langchain-openai | OpenAI / Azure | GPT-4o, GPT-4o-mini, o1, o3-mini, text-embedding-3-small/large |
| langchain-anthropic | Anthropic | Claude 4 Sonnet/Opus, Claude 3.5 Haiku |
| langchain-google-genai | Gemini 2.5 Pro/Flash, Gemini 2.0 | |
| langchain-groq | Groq | Llama 3, Mixtral (fast inference) |
| langchain-cohere | Cohere | Command R+, Embed (multilingual) |
| langchain-mistralai | Mistral | Mistral Large, Codestral, Embed |
from langchain_openai import ChatOpenAI) instead of legacy paths (from langchain.chat_models import ChatOpenAI). The old paths still work but emit deprecation warnings.| Method | Description | Returns |
|---|---|---|
| invoke(input) | Single input → single output | Output type |
| batch(inputs) | Multiple inputs → multiple outputs | List[Output] |
| stream(input) | Stream output chunks | Iterator[OutputChunk] |
| ainvoke(input) | Async single invocation | Awaitable[Output] |
| abatch(inputs) | Async batch invocation | Awaitable[List[Output]] |
| astream(input) | Async streaming | AsyncIterator[OutputChunk] |
| astream_events(input, version) | Stream events from all steps | AsyncIterator[Event] |
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_openai import ChatOpenAI
# Build components
prompt = ChatPromptTemplate.from_template("Tell me about {topic}")
model = ChatOpenAI(model="gpt-4o", temperature=0.7)
output_parser = StrOutputParser()
# Compose with LCEL pipe operator
chain = prompt | model | output_parser
# Invoke
result = chain.invoke({"topic": "LangChain"})
print(result)
# Stream
for chunk in chain.stream({"topic": "LangChain"}):
print(chunk, end="", flush=True)
# Batch
results = chain.batch([{"topic": "AI"}, {"topic": "Python"}])
# Parallel with RunnableParallel
from langchain_core.runnables import RunnableParallel
chain = RunnableParallel(
joke=prompt | model | StrOutputParser(),
poem=prompt | model | StrOutputParser()
) | (lambda x: f"JOKE: {x['joke']}\n\nPOEM: {x['poem']}")Runnable interface can be composed with |. This gives you streaming, async, batch, and tracing for free. Prefer LCEL over all legacy chain classes (LLMChain, ConversationChain, etc.).from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic
from langchain_google_genai import ChatGoogleGenerativeAI
# OpenAI
openai = ChatOpenAI(model="gpt-4o", temperature=0.7)
openai_mini = ChatOpenAI(model="gpt-4o-mini", temperature=0)
# Anthropic
claude = ChatAnthropic(model="claude-sonnet-4-20250514", temperature=0.7)
claude_haiku = ChatAnthropic(model="claude-3-5-haiku-20241022")
# Google Gemini
gemini = ChatGoogleGenerativeAI(model="gemini-2.5-pro-preview-05-06")
# Streaming
async for chunk in openai.astream("Hello!"):
print(chunk.content, end="", flush=True)| Type | Import | Purpose |
|---|---|---|
| SystemMessage | langchain_core.messages | Set AI behavior, persona, rules |
| HumanMessage | langchain_core.messages | User input / question |
| AIMessage | langchain_core.messages | AI response (can include tool_calls) |
| ToolMessage | langchain_core.messages | Result from a tool execution |
| BaseMessage | langchain_core.messages | Base class for all messages |
from langchain_core.messages import (
SystemMessage, HumanMessage, AIMessage, ToolMessage
)
# Direct message invocation
messages = [
SystemMessage(content="You are a helpful Python tutor."),
HumanMessage(content="Explain list comprehensions with examples."),
]
response = model.invoke(messages)
print(response.content)
# Multi-turn conversation
conversation = [
SystemMessage(content="You are a math tutor."),
HumanMessage(content="What is 15 * 23?"),
AIMessage(content="15 * 23 = 345"),
HumanMessage(content="Now add 100 to that result."),
]
response = model.invoke(conversation)
# AI understands context: 345 + 100 = 445| Type | Import | Use Case |
|---|---|---|
| ChatPromptTemplate | langchain_core.prompts | Multi-message templates (recommended) |
| PromptTemplate | langchain_core.prompts | Single string template (legacy) |
| MessagesPlaceholder | langchain_core.prompts | Inject a list of messages dynamically |
| PipelinePromptTemplate | langchain_core.prompts | Compose multiple prompts in a pipeline |
| Practice | Details |
|---|---|
| Use ChatPromptTemplate | Preferred over PromptTemplate for chat models |
| Use MessagesPlaceholder | For conversation history / dynamic messages |
| Be specific in system messages | Define role, tone, format constraints clearly |
| Few-shot examples | Include 2-5 examples for better output consistency |
| Use partial variables | Pre-fill parts of the template (e.g., date, context) |
| LangSmith Hub | Browse & share prompts at hub.langchain.com |
from langchain_core.prompts import (
ChatPromptTemplate, MessagesPlaceholder, PromptTemplate
)
# Simple chat prompt
prompt = ChatPromptTemplate.from_messages([
("system", "You are a {role} expert in {domain}."),
("human", "{question}"),
])
chain = prompt | model | StrOutputParser()
chain.invoke({"role": "senior", "domain": "Python", "question": "What is a decorator?"})
# With conversation history
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant."),
MessagesPlaceholder(variable_name="history"),
("human", "{input}"),
])
# Few-shot examples
prompt = ChatPromptTemplate.from_messages([
("system", "Classify the sentiment. Reply: POSITIVE, NEGATIVE, or NEUTRAL."),
("human", "{input}"),
("ai", "{output}"),
])
few_shot_prompt = ChatPromptTemplate.from_messages([
("system", "Classify the sentiment. Reply: POSITIVE, NEGATIVE, or NEUTRAL."),
("human", "I love this product!"),
("ai", "POSITIVE"),
("human", "This is terrible."),
("ai", "NEGATIVE"),
("human", "{input}"),
])| Parser | Import | Output Type | Description |
|---|---|---|---|
| StrOutputParser | langchain_core.output_parsers | str | Extracts string from AIMessage.content |
| JsonOutputParser | langchain_core.output_parsers | dict/list | Parses JSON from response |
| PydanticOutputParser | langchain_core.output_parsers | Pydantic Model | Structured output with validation |
| CommaSeparatedListOutputParser | langchain_core.output_parsers | list[str] | Parses comma-separated list |
| XMLOutputParser | langchain_core.output_parsers | str (XML) | Parses XML format output |
from pydantic import BaseModel, Field
from langchain_core.output_parsers import PydanticOutputParser
class MovieReview(BaseModel):
title: str = Field(description="Movie title")
rating: float = Field(description="Rating 1-10")
summary: str = Field(description="One paragraph summary")
recommended: bool = Field(description="Would you recommend it?")
parser = PydanticOutputParser(pydantic_object=MovieReview)
prompt = ChatPromptTemplate.from_messages([
("system", "Extract movie review info.\n{format_instructions}"),
("human", "Review: {review}"),
]).partial(format_instructions=parser.get_format_instructions())
chain = prompt | model | parser
result = chain.invoke({"review": "Inception is a 10/10 masterpiece..."})
print(result.title) # "Inception"
print(result.rating) # 10.0
print(result.recommended) # Truefrom langchain_core.output_parsers import JsonOutputParser
from langchain_core.prompts import ChatPromptTemplate
parser = JsonOutputParser()
prompt = ChatPromptTemplate.from_messages([
("system", "Extract entities from text as JSON. Keys: people, places, organizations."),
("human", "{text}"),
])
chain = prompt | model | parser
result = chain.invoke({"text": "Tim Cook visited the Apple Park in Cupertino."})
# {"people": ["Tim Cook"], "places": ["Apple Park", "Cupertino"], "organizations": ["Apple"]}| Stage | Component | Description |
|---|---|---|
| 1. Load | Document Loaders | Load docs from web, PDF, DB, API, etc. |
| 2. Split | Text Splitters | Break docs into manageable chunks |
| 3. Embed | Embedding Models | Convert text to vector representations |
| 4. Store | Vector Stores | Persist and index embeddings |
| 5. Retrieve | Retrievers | Fetch relevant chunks for a query |
| 6. Generate | LLM + Prompt | Generate answer using retrieved context |
| Loader | Import | Source |
|---|---|---|
| WebBaseLoader | langchain_community.document_loaders | Web pages (HTML) |
| PyPDFLoader | langchain_community.document_loaders | PDF files (page by page) |
| TextLoader | langchain_community.document_loaders | Plain text / .txt files |
| CSVLoader | langchain_community.document_loaders | CSV files (row by row) |
| JSONLoader | langchain_community.document_loaders | JSON / JSONL files |
| YouTubeLoader | langchain_community.document_loaders | YouTube transcripts |
| GitLoader | langchain_community.document_loaders | Git repositories |
| SeleniumURLLoader | langchain_community.document_loaders | JS-rendered web pages |
from langchain_community.document_loaders import WebBaseLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_community.vectorstores import Chroma
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
# 1. Load documents
loader = WebBaseLoader("https://docs.python.org/3/tutorial/")
docs = loader.load()
# 2. Split into chunks
splitter = RecursiveCharacterTextSplitter(
chunk_size=1000,
chunk_overlap=200,
separators=["\n\n", "\n", ". ", " ", ""]
)
chunks = splitter.split_documents(docs)
# 3. Create embeddings and vector store
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma.from_documents(chunks, embeddings, persist_directory="./chroma_db")
# 4. Create retriever
retriever = vectorstore.as_retriever(
search_type="mmr", # or "similarity"
search_kwargs={"k": 4} # top 4 chunks
)
# 5. Build RAG chain
prompt = ChatPromptTemplate.from_template(
"Answer the question based only on the following context:\n"
"Context: {context}\n\nQuestion: {question}"
)
def format_docs(docs):
return "\n\n".join(doc.page_content for doc in docs)
rag_chain = (
{"context": retriever | format_docs, "question": RunnablePassthrough()}
| prompt
| ChatOpenAI(model="gpt-4o")
| StrOutputParser()
)
# 6. Query
answer = rag_chain.invoke("How do I install Python packages?")
print(answer)chunk_size=1000 and chunk_overlap=200. Use RecursiveCharacterTextSplitter as the default — it splits on paragraphs, lines, then sentences for the most natural boundaries.| Store | Type | Best For | Install |
|---|---|---|---|
| Chroma | Open-source, local | Development & small projects | pip install langchain-chroma |
| FAISS | Open-source, local | Fast similarity search at scale | pip install faiss-cpu |
| Pinecone | Managed cloud | Production apps, auto-scaling | pip install pinecone-client |
| Weaviate | Open-source / cloud | Hybrid search (vector + keyword) | pip install weaviate-client |
| Qdrant | Open-source / cloud | High-performance filtering | pip install qdrant-client |
| PGVector | PostgreSQL extension | Already using Postgres | pip install pgvector |
| Chroma / LanceDB | Embedded / local | Edge, offline, or zero-config apps | pip install lancedb |
| Model | Dimensions | Cost | Notes |
|---|---|---|---|
| text-embedding-3-small | 1536 | $0.02/1M tokens | Best value for most tasks |
| text-embedding-3-large | 3072 | $0.13/1M tokens | Higher quality, larger vectors |
| text-embedding-ada-002 | 1536 | $0.10/1M tokens | Legacy (still works) |
| BGE (HuggingFace) | 768-1024 | Free (local) | Great open-source alternative |
| Nomic Embed | 768 | Free (local) | Strong open-source, long context |
| Cohere Embed v3 | 1024 | API pricing | Excellent multilingual support |
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS, Chroma
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
# FAISS (fast, in-memory)
vectorstore = FAISS.from_documents(chunks, embeddings)
vectorstore.save_local("faiss_index")
vectorstore = FAISS.load_local("faiss_index", embeddings)
# Chroma (persistent)
vectorstore = Chroma.from_documents(
chunks, embeddings, persist_directory="./chroma_db"
)
# Similarity search
results = vectorstore.similarity_search("query", k=5)
# MMR search (maximal marginal relevance — more diverse)
results = vectorstore.max_marginal_relevance_search("query", k=5, fetch_k=20)
# As retriever
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})from langchain_core.tools import tool, StructuredTool
from pydantic import BaseModel, Field
# Method 1: @tool decorator (simplest)
@tool
def search_web(query: str) -> str:
"""Search the web for information about a topic."""
# Call your search API here
return f"Search results for: {query}"
@tool
def calculate(expression: str) -> float:
"""Evaluate a mathematical expression."""
return eval(expression)
# Method 2: Pydantic schema for complex tools
class SearchInput(BaseModel):
query: str = Field(description="Search query")
num_results: int = Field(default=5, description="Number of results")
@tool(args_schema=SearchInput)
def search_advanced(query: str, num_results: int = 5) -> str:
"""Search with configurable result count."""
return f"Top {num_results} results for: {query}"
# Method 3: StructuredTool
from langchain_core.tools import StructuredTool
weather_tool = StructuredTool.from_function(
func=get_weather,
name="get_weather",
description="Get current weather for a city",
args_schema=WeatherInput,
)from langchain_openai import ChatOpenAI
model = ChatOpenAI(model="gpt-4o")
tools = [search_web, calculate, search_advanced]
# Bind tools to model
llm_with_tools = model.bind_tools(tools)
# Invoke — LLM decides which tools to call
response = llm_with_tools.invoke("What is 25 * 47?")
# AIMessage with tool_calls: [
# {"name": "calculate", "args": {"expression": "25 * 47"}}
# ]
# Execute tool calls
for tool_call in response.tool_calls:
tool_name = tool_call["name"]
tool_args = tool_call["args"]
selected_tool = {t.name: t for t in tools}[tool_name]
result = selected_tool.invoke(tool_args)
print(f"{tool_name}({tool_args}) = {result}")
# Built-in LangChain tools
from langchain_community.tools import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper
wiki_tool = WikipediaQueryRun(api_wrapper=WikipediaAPIWrapper())"Search the web for real-time information. Use when the user asks about current events."| Agent | Strategy | Best For | Status |
|---|---|---|---|
| create_react_agent | ReAct (Reason + Act) | General-purpose, tool-using agents | Recommended |
| create_openai_functions_agent | OpenAI function calling | When using OpenAI models | Recommended |
| create_openai_tools_agent | OpenAI tool calling | Modern OpenAI tool calling | Recommended |
| create_tool_calling_agent | Universal tool calling | Any model with tool support | Recommended |
| create_structured_chat_agent | Structured multi-input | Complex tool arguments | Legacy |
| Plan-and-Execute | Plan first, then execute | Multi-step reasoning tasks | Advanced |
| Parameter | Type | Description |
|---|---|---|
| agent | Runnable | The agent (LLM + tools + prompt) |
| tools | list[BaseTool] | Available tools for the agent |
| verbose | bool | Print reasoning steps (debugging) |
| max_iterations | int | Max reasoning loops (default: 15) |
| max_execution_time | float | Timeout in seconds |
| early_stopping_method | str | "force" or "generate" |
| handle_parsing_errors | bool/str | Handle output parsing failures |
| return_intermediate_steps | bool | Include reasoning steps in output |
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.agents import create_tool_calling_agent, AgentExecutor
model = ChatOpenAI(model="gpt-4o")
tools = [search_web, calculate]
# Agent prompt (tool_calling format)
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant. Use tools when needed."),
("human", "{input}"),
MessagesPlaceholder(variable_name="agent_scratchpad"),
])
# Create agent and executor
agent = create_tool_calling_agent(model, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
# Run
result = agent_executor.invoke({"input": "What is 25 * 47? Then search for LangChain news."})
# ─── Plan-and-Execute Agent ───
from langchain_experimental.plan_and_execute import (
PlanAndExecute, load_agent_executor, load_chat_planner
)
planner = load_chat_planner(model)
executor = load_agent_executor(model, tools, verbose=True)
agent = PlanAndExecute(planner=planner, executor=executor, verbose=True)
agent.invoke({"input": "Research LangChain and summarize key features."})create_tool_calling_agent or create_openai_tools_agent. They produce more reliable tool calls than the legacy ReAct prompt. Reserve ReAct for older or smaller models.| Memory | Description | Use Case |
|---|---|---|
| ConversationBufferMemory | Stores full conversation history | Short conversations |
| ConversationBufferWindowMemory | Keeps last k turns | Longer conversations with context limit |
| ConversationSummaryMemory | LLM summarizes past turns | Very long conversations |
| ConversationSummaryBufferMemory | Summary of old + recent messages | Balanced approach |
| VectorStoreRetrieverMemory | Semantic search over history | Complex, long-running conversations |
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_community.chat_message_histories import ChatMessageHistory
store = {} # session_id -> ChatMessageHistory
def get_session_history(session_id):
if session_id not in store:
store[session_id] = ChatMessageHistory()
return store[session_id]
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant."),
MessagesPlaceholder(variable_name="history"),
("human", "{input}"),
])
chain = prompt | model | StrOutputParser()
chain_with_history = RunnableWithMessageHistory(
chain,
get_session_history,
input_messages_key="input",
history_messages_key="history",
)
# Multi-turn with memory
chain_with_history.invoke(
{"input": "Hi, I'm John. I love Python."},
config={"configurable": {"session_id": "user-123"}}
)
chain_with_history.invoke(
{"input": "What's my name and favorite language?"},
config={"configurable": {"session_id": "user-123"}}
)
# "Your name is John and your favorite language is Python."ConversationChain + ConversationBufferMemory pattern still works but is deprecated. The new RunnableWithMessageHistory integrates natively with LCEL and supports multi-session, async, and streaming.# ─── Sync Streaming ───
for chunk in chain.stream({"topic": "AI"}):
print(chunk, end="", flush=True)
# ─── Async Streaming ───
async for chunk in chain.astream({"topic": "AI"}):
print(chunk, end="", flush=True)
# ─── Stream with events (detailed) ───
async for event in chain.astream_events(
{"topic": "AI"}, version="v2"
):
kind = event["event"]
if kind == "on_chat_model_stream":
print(event["data"]["chunk"].content, end="", flush=True)
elif kind == "on_tool_start":
print(f"\nCalling tool: {event['name']}")
# ─── Stream tokens only ───
for token in model.stream("Tell me about AI"):
print(token.content, end="", flush=True)# RAG with streaming
async for chunk in rag_chain.astream("What is Python?"):
print(chunk, end="", flush=True)
# Stream both retrieved docs AND response
async for event in rag_chain.astream_events(
{"topic": "Python"}, version="v2"
):
kind = event["event"]
if kind == "on_retriever_end":
docs = event["data"]["output"]
print(f"Retrieved {len(docs)} documents")
elif kind == "on_chat_model_stream":
print(event["data"]["chunk"].content, end="")
# FastAPI streaming endpoint
from fastapi.responses import StreamingResponse
async def stream_response(query: str):
async for chunk in rag_chain.astream(query):
yield chunk
@app.get("/chat")
async def chat(q: str):
return StreamingResponse(stream_response(q))# ─── Environment Variables ───
export LANGCHAIN_API_KEY="lsv2_..."
export LANGCHAIN_TRACING_V2="true"
export LANGCHAIN_PROJECT="my-project"
# ─── Auto-tracing (recommended) ───
# Just set the env vars — all LCEL chains
# are automatically traced!
# ─── Manual tracing ───
from langsmith import traceable
@traceable(name="my_custom_function", tags=["custom"])
def my_function(query: str):
result = chain.invoke({"input": query})
return result
# ─── Nested traces ───
@traceable(name="rag_pipeline")
def rag_query(query: str):
docs = retriever.invoke(query)
answer = chain.invoke({"context": docs, "question": query})
return answer| Feature | Description |
|---|---|
| Tracing | Full trace of every LLM call, chain, tool invocation |
| Evaluation | Run evaluators (correctness, helpfulness, custom criteria) |
| Prompt Management | Version, test, and deploy prompts |
| Datasets | Create test datasets for consistent evaluation |
| Playground | Interactive testing of chains and prompts |
| Monitoring | Track latency, token usage, error rates, costs |
| Annotation | Human feedback and annotation workflows |
| Self-Hosted | LangSmith Self-Hosted v0.11 (Aug 2025) available |
from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
primary = ChatOpenAI(model="gpt-4o")
fallback = ChatAnthropic(model="claude-sonnet-4-20250514")
backup = ChatOpenAI(model="gpt-4o-mini")
prompt = ChatPromptTemplate.from_template("{input}")
# Try primary, fall back on error
from langchain_core.runnables import RunnableWithFallbacks
primary_chain = prompt | primary | StrOutputParser()
fallback_chain = prompt | fallback | StrOutputParser()
backup_chain = prompt | backup | StrOutputParser()
chain = primary_chain.with_fallbacks([fallback_chain, backup_chain])
# Automatically retries with fallback on rate limit / error
result = chain.invoke({"input": "Hello!"})from langchain_core.rate_limiters import InMemoryRateLimiter
# Rate limiting (requests per minute)
rate_limiter = InMemoryRateLimiter(
requests_per_second=5,
max_retries=3,
check_every_n_seconds=0.1,
)
model = ChatOpenAI(
model="gpt-4o",
rate_limiter=rate_limiter,
max_retries=3, # Built-in retry on transient errors
)
# Retry with exponential backoff
model_with_retry = model.with_retry(
stop_after_attempt=3,
retry_if_exception_type=(RateLimitError, APIConnectionError),
)
# Parsing error handling
chain = (
prompt
| model
| output_parser.with_fallbacks([
StrOutputParser(), # fallback to raw string
])
)| Feature | AgentExecutor | LangGraph |
|---|---|---|
| Control flow | Fixed loop (Thought → Action → Observe) | Custom graph with arbitrary loops |
| State management | Limited (memory only) | Full stateful graph with persistence |
| Multiple agents | Not supported | Multi-agent workflows |
| Human-in-the-loop | Not supported | Interrupt nodes for human approval |
| Branching | Not supported | Conditional edges, parallel nodes |
| Persistence | Memory classes only | Checkpointers (SQLite, Postgres) |
| Streaming | Basic | Per-node streaming events |
| Complexity | Simple | More boilerplate, more power |
from typing import TypedDict, Annotated
from langgraph.graph import StateGraph, END
from langgraph.graph.message import add_messages
from langgraph.checkpoint.memory import MemorySaver
class State(TypedDict):
messages: Annotated[list, add_messages]
next: str
# Nodes
def chatbot(state: State):
response = model.invoke(state["messages"])
return {"messages": [response]}
def should_continue(state: State):
last_message = state["messages"][-1]
if last_message.tool_calls:
return "tools"
return END
# Build graph
graph = StateGraph(State)
graph.add_node("agent", chatbot)
graph.add_node("tools", tool_node)
graph.set_entry_point("agent")
graph.add_conditional_edges("agent", should_continue, {"tools": "tools", END: END})
graph.add_edge("tools", "agent")
# Compile with memory
checkpointer = MemorySaver()
app = graph.compile(checkpointer=checkpointer)
# Run
result = app.invoke(
{"messages": [("human", "What is 25 * 47?")]},
config={"configurable": {"thread_id": "session-1"}}
)| Area | Recommendation |
|---|---|
| Chain composition | Use LCEL pipe operator — never legacy Chain classes |
| Imports | Use specific packages (langchain_openai, not langchain.chat_models) |
| Memory | Use RunnableWithMessageHistory — not ConversationChain + Memory |
| Agents | Use tool_calling agents for modern models, not ReAct prompt |
| Output parsing | Use PydanticOutputParser for structured data |
| Tracing | Enable LangSmith from day one |
| Error handling | Add fallbacks for provider failover |
| RAG chunking | RecursiveCharacterTextSplitter (1000/200 default) |
| Cost control | Set max_tokens, use smaller models for simple tasks |
| Security | Never embed secrets in prompts — use environment variables |
| Legacy (Deprecated) | Modern (v0.3+) |
|---|---|
| from langchain.chat_models import ChatOpenAI | from langchain_openai import ChatOpenAI |
| from langchain.llms import OpenAI | from langchain_openai import OpenAI (legacy) |
| from langchain.embeddings import OpenAIEmbeddings | from langchain_openai import OpenAIEmbeddings |
| from langchain.vectorstores import Chroma | from langchain_chroma import Chroma |
| LLMChain(prompt=prompt, llm=model) | prompt | model | parser (LCEL) |
| ConversationChain(llm=model, memory=mem) | RunnableWithMessageHistory(chain, ...) |
| AgentExecutor.from_agent_and_tools(...) | AgentExecutor(agent=agent, tools=tools) |
| ConversationBufferMemory() | ChatMessageHistory() |
| VectorStoreRetrieverMemory(vectorstore) | vectorstore.as_retriever() |
python -W always your_script.py to surface all deprecation warnings. The langchain package itself is a convenience wrapper — prefer importing from specific packages. LangChain v1.0 will remove all legacy imports.| What | Import Statement |
|---|---|
| ChatOpenAI | from langchain_openai import ChatOpenAI |
| ChatAnthropic | from langchain_anthropic import ChatAnthropic |
| OpenAIEmbeddings | from langchain_openai import OpenAIEmbeddings |
| ChatPromptTemplate | from langchain_core.prompts import ChatPromptTemplate |
| MessagesPlaceholder | from langchain_core.prompts import MessagesPlaceholder |
| StrOutputParser | from langchain_core.output_parsers import StrOutputParser |
| JsonOutputParser | from langchain_core.output_parsers import JsonOutputParser |
| PydanticOutputParser | from langchain_core.output_parsers import PydanticOutputParser |
| SystemMessage | from langchain_core.messages import SystemMessage |
| HumanMessage | from langchain_core.messages import HumanMessage |
| AIMessage | from langchain_core.messages import AIMessage |
| ToolMessage | from langchain_core.messages import ToolMessage |
| @tool decorator | from langchain_core.tools import tool |
| RunnableParallel | from langchain_core.runnables import RunnableParallel |
| RunnablePassthrough | from langchain_core.runnables import RunnablePassthrough |
| Message history | from langchain_core.runnables.history import RunnableWithMessageHistory |
| Pattern | Code Snippet |
|---|---|
| Simple chain | prompt | model | StrOutputParser() |
| RAG chain | { "context": retriever, "q": RunnablePassthrough() } | prompt | model | parser |
| Batch invoke | chain.batch([{"input": "A"}, {"input": "B"}]) |
| Stream | for chunk in chain.stream({...}): print(chunk) |
| Fallback | primary_chain.with_fallbacks([fallback_chain]) |
| Tool binding | model.bind_tools(tools) |
| Parallel | RunnableParallel(a=chain_a, b=chain_b) |
| Map docs | retriever | (lambda docs: "\n".join(d.page_content for d in docs)) |
| Primitive | Description | Example |
|---|---|---|
| RunnablePassthrough | Passes input through unchanged | Used to forward query in RAG |
| RunnableParallel | Run multiple chains in parallel | RunnableParallel(summary=..., title=...) |
| RunnableLambda | Wrap any function as Runnable | RunnableLambda(lambda x: x.upper()) |
| RunnableBranch | Conditional routing | Route based on input type/content |
| RunnableWithFallbacks | Try chain A, fall back to B | primary.with_fallbacks([backup]) |
| RunnableWithMessageHistory | Add conversation memory | chain | history(session_id) |
| RunnableConfig | Pass runtime config | config={"max_tokens": 500} |
| RunnableBinding | Bind default params | model.bind(temperature=0) |
| Splitter | Best For | Key Parameters |
|---|---|---|
| RecursiveCharacterTextSplitter | Default choice (most documents) | chunk_size=1000, chunk_overlap=200 |
| TokenTextSplitter | When token count matters | chunk_size=500, chunk_overlap=50 |
| MarkdownTextSplitter | Markdown documents | Splits on headers, code blocks, lists |
| HTMLHeaderTextSplitter | HTML pages | Splits by HTML header tags |
| CodeSplitter | Source code files | Language-aware splitting (Python, JS, etc.) |
| SemanticChunker | Semantic coherence | Uses embeddings for boundary detection |
LangSmith for debugging, LCEL for composition, and LangGraph for complex agent workflows.