Skip to content

LangChain

LangChain is a framework for building LLM-powered applications. It provides composable building blocks — prompt templates, chains, memory, tools, and agents — and a unified interface to dozens of LLM providers.


Why LangChain?

Without LangChain:                     With LangChain:
──────────────────────                 ─────────────────────
Manual prompt string formatting        ChatPromptTemplate with variables
Re-implement retry logic everywhere    Built-in retry / fallback
Custom parsing for each model output   Output parsers
Manual conversation history mgmt       Memory components
Custom tool execution scaffolding       Standardised tool + agent loop
No streaming abstraction               .stream() on any chain

Use LangChain when you have complex pipelines. For a single API call with a simple prompt, the raw SDK is fine.


Core Concepts Mindmap

mindmap
  root((LangChain))
    LCEL Pipe Syntax
      Chain components with |
      Lazy evaluation
      Built-in streaming
      Parallel execution
      Batch support
    Prompt Templates
      ChatPromptTemplate
      SystemMessage
      HumanMessage
      AIMessage
      MessagesPlaceholder
      Variable substitution
    Models
      ChatOpenAI
      ChatAnthropic
      ChatGoogleGenerativeAI
      Ollama local models
      .bind for model config
    Output Parsers
      StrOutputParser
      JsonOutputParser
      PydanticOutputParser
      CommaSeparatedListOutputParser
    Memory
      ConversationBufferMemory
      ConversationSummaryMemory
      ConversationBufferWindowMemory
      VectorStoreRetrieverMemory
    Tools
      Custom Python functions
      DuckDuckGoSearch
      WikipediaQueryRun
      PythonREPL
      Structured tool with schema
    Agents
      ReAct agent
      Tool calling agent
      create_react_agent
      AgentExecutor

LCEL — LangChain Expression Language

LCEL is the modern way to compose LangChain. Chains are built with the | (pipe) operator. Each component's output becomes the next component's input.

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

llm = ChatOpenAI(model="gpt-4o", temperature=0)
parser = StrOutputParser()

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a concise technical assistant."),
    ("human", "{question}"),
])

# Build the chain with | operator
chain = prompt | llm | parser

# Invoke
result = chain.invoke({"question": "What is a context window?"})

# Stream token by token
for chunk in chain.stream({"question": "Explain embeddings"}):
    print(chunk, end="", flush=True)

# Batch multiple inputs
results = chain.batch([
    {"question": "What is RAG?"},
    {"question": "What is a vector database?"},
])

LCEL Parallel Execution

from langchain_core.runnables import RunnableParallel

# Run two chains simultaneously
parallel_chain = RunnableParallel({
    "summary":    summary_chain,
    "key_points": key_points_chain,
})

result = parallel_chain.invoke({"text": long_article})
# result = {"summary": "...", "key_points": [...]}
# Both chains ran at the same time ✅

Prompt Templates

from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import SystemMessage, HumanMessage

# Simple template with variable substitution
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a {role}. Answer in {language}."),
    ("human", "{question}"),
])

formatted = prompt.format_messages(
    role="senior DevOps engineer",
    language="simple English",
    question="What is Kubernetes?"
)

# Template with chat history (for conversational chains)
prompt_with_history = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    MessagesPlaceholder("chat_history"),   # ← injects history list here
    ("human", "{input}"),
])

Memory

Memory lets a chain "remember" previous turns in a conversation. Different types trade off between completeness and token efficiency.

ConversationBufferMemory
  Stores every message verbatim
  ✅ Perfectly accurate
  ❌ Grows unboundedly → hits context limit eventually
  Best for: short conversations

ConversationBufferWindowMemory(k=5)
  Keeps only the last k turns
  ✅ Bounded token usage
  ❌ Loses old context
  Best for: most chatbots

ConversationSummaryMemory
  Summarises older turns using an LLM
  ✅ Bounded + keeps gist of old context
  ❌ Extra LLM calls for summarisation
  Best for: long conversations where full history matters

ConversationSummaryBufferMemory(max_token_limit=2000)
  Keeps recent turns verbatim, summarises older ones
  ✅ Best of both worlds
  Best for: production chatbots
from langchain.memory import ConversationBufferMemory
from langchain_core.runnables.history import RunnableWithMessageHistory

# Modern approach: RunnableWithMessageHistory
from langchain_community.chat_message_histories import ChatMessageHistory

store = {}  # session_id → ChatMessageHistory

def get_session_history(session_id: str) -> ChatMessageHistory:
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]

chain_with_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="input",
    history_messages_key="chat_history",
)

# Each call with same session_id shares history
chain_with_history.invoke(
    {"input": "What is LangChain?"},
    config={"configurable": {"session_id": "user-42"}}
)

Tools

Tools give the LLM the ability to interact with the world — search the web, run code, call APIs, query databases.

from langchain_core.tools import tool
from pydantic import BaseModel

# Simple tool — decorate any function
@tool
def get_weather(city: str) -> str:
    """Get the current weather for a city."""
    # Call a real weather API here
    return f"Tokyo: 18°C, rainy"

# Tool with structured input schema
class StockInput(BaseModel):
    ticker: str
    date: str

@tool(args_schema=StockInput)
def get_stock_price(ticker: str, date: str) -> float:
    """Get the closing stock price for a ticker on a date."""
    return 182.50

# Bind tools to the LLM
llm_with_tools = ChatOpenAI(model="gpt-4o").bind_tools([get_weather, get_stock_price])

Agents

Agents let the LLM decide which tools to use and in what order, rather than following a fixed pipeline. The LLM is the decision-maker.

from langchain import hub
from langchain.agents import create_react_agent, AgentExecutor
from langchain_community.tools import DuckDuckGoSearchRun, WikipediaQueryRun

tools = [
    DuckDuckGoSearchRun(),
    WikipediaQueryRun(),
    get_weather,
]

# Pull the ReAct prompt from LangChain Hub
react_prompt = hub.pull("hwchase17/react")

# Create agent (the reasoning loop)
agent = create_react_agent(llm, tools, react_prompt)

# Wrap in executor (handles the loop + error recovery)
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True,          # Prints Thought/Action/Observation
    max_iterations=10,     # Prevent infinite loops
    handle_parsing_errors=True,
)

result = agent_executor.invoke({
    "input": "What's the weather in the capital of Japan?"
})

What Happens Under the Hood

User: "What's the weather in the capital of Japan?"

Thought: I need to find the capital of Japan first.
Action: wikipedia_query("capital of Japan")
Observation: "Tokyo is the capital of Japan..."

Thought: Now I have the capital. I can check the weather.
Action: get_weather("Tokyo")
Observation: "Tokyo: 18°C, rainy"

Thought: I have all the info I need.
Final Answer: The capital of Japan is Tokyo, and the current weather is 18°C with rain.

Output Parsers

from langchain_core.output_parsers import (
    StrOutputParser,
    JsonOutputParser,
    CommaSeparatedListOutputParser,
)
from langchain_core.pydantic_v1 import BaseModel, Field

# String (default)
chain = prompt | llm | StrOutputParser()

# JSON — parse directly to dict
chain = prompt | llm | JsonOutputParser()

# Pydantic — typed, validated output
class Review(BaseModel):
    sentiment: str = Field(description="POSITIVE, NEGATIVE, or NEUTRAL")
    score: int = Field(description="Score from 1-10")
    summary: str = Field(description="One sentence summary")

parser = JsonOutputParser(pydantic_object=Review)
chain = prompt | llm | parser
result = chain.invoke({"review": "Great product, fast shipping!"})
# result is a typed Review object ✅

Quick Reference

LCEL chain syntax:
  chain = prompt | llm | parser
  chain.invoke({})    → single result
  chain.stream({})    → token-by-token generator
  chain.batch([{}])   → list of results, parallel

Memory types by use case:
  Short chat    → ConversationBufferMemory
  Most chatbots → ConversationBufferWindowMemory(k=5)
  Long sessions → ConversationSummaryBufferMemory(max_token_limit=2000)

Agent components:
  Tools          → what the agent can do
  Prompt         → how the agent reasons (ReAct format)
  LLM            → what decides next action
  AgentExecutor  → runs the Thought→Action→Observation loop

Common chains:
  prompt | llm | parser              → basic chain
  retriever | format | prompt | llm  → RAG chain
  RunnableParallel({a: c1, b: c2})   → run two chains in parallel