Back to blog
Sep 30, 2025
10 min read

LangChain Middleware Study Guide: Simplifying Complex Agent Control in 1.0 Alpha

A comprehensive study guide exploring LangChain's revolutionary Middleware abstraction that simplifies context engineering through before_model, after_model, and modify_model_request hooks

The Evolution of Agent Control

LangChain’s new Middleware concept in 1.0 Alpha addresses a fundamental challenge that’s plagued agent frameworks for years: context engineering. After nearly three years of agent abstractions and hundreds of frameworks facing the same issues, the LangChain team has introduced a solution that just makes sense: a straightforward way to control what happens before and after model calls.

This approach takes previously complex features and organizes them into a clean, logical pattern. It’s practical for production use while remaining simple enough for rapid prototyping. The team’s decision to deprecate older patterns in favor of this cleaner abstraction demonstrates thoughtful engineering - the kind that makes LangChain and LangGraph reliable choices for production AI applications.

Core Architecture: The Middleware Pattern

The beauty of the Middleware pattern lies in its simplicity. Instead of managing dozens of parameters with complex interdependencies, developers now work with three clear intervention points in the agent execution flow. Even agents without tools benefit from this architecture, as middleware can modify prompts, manage context, and control flow regardless of whether tools are involved:

graph TD Input([input]) --> BM[Middleware.before_model] BM --> MM[Middleware.modify_model_request] MM --> Model{model} Model --> AM[Middleware.after_model] AM --> Tools[tools] Tools -->|observation| BM Model -->|finish| Output([output]) style Input fill:#1e7e8e,color:#fff style Output fill:#1e7e8e,color:#fff style BM fill:#444,stroke:#888,color:#fff style MM fill:#444,stroke:#888,color:#fff style AM fill:#444,stroke:#888,color:#fff style Model fill:#333,stroke:#666,color:#fff style Tools fill:#333,stroke:#666,color:#fff

This sequential processing model mirrors web server middleware patterns, making it immediately intuitive for developers with web backgrounds while providing the control needed for sophisticated AI applications.

The Three Middleware Hooks Explained

1. before_model Hook

Purpose: Runs before model execution, providing full control over agent state and flow.

Capabilities:

  • Update agent state
  • Jump to different nodes (model, tools, or end)
  • Implement pre-processing logic like summarization or context management

Real-World Example - Summarization:

from langchain.agents.middleware import SummarizationMiddleware

# Automatically summarize long conversations
summarization = SummarizationMiddleware(
    model="openai:gpt-4o-mini",
    max_tokens_before_summary=4000,  # Trigger at 4k tokens
    messages_to_keep=20,              # Keep last 20 messages
    summary_prompt="Summarize the earlier context concisely."
)

Why This Matters: This hook enables sophisticated context management without cluttering your main agent logic. The summarization example shows how you can automatically manage token limits while preserving conversation context.

2. modify_model_request Hook

Purpose: Modifies the model request just before execution without changing persistent state.

Capabilities (can modify):

  • Tools available to the model
  • Prompt template
  • Message list
  • Model selection
  • Model settings
  • Output format
  • Tool choice strategy

Constraints:

  • Cannot update persistent state
  • Cannot jump to other nodes
  • Changes apply only to current request

Real-World Example - Prompt Caching:

from langchain_anthropic import ChatAnthropic
from langchain.agents.middleware.prompt_caching import AnthropicPromptCachingMiddleware
from langchain.agents import create_agent
from langchain_core.messages import HumanMessage

LONG_PROMPT = """
Please be a helpful assistant.

<Lots more context ...>
"""

agent = create_agent(
    model=ChatAnthropic(model="claude-sonnet-4-latest"),
    tools=[],  # Tools are required, even if empty
    prompt=LONG_PROMPT,
    middleware=[AnthropicPromptCachingMiddleware(ttl="5m")],
)
# Cache store
agent.invoke({"messages": [HumanMessage("Hi, my name is Bob")]})

# Cache hit - system prompt is cached
agent.invoke({"messages": [HumanMessage("What's my name?")]})

Why This Matters: This hook allows dynamic request modification without the complexity of managing multiple agent configurations. You can adjust model behavior on-the-fly based on context.

3. after_model Hook

Purpose: Executes after model completion, before tool execution (if tools are called).

Capabilities:

  • Inspect model output
  • Update state based on model response
  • Jump to different nodes
  • Implement safety checks

Real-World Example - Human-in-the-Loop:

from langchain.agents.middleware import HumanInTheLoopMiddleware
from langgraph.checkpoint.memory import InMemorySaver
from langchain.agents import create_agent

# Add human approval for specific tool calls
hitl = HumanInTheLoopMiddleware(
    interrupt_on={
        "delete_database": True,  # Require approval for this tool
        "send_email": True,        # Require approval for this tool
        "read_data": False,        # Auto-approve this tool
    }
)

agent = create_agent(
    model="openai:gpt-4o",
    prompt="You are a helpful assistant.",
    tools=[...],  # your tools here
    middleware=[hitl],
    checkpointer=InMemorySaver(),  # Required for interrupts
)

Why This Matters: This enables safety controls and human oversight without building complex state machines. The agent pauses automatically for specified tools, while allowing others to proceed without interruption. The interrupt_on parameter gives granular control over which operations require human approval.

Middleware Composition Pattern

One of the most powerful aspects is combining multiple middleware components:

graph LR subgraph "Complex Agent with Multiple Middleware" Input([User Query]) --> Sum[Summarization
Middleware] Sum --> Cache[Caching
Middleware] Cache --> Model[LLM] Model --> HITL[Human-in-the-Loop
Middleware] HITL --> Guard[Guardrails
Middleware] Guard --> Output([Response]) end style Sum fill:#4CAF50,color:#fff style Cache fill:#FF9800,color:#fff style Model fill:#2196F3,color:#fff style HITL fill:#9C27B0,color:#fff style Guard fill:#F44336,color:#fff

Practical Implementation Guide

Note that all agents require a tools parameter (even if it’s an empty list), and middleware components often have required parameters for explicit configuration.

Basic Agent with Middleware

from langchain.agents import create_agent
from langchain.agents.middleware import (
    SummarizationMiddleware,
    HumanInTheLoopMiddleware,
)
from langgraph.checkpoint.memory import InMemorySaver

# Define your middleware stack
middleware_stack = [
    SummarizationMiddleware(
        model="openai:gpt-4o-mini",
        max_tokens_before_summary=3000,
        messages_to_keep=20,
    ),
    HumanInTheLoopMiddleware(
        interrupt_on={
            "deploy_tool": True,      # Require approval for deployments
            "rollback_tool": True,    # Require approval for rollbacks
            "monitor_tool": False,    # Auto-approve monitoring
        }
    ),
]

# Create agent with middleware
agent = create_agent(
    model="openai:gpt-4o",
    prompt="You are a helpful DevOps assistant.",
    tools=[deploy_tool, monitor_tool, rollback_tool],
    middleware=middleware_stack,
    checkpointer=InMemorySaver(),
)

# The agent now automatically:
# 1. Summarizes long conversations
# 2. Requests approval for critical tool calls
# 3. Maintains conversation state

Custom Middleware Creation

import time
from typing import Any, Dict
from langchain.agents.middleware import AgentMiddleware, AgentState, ModelRequest

class LoggingMiddleware(AgentMiddleware):
    """Custom middleware for comprehensive logging"""
    # Optional: declare state schema if you add state keys
    # state_schema: AgentState

    def before_model(self, state: AgentState) -> Dict[str, Any] | None:
        # Log pre-model state
        print(f"Messages count: {len(state['messages'])}")
        return None  # Return None to make no change

    def modify_model_request(self, request: ModelRequest, state: AgentState) -> ModelRequest:
        # Add monitoring metadata
        meta = getattr(request, "metadata", {}) or {}
        meta["timestamp"] = time.time()
        setattr(request, "metadata", meta)
        return request

    def after_model(self, state: AgentState) -> Dict[str, Any] | None:
        # Log model response
        last_message = state['messages'][-1]
        print(f"Model response: {getattr(last_message, 'content', '')[:100]}...")
        return None

Migration Strategy from Legacy Patterns

Before (Complex Parameter Management):

# Old approach with multiple parameters
agent = create_agent(
    model=dynamic_model_function,
    prompt=dynamic_prompt_function,
    pre_model_hook=summarization_hook,
    post_model_hook=guardrails_hook,
    runtime_config=config,
    # Many more interdependent parameters...
)

After (Clean Middleware Stack):

from typing import Any
from langchain.agents import create_agent, AgentState
from langchain.agents.middleware import SummarizationMiddleware, HumanInTheLoopMiddleware
from langchain.agents.middleware.types import modify_model_request
from langchain.agents.middleware import ModelRequest

@modify_model_request
def dynamic_prompt(state: AgentState, request: ModelRequest) -> ModelRequest:
    # Example: adjust system prompt based on conversation length
    message_count = len(state["messages"])
    if message_count > 10:
        prompt = "You are in an extended conversation. Be more concise."
    else:
        prompt = "You are a helpful assistant."
    request.system_prompt = prompt
    return request

agent = create_agent(
    model="openai:gpt-4o",
    tools=[...],  # your tools here
    middleware=[
        SummarizationMiddleware(
            model="openai:gpt-4o-mini",
            max_tokens_before_summary=3000,
        ),
        HumanInTheLoopMiddleware(
            interrupt_on={}  # Specify which tools need approval
        ),
        dynamic_prompt,  # Dynamic system prompt
    ],
)

Key Technical Constraints

When using middleware, be aware of these constraints:

  1. Model Parameter: Must be a string or BaseChatModel (not a function)
  2. Prompt Parameter: Must be a string or None (not a function)
  3. Legacy Hooks: Cannot use pre_model_hook or post_model_hook with middleware
  4. Checkpointer Required: For interrupts (like Human-in-the-Loop), you need a checkpointer

Why This Pattern Makes Sense

The middleware pattern simplifies complex agent control by:

  1. Separation of Concerns: Each middleware handles one specific aspect
  2. Composability: Mix and match middleware for different use cases
  3. Reusability: Share middleware across projects and teams
  4. Testability: Test each middleware component independently
  5. Maintainability: Clear, logical structure that’s easy to debug
  6. Flexibility: Works with tool-using agents and pure LLM agents alike

The LangChain team’s decision to embrace this pattern shows their commitment to practical, production-ready solutions. By requiring explicit parameters (like tools, even when empty) and clear configuration (like interrupt_on), they’re ensuring code is self-documenting and less prone to subtle bugs. This explicit-over-implicit approach makes it easier to build sophisticated agents that can actually ship to production.

Resources for Deep Diving

Getting Started

Install LangChain 1.0 Alpha:

# Python
pip install --pre -U langchain langgraph langchain-openai langchain-anthropic

# JavaScript
npm install langchain@next

Validation & Testing

All code examples in this guide have been tested with LangChain 1.0.0a10 (September 2025). Here’s what we verified:

Test Environment

# Create virtual environment
python -m venv langchain_test
source langchain_test/bin/activate

# Install alpha versions
pip install --pre -U langchain==1.0.0a10 langgraph==1.0.0a4

Test Results ✅

All examples have been validated with actual agent creation and proper parameter configuration:

ComponentStatusValidation Notes
SummarizationMiddleware✅ VerifiedRequires model parameter
Prompt Caching with Agent✅ VerifiedWorks with Anthropic models
HumanInTheLoopMiddleware✅ VerifiedRequires interrupt_on dict
Basic Agent Stack✅ VerifiedFull stack with tools works
Custom LoggingMiddleware✅ VerifiedAgentMiddleware base class available
Migration @decorator✅ VerifiedDecorator pattern functional

Key API Requirements

  1. Tools parameter is always required in create_agent:

    # ❌ Will fail
    agent = create_agent(model="gpt-4o", middleware=[...])
    
    # ✅ Correct (even for agents without tools)
    agent = create_agent(model="gpt-4o", tools=[], middleware=[...])
  2. HumanInTheLoopMiddleware requires interrupt_on:

    # ❌ Will fail
    hitl = HumanInTheLoopMiddleware()
    
    # ✅ Correct
    hitl = HumanInTheLoopMiddleware(
        interrupt_on={"tool_name": True}  # Or {} for no tools
    )
  3. SummarizationMiddleware requires model:

    # ❌ Will fail
    summarization = SummarizationMiddleware()
    
    # ✅ Correct
    summarization = SummarizationMiddleware(
        model="openai:gpt-4o-mini",
        max_tokens_before_summary=3000,
    )
  4. Checkpointer is required for interrupts:

    from langgraph.checkpoint.memory import InMemorySaver
    agent = create_agent(..., checkpointer=InMemorySaver())

Final Thoughts

The middleware pattern demonstrates why LangChain continues to be a solid choice for AI development. It’s a practical solution that addresses real problems developers face every day. Being able to clearly define what happens before and after model calls, while modifying requests in-flight, provides the control needed for production applications without unnecessary complexity.

This is what makes LangChain and LangGraph compelling for production use: the team consistently delivers practical abstractions that balance power with simplicity. The middleware pattern shows they understand what developers actually need - clear, composable ways to control agent behavior that work in the real world.


This study guide is based on the Agent Middleware article by the LangChain team, published September 8, 2025.

Let's Build AI That Works

Ready to implement these ideas in your organization?