The Evolution of Agent Control
LangChain’s new Middleware concept in 1.0 Alpha addresses a fundamental challenge that’s plagued agent frameworks for years: context engineering. After nearly three years of agent abstractions and hundreds of frameworks facing the same issues, the LangChain team has introduced a solution that just makes sense: a straightforward way to control what happens before and after model calls.
This approach takes previously complex features and organizes them into a clean, logical pattern. It’s practical for production use while remaining simple enough for rapid prototyping. The team’s decision to deprecate older patterns in favor of this cleaner abstraction demonstrates thoughtful engineering - the kind that makes LangChain and LangGraph reliable choices for production AI applications.
Core Architecture: The Middleware Pattern
The beauty of the Middleware pattern lies in its simplicity. Instead of managing dozens of parameters with complex interdependencies, developers now work with three clear intervention points in the agent execution flow. Even agents without tools benefit from this architecture, as middleware can modify prompts, manage context, and control flow regardless of whether tools are involved:
This sequential processing model mirrors web server middleware patterns, making it immediately intuitive for developers with web backgrounds while providing the control needed for sophisticated AI applications.
The Three Middleware Hooks Explained
1. before_model Hook
Purpose: Runs before model execution, providing full control over agent state and flow.
Capabilities:
- Update agent state
- Jump to different nodes (
model
,tools
, orend
) - Implement pre-processing logic like summarization or context management
Real-World Example - Summarization:
from langchain.agents.middleware import SummarizationMiddleware
# Automatically summarize long conversations
summarization = SummarizationMiddleware(
model="openai:gpt-4o-mini",
max_tokens_before_summary=4000, # Trigger at 4k tokens
messages_to_keep=20, # Keep last 20 messages
summary_prompt="Summarize the earlier context concisely."
)
Why This Matters: This hook enables sophisticated context management without cluttering your main agent logic. The summarization example shows how you can automatically manage token limits while preserving conversation context.
2. modify_model_request Hook
Purpose: Modifies the model request just before execution without changing persistent state.
Capabilities (can modify):
- Tools available to the model
- Prompt template
- Message list
- Model selection
- Model settings
- Output format
- Tool choice strategy
Constraints:
- Cannot update persistent state
- Cannot jump to other nodes
- Changes apply only to current request
Real-World Example - Prompt Caching:
from langchain_anthropic import ChatAnthropic
from langchain.agents.middleware.prompt_caching import AnthropicPromptCachingMiddleware
from langchain.agents import create_agent
from langchain_core.messages import HumanMessage
LONG_PROMPT = """
Please be a helpful assistant.
<Lots more context ...>
"""
agent = create_agent(
model=ChatAnthropic(model="claude-sonnet-4-latest"),
tools=[], # Tools are required, even if empty
prompt=LONG_PROMPT,
middleware=[AnthropicPromptCachingMiddleware(ttl="5m")],
)
# Cache store
agent.invoke({"messages": [HumanMessage("Hi, my name is Bob")]})
# Cache hit - system prompt is cached
agent.invoke({"messages": [HumanMessage("What's my name?")]})
Why This Matters: This hook allows dynamic request modification without the complexity of managing multiple agent configurations. You can adjust model behavior on-the-fly based on context.
3. after_model Hook
Purpose: Executes after model completion, before tool execution (if tools are called).
Capabilities:
- Inspect model output
- Update state based on model response
- Jump to different nodes
- Implement safety checks
Real-World Example - Human-in-the-Loop:
from langchain.agents.middleware import HumanInTheLoopMiddleware
from langgraph.checkpoint.memory import InMemorySaver
from langchain.agents import create_agent
# Add human approval for specific tool calls
hitl = HumanInTheLoopMiddleware(
interrupt_on={
"delete_database": True, # Require approval for this tool
"send_email": True, # Require approval for this tool
"read_data": False, # Auto-approve this tool
}
)
agent = create_agent(
model="openai:gpt-4o",
prompt="You are a helpful assistant.",
tools=[...], # your tools here
middleware=[hitl],
checkpointer=InMemorySaver(), # Required for interrupts
)
Why This Matters: This enables safety controls and human oversight without building complex state machines. The agent pauses automatically for specified tools, while allowing others to proceed without interruption. The interrupt_on
parameter gives granular control over which operations require human approval.
Middleware Composition Pattern
One of the most powerful aspects is combining multiple middleware components:
Middleware] Sum --> Cache[Caching
Middleware] Cache --> Model[LLM] Model --> HITL[Human-in-the-Loop
Middleware] HITL --> Guard[Guardrails
Middleware] Guard --> Output([Response]) end style Sum fill:#4CAF50,color:#fff style Cache fill:#FF9800,color:#fff style Model fill:#2196F3,color:#fff style HITL fill:#9C27B0,color:#fff style Guard fill:#F44336,color:#fff
Practical Implementation Guide
Note that all agents require a tools
parameter (even if it’s an empty list), and middleware components often have required parameters for explicit configuration.
Basic Agent with Middleware
from langchain.agents import create_agent
from langchain.agents.middleware import (
SummarizationMiddleware,
HumanInTheLoopMiddleware,
)
from langgraph.checkpoint.memory import InMemorySaver
# Define your middleware stack
middleware_stack = [
SummarizationMiddleware(
model="openai:gpt-4o-mini",
max_tokens_before_summary=3000,
messages_to_keep=20,
),
HumanInTheLoopMiddleware(
interrupt_on={
"deploy_tool": True, # Require approval for deployments
"rollback_tool": True, # Require approval for rollbacks
"monitor_tool": False, # Auto-approve monitoring
}
),
]
# Create agent with middleware
agent = create_agent(
model="openai:gpt-4o",
prompt="You are a helpful DevOps assistant.",
tools=[deploy_tool, monitor_tool, rollback_tool],
middleware=middleware_stack,
checkpointer=InMemorySaver(),
)
# The agent now automatically:
# 1. Summarizes long conversations
# 2. Requests approval for critical tool calls
# 3. Maintains conversation state
Custom Middleware Creation
import time
from typing import Any, Dict
from langchain.agents.middleware import AgentMiddleware, AgentState, ModelRequest
class LoggingMiddleware(AgentMiddleware):
"""Custom middleware for comprehensive logging"""
# Optional: declare state schema if you add state keys
# state_schema: AgentState
def before_model(self, state: AgentState) -> Dict[str, Any] | None:
# Log pre-model state
print(f"Messages count: {len(state['messages'])}")
return None # Return None to make no change
def modify_model_request(self, request: ModelRequest, state: AgentState) -> ModelRequest:
# Add monitoring metadata
meta = getattr(request, "metadata", {}) or {}
meta["timestamp"] = time.time()
setattr(request, "metadata", meta)
return request
def after_model(self, state: AgentState) -> Dict[str, Any] | None:
# Log model response
last_message = state['messages'][-1]
print(f"Model response: {getattr(last_message, 'content', '')[:100]}...")
return None
Migration Strategy from Legacy Patterns
Before (Complex Parameter Management):
# Old approach with multiple parameters
agent = create_agent(
model=dynamic_model_function,
prompt=dynamic_prompt_function,
pre_model_hook=summarization_hook,
post_model_hook=guardrails_hook,
runtime_config=config,
# Many more interdependent parameters...
)
After (Clean Middleware Stack):
from typing import Any
from langchain.agents import create_agent, AgentState
from langchain.agents.middleware import SummarizationMiddleware, HumanInTheLoopMiddleware
from langchain.agents.middleware.types import modify_model_request
from langchain.agents.middleware import ModelRequest
@modify_model_request
def dynamic_prompt(state: AgentState, request: ModelRequest) -> ModelRequest:
# Example: adjust system prompt based on conversation length
message_count = len(state["messages"])
if message_count > 10:
prompt = "You are in an extended conversation. Be more concise."
else:
prompt = "You are a helpful assistant."
request.system_prompt = prompt
return request
agent = create_agent(
model="openai:gpt-4o",
tools=[...], # your tools here
middleware=[
SummarizationMiddleware(
model="openai:gpt-4o-mini",
max_tokens_before_summary=3000,
),
HumanInTheLoopMiddleware(
interrupt_on={} # Specify which tools need approval
),
dynamic_prompt, # Dynamic system prompt
],
)
Key Technical Constraints
When using middleware, be aware of these constraints:
- Model Parameter: Must be a string or
BaseChatModel
(not a function) - Prompt Parameter: Must be a string or None (not a function)
- Legacy Hooks: Cannot use
pre_model_hook
orpost_model_hook
with middleware - Checkpointer Required: For interrupts (like Human-in-the-Loop), you need a checkpointer
Why This Pattern Makes Sense
The middleware pattern simplifies complex agent control by:
- Separation of Concerns: Each middleware handles one specific aspect
- Composability: Mix and match middleware for different use cases
- Reusability: Share middleware across projects and teams
- Testability: Test each middleware component independently
- Maintainability: Clear, logical structure that’s easy to debug
- Flexibility: Works with tool-using agents and pure LLM agents alike
The LangChain team’s decision to embrace this pattern shows their commitment to practical, production-ready solutions. By requiring explicit parameters (like tools
, even when empty) and clear configuration (like interrupt_on
), they’re ensuring code is self-documenting and less prone to subtle bugs. This explicit-over-implicit approach makes it easier to build sophisticated agents that can actually ship to production.
Resources for Deep Diving
- Official Middleware Documentation
- LangChain 1.0 Alpha Release Notes
- Human-in-the-Loop Implementation
- Summarization Middleware Guide
- Anthropic Prompt Caching
Getting Started
Install LangChain 1.0 Alpha:
# Python
pip install --pre -U langchain langgraph langchain-openai langchain-anthropic
# JavaScript
npm install langchain@next
Validation & Testing
All code examples in this guide have been tested with LangChain 1.0.0a10 (September 2025). Here’s what we verified:
Test Environment
# Create virtual environment
python -m venv langchain_test
source langchain_test/bin/activate
# Install alpha versions
pip install --pre -U langchain==1.0.0a10 langgraph==1.0.0a4
Test Results ✅
All examples have been validated with actual agent creation and proper parameter configuration:
Component | Status | Validation Notes |
---|---|---|
SummarizationMiddleware | ✅ Verified | Requires model parameter |
Prompt Caching with Agent | ✅ Verified | Works with Anthropic models |
HumanInTheLoopMiddleware | ✅ Verified | Requires interrupt_on dict |
Basic Agent Stack | ✅ Verified | Full stack with tools works |
Custom LoggingMiddleware | ✅ Verified | AgentMiddleware base class available |
Migration @decorator | ✅ Verified | Decorator pattern functional |
Key API Requirements
-
Tools parameter is always required in
create_agent
:# ❌ Will fail agent = create_agent(model="gpt-4o", middleware=[...]) # ✅ Correct (even for agents without tools) agent = create_agent(model="gpt-4o", tools=[], middleware=[...])
-
HumanInTheLoopMiddleware requires
interrupt_on
:# ❌ Will fail hitl = HumanInTheLoopMiddleware() # ✅ Correct hitl = HumanInTheLoopMiddleware( interrupt_on={"tool_name": True} # Or {} for no tools )
-
SummarizationMiddleware requires
model
:# ❌ Will fail summarization = SummarizationMiddleware() # ✅ Correct summarization = SummarizationMiddleware( model="openai:gpt-4o-mini", max_tokens_before_summary=3000, )
-
Checkpointer is required for interrupts:
from langgraph.checkpoint.memory import InMemorySaver agent = create_agent(..., checkpointer=InMemorySaver())
Final Thoughts
The middleware pattern demonstrates why LangChain continues to be a solid choice for AI development. It’s a practical solution that addresses real problems developers face every day. Being able to clearly define what happens before and after model calls, while modifying requests in-flight, provides the control needed for production applications without unnecessary complexity.
This is what makes LangChain and LangGraph compelling for production use: the team consistently delivers practical abstractions that balance power with simplicity. The middleware pattern shows they understand what developers actually need - clear, composable ways to control agent behavior that work in the real world.
This study guide is based on the Agent Middleware article by the LangChain team, published September 8, 2025.