LangGraph Integration

Per-invocation mocking via RunnableConfig—no environment variables, no global state, fully concurrent.

StuntDouble uses LangGraph’s native ToolNode with an awrap_tool_call wrapper for per-invocation mocking.

Overview

┌─────────────────────────────────────────────────────────────────────────────┐
│                         LangGraph Per-Invocation Flow                       │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   graph.invoke(                                                             │
│     state,                              ┌─────────────────────────┐         │
│     config={                            │ Check scenario_metadata │         │
│       "configurable": {                 └───────────┬─────────────┘         │
│         "scenario_metadata": {...}                  │                       │
│       }                                    ┌────────┴────────┐              │
│     }                                      ▼                 ▼              │
│   )                                   Has mocks?       No mocks?            │
│                                            │                 │              │
│                                            ▼                 ▼              │
│                                     Return MOCK      Call REAL tool         │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Benefit	Description
✅ Concurrent-safe	Each invocation has its own mocks
✅ No global state	No environment variables or singletons
✅ Production-ready	Same graph handles mock and real traffic
✅ Flexible	Different mocks per request

ToolNode Wrapper

Uses LangGraph’s native ToolNode with StuntDouble’s awrap_tool_call wrapper.

Prerequisites

The ToolNode Wrapper requires the awrap_tool_call parameter on ToolNode, which was introduced in LangGraph 1.0. Make sure your dependencies meet these minimum versions:

Package	Minimum Version	Why
`langgraph`	>=1.0.0	Provides `ToolNode(awrap_tool_call=...)` parameter
`langchain-core`	>=1.2.5	Required by StuntDouble for `BaseTool`, `RunnableConfig`, and schema inspection

# Verify your versions
pip show langgraph langchain-core

# Upgrade if needed
pip install --upgrade "langgraph>=1.0.0" "langchain-core>=1.2.5"

# Or with uv
uv add "langgraph>=1.0.0" "langchain-core>=1.2.5"

# Or with Poetry
poetry add "langgraph>=1.0.0" "langchain-core>=1.2.5"

See Dependencies Reference for full compatibility details.

Option A: Default Registry (Simplest) ⭐

Use the pre-configured mockable_tool_wrapper and default_registry for zero-setup mocking:

from langgraph.graph import StateGraph, MessagesState, START
from langgraph.prebuilt import ToolNode, tools_condition
from langchain_core.messages import HumanMessage
from stuntdouble import (
    mockable_tool_wrapper,      # Pre-configured wrapper
    default_registry,           # Default mock registry
    inject_scenario_metadata,   # Config helper
)

# Your real tools (production code unchanged)
tools = [get_customer_tool, list_bills_tool, create_invoice_tool]

# Step 1: Register mocks on the default registry
default_registry.mock("get_customer").returns({
    "id": "CUST-001",
    "name": "Test Corp",
    "balance": 1500,
})
default_registry.mock("list_bills").returns({
    "bills": [{"id": "B001", "amount": 500}]
})

# Step 2: Build graph with native ToolNode + mockable wrapper
builder = StateGraph(MessagesState)
builder.add_node("agent", agent_node)
builder.add_node("tools", ToolNode(tools, awrap_tool_call=mockable_tool_wrapper))  # ← Native ToolNode!
builder.add_edge(START, "agent")
builder.add_conditional_edges("agent", tools_condition)
builder.add_edge("tools", "agent")
graph = builder.compile()

# Step 3: Invoke WITH mocks
config = inject_scenario_metadata({}, {
    "scenario_id": "langgraph-default-registry-demo"
})
result = await graph.ainvoke({"messages": [HumanMessage("List my bills")]}, config=config)
# → Uses mocked get_customer / list_bills

# Step 4: Invoke WITHOUT mocks (no scenario_metadata = real tools)
result = await graph.ainvoke({"messages": [HumanMessage("List my bills")]})
# → Uses real list_bills tool

Using the fluent builder on default_registry (even simpler):

from stuntdouble import default_registry, mockable_tool_wrapper

mock = default_registry.mock  # Convenience: mock("tool").returns(...)

# No registry parameter needed — uses default_registry automatically
mock("get_customer").returns({"id": "123", "name": "Test Corp"})
mock("list_bills").returns({"bills": [{"id": "B001", "amount": 500}]})
mock("get_invoice").when(status={"$in": ["paid", "pending"]}).returns({"priority": "low"})

# Verify registration
print(default_registry.list_registered())  # ['get_customer', 'list_bills', 'get_invoice']

# Use the pre-configured wrapper (reads from default_registry)
tool_node = ToolNode(tools, awrap_tool_call=mockable_tool_wrapper)

Option B: Custom Registry (Full Control)

For advanced scenarios where you need multiple registries, custom wrappers, call recording, or signature validation:

from typing import Any, Callable
from langgraph.graph import StateGraph, MessagesState, START
from langgraph.prebuilt import ToolNode, tools_condition
from langchain_core.messages import HumanMessage
from stuntdouble import (
    MockToolsRegistry,
    create_mockable_tool_wrapper,
    inject_scenario_metadata,
)

# Your real tools
tools = [get_customer_tool, list_bills_tool]

# Step 1: Define mock functions
# Each mock_fn takes scenario_metadata and returns a callable that matches
# the real tool's signature (accepts the same arguments).

def get_customer_mock(scenario_metadata: dict[str, Any]) -> Callable[..., Any]:
    """
    Mock for get_customer tool.
    
    Real tool signature: get_customer(user_id: str) -> dict
    """
    mocks = scenario_metadata.get("mocks", {})
    mock_data = mocks.get("get_customer", [])
    
    if isinstance(mock_data, list) and mock_data:
        data = mock_data[0].get("output", {})
    else:
        data = mock_data if isinstance(mock_data, dict) else {}
    
    # Mock callable matches real tool signature
    def mock_fn(user_id: str) -> dict:
        # Can use input values in response, or return static mock data
        return data or {"id": user_id, "name": "Test Corp", "status": "active"}
    
    return mock_fn

def list_bills_mock(scenario_metadata: dict[str, Any]) -> Callable[..., Any]:
    """
    Mock for list_bills tool.
    
    Real tool signature: list_bills(start_date: str, end_date: str) -> dict
    """
    mocks = scenario_metadata.get("mocks", {})
    mock_data = mocks.get("list_bills", [])
    
    if isinstance(mock_data, list) and mock_data:
        data = mock_data[0].get("output", {})
    else:
        data = {}
    
    bills = data.get("bills", [])
    
    # Mock callable matches real tool signature
    def mock_fn(start_date: str, end_date: str) -> dict:
        # Can filter based on input args if needed
        return {"bills": bills, "start_date": start_date, "end_date": end_date}
    
    return mock_fn

# Step 2: Create registry and register mock functions
registry = MockToolsRegistry()
registry.register("get_customer", mock_fn=get_customer_mock)
registry.register("list_bills", mock_fn=list_bills_mock)

# Step 3: Create wrapper and build graph with native ToolNode
wrapper = create_mockable_tool_wrapper(registry)

builder = StateGraph(MessagesState)
builder.add_node("agent", agent_node)
builder.add_node("tools", ToolNode(tools, awrap_tool_call=wrapper))  # ← Native ToolNode!
builder.add_edge(START, "agent")
builder.add_conditional_edges("agent", tools_condition)
builder.add_edge("tools", "agent")
graph = builder.compile()

# Step 4: Invoke WITH mocks
config = inject_scenario_metadata({}, {
    "mocks": {
        "list_bills": [{"output": {"bills": [{"id": "B001", "amount": 500}]}}]
    }
})
result = await graph.ainvoke(
    {"messages": [HumanMessage("Get customer CUST-001")]},
    config=config
)

# Step 5: Invoke WITHOUT mocks (production mode - no scenario_metadata)
result = await graph.ainvoke(
    {"messages": [HumanMessage("Get customer CUST-001")]}
)

API Reference

# LangGraph package exports
from stuntdouble import (
    # Pre-configured wrapper and registry
    mockable_tool_wrapper,             # Ready-to-use awrap_tool_call wrapper
    default_registry,                  # Default MockToolsRegistry instance

    # Factory functions
    create_mockable_tool_wrapper,      # Create wrapper with custom registry

    # Mock registration
    MockToolsRegistry,                 # Factory-based mock registration
    MockBuilder,                       # Chainable mock builder (also: from stuntdouble import MockBuilder)

    # Fluent builder: mock = default_registry.mock (no standalone mock function)
    # Call recording
    CallRecorder,                      # Records tool calls for verification
    CallRecord,                        # Individual call record

    # Config utilities
    inject_scenario_metadata,          # Add scenario_metadata to config
    get_scenario_metadata,             # Extract from ToolCallRequest
    get_configurable_context,          # Extract configurable dict from RunnableConfig

    # Validation
    validate_mock_parameters,          # Validate mock inputs match tool schema
    validate_mock_signature,           # Validate mock function signature matches tool
    validate_registry_mocks,           # Validate scenario_metadata mock cases

    # Exceptions
    MissingMockError,                  # Raised when mock not found in strict mode
    SignatureMismatchError,            # Raised when mock signature doesn't match tool
    MockAssertionError,                # Raised by CallRecorder assertions
)

Function	Description
`mockable_tool_wrapper`	Pre-configured wrapper for `ToolNode(tools, awrap_tool_call=...)` using `default_registry`
`default_registry`	Default `MockToolsRegistry` used by `mockable_tool_wrapper`
`create_mockable_tool_wrapper(registry, recorder=, tools=, validate_signatures=, require_mock_when_scenario=, strict_mock_errors=)`	Create wrapper with custom registry, optional recorder, validation, and error-handling controls
`default_registry.mock(tool_name)`	Convenience returning `MockBuilder`. Use `mock = default_registry.mock` for shorthand.
`MockBuilder(tool_name, registry)`	Chainable builder: `.when()`, `.returns()`, `.returns_fn()`, `.echoes_input()`
`MockToolsRegistry()`	Create a registry for mock functions
`registry.register(tool_name, mock_fn, when=None, tool=None)`	Register a mock function. Pass `tool=` for signature validation.
`CallRecorder()`	Records tool calls for test assertions
`CallRecord`	Individual call record with `tool_name`, `args`, `result`, `was_mocked`, etc.
`inject_scenario_metadata(config, metadata)`	Create config with scenario_metadata
`get_configurable_context(config)`	Extract the `configurable` dict from RunnableConfig for context-aware mocks
`validate_mock_signature(tool, mock_fn, scenario_metadata, config)`	Validate mock function signature matches tool
`validate_registry_mocks(tools, scenario_metadata)`	Validate `scenario_metadata["mocks"]` against tool parameters

Troubleshooting

Mocks Not Working

Check you’re using the wrapper:

# Default registry approach
from stuntdouble import mockable_tool_wrapper
ToolNode(tools, awrap_tool_call=mockable_tool_wrapper)  # ← Required!

# Or custom registry approach
wrapper = create_mockable_tool_wrapper(registry)
ToolNode(tools, awrap_tool_call=wrapper)  # ← Required!

Check scenario_metadata is passed:

config = inject_scenario_metadata({}, {"mocks": {...}})
result = await graph.ainvoke(state, config=config)

Check mock is registered:

# Default registry
from stuntdouble import default_registry
print(default_registry.list_registered())  # Should include your tool name

# Custom registry
print(registry.list_registered())  # Should include your tool name

Check when predicate returns True:

# If using `when=`, ensure it returns True for your scenario
default_registry.register("tool", mock_fn=..., when=lambda md: md.get("mode") == "test")

CallRecorder: Tool Call Verification

The CallRecorder captures all tool calls during test execution, enabling verification of what tools were called, with what arguments, and how many times.

Quick Start

from stuntdouble import (
    MockToolsRegistry,
    CallRecorder,
    create_mockable_tool_wrapper,
)
from langgraph.prebuilt import ToolNode

# Create registry and recorder
registry = MockToolsRegistry()
recorder = CallRecorder()

# Register mocks
mock = registry.mock
mock("get_customer").returns({"id": "123", "name": "Test Corp"})

# Create wrapper with recorder
wrapper = create_mockable_tool_wrapper(registry, recorder=recorder)

# Build your graph
tools = [get_customer, list_bills, create_invoice]
tool_node = ToolNode(tools, awrap_tool_call=wrapper)

# ... run your agent ...

# Verify calls were made
recorder.assert_called("get_customer")
recorder.assert_not_called("delete_account")
recorder.assert_called_once("list_bills")
recorder.assert_called_times("create_invoice", 2)

# Verify arguments
recorder.assert_called_with("get_customer", customer_id="123")
recorder.assert_last_called_with("list_bills", status="active")

# Verify call order
recorder.assert_call_order("get_customer", "list_bills", "create_invoice")

# Inspect recorded calls
print(recorder.summary())

API Reference

from stuntdouble import CallRecorder, CallRecord, MockAssertionError

recorder = CallRecorder()

Query Methods

Method	Description	Example
`was_called(tool, **args)`	Check if tool was called (optionally with specific args)	`recorder.was_called("get_customer", customer_id="123")`
`call_count(tool)`	Get number of calls to a tool	`recorder.call_count("list_bills")`
`get_calls(tool=None)`	Get list of `CallRecord` objects	`recorder.get_calls("get_customer")`
`get_last_call(tool)`	Get the most recent call	`recorder.get_last_call("list_bills")`
`get_first_call(tool)`	Get the first call	`recorder.get_first_call("get_customer")`
`get_args(tool, index=-1)`	Get arguments from a specific call	`recorder.get_args("get_customer", 0)`
`get_result(tool, index=-1)`	Get result from a specific call	`recorder.get_result("create_invoice")`
`summary()`	Human-readable summary of all calls	`print(recorder.summary())`
`clear()`	Reset recorder for next test	`recorder.clear()`

Assertion Methods

All assertion methods raise MockAssertionError on failure.

Method	Description	Example
`assert_called(tool)`	Assert tool was called at least once	`recorder.assert_called("get_customer")`
`assert_not_called(tool)`	Assert tool was never called	`recorder.assert_not_called("delete_account")`
`assert_called_once(tool)`	Assert tool was called exactly once	`recorder.assert_called_once("list_bills")`
`assert_called_times(tool, n)`	Assert tool was called exactly n times	`recorder.assert_called_times("create_invoice", 2)`
`assert_called_with(tool, **args)`	Assert any call matches the arguments	`recorder.assert_called_with("get_customer", customer_id="123")`
`assert_last_called_with(tool, **args)`	Assert last call matches the arguments	`recorder.assert_last_called_with("list_bills", status="active")`
`assert_call_order(*tools)`	Assert tools were called in order	`recorder.assert_call_order("get_customer", "list_bills")`

CallRecord Properties

Each recorded call is a CallRecord with these properties:

Property	Description	Example
`tool_name`	Name of the tool	`"get_customer"`
`args`	Arguments passed to the tool	`{"customer_id": "123"}`
`result`	Return value (mock or real)	`{"id": "123", "name": "Test Corp"}`
`error`	Exception if call failed	`None` or `ValueError(...)`
`was_mocked`	Whether a mock was used	`True` or `False`
`duration_ms`	Call duration in milliseconds	`5.2`
`timestamp`	Unix timestamp when call was made	`1704567890.123`
`scenario_id`	Scenario ID from metadata	`"test-001"` or `None`

Examples

Basic Verification

recorder = CallRecorder()
wrapper = create_mockable_tool_wrapper(registry, recorder=recorder)

# After running agent
recorder.assert_called("get_customer")
recorder.assert_not_called("delete_account")
assert recorder.call_count("list_bills") == 1

Argument Verification

# Verify any call matches
recorder.assert_called_with("get_customer", customer_id="123")

# Verify last call matches
recorder.assert_last_called_with("list_bills", status="active", limit=10)

# Check if called with specific args
if recorder.was_called("create_invoice", amount=100):
    print("Invoice created for $100")

Call Order Verification

# Verify tools were called in specific order
recorder.assert_call_order("get_customer", "list_bills", "create_invoice")

Inspecting Calls

# Get all calls for a tool
calls = recorder.get_calls("create_invoice")
for call in calls:
    print(f"Amount: {call.args['amount']}, Result: {call.result}")

# Get specific call arguments
first_args = recorder.get_args("get_customer", index=0)
last_result = recorder.get_result("list_bills")

# Get full summary
print(recorder.summary())
# Output:
# Recorded 4 call(s):
#   1. get_customer [MOCKED] args={'customer_id': '123'}
#   2. list_bills [MOCKED] args={'status': 'active'}
#   3. create_invoice [MOCKED] args={'amount': 500}
#   4. create_invoice [MOCKED] args={'amount': 1200}

Testing with pytest

import pytest
from stuntdouble import (
    MockToolsRegistry,
    CallRecorder,
    create_mockable_tool_wrapper,
)

@pytest.fixture
def recorder():
    return CallRecorder()

@pytest.fixture
def mock_wrapper(recorder):
    registry = MockToolsRegistry()
    registry.mock("get_customer").returns({"id": "123", "name": "Test Corp"})
    return create_mockable_tool_wrapper(registry, recorder=recorder)

def test_customer_workflow(mock_wrapper, recorder):
    # Build graph with wrapper
    tool_node = ToolNode(tools, awrap_tool_call=mock_wrapper)
    # ... run agent ...
    
    # Verify behavior
    recorder.assert_called("get_customer")
    recorder.assert_called_with("get_customer", customer_id="123")
    assert recorder.get_result("get_customer")["name"] == "Test Corp"

Thread Safety

CallRecorder is thread-safe and suitable for concurrent test execution. All methods use internal locking to protect the call list during concurrent access.

Shared Concepts

The following concepts apply to the ToolNode Wrapper approach.

MockBuilder: Fluent Mock Registration

The MockBuilder provides a fluent, chainable API for registering mocks. See the MockBuilder Guide for complete documentation and examples.

Quick example with explicit registry:

from stuntdouble import MockToolsRegistry

registry = MockToolsRegistry()
mock = registry.mock

# Simple static mock
mock("get_customer").returns({"id": "123", "name": "Test Corp"})

# Conditional mock with input matching
mock("list_bills").when(status="active").returns({"bills": [{"id": "B001"}]})

# Echo input fields in response
mock("update_customer").echoes_input("customer_id", "name").returns({"updated": True})

# Custom mock function
mock("calculate_total").returns_fn(
    lambda items, tax_rate: {"total": sum(i["price"] for i in items) * (1 + tax_rate)}
)

# Combine conditions with operators
mock("get_invoice").when(
    status={"$in": ["paid", "pending"]},
    amount={"$gt": 100}
).returns({"priority": "high"})

Quick example with default registry:

from stuntdouble import default_registry

mock = default_registry.mock  # Uses default_registry

# Register mocks
mock("get_customer").returns({"id": "123", "name": "Mocked"})
mock("list_bills").when(status="active").returns({"bills": []})

# Verify
assert default_registry.is_registered("get_customer")

→ Full MockBuilder Guide

Mocked Tool Patterns

Mocked tools are functions that receive scenario_metadata and return a mock callable. Below are all supported patterns with sample inputs and outputs.

Pattern 1: Static Mock

Always returns the same response regardless of input.

# Sample input:  get_weather(city="NYC")
# Sample output: {"temp": 72, "conditions": "sunny"}

registry.register(
    "get_weather",
    mock_fn=lambda md: lambda **kwargs: {"temp": 72, "conditions": "sunny"}
)

Pattern 2: Input-Echo Mock

Response includes values from the tool input.

# Sample input:  get_customer(customer_id="CUST-123")
# Sample output: {"id": "CUST-123", "name": "Test Corp", "status": "active"}

registry.register(
    "get_customer",
    mock_fn=lambda md: lambda customer_id, **kw: {
        "id": customer_id,  # Echo the input
        "name": "Test Corp",
        "status": "active"
    }
)

Pattern 3: Mocks-Based Mock (Data-Driven)

Mock data comes from the scenario_metadata.mocks structure. Write a factory that extracts data from scenario_metadata:

from typing import Callable

def list_bills_mock(scenario_metadata: dict) -> Callable:
    mocks = scenario_metadata.get("mocks", {})
    mock_data = mocks.get("list_bills", [])
    
    if isinstance(mock_data, list) and mock_data:
        data = mock_data[0].get("output", {})
    else:
        data = {}
    
    return lambda **kwargs: data

registry.register("list_bills", mock_fn=list_bills_mock)

Pattern 4: Conditional Mock (with `when` predicate)

Only mock under certain conditions; otherwise call the real tool.

# Sample scenario_metadata (MOCKED):
#   {"mode": "test", "mocks": {"send_email": [{"output": {"sent": true}}]}}
#
# Sample scenario_metadata (NOT MOCKED - calls real tool):
#   {"mode": "production"}
#
# Sample input:  send_email(to="user@example.com", body="Hello")
# Sample output: {"sent": True, "message_id": "mock-123"}

registry.register(
    "send_email",
    mock_fn=lambda md: lambda **kw: {"sent": True, "message_id": "mock-123"},
    when=lambda md: md.get("mode") == "test"  # Only mock in test mode
)

Pattern 5: Dynamic Placeholders

Outputs can include dynamic placeholders for timestamps, UUIDs, and input references. Use data-driven mocks via register_data_driven() / DataDrivenMockFactory, or implement placeholder resolution in your own mock factory. See Mock Format Reference for supported placeholder syntax.

Supported placeholders:

Placeholder	Description	Example Output
`{{now}}`	Current ISO timestamp	`2026-02-12T10:30:00`
`{{now + Nd}}`	N days from now	`{{now + 7d}}` → 7 days later
`{{now - Nd}}`	N days ago	`{{now - 30d}}` → 30 days ago
`{{today}}`	Current date only	`2026-02-12`
`{{input.field}}`	Reference input value	Echoes input
`{{uuid}}`	Random UUID	`a1b2c3d4-e5f6-...`
`{{random_int(min, max)}}`	Random integer	`42`
`{{sequence('prefix')}}`	Incrementing ID	`prefix-001`, `prefix-002`

See Mock Format Reference for all placeholders.

Pattern 6: Context-Aware Mock (Runtime Config Access)

Access runtime context (like user identity from HTTP headers) in your mock factory. This is especially useful for no-argument tools that need to return user-specific data.

from stuntdouble import get_configurable_context

# Sample RunnableConfig (passed by your application):
#   {
#     "configurable": {
#       "agent_context": {
#         "auth_header": {
#           "user_id": "USER-123",
#           "org_id": "ORG-456"
#         }
#       }
#     }
#   }
#
# Sample input:  get_current_user()  # No arguments!
# Sample output: {"user_id": "USER-123", "org_id": "ORG-456"}

def user_context_mock(scenario_metadata: dict, config: dict = None):
    """Mock factory that extracts user context from RunnableConfig."""
    ctx = get_configurable_context(config)

    # Application-specific extraction (your app knows its structure)
    agent_context = ctx.get("agent_context", {})
    auth_header = agent_context.get("auth_header", {})
    user_id = auth_header.get("user_id", "unknown")
    org_id = auth_header.get("org_id", "unknown")

    # Return the mock callable
    return lambda: {"user_id": user_id, "org_id": org_id}

registry.register("get_current_user", mock_fn=user_context_mock)

Key Points:

Mock factories can accept an optional second parameter config (the RunnableConfig)
Use get_configurable_context(config) to safely extract the configurable dict
Backward compatible: Existing factories with only scenario_metadata continue to work
The config parameter is detected via signature inspection—no registration changes needed

When to Use:

No-argument tools that need runtime context (user ID, tenant ID, etc.)
Mocks that need to vary based on request headers
Multi-tenant testing scenarios

→ Full Context-Aware Mocks Guide

scenario_metadata Structure

scenario_metadata = {
    # Optional: Scenario identifier
    "scenario_id": "test-001",
    
    # Optional: Mode indicator
    "mode": "test",
    
    # Mock definitions by tool name
    "mocks": {
        "tool_name": [
            {
                "input": {...},   # Optional: Input pattern to match
                "output": {...}   # Required: Output to return
            },
            {
                "output": {...}   # Catch-all (no input pattern)
            }
        ]
    }
}

Input Matching

Match inputs with operators for conditional responses:

scenario_metadata = {
    "mocks": {
        "get_bills": [
            {"input": {"status": "overdue", "amount": {"$gt": 5000}}, "output": {"priority": "URGENT"}},
            {"input": {"status": {"$in": ["paid", "pending"]}}, "output": {"priority": "low"}},
            {"output": {"priority": "normal"}}  # Catch-all
        ]
    }
}

See Mock Format Reference for all operators and Matchers and Resolvers Guide for detailed examples.

Dynamic Placeholders

Outputs can include dynamic placeholders:

scenario_metadata = {
    "mocks": {
        "create_invoice": [{
            "output": {
                "id": "{{uuid}}",
                "created_at": "{{now}}",
                "due_date": "{{now + 30d}}",
                "customer_id": "{{input.customer_id}}"
            }
        }]
    }
}

See Mock Format Reference for all placeholders.

Custom Mocked Tools

StuntDouble’s LangGraph mocking uses mocked tools. The format of your mock data is defined by the mocked tool function—not by the framework.

Mocked Tool Signature

Standard (1-parameter) and context-aware (2-parameter) signatures:

from typing import Any, Callable

# Standard factory: receives scenario_metadata only
def my_mocked_tool(scenario_metadata: dict[str, Any]) -> Callable[..., Any] | None:
    """
    Args:
        scenario_metadata: The scenario configuration for this invocation
        
    Returns:
        A callable mock function, or None to skip mocking (use real tool)
    """
    # 1. Extract your mock data from scenario_metadata
    # 2. Return a callable that handles tool invocations
    # 3. Return None if mocking shouldn't apply
    pass

# Context-aware factory: receives scenario_metadata AND config
def my_context_mock(scenario_metadata: dict[str, Any], config: dict = None) -> Callable[..., Any] | None:
    """
    Args:
        scenario_metadata: The scenario configuration for this invocation
        config: The RunnableConfig with runtime context (optional)
        
    Returns:
        A callable mock function, or None to skip mocking (use real tool)
    """
    ctx = get_configurable_context(config)
    # Use ctx for runtime context like user identity, headers, etc.
    pass

Example: Custom Key Name

Use "responses" instead of "mocks":

# Sample scenario_metadata: {"responses": {"my_tool": {"data": "custom"}}}
# Sample input:  my_tool(query="test")
# Sample output: {"data": "custom"}

def responses_factory(tool_name: str):
    def factory(scenario_metadata: dict) -> Callable | None:
        responses = scenario_metadata.get("responses", {})
        tool_data = responses.get(tool_name)
        
        if tool_data is None:
            return None  # No mock, use real tool
        
        return lambda **kwargs: tool_data
    
    return factory

registry.register("my_tool", mock_fn=responses_factory("my_tool"))

Example: Stateful Factory

Return sequences of responses (first call, second call, etc.):

# Sample scenario_metadata: {"sequences": {"api_call": [{"attempt": 1}, {"attempt": 2}, {"attempt": 3}]}}
# Sample input:  api_call(url="https://api.com")
# Call 1 output: {"attempt": 1}
# Call 2 output: {"attempt": 2}
# Call 3 output: {"attempt": 3}
# Call 4 output: {"attempt": 3}  (stays on last)

def sequence_factory(tool_name: str):
    def factory(scenario_metadata: dict) -> Callable | None:
        sequence = scenario_metadata.get("sequences", {}).get(tool_name, [])
        if not sequence:
            return None
        
        call_count = [0]
        
        def sequenced_mock(**kwargs):
            idx = min(call_count[0], len(sequence) - 1)
            call_count[0] += 1
            return sequence[idx]
        
        return sequenced_mock
    
    return factory

registry.register("api_call", mock_fn=sequence_factory("api_call"))

Example: Tenant-Aware Factory

Use runtime config for multi-tenant mocking:

# Sample RunnableConfig: {"configurable": {"agent_context": {"tenant_id": "tenant-a"}}}
# Sample input:  get_tenant_config()
# Sample output: {"plan": "enterprise", "max_users": 1000}

from stuntdouble import get_configurable_context

def tenant_factory(scenario_metadata: dict, config: dict = None):
    ctx = get_configurable_context(config)
    tenant_id = ctx.get("agent_context", {}).get("tenant_id", "default")

    tenant_configs = {
        "tenant-a": {"plan": "enterprise", "max_users": 1000},
        "tenant-b": {"plan": "startup", "max_users": 10},
        "default": {"plan": "free", "max_users": 1},
    }

    data = tenant_configs.get(tenant_id, tenant_configs["default"])
    return lambda: data

registry.register("get_tenant_config", mock_fn=tenant_factory)

Best Practices

1. Register at Startup

registry = MockToolsRegistry()

# Register all mocks before graph compilation
registry.register("tool_a", mock_fn=tool_a_mock)
registry.register("tool_b", mock_fn=tool_b_mock)

# Then build graph
wrapper = create_mockable_tool_wrapper(registry)
builder.add_node("tools", ToolNode(tools, awrap_tool_call=wrapper))

2. Keep Mocked Tools Pure

# Good: Fresh callable each time
mock_fn=lambda md: lambda **kw: {"data": md.get("value")}

# Bad: Side effects
def mocked_tool(md):
    print("Creating mock")  # Side effect!
    return lambda **kw: {...}

3. Enable Logging for Debugging

import logging
logging.getLogger("stuntdouble").setLevel(logging.DEBUG)

Mock Signature Validation

StuntDouble can validate that your mock functions have the same parameter signature as the real tools they mock. This catches configuration errors early.

Validation Points

Point	Control	Behavior on Mismatch
Registration	Pass `tool=` to `registry.register()`	Raises `SignatureMismatchError` immediately
Runtime	Pass `tools=` and `validate_signatures=True` to wrapper	Raises `SignatureMismatchError`

Controlling Validation

There are two independent validation points with separate controls:

What you want	Registration	Wrapper
No validation at all	Omit `tool=`	Set `validate_signatures=False` or omit `tools=`
Registration only (fail fast)	Pass `tool=real_tool`	Set `validate_signatures=False` or omit `tools=`
Runtime only (lazy validation)	Omit `tool=`	Pass `tools=all_tools` + `validate_signatures=True`
Both (belt and suspenders)	Pass `tool=real_tool`	Pass `tools=all_tools` + `validate_signatures=True`

Example: Registration-Time Validation

Validates immediately when you register the mock. Throws SignatureMismatchError if signatures don’t match:

from stuntdouble import MockToolsRegistry, SignatureMismatchError

registry = MockToolsRegistry()

# This mock is missing the 'units' parameter
def bad_weather_mock(scenario_metadata):
    def mock_fn(city: str):  # Missing 'units' parameter!
        return {"temp": 72}
    return mock_fn

try:
    registry.register(
        "get_weather",
        mock_fn=bad_weather_mock,
        tool=get_weather_tool,  # ← Passing tool enables validation
    )
except SignatureMismatchError as e:
    print(f"Registration failed: {e}")
    # "Mock for 'get_weather' has mismatched signature.
    #  Missing parameters in mock: units"

Example: Runtime Validation

Validates each time a mock is about to be executed. On failure, raises SignatureMismatchError:

from stuntdouble import create_mockable_tool_wrapper

# Create wrapper with runtime validation
wrapper = create_mockable_tool_wrapper(
    registry,
    tools=all_tools,           # ← List of tools for validation
    validate_signatures=True,  # ← Enable runtime validation (default)
)

# If mock signature doesn't match at runtime:
# → Raises SignatureMismatchError before the mock executes

What Gets Validated

The validate_mock_signature function checks:

Missing parameters: Mock is missing parameters the tool expects
Extra required parameters: Mock requires parameters the tool doesn’t have
Required/optional mismatch: Tool has optional param but mock requires it

# Tool signature: get_weather(city: str, units: str = "celsius")

# ✅ Valid - exact match
def mock_fn(city: str, units: str = "celsius"): ...

# ✅ Valid - optional params can have different defaults
def mock_fn(city: str, units: str = "fahrenheit"): ...

# ❌ Invalid - missing 'units' parameter
def mock_fn(city: str): ...

# ❌ Invalid - extra required parameter
def mock_fn(city: str, units: str = "celsius", extra: str): ...

# ❌ Invalid - 'units' should be optional but mock requires it
def mock_fn(city: str, units: str): ...

Opting Out

# Skip registration validation
registry.register("get_weather", mock_fn=my_mock)  # No tool=

# Skip runtime validation
wrapper = create_mockable_tool_wrapper(
    registry,
    validate_signatures=False,  # ← Disable runtime validation
)

MCP Tool Support

Signature validation works with MCP tools loaded via langchain-mcp-adapters. These tools provide JSON Schema dicts instead of Pydantic models, and StuntDouble handles both formats:

Schema Format	Source	Handling
Pydantic model	LangChain native tools	Standard `model_fields` inspection
JSON Schema dict	MCP tools via langchain-mcp-adapters	Parses `properties` and `required` fields

# MCP tools work seamlessly with validation
from langchain_mcp_adapters.client import MultiServerMCPClient

client = MultiServerMCPClient({"mcp-server": {"url": "http://localhost:8080/mcp"}})
mcp_tools = await client.get_tools()  # Tools have JSON Schema args_schema

# StuntDouble validates against JSON Schema automatically
wrapper = create_mockable_tool_wrapper(
    registry,
    tools=mcp_tools,          # ← MCP tools with JSON Schema
    validate_signatures=True,  # ← Works with JSON Schema dicts
)

→ Full Signature Validation Guide

Using Mirrored Tools

MCP Tool Mirroring auto-generates mock tools from MCP server schemas. These mirrored tools integrate seamlessly with LangGraph’s per-invocation mocking.

Quick Start with Mirroring

from stuntdouble.mirroring import ToolMirror
from stuntdouble import (
    MockToolsRegistry,
    create_mockable_tool_wrapper,
    inject_scenario_metadata,
)
from langgraph.prebuilt import ToolNode

# 1. Mirror tools from MCP server
mirror = ToolMirror()
mirror.mirror(["python", "-m", "my_mcp_server"])
tools = mirror.to_langchain_tools()

# 2. Create registry and wrapper
registry = MockToolsRegistry()
wrapper = create_mockable_tool_wrapper(registry)

# 3. Build graph with mirrored tools
builder = StateGraph(MessagesState)
builder.add_node("agent", agent_node)
builder.add_node("tools", ToolNode(tools, awrap_tool_call=wrapper))
builder.add_edge(START, "agent")
builder.add_conditional_edges("agent", tools_condition)
builder.add_edge("tools", "agent")
graph = builder.compile()

# 4. Invoke - mirrored tools return generated mock data
result = await graph.ainvoke({"messages": [HumanMessage("Create an invoice")]})

LangGraph-Optimized Mirroring

Use ToolMirror.for_langgraph() for a mirror pre-configured for LangGraph integration:

from stuntdouble.mirroring import ToolMirror

# Optimized for LangGraph: generates mock functions compatible with registry
mirror = ToolMirror.for_langgraph()
mirror.mirror(["python", "-m", "my_mcp_server"])
tools = mirror.to_langchain_tools()

Mirroring with HTTP Authentication

Mirror from remote MCP servers behind authentication:

from stuntdouble.mirroring import ToolMirror

mirror = ToolMirror()

# Bearer token authentication
result = mirror.mirror(
    http_url="https://api.example.com/mcp",
    headers={"Authorization": "Bearer your-token-here"}
)

# API key authentication
result = mirror.mirror(
    http_url="http://localhost:8080",
    headers={"X-API-Key": "abc123", "X-Client-ID": "my-app"}
)

tools = mirror.to_langchain_tools()

Mirroring with LangGraph Registry Integration

For advanced scenarios, mirror tools directly into the LangGraph mock registry:

from stuntdouble.mirroring import ToolMirror
from stuntdouble import MockToolsRegistry

# Create registry first
registry = MockToolsRegistry()

# Mirror tools and register them in the LangGraph registry
mirror = ToolMirror.for_langgraph(registry=registry)
mirror.mirror(["python", "-m", "my_server"])

# Registry now contains mock functions for all mirrored tools
print(registry.list_registered())
# ['create_invoice', 'get_customer', 'list_bills', ...]

Combining with Custom Mocks

Override specific mirrored tools with custom behavior:

from stuntdouble.mirroring import ToolMirror
from stuntdouble import MockToolsRegistry

registry = MockToolsRegistry()

# Mirror all tools
mirror = ToolMirror.for_langgraph(registry=registry)
mirror.mirror(["python", "-m", "my_server"])

# Override specific tool with custom mock
registry.mock("get_customer").returns({
    "id": "CUST-001",
    "name": "Test Corp",
    "tier": "platinum"
})

# get_customer uses custom mock, others use generated mocks

→ See the MCP Tool Mirroring Guide for complete documentation.

LangGraph Integration

Overview

ToolNode Wrapper

Prerequisites

Option A: Default Registry (Simplest) ⭐

Option B: Custom Registry (Full Control)

API Reference

Troubleshooting

Mocks Not Working

CallRecorder: Tool Call Verification

Quick Start

API Reference

Query Methods

Assertion Methods

CallRecord Properties

Examples

Basic Verification

Argument Verification

Call Order Verification

Inspecting Calls

Testing with pytest

Thread Safety

Shared Concepts

MockBuilder: Fluent Mock Registration

Mocked Tool Patterns

Pattern 1: Static Mock

Pattern 2: Input-Echo Mock

Pattern 3: Mocks-Based Mock (Data-Driven)

Pattern 4: Conditional Mock (with when predicate)

Pattern 5: Dynamic Placeholders

Pattern 6: Context-Aware Mock (Runtime Config Access)

scenario_metadata Structure

Input Matching

Dynamic Placeholders

Custom Mocked Tools

Mocked Tool Signature

Example: Custom Key Name

Example: Stateful Factory

Example: Tenant-Aware Factory

Best Practices

1. Register at Startup

2. Keep Mocked Tools Pure

3. Enable Logging for Debugging

Mock Signature Validation

Validation Points

Controlling Validation

Example: Registration-Time Validation

Example: Runtime Validation

What Gets Validated

Opting Out

MCP Tool Support

Using Mirrored Tools

Quick Start with Mirroring

LangGraph-Optimized Mirroring

Mirroring with HTTP Authentication

Mirroring with LangGraph Registry Integration

Combining with Custom Mocks

Pattern 4: Conditional Mock (with `when` predicate)