# StuntDouble Documentation **Tool Mocking Framework for AI Agent Testing** --- ## What is StuntDouble? StuntDouble is a Python framework for **mocking AI agent tool calls**. Just like a stunt double performs risky scenes in place of an actor, StuntDouble lets you test your AI agents without the risk, cost, and unpredictability of production APIs. ``` ┌─────────────────────────────────────────────────────────────────────────────┐ │ │ │ Production Testing with StuntDouble │ │ ────────── ──────────────────────── │ │ │ │ Agent ──▶ Real API ──▶ $$$ Agent ──▶ StuntDouble ──▶ Mock │ │ ↓ ↓ │ │ Slow, Flaky, Fast, Reliable, │ │ Unpredictable Deterministic │ │ │ └─────────────────────────────────────────────────────────────────────────────┘ ``` ## Motivation ### The Problem AI agents rely heavily on external tools—APIs for customer data, billing systems, communication services, and more. Testing these agents presents unique challenges: | Challenge | Impact | |-----------|--------| | **API Costs** | Every test run hits real APIs, accumulating costs | | **Slow Execution** | Network latency makes test suites painfully slow | | **Non-Deterministic** | Real data changes, causing flaky tests | | **Environment Dependencies** | Tests require production-like environments | | **Risk of Side Effects** | Tests might accidentally send emails, create invoices | ### The Solution StuntDouble provides **transparent tool mocking** that intercepts tool calls and returns controlled responses: ``` ┌─────────────────────────────────────────────────────────────────────────────┐ │ │ │ Before StuntDouble │ │ ───────────────── │ │ │ │ ┌─────────┐ ┌──────────────┐ ┌────────────────┐ │ │ │ Agent │────▶│ list_bills() │────▶│ Real API │ │ │ └─────────┘ └──────────────┘ │ • Slow │ │ │ │ • Costly │ │ │ │ • Flaky │ │ │ └────────────────┘ │ │ │ │ After StuntDouble │ │ ───────────────── │ │ │ │ ┌─────────┐ ┌──────────────┐ ┌────────────────┐ │ │ │ Agent │────▶│ StuntDouble │────▶│ Mock Function │ │ │ └─────────┘ └──────────────┘ │ • Instant │ │ │ │ │ • Free │ │ │ │ │ • Controlled │ │ │ ▼ └────────────────┘ │ │ scenario_metadata? │ │ ├── YES ──▶ Return mock │ │ └── NO ──▶ Call real tool │ │ │ └─────────────────────────────────────────────────────────────────────────────┘ ``` ## Key Features | Feature | Description | |---------|-------------| | 🚀 **LangGraph Native** | Per-invocation mocking via `RunnableConfig`—no global state | | ✨ **Transparent Mocking** | Wrap tools once, toggle between real and mock | | 🔄 **Zero Code Changes** | Agent code remains unchanged, production-ready | | 🔍 **Smart Input Matching** | Operators like `$gt`, `$in`, `$regex` for conditional mocking | | ⏰ **Dynamic Outputs** | Placeholders for `{{now}}`, `{{input.id}}`, `{{uuid}}` | | 🔗 **Fluent Builder API** | Chainable mock registration: `mock("tool").when(...).returns(...)` | | 📝 **Call Recording** | Capture and assert on tool calls with `CallRecorder` | | 🔐 **Signature Validation** | Catch mock/tool signature mismatches at registration or runtime | | 🪞 **MCP Tool Mirroring** | Auto-generate mocks from MCP server schemas | | 🔑 **Context-Aware Mocks** | Access runtime config (user ID, headers) in mock factories | ## Getting Started ### Quick Example Best for LangGraph agents. Per-invocation mocking via `RunnableConfig`. ```python from langgraph.prebuilt import ToolNode from stuntdouble import ( mockable_tool_wrapper, default_registry, inject_scenario_metadata, ) mock = default_registry.mock # Fluent builder on default registry # Register mocks on the default registry (fluent API) mock("list_bills").returns({"bills": [{"id": "B001", "amount": 500}]}) # Build graph with native ToolNode + mockable wrapper builder.add_node("tools", ToolNode(tools, awrap_tool_call=mockable_tool_wrapper)) # Invoke with mocks via config config = inject_scenario_metadata({}, { "scenario_id": "landing-page-demo" }) result = await graph.ainvoke(state, config=config) ``` **Benefits:** - ✅ Fully concurrent—each request gets its own mocks - ✅ No global state or environment variables - ✅ Native `ToolNode` integration—minimal code changes - ✅ Call recording for test assertions - ✅ Signature validation to catch config errors early → [LangGraph Integration Guide](guides/langgraph-integration.md) ### Auto-Generate Mocks from MCP Servers Already using an MCP server? StuntDouble can auto-discover its tools and generate mock implementations, which you can then use in your LangGraph agent workflow: ```python from stuntdouble.mirroring import ToolMirror mirror = ToolMirror() mirror.mirror(["python", "-m", "my_mcp_server"]) tools = mirror.to_langchain_tools() ``` → [MCP Mirroring Guide](guides/mcp-mirroring.md) ## Installation ```shell # Using uv (recommended) uv add stuntdouble # Using pip pip install stuntdouble # Using Poetry poetry add stuntdouble ``` For MCP mirroring support, install the optional extra: ```shell pip install "stuntdouble[mcp]" ``` ## Documentation Structure ```{toctree} :maxdepth: 2 :caption: Getting Started guides/quickstart guides/langgraph-integration ``` ```{toctree} :maxdepth: 2 :caption: Guides guides/mcp-mirroring guides/matchers-and-resolvers guides/mock-builder guides/call-recording guides/context-aware-mocks guides/signature-validation ``` ```{toctree} :maxdepth: 2 :caption: Reference reference/mock-format reference/dependencies reference/schema-validation architecture/overview ``` ## Support For questions or feedback, please open an issue on the repository.