Tool calling (function calling) is where most AI agents fail in production. The model calls a tool with wrong parameters, the tool throws an error, and the agent has no recovery strategy. Here are the patterns that make tool-using agents reliable.
Pattern 1: Strict Schema Definition
The most common failure mode is loose schemas. If a parameter can be a string or an integer, the model will sometimes guess wrong. Define schemas with the strictest possible typing:
Pattern 2: Structured Error Feedback
When a tool call fails, return a structured error that tells the model exactly what went wrong and how to fix it:
Pattern 3: Idempotency Guards
AI agents retry on failure. If your tools have side effects (writing to a database, sending emails, calling external APIs), they must be idempotent. Implement idempotency keys for every stateful operation:
- Generate a unique operation ID before the tool call, derived from the trace ID and operation type
- Pass the ID to the tool; the tool stores completed operations and returns the cached result on duplicate calls
- Set a TTL on idempotency records (24 hours is typical) to prevent unbounded growth
Pattern 4: Parallel Tool Execution
Modern LLMs support parallel function calling—calling multiple tools simultaneously. Use this to eliminate serial latency in multi-step workflows:
| Approach | Latency | Use When |
|---|---|---|
| Sequential | Sum of all tools | Tool B depends on Tool A's output |
| Parallel | Max of all tools | Tools are independent |
Pattern 5: Tool Result Validation
Don't trust tool results blindly. Validate them before passing to the next step:
- Schema validation — Ensure the tool returned the expected structure
- Range checks — A search that returns 0 results may indicate a bad query, not an empty set
- Freshness checks — External API data may be stale; check timestamps
- Consistency checks — Cross-validate results from multiple tools when they should agree
Build agents that actually work in production.
We engineer AI agents with production-grade tool calling: schema validation, retry logic, idempotency, and observability built in.
Start Assessment