every agent running on inference.sh is automatically traced. no configuration, no instrumentation code, no separate products.
the visibility problem
agents are opaque by default. a user sends a message, the agent responds. between those events, the agent might have reasoned through multiple approaches, called several tools, and made numerous decisions. all invisible unless you capture it.
without visibility, debugging is speculation. you can't fix what you can't see.
what gets captured
- reasoning chain: every thought, decision point, and branch taken
- tool calls: inputs, outputs, timing, success/failure
- token usage: where context is consumed, which steps are expensive
- state changes: how memory and context evolve through execution
- timing: duration of each step, where bottlenecks occur
real-time streaming
agents often run for minutes. you can't wait until completion to see what's happening.
with inference.sh, you can:
- watch agents think through problems live
- see tool calls as they execute
- catch issues before they compound
- understand why the agent is taking a particular path
users see the agent working, not a blank screen.
built-in, not bolted on
most teams add observability as an afterthought; integrating tracing products, adding instrumentation, managing separate dashboards.
when observability is part of the runtime:
- no configuration required
- every step captured automatically
- no separate products to manage
- no additional costs scaling with volume