visibility into agent behavior

Agent Observability

See exactly what your agents are doing, why they made each decision, and what went wrong when things break. No instrumentation required.

start building read the docs

the debugging problem

It's 2am. Your agent is producing bad outputs for a subset of users. You need to understand why.

With traditional agent deployments, you're flying blind. You have logs that show the agent ran. Maybe you captured the final output. But the reasoning chain? The tool calls? The intermediate state? Gone.

You can't reproduce the issue because you don't know what inputs triggered it. You can't fix it because you don't know where in the process things went wrong. You're reduced to guessing, adding print statements, and hoping the issue happens again while you're watching.

Agent observability isn't optional. It's the difference between debugging in minutes and debugging in days.

what you need to see

Agents are different from traditional software. A web request is a function call: input in, output out, maybe some database queries in between. An agent is a reasoning process: multiple steps, external tool calls, decisions based on intermediate results, and behavior that varies based on context.

Effective agent observability captures:

the complete reasoning chain: every thought, every decision point, every branch taken
tool inputs and outputs: what the agent asked for, what it got back, how long it took
token usage: where context is being consumed, which steps are expensive
timing: how long each step takes, where bottlenecks occur
state changes: how agent memory and context evolve through execution

This isn't just logging. It's a complete trace of agent behavior that you can replay, analyze, and debug.

real-time visibility

Agents often run for minutes or hours. You can't wait until completion to see what's happening. You need real-time streaming of agent state.

With real-time observability, you can:

watch agents think through problems live
see tool calls as they execute
catch issues before they compound
understand why an agent is taking a particular path

Users also benefit from real-time streaming. Instead of waiting for a final response, they see the agent working. They can interrupt if the agent goes off track. They understand that complex requests take time.

built-in, not bolted on

Most teams add observability as an afterthought. They integrate a separate tracing product, add instrumentation code throughout their agent logic, and hope they captured enough detail.

This approach has problems:

instrumentation is tedious and error-prone
you only capture what you thought to instrument
multiple tools mean multiple dashboards
costs scale with volume in unpredictable ways

When observability is built into the runtime, every step is captured automatically. You don't decide what to log; everything is logged. You don't pay for a separate product; it's included.

observability in inference.sh

Every agent running on inference.sh automatically gets:

complete traces: every tool call, decision, and state change captured
real-time streaming: watch agent execution as it happens
token tracking: understand costs at the step level
replay capability: re-run any execution with the same inputs
search and filter: find specific executions by user, time, or outcome

No instrumentation code. No separate products. Just visibility into every agent, automatically.

start building observable agents →

what you get

the runtime layer

you could build this. but do you want to?

durable execution

event-driven, not long-running. if a tool fails, it doesn't crash your agent loop. state persists across invocations.

tool orchestration

150+ apps as tools via one API. structured execution with approvals when needed. full visibility into what ran.

observability

real-time streaming and logs for every action. see exactly what your agent is doing.

pay-per-execution

no idle costs while tools run or waiting for results. you're not paying to keep a process alive.

plug any model, swap providers without changing code

openai

anthropic

google

ready to ship?

start with the hosted platform. deploy your own when you're ready.

start for free

we use cookies

we use cookies to ensure you get the best experience on our website. for more information on how we use cookies, please see our cookie policy.

by clicking "accept", you agree to our use of cookies.
learn more.