Inference Logoinference.sh

visibility into agent behavior

Agent Observability

See exactly what your agents are doing, why they made each decision, and what went wrong when things break. No instrumentation required.

the debugging problem

It's 2am. Your agent is producing bad outputs for a subset of users. You need to understand why.

With traditional agent deployments, you're flying blind. You have logs that show the agent ran. Maybe you captured the final output. But the reasoning chain? The tool calls? The intermediate state? Gone.

You can't reproduce the issue because you don't know what inputs triggered it. You can't fix it because you don't know where in the process things went wrong. You're reduced to guessing, adding print statements, and hoping the issue happens again while you're watching.

Agent observability isn't optional. It's the difference between debugging in minutes and debugging in days.

what you need to see

Agents are different from traditional software. A web request is a function call: input in, output out, maybe some database queries in between. An agent is a reasoning process: multiple steps, external tool calls, decisions based on intermediate results, and behavior that varies based on context.

Effective agent observability captures:

  • the complete reasoning chain: every thought, every decision point, every branch taken
  • tool inputs and outputs: what the agent asked for, what it got back, how long it took
  • token usage: where context is being consumed, which steps are expensive
  • timing: how long each step takes, where bottlenecks occur
  • state changes: how agent memory and context evolve through execution

This isn't just logging. It's a complete trace of agent behavior that you can replay, analyze, and debug.

real-time visibility

Agents often run for minutes or hours. You can't wait until completion to see what's happening. You need real-time streaming of agent state.

With real-time observability, you can:

  • watch agents think through problems live
  • see tool calls as they execute
  • catch issues before they compound
  • understand why an agent is taking a particular path

Users also benefit from real-time streaming. Instead of waiting for a final response, they see the agent working. They can interrupt if the agent goes off track. They understand that complex requests take time.

built-in, not bolted on

Most teams add observability as an afterthought. They integrate a separate tracing product, add instrumentation code throughout their agent logic, and hope they captured enough detail.

This approach has problems:

  • instrumentation is tedious and error-prone
  • you only capture what you thought to instrument
  • multiple tools mean multiple dashboards
  • costs scale with volume in unpredictable ways

When observability is built into the runtime, every step is captured automatically. You don't decide what to log; everything is logged. You don't pay for a separate product; it's included.

observability in inference.sh

Every agent running on inference.sh automatically gets:

  • complete traces: every tool call, decision, and state change captured
  • real-time streaming: watch agent execution as it happens
  • token tracking: understand costs at the step level
  • replay capability: re-run any execution with the same inputs
  • search and filter: find specific executions by user, time, or outcome

No instrumentation code. No separate products. Just visibility into every agent, automatically.

start building observable agents →

what you get

the runtime layer

you could build this. but do you want to?

01

durable execution

event-driven, not long-running. if a tool fails, it doesn't crash your agent loop. state persists across invocations.

02

tool orchestration

150+ apps as tools via one API. structured execution with approvals when needed. full visibility into what ran.

03

observability

real-time streaming and logs for every action. see exactly what your agent is doing.

04

pay-per-execution

no idle costs while tools run or waiting for results. you're not paying to keep a process alive.

plug any model, swap providers without changing code

openai
Anthropicanthropic
google
meta
mistral
deepseek
+ 500 more

ready to ship?

start with the hosted platform. deploy your own when you're ready.

we use cookies

we use cookies to ensure you get the best experience on our website. for more information on how we use cookies, please see our cookie policy.

by clicking "accept", you agree to our use of cookies.
learn more.