inference.sh is an agent runtime; the infrastructure layer that executes your agent code with built-in solutions for the hard operational problems.
runtime vs framework
a framework gives you building blocks. you import a library, define your agent logic, and figure out where to deploy it yourself.
a runtime is infrastructure that executes your agent with specific guarantees about:
- durability: state persists across failures and restarts
- observability: every decision and action is captured automatically
- tool orchestration: managed integrations with authentication handled
- human oversight: approval gates for consequential actions
the distinction matters because agent workloads have unique characteristics that general-purpose infrastructure handles poorly.
why agents need a runtime
agents are different from traditional software:
- they make long sequences of decisions punctuated by external calls
- they need to maintain conversation context across many interactions
- they call tools that might fail, timeout, or return unexpected results
- they run for unpredictable durations; sometimes seconds, sometimes hours
- they need human oversight for sensitive actions
standard web frameworks assume request-response cycles measured in milliseconds. container orchestration assumes stateless workloads. message queues assume independent jobs. agent workloads fit none of these patterns well.
what the runtime handles
when you run agents on inference.sh, you get:
| capability | what it does |
|---|---|
| durable execution | state checkpoints after each step; resume on failure |
| observability | automatic tracing of every decision and tool call |
| human-in-the-loop | approval gates for consequential actions |
| tool orchestration | 150+ integrations with managed authentication |
you write agent logic. the runtime handles operations.
learn more
- durable execution: how agents survive failures
- observability: visibility into agent behavior
- human-in-the-loop: approval gates for actions
- tool orchestration: managed integrations
or dive into the deep dive on agent runtimes.