Inference Logoinference.sh

controlled autonomy

Human-in-the-Loop

Agents that know when to ask. Add approval gates to any action with one flag. The agent pauses, shows what it wants to do, and waits for your confirmation.

the autonomy problem

Autonomous agents are powerful. They can research, draft, analyze, and execute without constant supervision. That's the whole point.

But full autonomy is terrifying in production. An agent that can send emails can send the wrong email. An agent that can modify databases can modify them incorrectly. An agent that can make purchases can make expensive mistakes.

The question isn't whether to give agents autonomy. It's how to give them autonomy with appropriate guardrails. You want agents that can work independently on routine tasks but pause for human judgment on consequential actions.

traditional approaches don't work

Most teams solve this by limiting what agents can do. They remove dangerous tools entirely, or they require approval for everything.

Removing tools reduces capability. If your agent can't send emails, it can't complete workflows that require sending emails. You end up with agents that can research but can't act, which pushes the work back to humans anyway.

Requiring approval for everything destroys the productivity benefit. If a human has to approve every tool call, they might as well do the task themselves. The overhead makes agents impractical.

What you need is selective approval: autonomy for safe actions, human judgment for consequential ones.

how human-in-the-loop works

Human-in-the-loop is a runtime capability, not just a UI pattern. When an agent reaches an action that requires approval:

  • execution pauses: the agent's state is persisted, but no action is taken
  • request surfaces: the user sees what the agent wants to do, with full context
  • human decides: approve, reject, or modify the proposed action
  • execution resumes: the agent continues with the human's decision

This requires durable execution. The agent can't just sit in memory waiting for approval; it might wait hours or days. The runtime must persist state, handle the approval asynchronously, and resume execution when the decision arrives.

one flag to enable

In inference.sh, human-in-the-loop is a single configuration option:

agent = Agent(
    name="my-agent",
    tools=[send_email, update_database, ...],
    human_in_the_loop=True  # that's it
)

When enabled, tool calls pause for approval before executing. The user sees exactly what the agent wants to do: the tool being called, the parameters being passed, and why the agent chose this action.

You can also configure approval at the tool level. Mark specific tools as requiring approval while letting others execute automatically. Research tools run freely; action tools pause for confirmation.

the user experience

Human-in-the-loop isn't just about safety. It's about trust.

When users see what an agent is about to do before it does it, they develop confidence in the system. They understand how the agent thinks. They catch mistakes before they happen. They feel in control even when the agent is doing the work.

Over time, users learn which actions they always approve. They might choose to auto-approve certain patterns. The system becomes more autonomous as trust is established, not as a default assumption.

This is how you deploy agents that handle real tasks without the anxiety of full autonomy.

built into the runtime

Human-in-the-loop in inference.sh includes:

  • approval ui: clear presentation of what the agent wants to do
  • async handling: approvals can take minutes, hours, or days
  • modification support: adjust parameters before approving
  • audit trail: full record of what was approved and by whom
  • notification options: email, slack, or webhook when approval is needed

start building agents with human oversight →

agent architecture

full agent primitives

the building blocks for production agent systems

deep-agents

agents spawn sub-agents as tools. orchestrator delegates to specialists. results flow back up the chain. the main context stays focused on the original task.

orchestrator
├── research
│ └── web search app
├── analysis
│ └── long-context llm app
└── writer
  └── post to X app

human in the loop

agent pauses, shows what it wants to do, waits for confirmation.

human in the loop approval

widgets

agents generate interactive UI on-the-fly. forms, selections, charts, visualizations — rendered inline.

dynamic HTML widget

widgets

agents generate beautiful UI with HTML and CSS to display data.

UI widget display

webhooks

call any API, receive async callbacks. your endpoints or third-party services.

client tools

execute on user's system — browser, local functions. sync request/response.

memory

built-in key-value store per conversation.

planning

built-in multi-step plans, resume after interruption. perfect for deep agents.

structured output

typed results back to orchestrator.

ready to ship?

start with the hosted platform. deploy your own when you're ready.

we use cookies

we use cookies to ensure you get the best experience on our website. for more information on how we use cookies, please see our cookie policy.

by clicking "accept", you agree to our use of cookies.
learn more.