# inference.sh

> The AI agent runtime. Build agents that can actually do things—with durable execution, human-in-the-loop approval, and real-time observability.

inference.sh is an open-source agent runtime for deploying production AI agents. Unlike agent frameworks that just orchestrate LLM calls, inference.sh provides the infrastructure agents need: durable execution that survives failures, human-in-the-loop approval for critical actions, real-time streaming and observability, and tools that actually work.

Key differentiators from LangChain/LangGraph:
- **Durable execution**: Every tool call persisted, graph-backed visibility into agent decisions
- **Human-in-the-loop**: One flag to require approval before tool execution
- **Real tools**: Pre-built integrations that actually work, not just wrappers
- **Observability built-in**: Every agent call is logged, traced, and streamable

## Docs

### getting started

- [introduction](https://inference.sh/docs/getting-started/introduction.md)
- [what is inference.sh?](https://inference.sh/docs/getting-started/what-is-inference.md)
- [workspace tour](https://inference.sh/docs/getting-started/workspace-tour.md)
- [your first agent](https://inference.sh/docs/getting-started/your-first-agent.md)
### runtime

- [what is a runtime?](https://inference.sh/docs/runtime/overview.md)
- [durable execution](https://inference.sh/docs/runtime/durable-execution.md)
- [observability](https://inference.sh/docs/runtime/observability.md)
- [human-in-the-loop](https://inference.sh/docs/runtime/human-in-the-loop.md)
- [tool orchestration](https://inference.sh/docs/runtime/tool-orchestration.md)
### concepts

- [apps](https://inference.sh/docs/concepts/apps.md)
- [tasks](https://inference.sh/docs/concepts/tasks.md)
- [flows](https://inference.sh/docs/concepts/flows.md)
- [agents](https://inference.sh/docs/concepts/agents.md)
- [workers](https://inference.sh/docs/concepts/workers.md)
- [sessions](https://inference.sh/docs/concepts/sessions.md)
### agents

- [overview](https://inference.sh/docs/agents/overview.md)
- [creating an agent](https://inference.sh/docs/agents/creating.md)
- [system prompts](https://inference.sh/docs/agents/system-prompts.md)
- [skills](https://inference.sh/docs/agents/skills.md)
- [adding tools](https://inference.sh/docs/agents/adding-tools.md)
- [sub-agents](https://inference.sh/docs/agents/sub-agents.md)
- [chatting](https://inference.sh/docs/agents/chatting.md)
- [webhooks](https://inference.sh/docs/agents/webhooks.md)
**widgets**

- [overview](https://inference.sh/docs/agents/widgets/overview.md)
- [schema](https://inference.sh/docs/agents/widgets/schema.md)
- [actions](https://inference.sh/docs/agents/widgets/actions.md)
- [card](https://inference.sh/docs/agents/widgets/card.md)
- [box](https://inference.sh/docs/agents/widgets/box.md)
- [row](https://inference.sh/docs/agents/widgets/row.md)
- [col](https://inference.sh/docs/agents/widgets/col.md)
- [spacer](https://inference.sh/docs/agents/widgets/spacer.md)
- [divider](https://inference.sh/docs/agents/widgets/divider.md)
- [form](https://inference.sh/docs/agents/widgets/form.md)
- [text](https://inference.sh/docs/agents/widgets/text.md)
- [title](https://inference.sh/docs/agents/widgets/title.md)
- [caption](https://inference.sh/docs/agents/widgets/caption.md)
- [label](https://inference.sh/docs/agents/widgets/label.md)
- [markdown](https://inference.sh/docs/agents/widgets/markdown.md)
- [button](https://inference.sh/docs/agents/widgets/button.md)
- [input](https://inference.sh/docs/agents/widgets/input.md)
- [textarea](https://inference.sh/docs/agents/widgets/textarea.md)
- [select](https://inference.sh/docs/agents/widgets/select.md)
- [checkbox](https://inference.sh/docs/agents/widgets/checkbox.md)
- [radio-group](https://inference.sh/docs/agents/widgets/radio-group.md)
- [date-picker](https://inference.sh/docs/agents/widgets/date-picker.md)
- [image](https://inference.sh/docs/agents/widgets/image.md)
- [icon](https://inference.sh/docs/agents/widgets/icon.md)
- [badge](https://inference.sh/docs/agents/widgets/badge.md)
- [chart](https://inference.sh/docs/agents/widgets/chart.md)
- [transition](https://inference.sh/docs/agents/widgets/transition.md)
### skills

- [overview](https://inference.sh/docs/skills/overview.md)
- [creating skills](https://inference.sh/docs/skills/creating.md)
- [the registry](https://inference.sh/docs/skills/registry.md)
- [adding to agents](https://inference.sh/docs/skills/adding-to-agents.md)
- [using with other agents](https://inference.sh/docs/skills/other-agents.md)
### apps

- [overview](https://inference.sh/docs/apps/overview.md)
- [browsing the grid](https://inference.sh/docs/apps/browsing-grid.md)
- [running an app](https://inference.sh/docs/apps/running.md)
- [setup parameters](https://inference.sh/docs/apps/setup-parameters.md)
### flows

- [overview](https://inference.sh/docs/flows/overview.md)
- [creating a flow](https://inference.sh/docs/flows/creating.md)
- [connecting nodes](https://inference.sh/docs/flows/connecting.md)
- [deploying as app](https://inference.sh/docs/flows/deploying.md)
### extend

- [overview](https://inference.sh/docs/extend/overview.md)
- [coding agents](https://inference.sh/docs/extend/coding-agents.md)
- [cli setup](https://inference.sh/docs/extend/cli-setup.md)
- [creating an app](https://inference.sh/docs/extend/creating-app.md)
- [app code](https://inference.sh/docs/extend/app-code.md)
- [configuration](https://inference.sh/docs/extend/configuration.md)
- [secrets](https://inference.sh/docs/extend/secrets.md)
- [integrations](https://inference.sh/docs/extend/integrations.md)
- [output meta](https://inference.sh/docs/extend/output-meta.md)
- [best practices](https://inference.sh/docs/extend/best-practices.md)
- [cancellation](https://inference.sh/docs/extend/cancellation.md)
- [troubleshooting](https://inference.sh/docs/extend/troubleshooting.md)
- [deploying](https://inference.sh/docs/extend/deploying.md)
- [multi-function apps](https://inference.sh/docs/extend/multi-function-apps.md)
- [sessions](https://inference.sh/docs/extend/sessions.md)
### api & sdk

- [overview](https://inference.sh/docs/api/overview.md)
- [authentication](https://inference.sh/docs/api/authentication.md)
**sdk**

- [overview](https://inference.sh/docs/api/sdk/overview.md)
- [running apps](https://inference.sh/docs/api/sdk/running-apps.md)
- [streaming](https://inference.sh/docs/api/sdk/streaming.md)
- [polling](https://inference.sh/docs/api/sdk/polling.md)
- [files](https://inference.sh/docs/api/sdk/files.md)
**server proxy**

- [overview](https://inference.sh/docs/api/sdk/server-proxy.md)
- [next.js](https://inference.sh/docs/api/sdk/proxy/nextjs.md)
- [express](https://inference.sh/docs/api/sdk/proxy/express.md)
- [hono](https://inference.sh/docs/api/sdk/proxy/hono.md)
- [remix](https://inference.sh/docs/api/sdk/proxy/remix.md)
- [sveltekit](https://inference.sh/docs/api/sdk/proxy/sveltekit.md)
**agent sdk**

- [overview](https://inference.sh/docs/api/agent/overview.md)
- [template agents](https://inference.sh/docs/api/agent/template.md)
- [ad-hoc agents](https://inference.sh/docs/api/agent/adhoc.md)
- [building tools](https://inference.sh/docs/api/agent/tools.md)
- [client tools](https://inference.sh/docs/api/agent/client-tools.md)
- [app tools](https://inference.sh/docs/api/agent/app-tools.md)
- [agent tools](https://inference.sh/docs/api/agent/agent-tools.md)
- [webhook tools](https://inference.sh/docs/api/agent/webhook-tools.md)
- [internal tools](https://inference.sh/docs/api/agent/internal-tools.md)
- [structured output](https://inference.sh/docs/api/agent/structured-output.md)
- [approval](https://inference.sh/docs/api/agent/approval.md)
- [streaming](https://inference.sh/docs/api/agent/streaming.md)
**rest api**

- [overview](https://inference.sh/docs/api/rest/overview.md)
- [tasks](https://inference.sh/docs/api/rest/tasks.md)
- [files](https://inference.sh/docs/api/rest/files.md)
- [agents](https://inference.sh/docs/api/rest/agents.md)
- [streaming](https://inference.sh/docs/api/rest/streaming.md)
### secrets

- [overview](https://inference.sh/docs/secrets/overview.md)
- [environment variables](https://inference.sh/docs/secrets/environment.md)
### integrations

- [overview](https://inference.sh/docs/integrations/overview.md)
- [google service account](https://inference.sh/docs/integrations/google-service-account.md)
- [google oauth](https://inference.sh/docs/integrations/google-oauth.md)
- [google cloud platform](https://inference.sh/docs/integrations/gcp.md)
- [slack](https://inference.sh/docs/integrations/slack.md)
- [discord](https://inference.sh/docs/integrations/discord.md)
- [x.com](https://inference.sh/docs/integrations/x.md)
### private workers

- [why private?](https://inference.sh/docs/private/why.md)
- [installing the engine](https://inference.sh/docs/private/installing.md)
- [configuration](https://inference.sh/docs/private/config.md)
- [using private workers](https://inference.sh/docs/private/using.md)
### examples

- [image generation](https://inference.sh/docs/examples/image-generation.md)
- [audio transcription](https://inference.sh/docs/examples/audio-transcription.md)
- [content pipeline](https://inference.sh/docs/examples/content-pipeline.md)
- [data processing](https://inference.sh/docs/examples/data-processing.md)
- [multi-agent system](https://inference.sh/docs/examples/multi-agent.md)
- [x.com integration](https://inference.sh/docs/examples/x-integration.md)
- [multi-function app](https://inference.sh/docs/examples/multi-function-app.md)
- [stateful sessions](https://inference.sh/docs/examples/stateful-sessions.md)

## Blog

### skills

- [agent skills overview](https://inference.sh/blog/skills/skills-overview.md)

### concepts

- [workflows vs agents](https://inference.sh/blog/concepts/workflows-vs-agents.md)

### agent runtime

- [why agent runtimes matter](https://inference.sh/blog/agent-runtime/why-runtimes-matter.md)
- [durable execution for agents](https://inference.sh/blog/agent-runtime/durable-execution.md)
- [human-in-the-loop in one flag](https://inference.sh/blog/agent-runtime/human-in-the-loop.md)
- [agent memory without boilerplate](https://inference.sh/blog/agent-runtime/agent-memory.md)
- [the real cost of agent infrastructure](https://inference.sh/blog/agent-runtime/infrastructure-cost.md)

### multi-agent

- [hierarchical agent delegation](https://inference.sh/blog/multi-agent/hierarchical-delegation.md)
- [when to use multi-agent](https://inference.sh/blog/multi-agent/when-to-use.md)
- [concurrent agent execution](https://inference.sh/blog/multi-agent/concurrent-execution.md)

### observability

- [built-in observability](https://inference.sh/blog/observability/built-in.md)
- [debugging agents at 2am](https://inference.sh/blog/observability/debugging-guide.md)
- [real-time streaming](https://inference.sh/blog/observability/streaming.md)

### tools & execution

- [the tool integration tax](https://inference.sh/blog/tools/integration-tax.md)
- [sandboxed code execution](https://inference.sh/blog/tools/sandboxed-execution.md)
- [tool approval gates](https://inference.sh/blog/tools/approval-gates.md)
- [building custom apps](https://inference.sh/blog/tools/building-custom-apps.md)
- [client-side tools](https://inference.sh/blog/tools/client-side-tools.md)

### ux & interfaces

- [agents that generate UI](https://inference.sh/blog/ux/generative-ui.md)
- [agent ux patterns](https://inference.sh/blog/ux/agent-ux-patterns.md)

### guides

- [building a research agent](https://inference.sh/blog/guides/research-agent.md)
- [from demo to production](https://inference.sh/blog/guides/demo-to-production.md)
- [introducing ui.inference.sh](https://inference.sh/blog/guides/shadcn-registry.md)
- [seedance 2.0 is coming](https://inference.sh/blog/guides/seedance-2-video-generation.md)


## Quick Reference

### Agent Structure
```python
from inferencesh import Agent, Tool

agent = Agent(
    name="my-agent",
    model="claude-sonnet-4-20250514",
    system_prompt="You are a helpful assistant.",
    tools=[...],
    human_in_the_loop=True,  # Require approval for tool calls
)
```

### API Usage
```python
from inferencesh import inference
client = inference(api_key="inf_...")

# Run an agent
chat = client.agents.chat("my-agent")
response = chat.send("Help me with...")

# Stream responses
for chunk in chat.stream("What can you do?"):
    print(chunk.content, end='')
```

## Apps

Pre-built AI apps and tools available on the inference.sh platform:

### Featured

- [google/veo-3-1-fast](https://app.inference.sh/apps/google/veo-3-1-fast): Veo 3.1 Fast via Vertex AI - Generate videos from text prompts or images with optional audio
- [infsh/hunyuanvideo-foley](https://app.inference.sh/apps/infsh/hunyuanvideo-foley): Synthesizes realistic sound effects and audio tracks based on your video content and written descriptions.

### All Apps

- [infsh/remotion-render](https://app.inference.sh/apps/infsh/remotion-render): Render videos from React/Remotion component code — pass TSX, get MP4
- [openrouter/minimax-m-25](https://app.inference.sh/apps/openrouter/minimax-m-25): MiniMax-M2.5 is a SOTA large language model designed for real-world productivity. Trained in a diverse range of complex real-world digital working environments, M2.5 builds upon the coding expertise of M2.1 to extend into general office work, reaching fluency in generating and operating Word, Excel, and Powerpoint files, context switching between diverse software environments, and working across different agent and human teams.
- [falai/kokoro-tts](https://app.inference.sh/apps/falai/kokoro-tts): Kokoro TTS - Lightweight text-to-speech with multiple languages and voices
- [xai/grok-imagine-image-pro](https://app.inference.sh/apps/xai/grok-imagine-image-pro): Generate and edit images using xAI's Grok Imagine Pro model. Supports text-to-image and image editing with multiple aspect ratios.
- [openrouter/claude-opus-46](https://app.inference.sh/apps/openrouter/claude-opus-46): Opus 4.6 is Anthropic’s strongest model for coding and long-running professional tasks. It is built for agents that operate across entire workflows rather than single prompts, making it especially effective for large codebases, complex refactors, and multi-step debugging that unfolds over time.
- [falai/dia-tts](https://app.inference.sh/apps/falai/dia-tts): Dia TTS - Generate realistic dialogue with emotion control, natural nonverbals, and voice cloning
- [infsh/agent-browser](https://app.inference.sh/apps/infsh/agent-browser): Browser automation for AI agents. Navigate, interact with @e refs, take screenshots, record video with cursor indicator, execute JavaScript. Supports proxy configuration.
- [x/post-tweet](https://app.inference.sh/apps/x/post-tweet): Post tweets to X.com with text (280 char limit) and optional media. Supports up to 4 images or 1 video/GIF. Can reply to or quote other tweets. Images over 5MB are auto-resized.
- [x/dm-send](https://app.inference.sh/apps/x/dm-send): Send a direct message on X.com. Requires the recipient's user ID (not username). Text-only messages; media attachments are not supported.
- [x/user-follow](https://app.inference.sh/apps/x/user-follow): Follow a user on X.com by user ID. Succeeds silently if already following.
- [x/user-get](https://app.inference.sh/apps/x/user-get): Get a user profile from X.com by ID or username. Returns bio, follower/following counts, tweet count, verified status, and profile image URL.
- [x/post-retweet](https://app.inference.sh/apps/x/post-retweet): Retweet a post on X.com by post ID. Succeeds silently if already retweeted.
- [x/post-like](https://app.inference.sh/apps/x/post-like): Like a post on X.com by post ID. Succeeds silently if already liked.
- [x/post-delete](https://app.inference.sh/apps/x/post-delete): Delete a post from X.com by post ID. Can only delete posts authored by the authenticated account.
- [x/post-get](https://app.inference.sh/apps/x/post-get): Get a post by ID from X.com. Returns text, author ID, creation date, and engagement metrics (likes, retweets, replies, quotes).
- [x/post-create](https://app.inference.sh/apps/x/post-create): Create posts on X.com with text (280 char limit) and optional media. Supports up to 4 images or 1 video/GIF. Can reply to or quote other posts. Images over 5MB are auto-resized.
- [xai/grok-imagine-video](https://app.inference.sh/apps/xai/grok-imagine-video): Generate and edit videos using xAI's Grok Imagine Video model. Supports text-to-video, image-to-video, and video editing with configurable duration and resolution.
- [xai/grok-imagine-image](https://app.inference.sh/apps/xai/grok-imagine-image): Generate and edit images using xAI's Grok Imagine model. Supports text-to-image and image editing with multiple aspect ratios.
- [google/veo-3-1](https://app.inference.sh/apps/google/veo-3-1): Veo 3.1 via Vertex AI - Advanced video generation with frame interpolation, reference images, and audio generation
- [google/veo-3-fast](https://app.inference.sh/apps/google/veo-3-fast): Veo 3 Fast via Vertex AI - Fast video generation with audio from text prompts and images

## Landing Pages

Focused pages explaining key concepts and capabilities:

- [AI Agent Runtime](https://inference.sh/ai-agent-runtime): The production infrastructure layer for AI agents
- [Durable Execution](https://inference.sh/durable-execution): Agents that survive failures with persistent state and full traceability
- [Agent Observability](https://inference.sh/agent-observability): Real-time visibility into agent behavior and decisions
- [Human-in-the-Loop](https://inference.sh/human-in-the-loop): Approval gates for AI agent actions
- [Self-Hosted Agents](https://inference.sh/self-hosted-agents): Deploy the runtime in your VPC or on-prem
- [Realtime](https://inference.sh/realtime): Real-time streaming and live agent responses
- [Integrations](https://inference.sh/integrations): Pre-built connections to 150+ apps and services
- [x402](https://inference.sh/x402): Agentic payments and autonomous transactions
- [Creators](https://inference.sh/creators): Build and monetize AI apps on the platform
- [About](https://inference.sh/about): About inference.sh and the team
- [Trust](https://inference.sh/trust): The trust manifesto and our design principles

## Links

- [Website](https://inference.sh): Main landing page
- [Documentation](https://inference.sh/docs): Complete documentation
- [Blog](https://inference.sh/blog): Articles and tutorials
- [Apps](https://inference.sh/apps): Browse available AI apps and tools
- [GitHub](https://github.com/inference-sh): Source code and examples
- [Python SDK](https://pypi.org/project/inferencesh/): pip install inferencesh
- [npm SDK](https://www.npmjs.com/package/inferencesh): npm install inferencesh

## Optional

- [llms-full.txt](https://inference.sh/llms-full.txt): Complete documentation and blog content for deeper context