# inference.sh > The AI agent runtime. Build agents that can actually do things—with durable execution, human-in-the-loop approval, and real-time observability. inference.sh is an open-source agent runtime for deploying production AI agents. Unlike agent frameworks that just orchestrate LLM calls, inference.sh provides the infrastructure agents need: durable execution that survives failures, human-in-the-loop approval for critical actions, real-time streaming and observability, and tools that actually work. Key differentiators from LangChain/LangGraph: - **Durable execution**: Every tool call persisted, graph-backed visibility into agent decisions - **Human-in-the-loop**: One flag to require approval before tool execution - **Real tools**: Pre-built integrations that actually work, not just wrappers - **Observability built-in**: Every agent call is logged, traced, and streamable ## Docs ### getting started - [introduction](https://inference.sh/docs/getting-started/introduction.md) - [what is inference.sh?](https://inference.sh/docs/getting-started/what-is-inference.md) - [workspace tour](https://inference.sh/docs/getting-started/workspace-tour.md) - [your first agent](https://inference.sh/docs/getting-started/your-first-agent.md) ### runtime - [what is a runtime?](https://inference.sh/docs/runtime/overview.md) - [durable execution](https://inference.sh/docs/runtime/durable-execution.md) - [observability](https://inference.sh/docs/runtime/observability.md) - [human-in-the-loop](https://inference.sh/docs/runtime/human-in-the-loop.md) - [tool orchestration](https://inference.sh/docs/runtime/tool-orchestration.md) ### concepts - [apps](https://inference.sh/docs/concepts/apps.md) - [tasks](https://inference.sh/docs/concepts/tasks.md) - [flows](https://inference.sh/docs/concepts/flows.md) - [agents](https://inference.sh/docs/concepts/agents.md) - [workers](https://inference.sh/docs/concepts/workers.md) - [sessions](https://inference.sh/docs/concepts/sessions.md) ### agents - [overview](https://inference.sh/docs/agents/overview.md) - [creating an agent](https://inference.sh/docs/agents/creating.md) - [system prompts](https://inference.sh/docs/agents/system-prompts.md) - [skills](https://inference.sh/docs/agents/skills.md) - [adding tools](https://inference.sh/docs/agents/adding-tools.md) - [sub-agents](https://inference.sh/docs/agents/sub-agents.md) - [chatting](https://inference.sh/docs/agents/chatting.md) - [webhooks](https://inference.sh/docs/agents/webhooks.md) **widgets** - [overview](https://inference.sh/docs/agents/widgets/overview.md) - [schema](https://inference.sh/docs/agents/widgets/schema.md) - [actions](https://inference.sh/docs/agents/widgets/actions.md) - [card](https://inference.sh/docs/agents/widgets/card.md) - [box](https://inference.sh/docs/agents/widgets/box.md) - [row](https://inference.sh/docs/agents/widgets/row.md) - [col](https://inference.sh/docs/agents/widgets/col.md) - [spacer](https://inference.sh/docs/agents/widgets/spacer.md) - [divider](https://inference.sh/docs/agents/widgets/divider.md) - [form](https://inference.sh/docs/agents/widgets/form.md) - [text](https://inference.sh/docs/agents/widgets/text.md) - [title](https://inference.sh/docs/agents/widgets/title.md) - [caption](https://inference.sh/docs/agents/widgets/caption.md) - [label](https://inference.sh/docs/agents/widgets/label.md) - [markdown](https://inference.sh/docs/agents/widgets/markdown.md) - [button](https://inference.sh/docs/agents/widgets/button.md) - [input](https://inference.sh/docs/agents/widgets/input.md) - [textarea](https://inference.sh/docs/agents/widgets/textarea.md) - [select](https://inference.sh/docs/agents/widgets/select.md) - [checkbox](https://inference.sh/docs/agents/widgets/checkbox.md) - [radio-group](https://inference.sh/docs/agents/widgets/radio-group.md) - [date-picker](https://inference.sh/docs/agents/widgets/date-picker.md) - [image](https://inference.sh/docs/agents/widgets/image.md) - [icon](https://inference.sh/docs/agents/widgets/icon.md) - [badge](https://inference.sh/docs/agents/widgets/badge.md) - [chart](https://inference.sh/docs/agents/widgets/chart.md) - [transition](https://inference.sh/docs/agents/widgets/transition.md) ### apps - [overview](https://inference.sh/docs/apps/overview.md) - [browsing the grid](https://inference.sh/docs/apps/browsing-grid.md) - [running an app](https://inference.sh/docs/apps/running.md) - [setup parameters](https://inference.sh/docs/apps/setup-parameters.md) ### flows - [overview](https://inference.sh/docs/flows/overview.md) - [creating a flow](https://inference.sh/docs/flows/creating.md) - [connecting nodes](https://inference.sh/docs/flows/connecting.md) - [deploying as app](https://inference.sh/docs/flows/deploying.md) ### extend - [overview](https://inference.sh/docs/extend/overview.md) - [coding agents](https://inference.sh/docs/extend/coding-agents.md) - [cli setup](https://inference.sh/docs/extend/cli-setup.md) - [creating an app](https://inference.sh/docs/extend/creating-app.md) - [app code](https://inference.sh/docs/extend/app-code.md) - [configuration](https://inference.sh/docs/extend/configuration.md) - [secrets](https://inference.sh/docs/extend/secrets.md) - [integrations](https://inference.sh/docs/extend/integrations.md) - [output meta](https://inference.sh/docs/extend/output-meta.md) - [best practices](https://inference.sh/docs/extend/best-practices.md) - [cancellation](https://inference.sh/docs/extend/cancellation.md) - [troubleshooting](https://inference.sh/docs/extend/troubleshooting.md) - [deploying](https://inference.sh/docs/extend/deploying.md) - [multi-function apps](https://inference.sh/docs/extend/multi-function-apps.md) - [sessions](https://inference.sh/docs/extend/sessions.md) ### api & sdk - [overview](https://inference.sh/docs/api/overview.md) - [authentication](https://inference.sh/docs/api/authentication.md) **sdk** - [overview](https://inference.sh/docs/api/sdk/overview.md) - [running apps](https://inference.sh/docs/api/sdk/running-apps.md) - [streaming](https://inference.sh/docs/api/sdk/streaming.md) - [files](https://inference.sh/docs/api/sdk/files.md) **server proxy** - [overview](https://inference.sh/docs/api/sdk/server-proxy.md) - [next.js](https://inference.sh/docs/api/sdk/proxy/nextjs.md) - [express](https://inference.sh/docs/api/sdk/proxy/express.md) - [hono](https://inference.sh/docs/api/sdk/proxy/hono.md) - [remix](https://inference.sh/docs/api/sdk/proxy/remix.md) - [sveltekit](https://inference.sh/docs/api/sdk/proxy/sveltekit.md) **agent sdk** - [overview](https://inference.sh/docs/api/agent/overview.md) - [template agents](https://inference.sh/docs/api/agent/template.md) - [ad-hoc agents](https://inference.sh/docs/api/agent/adhoc.md) - [building tools](https://inference.sh/docs/api/agent/tools.md) - [client tools](https://inference.sh/docs/api/agent/client-tools.md) - [app tools](https://inference.sh/docs/api/agent/app-tools.md) - [agent tools](https://inference.sh/docs/api/agent/agent-tools.md) - [webhook tools](https://inference.sh/docs/api/agent/webhook-tools.md) - [internal tools](https://inference.sh/docs/api/agent/internal-tools.md) - [approval](https://inference.sh/docs/api/agent/approval.md) - [streaming](https://inference.sh/docs/api/agent/streaming.md) **rest api** - [overview](https://inference.sh/docs/api/rest/overview.md) - [tasks](https://inference.sh/docs/api/rest/tasks.md) - [files](https://inference.sh/docs/api/rest/files.md) - [agents](https://inference.sh/docs/api/rest/agents.md) - [streaming](https://inference.sh/docs/api/rest/streaming.md) ### secrets - [overview](https://inference.sh/docs/secrets/overview.md) - [environment variables](https://inference.sh/docs/secrets/environment.md) ### integrations - [overview](https://inference.sh/docs/integrations/overview.md) - [google service account](https://inference.sh/docs/integrations/google-service-account.md) - [google oauth](https://inference.sh/docs/integrations/google-oauth.md) - [google cloud platform](https://inference.sh/docs/integrations/gcp.md) - [slack](https://inference.sh/docs/integrations/slack.md) - [discord](https://inference.sh/docs/integrations/discord.md) - [x.com](https://inference.sh/docs/integrations/x.md) ### private workers - [why private?](https://inference.sh/docs/private/why.md) - [installing the engine](https://inference.sh/docs/private/installing.md) - [configuration](https://inference.sh/docs/private/config.md) - [using private workers](https://inference.sh/docs/private/using.md) ### examples - [image generation](https://inference.sh/docs/examples/image-generation.md) - [audio transcription](https://inference.sh/docs/examples/audio-transcription.md) - [content pipeline](https://inference.sh/docs/examples/content-pipeline.md) - [data processing](https://inference.sh/docs/examples/data-processing.md) - [multi-agent system](https://inference.sh/docs/examples/multi-agent.md) - [x.com integration](https://inference.sh/docs/examples/x-integration.md) - [multi-function app](https://inference.sh/docs/examples/multi-function-app.md) - [stateful sessions](https://inference.sh/docs/examples/stateful-sessions.md) ## Blog ### skills - [agent skills overview](https://inference.sh/blog/skills/agent-skills-overview.md) ### concepts - [workflows vs agents](https://inference.sh/blog/concepts/workflows-vs-agents.md) ### agent runtime - [why agent runtimes matter](https://inference.sh/blog/agent-runtime/why-runtimes-matter.md) - [durable execution for agents](https://inference.sh/blog/agent-runtime/durable-execution.md) - [human-in-the-loop in one flag](https://inference.sh/blog/agent-runtime/human-in-the-loop.md) - [agent memory without boilerplate](https://inference.sh/blog/agent-runtime/agent-memory.md) - [the real cost of agent infrastructure](https://inference.sh/blog/agent-runtime/infrastructure-cost.md) ### multi-agent - [hierarchical agent delegation](https://inference.sh/blog/multi-agent/hierarchical-delegation.md) - [when to use multi-agent](https://inference.sh/blog/multi-agent/when-to-use.md) - [concurrent agent execution](https://inference.sh/blog/multi-agent/concurrent-execution.md) ### observability - [built-in observability](https://inference.sh/blog/observability/built-in.md) - [debugging agents at 2am](https://inference.sh/blog/observability/debugging-guide.md) - [real-time streaming](https://inference.sh/blog/observability/streaming.md) ### tools & execution - [the tool integration tax](https://inference.sh/blog/tools/integration-tax.md) - [sandboxed code execution](https://inference.sh/blog/tools/sandboxed-execution.md) - [tool approval gates](https://inference.sh/blog/tools/approval-gates.md) - [building custom apps](https://inference.sh/blog/tools/building-custom-apps.md) - [client-side tools](https://inference.sh/blog/tools/client-side-tools.md) ### ux & interfaces - [agents that generate UI](https://inference.sh/blog/ux/generative-ui.md) - [agent ux patterns](https://inference.sh/blog/ux/agent-ux-patterns.md) ### guides - [building a research agent](https://inference.sh/blog/guides/research-agent.md) - [from demo to production](https://inference.sh/blog/guides/demo-to-production.md) - [introducing ui.inference.sh](https://inference.sh/blog/guides/shadcn-registry.md) ## Quick Reference ### Agent Structure ```python from inferencesh import Agent, Tool agent = Agent( name="my-agent", model="claude-sonnet-4-20250514", system_prompt="You are a helpful assistant.", tools=[...], human_in_the_loop=True, # Require approval for tool calls ) ``` ### API Usage ```python from inferencesh import inference client = inference(api_key="inf_...") # Run an agent chat = client.agents.chat("my-agent") response = chat.send("Help me with...") # Stream responses for chunk in chat.stream("What can you do?"): print(chunk.content, end='') ``` ## Apps Pre-built AI apps and tools available on the inference.sh platform: ### Featured - [google/veo-3-1-fast](https://app.inference.sh/apps/google/veo-3-1-fast): Veo 3.1 Fast via Vertex AI - Generate videos from text prompts or images with optional audio - [infsh/hunyuanvideo-foley](https://app.inference.sh/apps/infsh/hunyuanvideo-foley): Synthesizes realistic sound effects and audio tracks based on your video content and written descriptions. ### All Apps - [x/post-tweet](https://app.inference.sh/apps/x/post-tweet): Post tweets to X.com with text (280 char limit) and optional media. Supports up to 4 images or 1 video/GIF. Can reply to or quote other tweets. Images over 5MB are auto-resized. - [x/dm-send](https://app.inference.sh/apps/x/dm-send): Send a direct message on X.com. Requires the recipient's user ID (not username). Text-only messages; media attachments are not supported. - [x/user-follow](https://app.inference.sh/apps/x/user-follow): Follow a user on X.com by user ID. Succeeds silently if already following. - [x/user-get](https://app.inference.sh/apps/x/user-get): Get a user profile from X.com by ID or username. Returns bio, follower/following counts, tweet count, verified status, and profile image URL. - [x/post-retweet](https://app.inference.sh/apps/x/post-retweet): Retweet a post on X.com by post ID. Succeeds silently if already retweeted. - [x/post-like](https://app.inference.sh/apps/x/post-like): Like a post on X.com by post ID. Succeeds silently if already liked. - [x/post-delete](https://app.inference.sh/apps/x/post-delete): Delete a post from X.com by post ID. Can only delete posts authored by the authenticated account. - [x/post-get](https://app.inference.sh/apps/x/post-get): Get a post by ID from X.com. Returns text, author ID, creation date, and engagement metrics (likes, retweets, replies, quotes). - [x/post-create](https://app.inference.sh/apps/x/post-create): Create posts on X.com with text (280 char limit) and optional media. Supports up to 4 images or 1 video/GIF. Can reply to or quote other posts. Images over 5MB are auto-resized. - [xai/grok-imagine-video](https://app.inference.sh/apps/xai/grok-imagine-video): Generate and edit videos using xAI's Grok Imagine Video model. Supports text-to-video, image-to-video, and video editing with configurable duration and resolution. - [xai/grok-imagine-image](https://app.inference.sh/apps/xai/grok-imagine-image): Generate and edit images using xAI's Grok Imagine model. Supports text-to-image and image editing with multiple aspect ratios. - [google/veo-3-1](https://app.inference.sh/apps/google/veo-3-1): Veo 3.1 via Vertex AI - Advanced video generation with frame interpolation, reference images, and audio generation - [google/veo-3-fast](https://app.inference.sh/apps/google/veo-3-fast): Veo 3 Fast via Vertex AI - Fast video generation with audio from text prompts and images - [google/veo-3](https://app.inference.sh/apps/google/veo-3): Veo 3 via Vertex AI - Generate videos with audio from text prompts and images - [google/veo-2](https://app.inference.sh/apps/google/veo-2): Veo 2 via Vertex AI - Generate high-quality realistic videos from text prompts - [falai/flux-dev-lora](https://app.inference.sh/apps/falai/flux-dev-lora): Text-to-image and image-to-image generation with FLUX.1 [dev] LoRA support. Custom style adaptation and fine-tuned model variations from Black Forest Labs. - [falai/flux-2-klein-lora](https://app.inference.sh/apps/falai/flux-2-klein-lora): Text-to-image and image-to-image generation with FLUX.2 [klein] LoRA support. Available in 4B and 9B parameter sizes. Custom style adaptation and fine-tuned model variations from Black Forest Labs. - [bytedance/omnihuman-1-5](https://app.inference.sh/apps/bytedance/omnihuman-1-5): Multi-character audio-driven avatar video generation. Takes a portrait image + audio and generates a video where the person speaks/sings in sync. Supports specifying which character to drive. - [bytedance/omnihuman-1-0](https://app.inference.sh/apps/bytedance/omnihuman-1-0): Audio-driven avatar video generation. Takes a portrait image + audio and generates a video where the person speaks/sings in sync with the audio. - [bytedance/seedream-3-0-t2i](https://app.inference.sh/apps/bytedance/seedream-3-0-t2i): Generate cinematic quality images from text prompts with accurate text rendering using ByteDance's Seedream 3.0 T2I model via BytePlus ARK API. ## Landing Pages Focused pages explaining key concepts and capabilities: - [AI Agent Runtime](https://inference.sh/ai-agent-runtime): The production infrastructure layer for AI agents - [Durable Execution](https://inference.sh/durable-execution): Agents that survive failures with persistent state and full traceability - [Agent Observability](https://inference.sh/agent-observability): Real-time visibility into agent behavior and decisions - [Human-in-the-Loop](https://inference.sh/human-in-the-loop): Approval gates for AI agent actions - [Self-Hosted Agents](https://inference.sh/self-hosted-agents): Deploy the runtime in your VPC or on-prem - [Realtime](https://inference.sh/realtime): Real-time streaming and live agent responses - [Integrations](https://inference.sh/integrations): Pre-built connections to 150+ apps and services - [x402](https://inference.sh/x402): Agentic payments and autonomous transactions - [Creators](https://inference.sh/creators): Build and monetize AI apps on the platform - [About](https://inference.sh/about): About inference.sh and the team - [Trust](https://inference.sh/trust): The trust manifesto and our design principles ## Links - [Website](https://inference.sh): Main landing page - [Documentation](https://inference.sh/docs): Complete documentation - [Blog](https://inference.sh/blog): Articles and tutorials - [Apps](https://inference.sh/apps): Browse available AI apps and tools - [GitHub](https://github.com/inference-sh): Source code and examples - [Python SDK](https://pypi.org/project/inferencesh/): pip install inferencesh - [npm SDK](https://www.npmjs.com/package/inferencesh): npm install inferencesh ## Optional - [llms-full.txt](https://inference.sh/llms-full.txt): Complete documentation and blog content for deeper context