Inference Logoinference.sh
documentation

documentation

everything you need to build production AI agents

quick start

new to inference.sh? start here to get up and running in minutes.

get started →

all pages

What is inference.sh?

inference.sh is an AI workspace where agents can actually do things.

Workspace Tour

A quick look at what you'll find in the workspace.

Your First Agent

Create an agent in 5 minutes.

Concepts Overview

Everything in inference.sh connects to help you get work done with AI.

Apps

Apps are tools that do one thing well.

Tasks

A task is what happens when you run an app.

Flows

Flows chain multiple apps together.

Agents

Agents are AI assistants that use tools.

Workers

Workers are computers that run your tasks.

Agents Overview

Create AI assistants that plan, use tools, and get things done.

Creating an Agent

Step-by-step guide to creating an agent.

System Prompts

The system prompt defines your agent's personality and behavior.

Adding Tools

Tools extend what your agent can do. There are three types of tools.

Sub-Agents

Agents can delegate to other agents. Sub-agents are just another type of tool.

Chatting

How to interact with your agent.

Webhooks

Connect your agent to external services via webhooks.

Apps Overview

Tools that do one thing well.

Browsing the Grid

Find apps to run or add to agents.

Running an App

How to run an app from the workspace.

Variants

Different configurations of the same app.

Flows Overview

Visual workflows that chain apps together.

Creating a Flow

Build a multi-step workflow visually.

Connecting Nodes

Wire data between apps in your flow.

Deploying as App

Turn your flow into a reusable app.

Extending Apps

Build and deploy your own tools.

CLI Setup

Install the command-line tool for creating apps.

Creating an App

Scaffold a new app with the CLI.

App Code

The inference.py file is your app's logic.

Configuration

The inf.yml file defines app settings and resource requirements.

Deploying

Push your app to inference.sh.

API & SDK Overview

Programmatic access for developers.

Authentication

How to authenticate with the API.

Running Apps

Call apps programmatically.

Streaming

Watch task progress in real-time.

Files

Upload and download files with the API.

Why Private Workers?

Reasons to run on your own infrastructure.

Installing the Engine

Set up private workers on your hardware.

Configuration

Configure workers on your engine.

Using Private Workers

Run tasks on your own hardware.

Image Generation

Generate images with AI models.

Audio Transcription

Convert speech to text with Whisper.

Content Pipeline

Build a flow that creates social media content from a prompt.

Data Processing

Process and analyze data with AI.

Multi-Agent System

Build agents that work together.

Introduction

> inference.sh is an AI workspace where agents can actually do things. Put agents in charge of the busywork—they plan steps, pick the right tools or flows, and even generate small UI widgets when th...

Secrets Overview

Secrets and integrations for connecting external services.

Environment Variables

Store encrypted API keys and credentials.

Integrations Overview

Managed connections to external services.

Google Service Account

Connect Google Sheets, Docs, and Drive using a service account.

Google OAuth

Connect your personal Google account for Gmail, Calendar, and Drive.

X.com Integration

Connect X.com (formerly Twitter) to your agents.

Slack Integration

Connect Slack to your agents for messaging, channels, and events.

Discord Integration

Connect Discord to your agents for messaging, events, and bot interactions.

X.com Integration

Build apps that interact with X.com (Twitter).

Python SDK

Python SDK for inference.sh agents and apps.

JavaScript SDK

JavaScript/TypeScript SDK for inference.sh agents and apps.

Tool Builder

Fluent API for defining agent tools.

Google Cloud Platform

Connect your own GCP project to use Vertex AI, BigQuery, Cloud Storage, and other GCP services with your own quotas and billing.

What is a Runtime?

inference.sh is an agent runtime; the infrastructure layer that executes your agent code with built-in solutions for the hard operational problems.

Durable Execution

durable execution means your agent's state checkpoints after each step. if a connection drops, a process restarts, or a tool times out, execution resumes from the last checkpoint instead of starting o...

Observability

every agent running on inference.sh is automatically traced. no configuration, no instrumentation code, no separate products.

Human-in-the-Loop

human-in-the-loop lets you add approval gates to agent actions with one flag. the agent pauses, shows what it wants to do, and waits for confirmation before executing.

Tool Orchestration

inference.sh provides 150+ pre-built integrations as tools for your agents. oauth flows, token refresh, and credential management are handled by the platform.

Setup Parameters

Setup parameters configure an app's initialization steps. Modifying these parameters triggers a full restart (re-running setup()).

Multi-Function App

Apps can define multiple entry points to support different modes of operation within the same container context. This reduces cold starts and resource duplication.

Graceful Cancellation

Apps typically process requests until completion. However, users may cancel long-running tasks. Implementing graceful cancellation ensures resources are released immediately and partial results are ha...

Concepts Overview

Everything in inference.sh connects to help you get work done with AI.

Extending Apps

Build and deploy your own tools.

API & SDK Overview

Programmatic access for developers.

Secrets

Secrets allow your app to securely access API keys and sensitive values. Users provide their own secrets which are encrypted and injected at runtime.

Integrations

Integrations allow your app to access external services (Google Sheets, Drive, etc.) on behalf of users through OAuth.

Output Metadata

OutputMeta enables usage-based pricing by reporting what your app processes and generates.

Best Practices

Performance optimization and coding patterns for inference.sh apps.

Troubleshooting

Common issues and solutions for inference.sh apps.

Widgets Overview

Learn how to design widgets in your agent experience.

Widget Schema

Complete type definitions for the widget system.

Widget Actions

Trigger actions on the backend from user interactions in your widgets.

Card

A bounded container for widgets. The Card is the primary container for all widget content.

Text

Displays plain text with optional styling variants.

Badge

A small label for status or metadata.

Row

Arranges children horizontally with configurable gap.

Col

Arranges children vertically with configurable gap.

Button

A flexible action button that triggers widget actions.

Input

Text input field for collecting user input.

Select

Dropdown single-select input for choosing from predefined options.

Checkbox

Binary selection control for boolean values.

Image

Displays an image with optional alt text.

Markdown

Renders markdown-formatted text with full styling support.

Box

Flexible container for layout and surface styling with full control.

Spacer

Flexible space to separate content within a layout.

Divider

Separate content with a thin line.

Form

Layout container optimized for form controls and submission.

Title

Section headings with scalable sizes and weights.

Caption

Supplemental text for descriptions, hints, or metadata.

Label

Accessible label for a form field.

Textarea

Multi-line text input control for longer form content.

RadioGroup

Choose a single option from a set of mutually exclusive options.

DatePicker

Select a date from a calendar popover.

Icon

Visual glyphs for actions and status indicators.

Chart

Render simple bar, line, and area charts from tabular data.

we use cookies

we use cookies to ensure you get the best experience on our website. for more information on how we use cookies, please see our cookie policy.

by clicking "accept", you agree to our use of cookies.
learn more.