Inference Logoinference.sh

API & SDK Overview

Programmatic access for developers.

Everything in the workspace is available via API. Run apps, execute flows, and interact with agents from your own code.

For direct HTTP access, see REST API overview — including X-API-Version: 2, RFC 9457 errors, and the OpenAPI route catalog at GET /openapi.json.


Quick Start

1. Get an API Key

  1. Go to Settings → API Keys
  2. Click Create API Key
  3. Copy your key (starts with inf_)

2. Install an SDK

1pip install inferencesh23# With async support4pip install inferencesh[async]

3. Run Your First App

1from inferencesh import inference23client = inference(api_key="inf_your_key")45result = client.run({6    "app": "infsh/echo",7    "input": {"message": "Hello from the API!"}8})910print(result["output"])

What's Available?

Building Apps

Extend overview — CLI setup, app code, deploy, and related guides

SDK Reference

Full SDK documentation with tabbed Python/JavaScript examples.

Agent SDK

Build headless AI agents with tools and multi-turn conversations.

REST API

Use any language with the REST API.

  • REST Overview — Base URL, X-API-Version: 2, errors, OpenAPI catalog
  • Entitlements — Plan limits and feature flags (GET /entitlements)
  • Tasks — Run apps, status, webhooks, cancellation
  • Flow runs — Execute saved flows programmatically
  • Apps — App metadata, versions, set active version
  • Agents — Chat sessions and structured output
  • Search — Discover apps, skills, and knowledge
  • Instance types — Public GPU catalog and hourly pricing
  • Skills · Knowledge — Registry and knowledge entries
  • Streaming — SSE for tasks and agent chats
  • Billing · Subscription — Balance top-ups and plan management

API version

Official SDKs and the belt/infsh CLIs send X-API-Version: 2 for bare JSON responses and RFC 9457 errors. For curl and custom HTTP clients, add the same header or use the legacy wrapped format. See REST API version.


Authentication

All API calls require an API key in the header:

1from inferencesh import inference23client = inference(api_key="inf_your_key")4# Or use environment variable:5# client = inference(api_key=os.environ["INFERENCE_API_KEY"])

Full Authentication Guide


Task Status Codes

Numeric status codes are returned on task objects and webhook payloads. Common values:

StatusCodeMeaning
Queued2Waiting for worker
Running7Executing
Completed10Done
Failed11Error occurred
Cancelled12Cancelled

Full status code list


Rate Limits

EndpointLimit
Run task100/minute
Get task1000/minute
Upload file50/minute

we use cookies

we use cookies to ensure you get the best experience on our website. for more information on how we use cookies, please see our cookie policy.

by clicking "accept", you agree to our use of cookies.
learn more.