Build and deploy your own tools.

The Grid is open for extension. If you need something that doesn't exist, create it.

Overview

Apps are Python scripts with:

Typed inputs — What the app accepts
Typed outputs — What it returns
Lifecycle methods — Setup, run, cleanup

The CLI helps you create, test, and deploy.

Quick Start

Install the CLI

bash

1curl -fsSL https://cli.inference.sh | sh

bash

1infsh login

Create an App

bash

1mkdir my-app && cd my-app2infsh init

Answer the prompts:

Name: my-app
Description: "What my app does"
Category: image, text, audio, etc.
Python version: 3.10, 3.11, or 3.12
GPU required: Yes or No

This creates:

code

1my-app/2├── inf.yml           # Configuration3├── inference.py      # Your code4├── requirements.txt  # Dependencies5└── ...

The App Template

`inference.py`

python

1from inferencesh import BaseApp, BaseAppInput, BaseAppOutput, File2from pydantic import Field3from typing import Optional45class AppInput(BaseAppInput):6    prompt: str = Field(description="What to generate")7    style: str = Field(default="default", description="Output style")89class AppOutput(BaseAppOutput):10    result: str = Field(description="Generated text")11    image: Optional[File] = Field(default=None, description="Generated image")1213class App(BaseApp):14    async def setup(self, metadata):15        """Called once when the worker starts.16        Load models, initialize resources here."""17        pass18    19    async def run(self, input_data: AppInput, metadata) -> AppOutput:20        """Called for each request.21        Process input and return output."""22        23        # Access inputs24        prompt = input_data.prompt25        style = input_data.style26        27        # Log progress (visible in real-time)28        metadata.log("Processing...")29        30        # Do your work31        result = f"Generated: {prompt} in {style} style"32        33        # Return output34        return AppOutput(result=result)35    36    async def unload(self):37        """Called on shutdown. Cleanup resources."""38        pass

Input/Output Types

Type	Example	Description
`str`	`"hello"`	Text
`int`	`42`	Integer
`float`	`3.14`	Decimal
`bool`	`true`	Boolean
`List[T]`	`[1, 2, 3]`	Array
`Optional[T]`	`null` or value	Nullable
`File`	`{"uri": "..."}`	File reference

Working with Files

Input files are downloaded automatically:

python

1async def run(self, input_data: AppInput, metadata) -> AppOutput:2    # File is ready to use3    image_path = input_data.image.path4    5    with open(image_path, "rb") as f:6        data = f.read()

Output files are uploaded automatically:

python

1async def run(self, input_data: AppInput, metadata) -> AppOutput:2    # Write to a file3    output_path = "/tmp/result.png"4    save_image(output_path)5    6    # Return it7    return AppOutput(image=File(path=output_path))

Configuration

`inf.yml`

yaml

1name: my-app2description: What my app does3category: image45resources:6  gpu:7    count: 18    vram: 8000       # MB9    type: nvidia10  ram: 16000         # MB1112python: "3.11"

Setup Parameters

Use AppSetup class in Python to define runtime configuration (e.g. model selection). See Setup Parameters.

Dependencies

requirements.txt — Python packages:

code

1torch>=2.0.02transformers>=4.30.03pillow>=9.0.0

packages.txt — System packages (apt):

code

1ffmpeg2libgl1-mesa-glx

Requirements

If your app needs external API keys or integrations, declare them:

yaml

1# inf.yml2requirements:3  # Environment secrets (user provides their own keys)4  secrets:5    - key: OPENAI_API_KEY6      description: For GPT-4 API calls7    8    - key: REPLICATE_API_TOKEN9      description: For model inference10      optional: true1112  # Integrations (managed OAuth connections)13  integrations:14    - key: google.sheets15      description: Read/write spreadsheets16    17    - key: x.tweet.write18      description: Post tweets19      optional: true

Required items must be configured before the app runs.

Optional items won't block execution if missing.

→ Learn more about Secrets & Integrations

Multi-Function Apps

By default, apps expose a single run method. You can define multiple functions to handle different operations in one app:

python

1from inferencesh import BaseApp, BaseAppInput, BaseAppOutput, File2from pydantic import Field34class UpscaleInput(BaseAppInput):5    image: File = Field(description="Image to upscale")6    scale: int = Field(default=2, description="Scale factor")78class UpscaleOutput(BaseAppOutput):9    image: File = Field(description="Upscaled image")1011class RemoveBgInput(BaseAppInput):12    image: File = Field(description="Image to process")1314class RemoveBgOutput(BaseAppOutput):15    image: File = Field(description="Image with background removed")1617class App(BaseApp):18    async def setup(self, metadata):19        # Load models for all functions20        pass2122    async def upscale(self, input_data: UpscaleInput, metadata) -> UpscaleOutput:23        # Upscale logic24        ...2526    async def remove_bg(self, input_data: RemoveBgInput, metadata) -> RemoveBgOutput:27        # Background removal logic28        ...

Each public method (other than setup/unload) becomes a callable function with its own input/output schema.

Testing Multi-Function Apps

bash

1# Test a specific function2infsh app test --function upscale --input upscale_input.json34# Test another function5infsh app test --function remove_bg --input removebg_input.json

Running Multi-Function Apps in the Cloud

bash

1infsh app run user/image-tools --function upscale --input input.json

When used as an agent tool, each function appears as a separate tool the agent can call.

Deploy

bash

1infsh app deploy

The CLI:

Validates your app
Installs dependencies locally
Tests the structure
Packages everything
Uploads to inference.sh

Output:

code

1┌──────────────────────────────────────┐2│         deployment summary           │3├──────────────────────────────────────┤4│                                      │5│ final status: success                │6│ duration: 45s                        │7│ user: yourname                       │8│ app: my-app                          │9│ link: https://app.inference.sh/...   │10│                                      │11└──────────────────────────────────────┘

Testing Locally

Generate Example Input

bash

1infsh app test --save-example2# Creates input.json

Run Locally

bash

1infsh app test --input input.json23# Multi-function apps: specify which function to call4infsh app test --function upscale --input input.json

Debug Mode

bash

1infsh app test --input input.json --debug

AI-Friendly Development

The CLI creates a CLAUDE.md file with instructions for AI assistants. This means you can:

Describe what you want to an AI (ChatGPT, Claude, etc.)
Share your app structure
Get working code
Deploy with infsh app deploy

The template and structure are designed for AI code generation.

Editing in the Workspace

After deploying, edit settings in the web interface:

Setup Parameters — Configure initialization defaults
Images — Card, thumbnail, banner
Description — Help others find your app
Visibility — Public or private

These changes don't require the CLI.

Best Practices

Keep Apps Focused

code

1❌ Don't: One app that does everything2✓ Do: Single-purpose apps that compose well

Download Models in Setup

python

1async def setup(self, metadata):2    # Good: Download on first run, cached afterward3    from huggingface_hub import hf_hub_download4    self.model_path = hf_hub_download("model-name", "model.bin")

Don't bundle large models — they're cached after first download.

Log Progress

python

1async def run(self, input_data: AppInput, metadata) -> AppOutput:2    metadata.log("Step 1: Loading image...")3    # ...4    metadata.log("Step 2: Processing...")5    # ...6    metadata.log("Step 3: Saving result...")

Users see these in real-time.

Handle Errors Gracefully

python

1async def run(self, input_data: AppInput, metadata) -> AppOutput:2    if not input_data.image.exists():3        raise ValueError("Image file not found")4    5    try:6        result = process(input_data)7    except ProcessingError as e:8        metadata.log(f"Error: {e}")9        raise RuntimeError(f"Processing failed: {e}")

What's Next?

Apps — See your app in the Grid
Flows — Use your app in workflows
Agents — Add your app as an agent tool
Secrets & Integrations — Declare external requirements
API & SDK — Run your app programmatically

Overview

Quick Start

Install the CLI

Login

Create an App

The App Template

inference.py

Input/Output Types

Working with Files

Configuration

inf.yml

Setup Parameters

Dependencies

Requirements

Multi-Function Apps

Testing Multi-Function Apps

Running Multi-Function Apps in the Cloud

Deploy

Testing Locally

Generate Example Input

Run Locally

Debug Mode

AI-Friendly Development

Editing in the Workspace

Best Practices

Keep Apps Focused

Download Models in Setup

Log Progress

Handle Errors Gracefully

What's Next?

`inference.py`

`inf.yml`