Inference Logoinference.sh

Extending Apps

Build and deploy your own tools.

The Grid is open for extension. If you need something that doesn't exist, create it.


Overview

Apps are Python scripts with:

  • Typed inputs — What the app accepts
  • Typed outputs — What it returns
  • Lifecycle methods — Setup, run, cleanup

The CLI helps you create, test, and deploy.


Quick Start

Install the CLI

bash
1curl -fsSL https://cli.inference.sh | sh

Login

bash
1infsh login

Create an App

bash
1mkdir my-app && cd my-app2infsh init

Answer the prompts:

  • Name: my-app
  • Description: "What my app does"
  • Category: image, text, audio, etc.
  • Python version: 3.10, 3.11, or 3.12
  • GPU required: Yes or No

This creates:

code
1my-app/2 inf.yml           # Configuration3 inference.py      # Your code4 requirements.txt  # Dependencies5 ...

The App Template

inference.py

python
1from inferencesh import BaseApp, BaseAppInput, BaseAppOutput, File2from pydantic import Field3from typing import Optional4 5class AppInput(BaseAppInput):6    prompt: str = Field(description="What to generate")7    style: str = Field(default="default", description="Output style")8 9class AppOutput(BaseAppOutput):10    result: str = Field(description="Generated text")11    image: Optional[File] = Field(default=None, description="Generated image")12 13class App(BaseApp):14    async def setup(self, metadata):15        """Called once when the worker starts.16        Load models, initialize resources here."""17        pass18    19    async def run(self, input_data: AppInput, metadata) -> AppOutput:20        """Called for each request.21        Process input and return output."""22        23        # Access inputs24        prompt = input_data.prompt25        style = input_data.style26        27        # Log progress (visible in real-time)28        metadata.log("Processing...")29        30        # Do your work31        result = f"Generated: {prompt} in {style} style"32        33        # Return output34        return AppOutput(result=result)35    36    async def unload(self):37        """Called on shutdown. Cleanup resources."""38        pass

Input/Output Types

TypeExampleDescription
str"hello"Text
int42Integer
float3.14Decimal
booltrueBoolean
List[T][1, 2, 3]Array
Optional[T]null or valueNullable
File{"uri": "..."}File reference

Working with Files

Input files are downloaded automatically:

python
1async def run(self, input_data: AppInput, metadata) -> AppOutput:2    # File is ready to use3    image_path = input_data.image.path4    5    with open(image_path, "rb") as f:6        data = f.read()

Output files are uploaded automatically:

python
1async def run(self, input_data: AppInput, metadata) -> AppOutput:2    # Write to a file3    output_path = "/tmp/result.png"4    save_image(output_path)5    6    # Return it7    return AppOutput(image=File(path=output_path))

Configuration

inf.yml

yaml
1name: my-app2description: What my app does3category: image4 5resources:6  gpu:7    count: 18    vram: 8000       # MB9    type: nvidia10  ram: 16000         # MB11 12python: "3.11"

Setup Parameters

Use AppSetup class in Python to define runtime configuration (e.g. model selection). See Setup Parameters.

Dependencies

requirements.txt — Python packages:

code
1torch>=2.0.02transformers>=4.30.03pillow>=9.0.0

packages.txt — System packages (apt):

code
1ffmpeg2libgl1-mesa-glx

Requirements

If your app needs external API keys or integrations, declare them:

yaml
1# inf.yml2requirements:3  # Environment secrets (user provides their own keys)4  secrets:5    - key: OPENAI_API_KEY6      description: For GPT-4 API calls7    8    - key: REPLICATE_API_TOKEN9      description: For model inference10      optional: true11 12  # Integrations (managed OAuth connections)13  integrations:14    - key: google.sheets15      description: Read/write spreadsheets16    17    - key: x.tweet.write18      description: Post tweets19      optional: true

Required items must be configured before the app runs.

Optional items won't block execution if missing.

Learn more about Secrets & Integrations


Deploy

bash
1infsh deploy

The CLI:

  1. Validates your app
  2. Installs dependencies locally
  3. Tests the structure
  4. Packages everything
  5. Uploads to inference.sh

Output:

code
12         deployment summary           34                                      5 final status: success                6 duration: 45s                        7 user: yourname                       8 app: my-app                          9 link: https://app.inference.sh/...   │10                                      11

Testing Locally

Generate Example Input

bash
1infsh run --save-example2# Creates input.json

Run Locally

bash
1infsh run --input-file input.json

Iterate Quickly

bash
1# Skip validation for faster deploys2infsh deploy --skip-checks

AI-Friendly Development

The CLI creates a CLAUDE.md file with instructions for AI assistants. This means you can:

  1. Describe what you want to an AI (ChatGPT, Claude, etc.)
  2. Share your app structure
  3. Get working code
  4. Deploy with infsh deploy

The template and structure are designed for AI code generation.


Editing in the Workspace

After deploying, edit settings in the web interface:

  • Setup Parameters — Configure initialization defaults
  • Images — Card, thumbnail, banner
  • Description — Help others find your app
  • Visibility — Public or private

These changes don't require the CLI.


Best Practices

Keep Apps Focused

code
1 Don't: One app that does everything2 Do: Single-purpose apps that compose well

Download Models in Setup

python
1async def setup(self, metadata):2    # Good: Download on first run, cached afterward3    from huggingface_hub import hf_hub_download4    self.model_path = hf_hub_download("model-name", "model.bin")

Don't bundle large models — they're cached after first download.

Log Progress

python
1async def run(self, input_data: AppInput, metadata) -> AppOutput:2    metadata.log("Step 1: Loading image...")3    # ...4    metadata.log("Step 2: Processing...")5    # ...6    metadata.log("Step 3: Saving result...")

Users see these in real-time.

Handle Errors Gracefully

python
1async def run(self, input_data: AppInput, metadata) -> AppOutput:2    if not input_data.image.exists():3        raise ValueError("Image file not found")4    5    try:6        result = process(input_data)7    except ProcessingError as e:8        metadata.log(f"Error: {e}")9        raise RuntimeError(f"Processing failed: {e}")

What's Next?

we use cookies

we use cookies to ensure you get the best experience on our website. for more information on how we use cookies, please see our cookie policy.

by clicking "accept", you agree to our use of cookies.
learn more.