Inference Logoinference.sh

Extending Apps

Build and deploy your own tools.

The Grid is open for extension. If you need something that doesn't exist, create it.


Overview

Apps are Python scripts with:

  • Typed inputs — What the app accepts
  • Typed outputs — What it returns
  • Lifecycle methods — Setup, run, cleanup

The CLI helps you create, test, and deploy.


Quick Start

Install the CLI

bash
1curl -fsSL https://cli.inference.sh | sh

Login

bash
1infsh login

Create an App

bash
1mkdir my-app && cd my-app2infsh init

Answer the prompts:

  • Name: my-app
  • Description: "What my app does"
  • Category: image, text, audio, etc.
  • Python version: 3.10, 3.11, or 3.12
  • GPU required: Yes or No

This creates:

code
1my-app/2 inf.yml           # Configuration3 inference.py      # Your code4 requirements.txt  # Dependencies5 ...

The App Template

inference.py

python
1from inferencesh import BaseApp, BaseAppInput, BaseAppOutput, File2from pydantic import Field3from typing import Optional45class AppInput(BaseAppInput):6    prompt: str = Field(description="What to generate")7    style: str = Field(default="default", description="Output style")89class AppOutput(BaseAppOutput):10    result: str = Field(description="Generated text")11    image: Optional[File] = Field(default=None, description="Generated image")1213class App(BaseApp):14    async def setup(self, metadata):15        """Called once when the worker starts.16        Load models, initialize resources here."""17        pass18    19    async def run(self, input_data: AppInput, metadata) -> AppOutput:20        """Called for each request.21        Process input and return output."""22        23        # Access inputs24        prompt = input_data.prompt25        style = input_data.style26        27        # Log progress (visible in real-time)28        metadata.log("Processing...")29        30        # Do your work31        result = f"Generated: {prompt} in {style} style"32        33        # Return output34        return AppOutput(result=result)35    36    async def unload(self):37        """Called on shutdown. Cleanup resources."""38        pass

Input/Output Types

TypeExampleDescription
str"hello"Text
int42Integer
float3.14Decimal
booltrueBoolean
List[T][1, 2, 3]Array
Optional[T]null or valueNullable
File{"uri": "..."}File reference

Working with Files

Input files are downloaded automatically:

python
1async def run(self, input_data: AppInput, metadata) -> AppOutput:2    # File is ready to use3    image_path = input_data.image.path4    5    with open(image_path, "rb") as f:6        data = f.read()

Output files are uploaded automatically:

python
1async def run(self, input_data: AppInput, metadata) -> AppOutput:2    # Write to a file3    output_path = "/tmp/result.png"4    save_image(output_path)5    6    # Return it7    return AppOutput(image=File(path=output_path))

Configuration

inf.yml

yaml
1name: my-app2description: What my app does3category: image45resources:6  gpu:7    count: 18    vram: 8000       # MB9    type: nvidia10  ram: 16000         # MB1112python: "3.11"

Setup Parameters

Use AppSetup class in Python to define runtime configuration (e.g. model selection). See Setup Parameters.

Dependencies

requirements.txt — Python packages:

code
1torch>=2.0.02transformers>=4.30.03pillow>=9.0.0

packages.txt — System packages (apt):

code
1ffmpeg2libgl1-mesa-glx

Requirements

If your app needs external API keys or integrations, declare them:

yaml
1# inf.yml2requirements:3  # Environment secrets (user provides their own keys)4  secrets:5    - key: OPENAI_API_KEY6      description: For GPT-4 API calls7    8    - key: REPLICATE_API_TOKEN9      description: For model inference10      optional: true1112  # Integrations (managed OAuth connections)13  integrations:14    - key: google.sheets15      description: Read/write spreadsheets16    17    - key: x.tweet.write18      description: Post tweets19      optional: true

Required items must be configured before the app runs.

Optional items won't block execution if missing.

Learn more about Secrets & Integrations


Multi-Function Apps

By default, apps expose a single run method. You can define multiple functions to handle different operations in one app:

python
1from inferencesh import BaseApp, BaseAppInput, BaseAppOutput, File2from pydantic import Field34class UpscaleInput(BaseAppInput):5    image: File = Field(description="Image to upscale")6    scale: int = Field(default=2, description="Scale factor")78class UpscaleOutput(BaseAppOutput):9    image: File = Field(description="Upscaled image")1011class RemoveBgInput(BaseAppInput):12    image: File = Field(description="Image to process")1314class RemoveBgOutput(BaseAppOutput):15    image: File = Field(description="Image with background removed")1617class App(BaseApp):18    async def setup(self, metadata):19        # Load models for all functions20        pass2122    async def upscale(self, input_data: UpscaleInput, metadata) -> UpscaleOutput:23        # Upscale logic24        ...2526    async def remove_bg(self, input_data: RemoveBgInput, metadata) -> RemoveBgOutput:27        # Background removal logic28        ...

Each public method (other than setup/unload) becomes a callable function with its own input/output schema.

Testing Multi-Function Apps

bash
1# Test a specific function2infsh app test --function upscale --input upscale_input.json34# Test another function5infsh app test --function remove_bg --input removebg_input.json

Running Multi-Function Apps in the Cloud

bash
1infsh app run user/image-tools --function upscale --input input.json

When used as an agent tool, each function appears as a separate tool the agent can call.


Deploy

bash
1infsh app deploy

The CLI:

  1. Validates your app
  2. Installs dependencies locally
  3. Tests the structure
  4. Packages everything
  5. Uploads to inference.sh

Output:

code
12         deployment summary           34                                      5 final status: success                6 duration: 45s                        7 user: yourname                       8 app: my-app                          9 link: https://app.inference.sh/...   10                                      11

Testing Locally

Generate Example Input

bash
1infsh app test --save-example2# Creates input.json

Run Locally

bash
1infsh app test --input input.json23# Multi-function apps: specify which function to call4infsh app test --function upscale --input input.json

Debug Mode

bash
1infsh app test --input input.json --debug

AI-Friendly Development

The CLI creates a CLAUDE.md file with instructions for AI assistants. This means you can:

  1. Describe what you want to an AI (ChatGPT, Claude, etc.)
  2. Share your app structure
  3. Get working code
  4. Deploy with infsh app deploy

The template and structure are designed for AI code generation.


Editing in the Workspace

After deploying, edit settings in the web interface:

  • Setup Parameters — Configure initialization defaults
  • Images — Card, thumbnail, banner
  • Description — Help others find your app
  • Visibility — Public or private

These changes don't require the CLI.


Best Practices

Keep Apps Focused

code
1 Don't: One app that does everything2 Do: Single-purpose apps that compose well

Download Models in Setup

python
1async def setup(self, metadata):2    # Good: Download on first run, cached afterward3    from huggingface_hub import hf_hub_download4    self.model_path = hf_hub_download("model-name", "model.bin")

Don't bundle large models — they're cached after first download.

Log Progress

python
1async def run(self, input_data: AppInput, metadata) -> AppOutput:2    metadata.log("Step 1: Loading image...")3    # ...4    metadata.log("Step 2: Processing...")5    # ...6    metadata.log("Step 3: Saving result...")

Users see these in real-time.

Handle Errors Gracefully

python
1async def run(self, input_data: AppInput, metadata) -> AppOutput:2    if not input_data.image.exists():3        raise ValueError("Image file not found")4    5    try:6        result = process(input_data)7    except ProcessingError as e:8        metadata.log(f"Error: {e}")9        raise RuntimeError(f"Processing failed: {e}")

What's Next?

we use cookies

we use cookies to ensure you get the best experience on our website. for more information on how we use cookies, please see our cookie policy.

by clicking "accept", you agree to our use of cookies.
learn more.