Build and deploy your own tools.
The Grid is open for extension. If you need something that doesn't exist, create it.
Overview
Apps are Python scripts with:
- Typed inputs — What the app accepts
- Typed outputs — What it returns
- Lifecycle methods — Setup, run, cleanup
The CLI helps you create, test, and deploy.
Quick Start
Install the CLI
1curl -fsSL https://cli.inference.sh | shLogin
1infsh loginCreate an App
1mkdir my-app && cd my-app2infsh initAnswer the prompts:
- Name:
my-app - Description: "What my app does"
- Category: image, text, audio, etc.
- Python version: 3.10, 3.11, or 3.12
- GPU required: Yes or No
This creates:
1my-app/2├── inf.yml # Configuration3├── inference.py # Your code4├── requirements.txt # Dependencies5└── ...The App Template
inference.py
1from inferencesh import BaseApp, BaseAppInput, BaseAppOutput, File2from pydantic import Field3from typing import Optional4 5class AppInput(BaseAppInput):6 prompt: str = Field(description="What to generate")7 style: str = Field(default="default", description="Output style")8 9class AppOutput(BaseAppOutput):10 result: str = Field(description="Generated text")11 image: Optional[File] = Field(default=None, description="Generated image")12 13class App(BaseApp):14 async def setup(self, metadata):15 """Called once when the worker starts.16 Load models, initialize resources here."""17 pass18 19 async def run(self, input_data: AppInput, metadata) -> AppOutput:20 """Called for each request.21 Process input and return output."""22 23 # Access inputs24 prompt = input_data.prompt25 style = input_data.style26 27 # Log progress (visible in real-time)28 metadata.log("Processing...")29 30 # Do your work31 result = f"Generated: {prompt} in {style} style"32 33 # Return output34 return AppOutput(result=result)35 36 async def unload(self):37 """Called on shutdown. Cleanup resources."""38 passInput/Output Types
| Type | Example | Description |
|---|---|---|
str | "hello" | Text |
int | 42 | Integer |
float | 3.14 | Decimal |
bool | true | Boolean |
List[T] | [1, 2, 3] | Array |
Optional[T] | null or value | Nullable |
File | {"uri": "..."} | File reference |
Working with Files
Input files are downloaded automatically:
1async def run(self, input_data: AppInput, metadata) -> AppOutput:2 # File is ready to use3 image_path = input_data.image.path4 5 with open(image_path, "rb") as f:6 data = f.read()Output files are uploaded automatically:
1async def run(self, input_data: AppInput, metadata) -> AppOutput:2 # Write to a file3 output_path = "/tmp/result.png"4 save_image(output_path)5 6 # Return it7 return AppOutput(image=File(path=output_path))Configuration
inf.yml
1name: my-app2description: What my app does3category: image4 5resources:6 gpu:7 count: 18 vram: 8000 # MB9 type: nvidia10 ram: 16000 # MB11 12python: "3.11"Setup Parameters
Use AppSetup class in Python to define runtime configuration (e.g. model selection).
See Setup Parameters.
Dependencies
requirements.txt — Python packages:
1torch>=2.0.02transformers>=4.30.03pillow>=9.0.0packages.txt — System packages (apt):
1ffmpeg2libgl1-mesa-glxRequirements
If your app needs external API keys or integrations, declare them:
1# inf.yml2requirements:3 # Environment secrets (user provides their own keys)4 secrets:5 - key: OPENAI_API_KEY6 description: For GPT-4 API calls7 8 - key: REPLICATE_API_TOKEN9 description: For model inference10 optional: true11 12 # Integrations (managed OAuth connections)13 integrations:14 - key: google.sheets15 description: Read/write spreadsheets16 17 - key: x.tweet.write18 description: Post tweets19 optional: trueRequired items must be configured before the app runs.
Optional items won't block execution if missing.
→ Learn more about Secrets & Integrations
Deploy
1infsh deployThe CLI:
- Validates your app
- Installs dependencies locally
- Tests the structure
- Packages everything
- Uploads to inference.sh
Output:
1┌──────────────────────────────────────┐2│ deployment summary │3├──────────────────────────────────────┤4│ │5│ final status: success │6│ duration: 45s │7│ user: yourname │8│ app: my-app │9│ link: https://app.inference.sh/... │10│ │11└──────────────────────────────────────┘Testing Locally
Generate Example Input
1infsh run --save-example2# Creates input.jsonRun Locally
1infsh run --input-file input.jsonIterate Quickly
1# Skip validation for faster deploys2infsh deploy --skip-checksAI-Friendly Development
The CLI creates a CLAUDE.md file with instructions for AI assistants. This means you can:
- Describe what you want to an AI (ChatGPT, Claude, etc.)
- Share your app structure
- Get working code
- Deploy with
infsh deploy
The template and structure are designed for AI code generation.
Editing in the Workspace
After deploying, edit settings in the web interface:
- Setup Parameters — Configure initialization defaults
- Images — Card, thumbnail, banner
- Description — Help others find your app
- Visibility — Public or private
These changes don't require the CLI.
Best Practices
Keep Apps Focused
1❌ Don't: One app that does everything2✓ Do: Single-purpose apps that compose wellDownload Models in Setup
1async def setup(self, metadata):2 # Good: Download on first run, cached afterward3 from huggingface_hub import hf_hub_download4 self.model_path = hf_hub_download("model-name", "model.bin")Don't bundle large models — they're cached after first download.
Log Progress
1async def run(self, input_data: AppInput, metadata) -> AppOutput:2 metadata.log("Step 1: Loading image...")3 # ...4 metadata.log("Step 2: Processing...")5 # ...6 metadata.log("Step 3: Saving result...")Users see these in real-time.
Handle Errors Gracefully
1async def run(self, input_data: AppInput, metadata) -> AppOutput:2 if not input_data.image.exists():3 raise ValueError("Image file not found")4 5 try:6 result = process(input_data)7 except ProcessingError as e:8 metadata.log(f"Error: {e}")9 raise RuntimeError(f"Processing failed: {e}")What's Next?
- Apps — See your app in the Grid
- Flows — Use your app in workflows
- Agents — Add your app as an agent tool
- Secrets & Integrations — Declare external requirements
- API & SDK — Run your app programmatically