Inference Logoinference.sh

Configuration

The inf.yml file defines app settings and resource requirements.


Project Structure

code
1my-app/2 inf.yml           # Configuration3 inference.py      # App logic4 requirements.txt  # Python packages (pip)5 packages.txt      # System packages (apt)  optional

Basic structure

yaml
1name: my-app2description: What my app does3category: image4kernel: python-3.115 6resources:7  gpu:8    count: 19    vram: 24    # 24GB (auto-converted to bytes)10    type: any11  ram: 32       # 32GB

Fields

FieldRequiredDescription
nameYesApp identifier (slug format)
descriptionYesWhat it does
categoryYesApp category
kernelYesRuntime: python-3.10, python-3.11, python-3.12
resourcesYesHardware requirements

Resources

The CLI automatically converts human-friendly values to bytes:

  • < 1000 → treated as GB (e.g., 80 = 80GB)
  • 1000 to 1 billion → treated as MB (e.g., 80000 = 80GB)
yaml
1resources:2  gpu:3    count: 1        # Number of GPUs4    vram: 24        # 24GB5    type: any       # GPU type6  ram: 32           # 32GB

GPU Types

ValueDescription
anyAny GPU will work
nvidiaRequires NVIDIA GPU
amdRequires AMD GPU
appleRequires Apple Silicon
noneNo GPU needed (CPU only)

Note: Currently only NVIDIA CUDA GPUs are supported.

For CPU-only apps:

yaml
1resources:2  gpu:3    count: 04    type: none5  ram: 4

Categories

CategoryUse For
imageImage generation, editing
videoVideo generation, processing
audioAudio generation, TTS
textText generation
chatConversational AI
3d3D model generation
otherEverything else

Dependencies

Python Packages (requirements.txt)

code
1torch>=2.02transformers3accelerate

System Packages (packages.txt)

For apt-installable system dependencies:

code
1ffmpeg2libgl1-mesa-glx

Base Images

Apps run in containers with these base images:

TypeImage
GPUdocker.inference.sh/gpu:latest-cuda
CPUdocker.inference.sh/cpu:latest

Environment Variables

yaml
1env:2  MODEL_NAME: gpt-43  MAX_TOKENS: "2000"4  HF_HUB_ENABLE_HF_TRANSFER: "1"

Access in code:

python
1import os2model = os.environ["MODEL_NAME"]

Secrets and Integrations

Declare required secrets and OAuth integrations:

yaml
1secrets:2  - key: HF_TOKEN3    description: HuggingFace token for gated models4    optional: false5 6integrations:7  - key: google.sheets8    description: Access to Google Sheets9    optional: true

See Secrets and Integrations for details.


Next

Deploying

we use cookies

we use cookies to ensure you get the best experience on our website. for more information on how we use cookies, please see our cookie policy.

by clicking "accept", you agree to our use of cookies.
learn more.