The inference.py file is your app's logic.
Structure
python
1from inferencesh import BaseApp, BaseAppInput, BaseAppOutput2from pydantic import Field34class AppSetup(BaseAppInput):5 model_id: str = Field(default="gpt2", description="Model to load")67class AppInput(BaseAppInput):8 # Define inputs here9 pass1011class AppOutput(BaseAppOutput):12 # Define outputs here13 pass1415class App(BaseApp):16 async def setup(self, config: AppSetup):17 # Runs once when worker starts or config changes18 pass1920 async def run(self, input_data: AppInput, metadata) -> AppOutput:21 # Runs for each request22 pass2324 async def unload(self):25 # Runs on shutdown26 passDefining inputs
python
1class AppInput(BaseAppInput):2 prompt: str = Field(description="What to generate")3 style: str = Field(default="modern", description="Style to use")4 count: int = Field(default=1, description="How many to generate")| Type | Example |
|---|---|
str | Text |
int | Whole number |
float | Decimal |
bool | True/false |
File | Uploaded file |
Optional[T] | Can be null |
List[T] | Array |
Defining outputs
python
1class AppOutput(BaseAppOutput):2 result: str = Field(description="Generated text")3 image: File = Field(description="Generated image")The run method
This is where your logic goes:
python
1async def run(self, input_data: AppInput, metadata) -> AppOutput:2 # Log progress3 metadata.log("Processing...")4 5 # Do work6 result = process(input_data.prompt)7 8 # Return output9 return AppOutput(result=result)Setup for models
Load heavy resources in setup. Use AppSetup to define configurable parameters:
python
1class AppSetup(BaseAppInput):2 model_id: str = Field(default="gpt2", description="Model to load")3 precision: str = Field(default="fp16", description="Model precision")45class App(BaseApp):6 async def setup(self, config: AppSetup):7 from transformers import AutoModel8 self.model = AutoModel.from_pretrained(config.model_id)This runs once per configuration. If AppSetup values change between requests, the app re-initializes.
For more details, see Setup Parameters.
Multi-function apps
Apps can expose multiple functions, each with their own input/output types:
python
1from pydantic import BaseModel23class GreetInput(BaseModel):4 name: str = "World"56class GreetOutput(BaseModel):7 message: str89class ReverseInput(BaseModel):10 text: str1112class ReverseOutput(BaseModel):13 reversed_text: str1415class App:16 async def run(self, input_data: GreetInput) -> GreetOutput:17 """Default function - says hello."""18 return GreetOutput(message=f"Hello, {input_data.name}!")1920 async def greet(self, input_data: GreetInput) -> GreetOutput:21 """Custom greeting."""22 return GreetOutput(message=f"Welcome, {input_data.name}!")2324 async def reverse(self, input_data: ReverseInput) -> ReverseOutput:25 """Reverse text."""26 return ReverseOutput(reversed_text=input_data.text[::-1])Functions are discovered automatically if they:
- Are public methods (no
_prefix) - Have type hints for input and return value
- Use Pydantic models for input/output
Call specific functions via the API:
bash
1curl -X POST https://api.inference.sh/v1/apps/{app_id}/run \2 -d '{"function": "reverse", "input": {"text": "hello"}}'Working with files
Input files are downloaded for you:
python
1image_path = input_data.image.pathOutput files are uploaded for you:
python
1return AppOutput(image=File(path="/tmp/output.png"))