The inference.py file is your app's logic.
Structure
python
1from inferencesh import BaseApp, BaseAppInput, BaseAppOutput2from pydantic import Field3 4class AppInput(BaseAppInput):5 pass6 7class AppOutput(BaseAppOutput):8 # Define outputs here9 pass10 11class App(BaseApp):12 async def setup(self, metadata):13 # Runs once when worker starts14 pass15 16 async def run(self, input_data: AppInput, metadata) -> AppOutput:17 # Runs for each request18 pass19 20 async def unload(self):21 # Runs on shutdown22 passDefining inputs
python
1class AppInput(BaseAppInput):2 prompt: str = Field(description="What to generate")3 style: str = Field(default="modern", description="Style to use")4 count: int = Field(default=1, description="How many to generate")| Type | Example |
|---|---|
str | Text |
int | Whole number |
float | Decimal |
bool | True/false |
File | Uploaded file |
Optional[T] | Can be null |
List[T] | Array |
Defining outputs
python
1class AppOutput(BaseAppOutput):2 result: str = Field(description="Generated text")3 image: File = Field(description="Generated image")The run method
This is where your logic goes:
python
1async def run(self, input_data: AppInput, metadata) -> AppOutput:2 # Log progress3 metadata.log("Processing...")4 5 # Do work6 result = process(input_data.prompt)7 8 # Return output9 return AppOutput(result=result)Setup for models
Load heavy resources in setup:
python
1async def setup(self, metadata):2 from transformers import AutoModel3 self.model = AutoModel.from_pretrained("model-name")This runs once, not per-request.
Working with files
Input files are downloaded for you:
python
1image_path = input_data.image.pathOutput files are uploaded for you:
python
1return AppOutput(image=File(path="/tmp/output.png"))