Inference Logoinference.sh

App Code

The inference.py file is your app's logic.


Structure

python
1from inferencesh import BaseApp, BaseAppInput, BaseAppOutput2from pydantic import Field34class AppSetup(BaseAppInput):5    model_id: str = Field(default="gpt2", description="Model to load")67class AppInput(BaseAppInput):8    # Define inputs here9    pass1011class AppOutput(BaseAppOutput):12    # Define outputs here13    pass1415class App(BaseApp):16    async def setup(self, config: AppSetup):17        # Runs once when worker starts or config changes18        pass1920    async def run(self, input_data: AppInput, metadata) -> AppOutput:21        # Runs for each request22        pass2324    async def unload(self):25        # Runs on shutdown26        pass

Defining inputs

python
1class AppInput(BaseAppInput):2    prompt: str = Field(description="What to generate")3    style: str = Field(default="modern", description="Style to use")4    count: int = Field(default=1, description="How many to generate")
TypeExample
strText
intWhole number
floatDecimal
boolTrue/false
FileUploaded file
Optional[T]Can be null
List[T]Array

Defining outputs

python
1class AppOutput(BaseAppOutput):2    result: str = Field(description="Generated text")3    image: File = Field(description="Generated image")

The run method

This is where your logic goes:

python
1async def run(self, input_data: AppInput, metadata) -> AppOutput:2    # Log progress3    metadata.log("Processing...")4    5    # Do work6    result = process(input_data.prompt)7    8    # Return output9    return AppOutput(result=result)

Setup for models

Load heavy resources in setup. Use AppSetup to define configurable parameters:

python
1class AppSetup(BaseAppInput):2    model_id: str = Field(default="gpt2", description="Model to load")3    precision: str = Field(default="fp16", description="Model precision")45class App(BaseApp):6    async def setup(self, config: AppSetup):7        from transformers import AutoModel8        self.model = AutoModel.from_pretrained(config.model_id)

This runs once per configuration. If AppSetup values change between requests, the app re-initializes.

For more details, see Setup Parameters.


Multi-function apps

Apps can expose multiple functions, each with their own input/output types:

python
1from pydantic import BaseModel23class GreetInput(BaseModel):4    name: str = "World"56class GreetOutput(BaseModel):7    message: str89class ReverseInput(BaseModel):10    text: str1112class ReverseOutput(BaseModel):13    reversed_text: str1415class App:16    async def run(self, input_data: GreetInput) -> GreetOutput:17        """Default function - says hello."""18        return GreetOutput(message=f"Hello, {input_data.name}!")1920    async def greet(self, input_data: GreetInput) -> GreetOutput:21        """Custom greeting."""22        return GreetOutput(message=f"Welcome, {input_data.name}!")2324    async def reverse(self, input_data: ReverseInput) -> ReverseOutput:25        """Reverse text."""26        return ReverseOutput(reversed_text=input_data.text[::-1])

Functions are discovered automatically if they:

  • Are public methods (no _ prefix)
  • Have type hints for input and return value
  • Use Pydantic models for input/output

Call specific functions via the API:

bash
1curl -X POST https://api.inference.sh/v1/apps/{app_id}/run \2  -d '{"function": "reverse", "input": {"text": "hello"}}'

Working with files

Input files are downloaded for you:

python
1image_path = input_data.image.path

Output files are uploaded for you:

python
1return AppOutput(image=File(path="/tmp/output.png"))

Next

Configuration

we use cookies

we use cookies to ensure you get the best experience on our website. for more information on how we use cookies, please see our cookie policy.

by clicking "accept", you agree to our use of cookies.
learn more.