Inference Logoinference.sh

Setup Parameters

Setup parameters configure an app's initialization steps. Modifying these parameters triggers a full restart (re-running setup()).

Unlike input parameters which change per request, setup parameters control expensive, static resources:

  • Model weights
  • Hardware precision (fp16/fp32)
  • Database connections
  • System prompts or behavior flags

Defining Setup Parameters

Define an AppSetup class in inference.py. This schema validates configuration on startup.

python
1from inferencesh import BaseApp, BaseAppInput2from pydantic import Field3 4class AppSetup(BaseAppInput):5    model_id: str = Field(default="gpt2", description="HuggingFace model ID")6    precision: str = Field(default="fp16", description="Model precision")7    enable_cache: bool = Field(default=True, description="Enable KV cache")8 9class App(BaseApp):10    async def setup(self, config: AppSetup):11        # Access parameters via config object12        self.model = load_model(13            config.model_id, 14            precision=config.precision15        )

Usage

Provide setup parameters when starting an app. If parameters match an existing running instance for that app, the request executes immediately. If they differ, the instance re-initializes.

CLI

bash
1infsh run input.json --setup setup.json

Python SDK

python
1client.run({2    "app": "my-app",3    "setup": {4        "model_id": "meta-llama/Llama-2-7b",5        "precision": "fp16"6    },7    "input": {8        "prompt": "Hello world"9    }10})

we use cookies

we use cookies to ensure you get the best experience on our website. for more information on how we use cookies, please see our cookie policy.

by clicking "accept", you agree to our use of cookies.
learn more.