Using Private Workers

Run tasks on your own hardware.

In the workspace

Toggle to "Private" when running an app:

code

1Run on: [○ Cloud] [● Private]

Via API

python

1result = client.run({2    "app": "my-app",3    "input": {...},4    "infra": "private"5})

Specific workers

Target exact workers:

python

1result = client.run({2    "app": "my-app",3    "input": {...},4    "infra": "private",5    "workers": ["my-server-gpu-0"]6})

Agents on private

Agents can use private workers too.

When an agent calls a tool, it respects your infra setting.

Monitoring

Check your engines at Engines:

Online/offline status
Resource usage
Running tasks

Caching

The engine caches:

App code
Downloaded models
Container images

Second runs are much faster.

That's it!

You now know how to use inference.sh.

→ Back to Docs

previousconfiguration nextimage generation

we use cookies

we use cookies to ensure you get the best experience on our website. for more information on how we use cookies, please see our cookie policy.

by clicking "accept", you agree to our use of cookies.
learn more.