Workers are computers that run your tasks.
Two types
| Type | Description |
|---|---|
| Cloud | Managed by inference.sh |
| Private | Your own hardware |
Cloud workers
- Pay-per-use
- Auto-scaling
- No setup required
Good for getting started and variable workloads.
Private workers
- Run on your hardware
- Via the inference.sh Engine
- Data stays on your network
Good for data privacy, dedicated resources, or cost control.
How tasks find workers
- You run an app
- Task goes to the queue
- A worker picks it up
- Worker runs it and returns results
You can choose cloud vs private when running.
Engines
An engine manages workers on your hardware.
Install it on your server, and your GPUs become available for tasks.
Next
Now you know the concepts! Let's build.