Inference Logoinference.sh

Instance Types

Browse GPU instance types and hourly pricing across cloud providers. This endpoint powers comparison views in the workspace and is available without authentication.


List instance types

GET /instances/types

Returns GPU instance types aggregated from cloud providers (via Shadeform). Results are filtered to common GPU families (for example A100, H100, L40S, RTX series).

No API key is required. Responses are cached for one hour (Cache-Control: public, max-age=3600).

Query parameters

ParameterTypeDescription
gpu_typestringFilter by GPU model name (for example H100, A100)
num_gpusstringFilter by GPU count (for example 1, 8)
cloudstringFilter by provider key (for example aws, runpod, lambdalabs)
regionstringFilter by region code
availablestringWhen true, only types with at least one available region
sortstringSort order (passed through to the provider catalog)

Response

With X-API-Version: 2, the body is a JSON array of instance type objects. Without that header, the same array is nested under data in the legacy envelope (success, status, data).

json
1[2  {3    "id": "runpod.NVIDIA H100 80GB HBM3",4    "cloud": "runpod",5    "cloud_logo_url": "https://cloud.inference.sh/logos/runpod.com.png",6    "region": "us-east-1",7    "shade_instance_type": "NVIDIA H100 80GB HBM3",8    "cloud_instance_type": "gpu_1x_h100_sxm5",9    "deployment_type": "vm",10    "hourly_price": 299,11    "configuration": {12      "gpu_type": "H100",13      "gpu_manufacturer": "nvidia",14      "interconnect": "sxm5",15      "memory_in_gb": 180,16      "num_gpus": 1,17      "nvlink": false,18      "os_options": ["ubuntu22.04"],19      "storage_in_gb": 500,20      "vcpus": 26,21      "vram_per_gpu_in_gb": 8022    },23    "availability": [24      { "available": true, "region": "us-east-1" }25    ],26    "boot_time": {27      "average_seconds": 420,28      "updated_at": "2026-05-20T12:00:00Z",29      "sample_size": 12030    }31  }32]

Fields

FieldDescription
idStable id ({cloud}.{shade_instance_type})
cloudProvider key (aws, runpod, lambdalabs, …)
cloud_logo_urlProvider logo URL (https://cloud.inference.sh/logos/{domain}.png)
regionDefault or selected region for this row
shade_instance_typeProvider catalog name
cloud_instance_typeProvider-specific SKU or machine type
deployment_typevm, container, or baremetal
hourly_pricePrice in cents per hour (divide by 100 for USD)
configurationGPU and machine specs (see below)
availabilityPer-region availability flags
boot_timeOptional average boot time stats

configuration

FieldDescription
gpu_typeGPU model name
gpu_manufacturerChip vendor (for example nvidia, amd)
interconnectInterconnect type when applicable
num_gpusNumber of GPUs
nvlinkWhether NVLink is available on this type
vram_per_gpu_in_gbVRAM per GPU
memory_in_gbSystem RAM
vcpusvCPU count
storage_in_gbAttached storage
os_optionsSupported OS images

Example

bash
1curl "https://api.inference.sh/instances/types?gpu_type=H100&available=true"
bash
1curl -H "X-API-Version: 2" \2  "https://api.inference.sh/instances/types?cloud=runpod&num_gpus=1"

Authenticated catalog (workspace)

When provisioning remote engines in the workspace, the authenticated endpoint GET /engines/types (requires engines:read scope) returns a deduplicated catalog: one cheapest available option per GPU family for engine setup flows.

Use GET /instances/types for the full public catalog with filters; use GET /engines/types inside authenticated integrations tied to engine provisioning.


Workers concept — cloud vs private workers for running apps
Private engine configuration — GPU workers on your own hardware
REST overview — base URL and authentication

we use cookies

we use cookies to ensure you get the best experience on our website. for more information on how we use cookies, please see our cookie policy.

by clicking "accept", you agree to our use of cookies.
learn more.