Most developer tools ask you to click through a dashboard, read docs for twenty minutes, and then figure out authentication before you run anything. Belt takes a different approach. Install it, log in, and run an AI app in under a minute. It is a single CLI that connects you to the entire inference.sh platform - apps, skills, MCP servers, and knowledge - from your terminal.
This guide covers what Belt does, how to get started, and the commands you will actually use day to day.
Installing Belt
Three options depending on your preference:
1curl -fsSL https://cli.inference.sh | shOr install through npm:
1npm i -g @inference/beltOr run it without installing:
1npx @inference/beltAfter installation, Belt registers three aliases: belt, infsh, and inferencesh. Use whichever feels natural. The current version is v1.9.1.
Once installed, log in to connect your inference.sh account:
1belt loginVerify everything is set up:
1belt meThat is it. No config files, no environment variables to export, no YAML to write.
Running Your First App
The core action most people want is simple: run an AI model and get output. Belt makes this a single command.
1belt app run pruna/flux-dev -i '{"prompt": "a cat"}'Belt streams the task status in real time so you can watch progress:
1Running > Queued > dispatched > preparing > serving > running > uploading > Completed in 7.9sWhen the task finishes, you get JSON output with your results - in this case, an image URL you can open directly. The whole cycle from command to output takes seconds, not minutes.
This pattern works across every app on the platform. Image generation, video creation, language models, search tools, audio synthesis - same command structure, same streaming output.
Exploring Apps
Before running an app, you might want to know what is available. Belt gives you several ways to find and inspect apps.
Searching for Apps
1belt app search "video"This returns a list of apps matching your query. If you want to browse everything:
1belt app listInspecting an App
Before you send a request, you probably want to know what inputs an app accepts, what it outputs, and what it costs. The get command shows you everything:
1belt app get pruna/flux-devThis prints the full input schema - every field the app accepts, its type, whether it is required, and default values. For pruna/flux-dev, you will see fields like prompt, aspect_ratio, guidance, and others. It also shows the output schema so you know the shape of what comes back. Pricing information is included too, so there are no surprises.
This is one of the most useful commands in Belt. Instead of hunting through documentation, you get the complete interface definition right in your terminal. Pipe it to a file, reference it while building, or just glance at it before writing your -i input JSON.
Building and Deploying Apps
Belt is not just for consuming apps. If you are building on inference.sh, the CLI handles your full development workflow.
Starting a New App
1belt app initThis scaffolds a new app project with the right structure and files. You get a working template that you can modify and test immediately.
Testing Locally
1belt app testRun your app locally to verify it works before pushing anything to the platform. Catch problems early, iterate fast.
Deploying
1belt app deployOne command to push your app live. Belt handles packaging, uploading, and registration. Your app becomes available on the platform for anyone (or just your team) to use.
The workflow is tight: init, write your logic, test, deploy. No separate CI pipeline to configure, no container registry to manage, no infrastructure to provision.
Skills
Skills are reusable agent capabilities - self-contained modules that give agents specific abilities. Belt manages the full lifecycle.
Using a Skill
1belt skill use inferencesh/web-searchThis fetches a skill from the inference.sh store or directly from GitHub. Once fetched, the skill is available for your agents to use. Skills package up prompts, tool definitions, and configuration into a single portable unit.
Finding Skills
1belt skill searchBrowse available skills to find capabilities you need. Web search, code analysis, data extraction - the store grows as the community builds.
Publishing Your Own
1belt skill uploadBelt scans your skill definition, validates it, and publishes it to the store. Other developers can then pull it into their own agent setups.
Installing Locally
1belt skill installPull a skill down to your machine for local use or modification. This is useful when you want to customize an existing skill or study how it works before building your own.
MCP Servers
MCP (Model Context Protocol) lets agents connect to external services through a standardized interface. Belt acts as your control plane for managing these connections.
Listing Connected Servers
1belt mcp listThis shows all your connected MCP servers. You might see services like Linear, Notion, Slack, and others depending on what you have set up.
Connecting a New Server
1belt mcp connect linearOne command to establish a connection. Belt handles the authentication flow and stores the connection details.
Inspecting Available Tools
1belt mcp tools linearSee exactly what a connected server offers. Linear, for example, exposes 35 tools covering issues, projects, comments, labels, and more. This command lists every tool with its description so you know what operations are available.
Running MCP Tools
1belt mcp runExecute MCP tools directly from your terminal. This is useful for testing integrations, running one-off operations, or scripting workflows that combine multiple services.
The MCP commands turn Belt into a hub for all your agent integrations. Instead of configuring each service separately in every agent framework, you manage connections once through Belt and they are available everywhere.
Knowledge Management
Knowledge in inference.sh gives your agents access to persistent information - documents, data, reference material. Belt manages this from the command line.
1belt knowledge listSee what knowledge bases are available.
1belt knowledge searchSearch across your knowledge stores to find specific information.
1belt knowledge uploadAdd new documents or data to your knowledge bases. This is how you feed information to your agents without stuffing everything into prompts.
Knowledge management through the CLI means you can script uploads, automate index updates, and integrate knowledge operations into your existing workflows.
Using Belt with AI Coding Tools
Belt works with Claude Code, Cursor, Cline, Windsurf, and Codex. This is where things get interesting - Belt becomes the bridge between your coding environment and the inference.sh platform.
Your AI coding assistant can call Belt commands to run apps, fetch skills, connect to MCP servers, and manage knowledge. Instead of the coding tool needing native integrations with every service, it uses Belt as a universal interface.
A practical example: you are working in Claude Code and need to generate an image for a UI mockup. Your agent runs belt app run pruna/flux-dev -i '{"prompt": "dashboard with dark theme"}', gets back an image URL, and continues working. No context switching, no browser tabs, no copy-pasting API keys.
The same pattern applies to any capability on the platform. Need web search results? Run a search skill through Belt. Need to create a Linear ticket? Use the MCP connection. Need to check a knowledge base? Search through Belt. Everything stays in the terminal, in the flow.
Keeping Belt Updated
Belt updates itself:
1belt updateCheck your current version:
1belt versionUpdates ship frequently. Running the latest version ensures you have access to new apps, bug fixes, and performance improvements.
Practical Patterns
Here are some patterns that work well once you have Belt set up.
Scripting with Belt
Because Belt is a CLI, it composes with standard Unix tools. Pipe output to jq for JSON processing. Chain commands in shell scripts. Use it in CI/CD pipelines. Wrap it in Makefiles.
1belt app run pruna/flux-dev -i '{"prompt": "a mountain lake at sunset"}' | jq '.output'Exploring Before Building
When starting a new project, use belt app search and belt app get to survey what already exists. You might find an app that does exactly what you need, saving you from building from scratch. The get command is especially valuable here - it gives you the complete API surface without leaving your terminal.
Testing MCP Integrations
Before wiring MCP tools into an agent, test them manually through Belt. Run belt mcp tools <server> to see what is available, then belt mcp run to execute specific tools and verify the output format. This is much faster than debugging inside an agent loop.
The Big Picture
Belt ties together everything on inference.sh into a single command-line interface. Apps for running AI models. Skills for packaging agent capabilities. MCP for connecting external services. Knowledge for persistent information. Account management for authentication and configuration.
Each piece works independently, but they are strongest together. An agent that can run apps, use skills, query knowledge, and operate MCP tools through one CLI has access to a broad set of capabilities without complex integration work.
The belt CLI is the entry point. One install, one login, and you have the full platform in your terminal.
FAQ
Do I need an inference.sh account to use Belt?
Yes. Run belt login after installation to authenticate. Belt connects to the inference.sh platform, so an account is required to run apps, manage skills, and use MCP connections. Creating an account is free and takes a few seconds.
Can I use Belt in CI/CD pipelines?
Belt is a standard CLI tool, so it works anywhere you can run shell commands. Use it in GitHub Actions, GitLab CI, or any automation pipeline. Authenticate using environment variables or a pre-configured login, and script Belt commands just like any other CLI tool.
What is the difference between apps, skills, and MCP servers?
Apps are AI models and tools you run - they take input and produce output (images, text, video, audio). Skills are reusable agent capabilities that package prompts, tools, and configuration into portable modules. MCP servers are connections to external services (like Linear, Notion, or Slack) that expose their functionality through a standardized protocol. Belt manages all three from the same interface.