realtime streaming
Stream Agent Execution to Your Frontend
Subscribe to chat sessions and receive live updates as your agent thinks, calls tools, and generates responses.
why streaming matters
AI agents take time. They think, they call tools, they process data. Without streaming, users stare at a spinner wondering if anything is happening.
With inference.sh, every update streams to your frontend in real-time. Users see the agent working—what tools it's calling, what it's thinking, what it's producing.
the agent-sdk
Our React SDK handles streaming automatically. Wrap your chat UI in a provider and use hooks to access state and actions.
import { AgentChatProvider, useAgentChat, useAgentActions } from '@inference/agent-sdk';
function MyChat() {
return (
<AgentChatProvider agentConfig={{ core_app_ref: 'infsh/claude-sonnet-4@abc123' }}>
<ChatUI />
</AgentChatProvider>
);
}
function ChatUI() {
const { messages, status, isGenerating } = useAgentChat();
const { sendMessage, stopGeneration } = useAgentActions();
// messages update in real-time as the agent responds
// status shows: 'idle' | 'connecting' | 'streaming' | 'error'
}pre-built chat component
Don't want to build your own UI? Use the pre-composed AgentChat component.
import { AgentChat } from '@inference/agent-sdk';
<AgentChat
agentConfig={{
core_app_ref: 'infsh/claude-opus-45@abc123',
name: 'My Assistant',
}}
/>what streams
Every part of the agent's execution is streamed:
- messages. User messages and assistant responses as they're generated.
- tool invocations. See which tools the agent is calling and their results.
- status updates. Know when the agent is thinking, waiting for approval, or done.
- llm tokens. Stream text as it's generated, not just when it's complete.
client tools
Define tools that execute in the browser. The SDK automatically handles invocations and submits results back to the agent.
import { tool, string } from '@inference/agent-sdk';
const scanUI = tool('scan_ui', 'Scan the current page')
.input({ selector: string('CSS selector to scan') })
.handler(async ({ selector }) => {
const element = document.querySelector(selector);
return JSON.stringify({ found: !!element });
});
<AgentChat agentConfig={{ tools: [scanUI] }} />under the hood
Streaming uses Server-Sent Events (SSE). A single connection per chat session receives typed events for both chat state and message updates. Auto-reconnect handles network interruptions.
For non-React environments, you can use the StreamManager class directly or connect to the SSE endpoint with any HTTP client.
get started
Install the SDK and start building streaming agent interfaces.
durable execution
event-driven, not long-running. if a tool fails, it doesn't crash your agent loop. state persists across invocations.
tool orchestration
150+ apps as tools via one API. structured execution with approvals when needed. full visibility into what ran.
observability
real-time streaming and logs for every action. see exactly what your agent is doing.
pay-per-execution
no idle costs while tools run or waiting for results. you're not paying to keep a process alive.
plug any model, swap providers without changing code
we use cookies
we use cookies to ensure you get the best experience on our website. for more information on how we use cookies, please see our cookie policy.
by clicking "accept", you agree to our use of cookies.
learn more.