Inference Logoinference.sh

Agent UX Patterns That Work

Users interacting with agents have different needs than users interacting with traditional software. Agents think, which takes time. Agents take actions, which carry consequences. Agents make mistakes, which require recovery. The user experience patterns that work for conventional interfaces often fail for agent interfaces. Understanding what patterns actually help users interact with agents successfully - and what patterns create frustration - makes the difference between agents that feel reliable and agents that feel broken.

The Fundamental Challenges

Agent interactions present challenges that traditional interfaces do not.

Unpredictable timing means users cannot know how long they will wait. A simple question might resolve in seconds. A complex task might take minutes. Unlike a progress bar moving toward a known completion, agent work has no predetermined endpoint. Users need feedback that helps them understand whether waiting is normal.

Opaque reasoning means users cannot see why things happen. The agent decides what tools to use, what information to gather, what approach to take. These decisions happen invisibly unless the interface reveals them. Users faced with unexplained behavior lose confidence.

Consequential actions mean mistakes matter. When an agent sends an email, deletes data, or makes a purchase, real things happen in the world. Users need confidence that they understand and control what agents do on their behalf.

Imperfect reliability means things will go wrong. Agents misunderstand requests, call wrong tools, produce incorrect outputs. Users need ways to catch problems and recover from them.

These challenges shape what UX patterns work. Patterns designed for deterministic, instant, transparent systems need adaptation for probabilistic, slow, opaque ones.

Pattern: Streaming Progress

When agents work, show progress continuously rather than presenting results only after completion.

Why it works: Streaming transforms dead time into engaged time. Users watching progress understand that work is happening. The same duration feels shorter when filled with activity than when empty. Streaming also reveals problems early - if an agent takes a wrong turn, users see it immediately rather than waiting until the end.

How to implement: Stream agent reasoning as it generates. Show tool calls starting and completing. Display intermediate outputs as they become available. Use visual indicators that distinguish in-progress content from completed content.

What to stream: The agent's stated thinking gives users insight into approach. Tool invocations show what information is being gathered or what actions are being taken. Partial results let users start processing output before completion.

What not to stream: Internal implementation details that users cannot interpret meaningfully. Extremely rapid updates that create visual noise. Sensitive information that should not be visible until complete.

The goal is keeping users informed without overwhelming them. A status update every few seconds during active work maintains engagement without creating distraction.

Pattern: Explicit Status Indicators

Show clearly what state the agent is in at every moment.

Why it works: Ambiguity causes anxiety. When users cannot tell whether an agent is thinking, waiting, stuck, or done, they lose confidence. Explicit status removes this ambiguity. Users know what to expect and when to expect it.

States to indicate: Thinking (agent is reasoning), executing (tool is running), waiting (paused for input or approval), completed (ready for next input), and error (something went wrong). Each state should have distinct visual representation.

Transitions matter: Moving between states should be visible. If the agent shifts from thinking to executing a tool, that transition should be clear. State changes are information users can act on.

Duration context: When possible, indicate whether current state is expected to be brief or extended. "Searching..." sets different expectations than "Running analysis (may take a minute)...". Context helps users calibrate their waiting.

Status indicators work best when they are always present and always accurate. An indicator that sometimes disappears or lags behind actual state undermines the confidence it is meant to build.

Pattern: Action Transparency

Show users what the agent intends to do before it does consequential things.

Why it works: Agents that take action without warning feel unpredictable and uncontrollable. Agents that show intent before acting let users verify and correct. This builds trust through accountability rather than requiring blind faith.

Levels of transparency: For low-consequence actions, showing what happened is sufficient - "Searched for quarterly reports." For medium-consequence actions, showing intent before execution helps - "I'll send this email to [email protected]" with a brief pause. For high-consequence actions, explicit approval gates stop execution until the user confirms.

What to show: The specific action, not just the category. "Sending email" is less useful than showing the recipient, subject, and preview of content. Specificity enables meaningful review.

After the fact: Even for actions that proceed without approval, showing what happened maintains transparency. Users should be able to see a complete history of agent actions, not just final outputs.

Transparency scales with consequence. Not everything needs approval, but everything that happens should be visible.

Pattern: Graceful Error Recovery

When things go wrong, help users understand and recover rather than just reporting failure.

Why it works: Errors are inevitable with agents. What matters is how errors feel. A cryptic error message leaves users stuck. A helpful error message guides them forward. Good error handling turns potential frustration into manageable interruption.

Error categories: Distinguish between different kinds of problems. A misunderstood request suggests rephrasing. A tool failure might resolve with retry. A capability limitation means trying a different approach. Different problems have different solutions.

Recovery guidance: Suggest what to try next. "I couldn't access that file. Would you like to upload it directly?" is more helpful than "File access failed." Give users actionable paths forward.

Partial success: When some parts of a task succeed while others fail, preserve the successes. Do not throw away completed work because a later step failed. Show what was accomplished and what remains.

Retry options: When transient failures are possible, offer easy retry. A button to "Try again" is better than asking users to retype their request.

The goal is keeping users moving forward rather than leaving them stuck at errors.

Pattern: Controllable Autonomy

Let users adjust how much the agent does independently versus how much it checks in.

Why it works: Different tasks warrant different autonomy levels. A quick question does not need approval gates. A complex task with real consequences benefits from checkpoints. Letting users control this calibration matches agent behavior to user comfort.

Autonomy settings: Some interfaces let users specify how proactive the agent should be. "Just answer my question" versus "Take action if needed" versus "Ask before doing anything." These settings shape agent behavior.

Task-level adjustment: Within a conversation, users might want to tighten or loosen control. "Go ahead with that" grants permission for immediate action. "Wait, let me review first" adds a checkpoint. Responsive agents adapt to these cues.

Default calibration: For most users, moderate autonomy works best. Agents that do too little feel unhelpful. Agents that do too much feel unpredictable. Start in the middle and let users adjust.

The pattern recognizes that appropriate autonomy varies by context, user, and task. Flexibility serves better than fixed behavior.

Pattern: Conversation Continuity

Maintain context across interactions so users do not repeat themselves.

Why it works: Every time a user must re-explain something, friction increases. Agents that remember what was discussed, what was learned, and what the user prefers feel more capable and easier to work with.

Conversation history: Recent exchanges should inform current responses without users explicitly referencing them. "Do the same thing for the other file" should work when context makes clear what "the same thing" means.

Learned preferences: Patterns in user behavior can inform agent defaults. If a user always wants summaries in bullet points, the agent might learn to default to that format.

Explicit memory: For important information, explicit storage ensures persistence. "Remember that my team meets on Tuesdays" creates lasting context that the agent can reference.

Memory visibility: Users should be able to see what the agent remembers about them. This transparency supports both trust and correction when the agent has something wrong.

Continuity makes interactions feel like an ongoing relationship rather than disconnected transactions.

Anti-Pattern: Silent Processing

The opposite of streaming - showing nothing while the agent works.

Why it fails: Users interpret silence as failure. After a few seconds with no feedback, users start wondering if something is broken. They might retry, navigate away, or lose confidence even when the agent is working correctly.

The fix: Always show something during processing. Even a simple animated indicator is better than nothing. Streaming content is better than animation. Contextual status is better than generic animation.

Anti-Pattern: Unexplained Delays

Long waits without explanation of why.

Why it fails: A twenty-second wait might be completely normal for a complex search, but users do not know that. Without explanation, every delay feels like a problem.

The fix: Explain what is happening during delays. "Searching five databases..." explains why search takes time. "Analyzing document (this usually takes about 30 seconds)..." sets expectations. Context transforms confusing waits into understood processes.

Anti-Pattern: Hidden Failures

Errors that the agent handles internally without telling users.

Why it fails: When agents silently work around problems, users do not learn that problems occurred. If the workaround produces inferior results, users blame the agent for poor performance rather than understanding that the preferred path failed.

The fix: Report significant failures even when workarounds exist. "I couldn't access the primary source, so I'm using cached data from last week." Users can then decide if that is acceptable or if they want to address the underlying problem.

Anti-Pattern: Approval Fatigue

Requiring approval for everything, making the approval process meaningless.

Why it fails: When every action needs approval, users stop reading the details. They click approve habitually. The approval gates meant to catch problems become rubber stamps that catch nothing.

The fix: Reserve approval for genuinely consequential actions. Let routine operations proceed automatically. When approval is needed, it should be rare enough that users pay attention.

For teams building agent interfaces, inference.sh provides infrastructure that enables these patterns - streaming for real-time progress, observability for status tracking, approval gates for controlled actions, and durable state for continuity. The platform handles the mechanics so you can focus on creating experiences that work for your users.

Good agent UX is not about making agents seem perfect. It is about making agent behavior understandable, predictable in its unpredictability, and recoverable when things go wrong. Users who understand what is happening can work with agents effectively even when the agents are not perfect.

FAQ

How do I balance transparency with overwhelming users with information?

Layer information by importance. Surface the essential status that all users need - is the agent working, done, or stuck? Make details available on demand for users who want them - expand to see reasoning, click to view tool parameters. Use progressive disclosure rather than dumping everything into the primary interface. Most users most of the time need only high-level awareness. Power users investigating specific situations need access to details. Design serves both without forcing either to accept the other's preferred level.

Should error messages explain technical details or hide them?

Show users what they can act on, offer technical details for those who want them. The primary message should describe the situation in terms users understand and suggest what they can do about it. A disclosure or "more details" option can reveal technical information useful for debugging or reporting issues. Never show only technical details - most users cannot interpret them and feel helpless. Never hide technical details completely - they are valuable for users who can use them and for support conversations.

How do I handle situations where the agent needs to wait for external events?

Set clear expectations about the wait, provide status during it, and offer options. "Waiting for approval from your manager - I'll continue once they respond. You can also cancel if you'd like to proceed differently." Long waits should include reminders that the agent is still engaged, not stuck. Consider offering to notify the user when the wait ends rather than requiring them to stay on the page. If external dependencies are unreliable, consider timeout behavior and alternatives when waits exceed reasonable durations.

we use cookies

we use cookies to ensure you get the best experience on our website. for more information on how we use cookies, please see our cookie policy.

by clicking "accept", you agree to our use of cookies.
learn more.