# inference.sh - Complete Documentation & Blog > The AI agent runtime. Build agents that can actually do things—with durable execution, human-in-the-loop approval, and real-time observability. This file contains the complete text content of all inference.sh documentation and blog pages. For the summary version, see: https://inference.sh/llms.txt --- # DOCUMENTATION --- # BLOG Articles, tutorials, and insights on building production AI agents. --- # APPS Pre-built AI apps and tools available on the inference.sh platform. --- ## Featured Apps ### pruna/p-video-avatar **URL:** https://app.inference.sh/apps/pruna/p-video-avatar **Category:** video Generate talking head videos from a portrait image with text or audio-driven speech --- ## All Apps ### reve/remix **URL:** https://app.inference.sh/apps/reve/remix **Category:** image Reve Remix — Create images from text and 1-6 reference images combined. --- ### reve/edit **URL:** https://app.inference.sh/apps/reve/edit **Category:** image Reve Edit — Edit images with natural language instructions. Top 3 on LMArena leaderboard. --- ### reve/create **URL:** https://app.inference.sh/apps/reve/create **Category:** image Reve Create — Generate images from text with best-in-class prompt adherence and text rendering. --- ### pruna/p-video-replace **URL:** https://app.inference.sh/apps/pruna/p-video-replace **Category:** video Replace characters in videos using reference images. Preserves motion, timing, camera, and scene. --- ### anthropic/claude-mythos-5 **URL:** https://app.inference.sh/apps/anthropic/claude-mythos-5 **Category:** chat Claude Mythos 5 — Project Glasswing. Successor to Claude Mythos Preview. 1M context, 128k output, adaptive thinking, vision, tool use. Direct API. --- ### anthropic/claude-fable-5 **URL:** https://app.inference.sh/apps/anthropic/claude-fable-5 **Category:** chat Claude Fable 5 — Anthropic's most capable widely released model. 1M context, 128k output, adaptive thinking, vision, tool use. Direct API. --- ### elevenlabs/voice-remix **URL:** https://app.inference.sh/apps/elevenlabs/voice-remix **Category:** audio ElevenLabs Voice Remix - Modify voice characteristics like accent, gender, style, pacing --- ### elevenlabs/voice-clone **URL:** https://app.inference.sh/apps/elevenlabs/voice-clone **Category:** audio ElevenLabs Voice Clone - Instantly clone a voice from audio samples --- ### elevenlabs/voice-design **URL:** https://app.inference.sh/apps/elevenlabs/voice-design **Category:** audio ElevenLabs Voice Design - Create custom AI voices from text descriptions --- ### x/post-search **URL:** https://app.inference.sh/apps/x/post-search **Category:** social Search recent posts on X.com. Use conversation_id to get replies to a tweet, or any X search query. Returns up to 100 posts with text, author, and engagement metrics. --- ### infsh/deepseek-ocr-2 **URL:** https://app.inference.sh/apps/infsh/deepseek-ocr-2 **Category:** text Next-gen document OCR with improved math, tables, and reading order. Converts images and PDFs to structured markdown. --- ### heygen/create-avatar **URL:** https://app.inference.sh/apps/heygen/create-avatar **Category:** video Create HeyGen avatars from video footage (digital twin), a photo (photo avatar), or a text prompt (AI-generated). Returns a look ID for use with avatar-video. --- ### openrouter/qwen3-32b **URL:** https://app.inference.sh/apps/openrouter/qwen3-32b **Category:** chat Qwen3 32B - powerful dense language model with reasoning and tool use capabilities via OpenRouter --- ### openrouter/qwen3-8b **URL:** https://app.inference.sh/apps/openrouter/qwen3-8b **Category:** chat Qwen3 8B - efficient dense language model with reasoning and tool use capabilities via OpenRouter --- ### anthropic/claude-haiku-45 **URL:** https://app.inference.sh/apps/anthropic/claude-haiku-45 **Category:** chat Claude Haiku 4.5 — Fastest and most affordable Claude. 200k context, 64k output, vision, extended thinking, tool use. Direct API. --- ### anthropic/claude-sonnet-45 **URL:** https://app.inference.sh/apps/anthropic/claude-sonnet-45 **Category:** chat Claude Sonnet 4.5 — Previous generation Sonnet. 200k context, 64k output, vision, extended thinking, tool use. Direct API. --- ### anthropic/claude-sonnet-46 **URL:** https://app.inference.sh/apps/anthropic/claude-sonnet-46 **Category:** chat Claude Sonnet 4.6 — Best balance of speed and intelligence. 1M context, 64k output, vision, extended thinking, tool use. Direct API. --- ### anthropic/claude-opus-46 **URL:** https://app.inference.sh/apps/anthropic/claude-opus-46 **Category:** chat Claude Opus 4.6 — Previous generation Opus. 1M context, 128k output, vision, extended thinking, tool use. Direct API. --- ### anthropic/claude-opus-47 **URL:** https://app.inference.sh/apps/anthropic/claude-opus-47 **Category:** chat Claude Opus 4.7 — Anthropic's most capable model. 1M context, 128k output, vision, extended thinking, tool use. Direct API. --- ### heygen/text-to-speech **URL:** https://app.inference.sh/apps/heygen/text-to-speech **Category:** audio Generate natural speech audio from text using HeyGen's Starfish TTS engine. Supports configurable voice, speed, SSML input, and multiple languages. --- ### heygen/lipsync **URL:** https://app.inference.sh/apps/heygen/lipsync **Category:** video Re-sync video lip movements to new audio using HeyGen's lipsync technology. Supports speed and precision modes with optional captioning. --- ### heygen/video-translate **URL:** https://app.inference.sh/apps/heygen/video-translate **Category:** video Translate videos into 30+ languages with voice cloning and lip-sync using HeyGen. Supports speed and precision modes with optional captioning. --- ### heygen/video-agent **URL:** https://app.inference.sh/apps/heygen/video-agent **Category:** video Generate complete videos from natural language prompts using HeyGen's AI video agent. The agent handles avatar selection, scripting, and production automatically. --- ### heygen/photo-video **URL:** https://app.inference.sh/apps/heygen/photo-video **Category:** video Animate portrait photos into talking videos using HeyGen. Upload a face image and add speech with configurable voice, motion prompts, and expressiveness. --- ### heygen/avatar-video **URL:** https://app.inference.sh/apps/heygen/avatar-video **Category:** video Generate talking avatar videos using HeyGen's digital and photo avatars with Avatar IV or V engines, configurable voice, resolution up to 4K, and expressiveness. --- ### veed/subtitles **URL:** https://app.inference.sh/apps/veed/subtitles **Category:** video Add professional burned-in subtitles to videos with 25+ style presets. Supports 100+ languages with automatic transcription or custom SRT files. --- ### infsh/html-to-video **URL:** https://app.inference.sh/apps/infsh/html-to-video **Category:** video Render HTML/CSS/JS animations to video — supports GSAP timelines, CSS animations, Web Animations API --- ### klingai/image-v2 **URL:** https://app.inference.sh/apps/klingai/image-v2 **Category:** image Kling Image V2 (Kolors V2.0) - text-to-image with 2K resolution, multi-image reference, and restyle. Restyle output matches input resolution. --- ### klingai/image-o1 **URL:** https://app.inference.sh/apps/klingai/image-o1 **Category:** image Kling Image O1 (Kolors Image-O1) - omni image generation with element control. Text-to-image and image-to-image at 1K/2K. $0.028/image. --- ### klingai/image-3o **URL:** https://app.inference.sh/apps/klingai/image-3o **Category:** image Kling Image 3O (Kolors Image-3O) - most capable image model with native 4K, series-image generation, and element control. $0.028/image (4K $0.056). --- ### klingai/image-v1 **URL:** https://app.inference.sh/apps/klingai/image-v1 **Category:** image Kling Image V1 (Kolors V1.0) - basic text-to-image and image-to-image generation. Cheapest option at $0.0035/image. --- ### klingai/image-v1-5 **URL:** https://app.inference.sh/apps/klingai/image-v1-5 **Category:** image Kling Image V1.5 (Kolors V1.5) - text-to-image with subject and face reference for character consistency. Generate images preserving a person's appearance. --- ### klingai/image-v2-1 **URL:** https://app.inference.sh/apps/klingai/image-v2-1 **Category:** image Kling Image V2.1 (Kolors V2.1) - text-to-image and multi-image reference generation. Combine multiple images for complex compositions. --- ### klingai/image-v3 **URL:** https://app.inference.sh/apps/klingai/image-v3 **Category:** image Kling Image V3 (Kolors V3.0) - latest image generation model with 1K/2K resolution support. Highest quality text-to-image. --- ### klingai/video-v3 **URL:** https://app.inference.sh/apps/klingai/video-v3 **Category:** video Kling V3.0 - latest and most capable video generation model. Native 4K output, multi-shot generation, flexible 3-15s duration billed per second, element control, motion control, and synchronized audio. --- ### klingai/video-v2-6 **URL:** https://app.inference.sh/apps/klingai/video-v2-6 **Category:** video Kling V2.6 video generation with native sound and voice control. Supports text-to-video and image-to-video with start/end frames, synchronized audio generation, and voice-driven character animation. --- ### klingai/video-o1 **URL:** https://app.inference.sh/apps/klingai/video-o1 **Category:** video Kling Video O1 (Omni) - unified video generation with text, image references, start/end frames, element references, and video references for editing and style transfer. The most capable Kling model. --- ### klingai/video-v2-5 **URL:** https://app.inference.sh/apps/klingai/video-v2-5 **Category:** video Kling V2.5 Turbo - fast video generation from text and images. Supports start/end frame interpolation in pro mode. Optimized for speed while maintaining high quality at up to 1080p. --- ### klingai/lip-sync **URL:** https://app.inference.sh/apps/klingai/lip-sync **Category:** video Kling Lip Sync - drive mouth movements in videos using text or audio. Ideal for dubbing, adding speech to silent videos, or replacing dialogue. --- ### klingai/avatar **URL:** https://app.inference.sh/apps/klingai/avatar **Category:** video Kling Avatar - generate digital human broadcast-style talking head videos from a single face photo. Provide text or audio for the avatar to speak. --- ### klingai/video-to-audio **URL:** https://app.inference.sh/apps/klingai/video-to-audio **Category:** audio Kling Video-to-Audio - add generated sound effects, ambient audio, or music to any video. Works with Kling-generated and user-uploaded videos (3-20s). --- ### klingai/virtual-tryon **URL:** https://app.inference.sh/apps/klingai/virtual-tryon **Category:** image Kling Virtual Try-On - AI clothing try-on from a person photo and clothing image. Supports single items and upper+lower combos (v1.5). $0.07 per generation. --- ### inworld/voice-cloning **URL:** https://app.inference.sh/apps/inworld/voice-cloning **Category:** audio Clone a voice from 5-15 seconds of audio using Inworld instant voice cloning. Use the cloned voice ID with any Inworld TTS model. --- ### inworld/voice-design **URL:** https://app.inference.sh/apps/inworld/voice-design **Category:** audio Design a custom voice from a text description using Inworld AI. Describe the voice you want and get up to 3 previews. Publish the one you like to use with any Inworld TTS model. --- ### inworld/text-to-speech-2 **URL:** https://app.inference.sh/apps/inworld/text-to-speech-2 **Category:** audio Inworld TTS-2 - High-quality multilingual text-to-speech with 100+ languages and natural-language steering --- ### inworld/text-to-speech-1-5-max **URL:** https://app.inference.sh/apps/inworld/text-to-speech-1-5-max **Category:** audio Inworld TTS 1.5 Max - Low-latency text-to-speech with 15 languages (<200ms P50) --- ### inworld/speech-to-text **URL:** https://app.inference.sh/apps/inworld/speech-to-text **Category:** audio Inworld Speech to Text - Multi-provider speech transcription with word timestamps --- ### inworld/text-to-speech-1-5-mini **URL:** https://app.inference.sh/apps/inworld/text-to-speech-1-5-mini **Category:** audio Inworld TTS 1.5 Mini - Ultra-low-latency text-to-speech with 15 languages (~120ms P50) --- ### openrouter/claude-sonnet-46 **URL:** https://app.inference.sh/apps/openrouter/claude-sonnet-46 **Category:** chat Sonnet 4. --- ### openrouter/hy3-preview **URL:** https://app.inference.sh/apps/openrouter/hy3-preview **Category:** chat Hy3 preview is a high-efficiency Mixture-of-Experts model from Tencent designed for agentic workflows and production use. --- ### openrouter/gemini-3-flash-preview **URL:** https://app.inference.sh/apps/openrouter/gemini-3-flash-preview **Category:** chat Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance. --- ### openrouter/kimi-k26 **URL:** https://app.inference.sh/apps/openrouter/kimi-k26 **Category:** chat Kimi K2. --- ### openrouter/claude-opus-47 **URL:** https://app.inference.sh/apps/openrouter/claude-opus-47 **Category:** chat Opus 4. --- ### xai/grok-imagine-image-quality **URL:** https://app.inference.sh/apps/xai/grok-imagine-image-quality **Category:** image Generate and edit high-quality images using xAI's Grok Imagine Quality model. Supports 1K and 2K output resolutions with text-to-image and image editing. --- ### infsh/hyperframes-render **URL:** https://app.inference.sh/apps/infsh/hyperframes-render **Category:** video Render HeyGen Hyperframes compositions to video — supports clips, GSAP timelines, track layering --- ### bria/rmbg **URL:** https://app.inference.sh/apps/bria/rmbg **Category:** image Remove the background from an image, producing a transparent cutout. The general-purpose background removal — for product-specific cutouts, use product-cutout instead. Output can be passed to replace-background, blur-background, or any editing app. --- ### bria/increase-resolution **URL:** https://app.inference.sh/apps/bria/increase-resolution **Category:** image Upscale images 2x or 4x (max 8192x8192) while preserving original content --- ### bria/expand **URL:** https://app.inference.sh/apps/bria/expand **Category:** image Expand image canvas with AI-generated content matching the original scene --- ### bria/erase **URL:** https://app.inference.sh/apps/bria/erase **Category:** image Remove objects from images using mask-based inpainting while preserving quality --- ### bria/generate **URL:** https://app.inference.sh/apps/bria/generate **Category:** image Generate images from text prompts using Bria Fibo --- ### bria/generate-lite **URL:** https://app.inference.sh/apps/bria/generate-lite **Category:** image Fast image generation from text prompts using Bria Fibo Lite --- ### bria/structured-prompt **URL:** https://app.inference.sh/apps/bria/structured-prompt **Category:** text Generate structured prompt JSON from text or images using Bria --- ### bria/ads-generate **URL:** https://app.inference.sh/apps/bria/ads-generate **Category:** image Generate multiple ads in various sizes from templates and brand assets --- ### bria/product-cutout **URL:** https://app.inference.sh/apps/bria/product-cutout **Category:** image Cut out product from image with transparent background --- ### bria/gen-fill **URL:** https://app.inference.sh/apps/bria/gen-fill **Category:** image Generative fill — replace masked regions with AI-generated content guided by a text prompt --- ### bria/product-packshot **URL:** https://app.inference.sh/apps/bria/product-packshot **Category:** image Generate professional 2000x2000 product packshot images --- ### bria/replace-background **URL:** https://app.inference.sh/apps/bria/replace-background **Category:** image Replace image background with AI-generated content from a text prompt or reference image --- ### bria/video-rmbg **URL:** https://app.inference.sh/apps/bria/video-rmbg **Category:** video Remove background from videos with optional color replacement --- ### bria/product-shadow **URL:** https://app.inference.sh/apps/bria/product-shadow **Category:** image Add realistic shadows to product cutout images --- ### bria/video-eraser **URL:** https://app.inference.sh/apps/bria/video-eraser **Category:** video Erase objects from video using a mask with inpainting --- ### bria/video-replace-background **URL:** https://app.inference.sh/apps/bria/video-replace-background **Category:** video Replace video background with an image or another video --- ### bria/edit **URL:** https://app.inference.sh/apps/bria/edit **Category:** image Edit an image using natural language text instructions --- ### bria/video-increase-resolution **URL:** https://app.inference.sh/apps/bria/video-increase-resolution **Category:** video Upscale video resolution up to 8K using AI super-resolution --- ### bria/video-green-screen **URL:** https://app.inference.sh/apps/bria/video-green-screen **Category:** video Apply green or blue screen effect to video foreground --- ### bytedance/seedance-2-0 **URL:** https://app.inference.sh/apps/bytedance/seedance-2-0 **Category:** video Professional multimodal video generation from text, images, video, and audio references using ByteDance's Seedance 2.0 model via BytePlus ARK API. Supports up to 1080p, text-to-video, image-to-video, and multimodal reference-to-video with synchronized audio. --- ### bytedance/seedance-2-0-fast **URL:** https://app.inference.sh/apps/bytedance/seedance-2-0-fast **Category:** video Fast multimodal video generation from text, images, video, and audio references using ByteDance's Seedance 2.0 Fast model via BytePlus ARK API. Supports text-to-video, image-to-video, and multimodal reference-to-video with synchronized audio. --- ### alibaba/happyhorse-1-0-video-edit **URL:** https://app.inference.sh/apps/alibaba/happyhorse-1-0-video-edit **Category:** video HappyHorse 1.0 Video Edit supports advanced video editing through natural language instructions with up to 5 reference images, preserving original motion dynamics via DashScope API --- ### alibaba/happyhorse-1-0-t2v **URL:** https://app.inference.sh/apps/alibaba/happyhorse-1-0-t2v **Category:** video HappyHorse 1.0 Text-to-Video generates physically realistic videos with smooth motion from text prompts via DashScope API, supporting 720P/1080P resolution and up to 15 seconds duration --- ### alibaba/happyhorse-1-0-i2v **URL:** https://app.inference.sh/apps/alibaba/happyhorse-1-0-i2v **Category:** video HappyHorse 1.0 Image-to-Video generates physically realistic videos with smooth motion from a single image and optional text description via DashScope API, supporting 720P/1080P resolution --- ### alibaba/happyhorse-1-0-r2v **URL:** https://app.inference.sh/apps/alibaba/happyhorse-1-0-r2v **Category:** video HappyHorse 1.0 Reference-to-Video generates videos preserving subject characters from up to 9 reference images, with enhanced stability in subject and scene referencing via DashScope API --- ### openai/gpt-image-2 **URL:** https://app.inference.sh/apps/openai/gpt-image-2 **Category:** image Generate and edit images using OpenAI's GPT Image 2 model. Supports text-to-image, image editing with reference images, and mask-based inpainting. --- ### falai/patina-extract-material **URL:** https://app.inference.sh/apps/falai/patina-extract-material **Category:** image Extracts a seamlessly tiling texture plus PBR material maps from a region of a source image described by a text prompt, via fal.ai PATINA. --- ### falai/patina-text-to-material **URL:** https://app.inference.sh/apps/falai/patina-text-to-material **Category:** image Generates seamlessly tiling PBR materials up to 8K from a text prompt (optional image-to-image and inpainting) via fal.ai PATINA. --- ### falai/patina-image-to-material **URL:** https://app.inference.sh/apps/falai/patina-image-to-material **Category:** image Predicts seamless high-resolution PBR material maps (basecolor, normal, roughness, metalness, height) from a single input image via fal.ai PATINA. --- ### infsh/omnivoice **URL:** https://app.inference.sh/apps/infsh/omnivoice **Category:** audio Zero-shot text-to-speech with voice cloning for 600+ languages. --- ### alibaba/wan-2-7-videoedit **URL:** https://app.inference.sh/apps/alibaba/wan-2-7-videoedit **Category:** video Wan 2.7 Video Edit performs instruction-based video editing and style transfer using multimodal inputs (text, images, video) via DashScope API with 720P/1080P output --- ### alibaba/wan-2-7-r2v **URL:** https://app.inference.sh/apps/alibaba/wan-2-7-r2v **Category:** video Wan 2.7 Reference-to-Video generates videos featuring characters from reference images and videos, supporting multi-character interaction, voice timbre cloning, and first-frame control --- ### alibaba/wan-2-7-i2v **URL:** https://app.inference.sh/apps/alibaba/wan-2-7-i2v **Category:** video Wan 2.7 Image-to-Video generates videos from images using multi-modal input (text, images, audio, video). Supports first frame generation, first+last frame, and video continuation with 720P/1080P resolution --- ### alibaba/wan-2-7-t2v **URL:** https://app.inference.sh/apps/alibaba/wan-2-7-t2v **Category:** video Wan 2.7 Text-to-Video generates high-quality videos from text prompts using Alibaba's latest video generation model via DashScope API, supporting 720P/1080P resolution and up to 15 seconds duration --- ### infsh/image-resize **URL:** https://app.inference.sh/apps/infsh/image-resize **Category:** image Resize images by width, height, scale factor, or megapixel target --- ### pruna/p-image-upscale **URL:** https://app.inference.sh/apps/pruna/p-image-upscale **Category:** image AI-powered image upscaling up to 128 megapixels with detail and realism enhancement --- ### alibaba/wan-2-7-image-pro **URL:** https://app.inference.sh/apps/alibaba/wan-2-7-image-pro **Category:** image Wan 2.7 Image Pro is Alibaba's professional image generation model supporting text-to-image, image editing, and multi-reference generation with up to 4K high-definition output --- ### alibaba/wan-2-7-image **URL:** https://app.inference.sh/apps/alibaba/wan-2-7-image **Category:** image Wan 2.7 Image is Alibaba's fast image generation model supporting text-to-image, image editing, and multi-reference image generation with up to 2K resolution --- ### google/veo-3-1-lite **URL:** https://app.inference.sh/apps/google/veo-3-1-lite **Category:** video Veo 3.1 Lite via Gemini API - Lightweight video generation with text and image input, audio support --- ### phota/train **URL:** https://app.inference.sh/apps/phota/train **Category:** other Train a Phota identity profile from 30-50 face images, poll status, list and delete profiles --- ### x/post-thread **URL:** https://app.inference.sh/apps/x/post-thread **Category:** social Create threaded posts on X.com. Provide 2-25 tweets that are posted sequentially as a reply chain. Each tweet supports text (280 char limit) and optional media (up to 4 images or 1 video/GIF). Images over 5MB are auto-resized. --- ### phota/edit **URL:** https://app.inference.sh/apps/phota/edit **Category:** image Edit images with text prompts while preserving identity of known subjects --- ### phota/generate **URL:** https://app.inference.sh/apps/phota/generate **Category:** image Generate images from text prompts with identity-preserved subjects via [[profile_id]] syntax --- ### phota/enhance **URL:** https://app.inference.sh/apps/phota/enhance **Category:** image Automatically enhance photo quality — lighting, composition, color, and sharpness --- # Additional Resources - Website: https://inference.sh - Documentation: https://inference.sh/docs - Blog: https://inference.sh/blog - Apps: https://inference.sh/apps - GitHub: https://github.com/inference-sh - Python SDK: https://pypi.org/project/inferencesh/ - npm SDK: https://www.npmjs.com/package/inferencesh/