app library
the grid
150+ tools that run serverless on CPU or GPU.
call directly via API or let agents orchestrate them.
featured

gemini-3-1-flash-image-preview
Gemini 3.1 Flash Image Preview (NanoBanana 2) via Vertex AI - Advanced image generation model powered by Google Cloud

veo-3-1-fast
Veo 3.1 Fast via Vertex AI - Generate videos from text prompts or images with optional audio

hunyuanvideo-foley
Synthesizes realistic sound effects and audio tracks based on your video content and written descriptions.

flux-1-kontext-dev
Edits existing images using text instructions, allowing for changes in style, characters, or objects, and reliably handles multiple edits while maintaining image coherence.
all apps

infsh/shell
execute shell commands in a sandboxed environment. run grep, sed, ls, find, and other cli tools with configurable working directory and timeout.

bytedance/seedream-5-lite
generate high-quality 2k-3k images from text prompts with single or multi-image input. supports text-to-image, image-to-image, and multi-reference image blending using bytedance's seedream 5 lite model via byteplus ark api.

alibaba/qwen-image-2-pro
qwen-image-2.0 pro offers enhanced text rendering, fine-grained realism, photorealistic scenes, and stronger semantic adherence for professional image generation and editing

alibaba/qwen-image-2
qwen-image-2.0 is alibaba's multimodal image generation model that integrates image generation and editing with enhanced text-rendering, realistic textures, and photorealistic scenes

google/gemini-3-1-flash-image-preview
gemini 3.1 flash image preview (nanobanana 2) via vertex ai - advanced image generation model powered by google cloud

openrouter/gpt-oss-safeguard-20b
gpt-oss safeguard 20b

infsh/remotion-render
render videos from react/remotion component code — pass tsx, get mp4

openrouter/minimax-m-25
minimax-m2.5 is a sota large language model designed for real-world productivity. trained in a diverse range of complex real-world digital working environments, m2.5 builds upon the coding expertise of m2.1 to extend into general office work, reaching fluency in generating and operating word, excel, and powerpoint files, context switching between diverse software environments, and working across different agent and human teams.

falai/kokoro-tts
kokoro tts - lightweight text-to-speech with multiple languages and voices

xai/grok-imagine-image-pro
generate and edit images using xai's grok imagine pro model. supports text-to-image and image editing with multiple aspect ratios.

openrouter/claude-opus-46
opus 4.6 is anthropic’s strongest model for coding and long-running professional tasks. it is built for agents that operate across entire workflows rather than single prompts, making it especially effective for large codebases, complex refactors, and multi-step debugging that unfolds over time.

falai/dia-tts
dia tts - generate realistic dialogue with emotion control, natural nonverbals, and voice cloning

infsh/agent-browser
browser automation for ai agents. navigate, interact with @e refs, take screenshots, record video with cursor indicator, execute javascript. supports proxy configuration.

x/post-tweet
post tweets to x.com with text (280 char limit) and optional media. supports up to 4 images or 1 video/gif. can reply to or quote other tweets. images over 5mb are auto-resized.

x/dm-send
send a direct message on x.com. requires the recipient's user id (not username). text-only messages; media attachments are not supported.

x/user-follow
follow a user on x.com by user id. succeeds silently if already following.

x/user-get
get a user profile from x.com by id or username. returns bio, follower/following counts, tweet count, verified status, and profile image url.

x/post-retweet
retweet a post on x.com by post id. succeeds silently if already retweeted.

x/post-like
like a post on x.com by post id. succeeds silently if already liked.

x/post-delete
delete a post from x.com by post id. can only delete posts authored by the authenticated account.

x/post-get
get a post by id from x.com. returns text, author id, creation date, and engagement metrics (likes, retweets, replies, quotes).

x/post-create
create posts on x.com with text (280 char limit) and optional media. supports up to 4 images or 1 video/gif. can reply to or quote other posts. images over 5mb are auto-resized.

xai/grok-imagine-video
generate and edit videos using xai's grok imagine video model. supports text-to-video, image-to-video, and video editing with configurable duration and resolution.

xai/grok-imagine-image
generate and edit images using xai's grok imagine model. supports text-to-image and image editing with multiple aspect ratios.

google/veo-3-1
veo 3.1 via vertex ai - advanced video generation with frame interpolation, reference images, and audio generation

google/veo-3-fast
veo 3 fast via vertex ai - fast video generation with audio from text prompts and images

google/veo-3
veo 3 via vertex ai - generate videos with audio from text prompts and images

google/veo-2
veo 2 via vertex ai - generate high-quality realistic videos from text prompts

google/veo-3-1-fast
veo 3.1 fast via vertex ai - generate videos from text prompts or images with optional audio

falai/flux-dev-lora
text-to-image and image-to-image generation with flux.1 [dev] lora support. custom style adaptation and fine-tuned model variations from black forest labs.

falai/flux-2-klein-lora
text-to-image and image-to-image generation with flux.2 [klein] lora support. available in 4b and 9b parameter sizes. custom style adaptation and fine-tuned model variations from black forest labs.

bytedance/omnihuman-1-5
multi-character audio-driven avatar video generation. takes a portrait image + audio and generates a video where the person speaks/sings in sync. supports specifying which character to drive.

bytedance/omnihuman-1-0
audio-driven avatar video generation. takes a portrait image + audio and generates a video where the person speaks/sings in sync with the audio.

bytedance/seedream-3-0-t2i
generate cinematic quality images from text prompts with accurate text rendering using bytedance's seedream 3.0 t2i model via byteplus ark api.

bytedance/seedream-4-0
generate high-quality 2k-4k images from text prompts with optional image-to-image generation using bytedance's seedream 4.0 model via byteplus ark api.

bytedance/seedream-4-5
generate high-quality 2k-4k images from text prompts with optional image-to-image generation using bytedance's seedream 4.5 model via byteplus ark api.

bytedance/seedance-1-0-lite
lightweight 720p video generation. automatically uses image-to-video mode when an image is provided, or text-to-video mode otherwise.

bytedance/seedance-1-0-pro
generate high-quality videos up to 1080p from text prompts with optional first-frame image control using bytedance's seedance 1.0 pro model.

bytedance/seedance-1-0-pro-fast
fast high-quality video generation up to 1080p from text prompts with optional first-frame image control using bytedance's seedance 1.0 pro fast model.

bytedance/seedance-1-5-pro
generate high-quality videos from text prompts with optional first-frame image control using bytedance's seedance 1.5 pro model via byteplus ark api.

falai/imagine-art-1-5-pro-preview
advanced text-to-image model creating ultra-high-fidelity 4k visuals with lifelike realism and refined aesthetics.

google/gemini-2-5-flash-image
gemini 2.5 flash image (nanobanana) via vertex ai - advanced image generation model powered by google cloud

google/gemini-3-pro-image-preview
gemini 3 pro image preview (nanobanana pro) via vertex ai - advanced image generation model powered by google cloud

infsh/post-tweet
post tweets to x.com (twitter)

tavily/search-assistant
a search assistant that browses the internet to deliver comprehensive results, including ai-generated answers, images, and detailed sources.

openrouter/kimi-k2-thinking
a powerful open-source thinking agent that excels at complex, multi-step problem-solving and consistently uses tools effectively over extended operations.

infsh/caption-videos
add captions to videos using an existing caption file, such as those generated by a speech-to-text service.

infsh/hunyuanvideo-foley
synthesizes realistic sound effects and audio tracks based on your video content and written descriptions.

infsh/qwen3-30b-a3b
a powerful language application that excels at multilingual communication and complex task execution, designed for fast performance.

infsh/hidream-i1-full
generates high-quality images with state-of-the-art results.

infsh/flux-1-kontext-dev
edits existing images using text instructions, allowing for changes in style, characters, or objects, and reliably handles multiple edits while maintaining image coherence.

infsh/sdxl
generates and modifies images from text prompts, producing high-resolution, photorealistic results with superior detail and accuracy compared to previous versions.

infsh/flux-1-fill
fills designated areas in existing images based on a descriptive text input.

infsh/hidream-e1-full
edit images by simply telling it what changes you want, such as altering colors, backgrounds, or accessories with precision.

infsh/real-esrgan
enhances low-quality, degraded images and videos by upscaling resolution, reducing noise, and restoring fine details.

tavily/extract
extracts clean, readable content, including text and images, from specified webpages, supporting batch processing for multiple urls.

infsh/hidream-i1
an open-source tool for generating high-quality images in seconds.

infsh/array-switch
allows you to choose between two different inputs based on a condition applied to an array of data.

infsh/extract-last-frame
save a specific frame from the end of a video as a static image file.

infsh/fast-whisper-large-v3
quickly converts audio files into written text or translates them to a different language.

infsh/thera
upscales images to any size without blurriness or jagged edges, maintaining high detail through its unique neural heat field technology.

infsh/dia-tts
dia tts - generate realistic dialogue with emotion control, natural nonverbals, and voice cloning

infsh/media-merger
merges multiple videos and images together using customized transitions.

infsh/hunyuan-image-to-3d
generates detailed, high-resolution 3d assets quickly from simple text descriptions or reference images.

infsh/gemma-3-12b-it
developed by google, this open-source tool processes both text and images to answer questions, summarize content, perform reasoning tasks, and understand images.

infsh/flux-1-dev-upscaler
increases the resolution of images and videos, enhancing clarity and detail by up to 4x.

infsh/array-element-switch
selects one of two possible inputs based on a comparison check within an array.

infsh/xlam-2-32b-fc-r-i1
a system capable of advanced multi-step reasoning and a strong understanding of language and context to create actionable plans.

infsh/diffrythm
generates complete songs quickly and simply using advanced latent diffusion technology.

infsh/mask-image
combines two images—a main image and a semi-transparent mask—to selectively hide or reveal parts of the main image, creating a partially transparent result.

infsh/falconsai-nsfw-detection
detects nsfw content in images and videos using falconsai/nsfw_image_detection model. for videos, samples frames at configurable intervals.

infsh/phi-4-14b
a powerful tool developed by microsoft and trained on high-quality data to excel at complex tasks like advanced math, coding, and general problem-solving, offering detailed reasoning alongside solutions.

infsh/audio-x
generates audio from any input using a unified framework.

infsh/video-audio-merger
merge video and audio files easily, with the flexibility to keep the original audio from the video.

infsh/hunyuan-image-to-3d-2
create high-quality, photorealistic 3d models and textures from simple text prompts or 2d images.

infsh/text-to-file
creates a new document using the text and file name you specify.

infsh/devstral-small-2505
an agent for software engineering tasks, created by mistral ai and all hands ai.

infsh/python-executor
runs and executes python programming code in a safe environment.

infsh/sdxl-controlnet
generates images with high quality and precise control over the composition and structure.

infsh/search-assistant
helps users create and refine search queries, retrieve relevant results from various sources, and generate overviews or summaries of the information found.

exa/answer
generate llm-powered answers informed by exa search results

infsh/instant-character
generates images featuring a character with a consistent visual style across different scenes or poses.

falai/fabric-1-0
creates videos where an image appears to talk using advanced lip-sync technology.

infsh/cogview4-6b
generates high-quality images from text, capable of producing detailed visuals up to 2048x2048 resolution.

infsh/ltx-video
create high-quality, realistic, and customizable videos quickly, with capabilities for producing detailed, high-resolution content.

infsh/kokoro-tts
converts text into spoken audio.

infsh/boolean-switch
selects one of two possible inputs based on whether a condition is true or false.

infsh/wan2-1-t2v-effects
generates high-quality videos from text descriptions with dynamic motion and supports both chinese and english language prompts at 720p resolution.

falai/pixverse-lipsync
generates highly realistic lipsync animations from any audio input.

infsh/mistral-small-3-2-24b-it-2506
follows precise instructions, excels at function/tool calling, and can process both text and images for tasks like document understanding and content generation.

openrouter/intellect-3
intellect 3

infsh/gemma-3-27b-it
handles complex tasks like question answering, summarizing, and reasoning across both text and image inputs, with support for multiple languages.

infsh/wan2-2-i2i-a14b
creates videos from images and enhances video quality using a built-in upscaler.

exa/extract
extract and analyze web page content using exa's advanced content retrieval

openrouter/claude-opus-45
claude opus 4.5

openrouter/gemini-3-pro-preview
gemini 3 pro preview

infsh/gemma-3n-e4b-it
a fast and versatile tool that can analyze and respond to information from text, images, and audio, designed to run efficiently on small or limited devices.

openrouter/claude-sonnet-45
claude sonnet 4.5 is anthropic’s most advanced sonnet model to date, optimized for real-world agents and coding workflows. it delivers state-of-the-art performance on coding benchmarks such as swe-bench verified, with improvements across system design, code security, and specification adherence. the model is designed for extended autonomous operation, maintaining task continuity across sessions and providing fact-based progress tracking.

infsh/omni-zero
creates stylized portraits instantly without needing specific training data.

infsh/numerical-switch
selects one of two inputs based on a condition involving numerical comparison.
not enough? create new apps fast. templates + coding agents make it insanely extensible.
create your own apps
start from templates. add code, packages, docs. deploy in minutes.
schemas become tool parameters automatically. your app shows up in the grid and can be used by agents and workflows.
create workflows
build a graph of apps. deploy as a single callable app.

drag and drop to build the graph. map io to connect steps. deploy as an app.
we use cookies
we use cookies to ensure you get the best experience on our website. for more information on how we use cookies, please see our cookie policy.
by clicking "accept", you agree to our use of cookies.
learn more.