app library
the grid
250+ tools that run serverless on CPU or GPU.
call directly via API or let agents orchestrate them.
featured
all apps

openai/gpt-image-2
generate and edit images using openai's gpt image 2 model. supports text-to-image, image editing with reference images, and mask-based inpainting.

falai/patina-extract-material
extracts a seamlessly tiling texture plus pbr material maps from a region of a source image described by a text prompt, via fal.ai patina.

falai/patina-text-to-material
generates seamlessly tiling pbr materials up to 8k from a text prompt (optional image-to-image and inpainting) via fal.ai patina.

falai/patina-image-to-material
predicts seamless high-resolution pbr material maps (basecolor, normal, roughness, metalness, height) from a single input image via fal.ai patina.

falai/seedance-2-t2v
generate videos with synchronized audio from text prompts using bytedance's seedance 2.0. supports quality and fast modes.

falai/seedance-2-r2v
generate videos from reference images, videos, and audio using bytedance's seedance 2.0. reference inputs as @image1, @video1, @audio1 in the prompt. supports quality and fast modes.

falai/seedance-2-i2v
generate videos with synchronized audio from images using bytedance's seedance 2.0. supports start/end frame control and quality/fast modes.

alibaba/wan-2-7-videoedit
wan 2.7 video edit performs instruction-based video editing and style transfer using multimodal inputs (text, images, video) via dashscope api with 720p/1080p output

alibaba/wan-2-7-r2v
wan 2.7 reference-to-video generates videos featuring characters from reference images and videos, supporting multi-character interaction, voice timbre cloning, and first-frame control

alibaba/wan-2-7-i2v
wan 2.7 image-to-video generates videos from images using multi-modal input (text, images, audio, video). supports first frame generation, first+last frame, and video continuation with 720p/1080p resolution

alibaba/wan-2-7-t2v
wan 2.7 text-to-video generates high-quality videos from text prompts using alibaba's latest video generation model via dashscope api, supporting 720p/1080p resolution and up to 15 seconds duration

infsh/image-resize
resize images by width, height, scale factor, or megapixel target

pruna/p-image-upscale
ai-powered image upscaling with detail and realism enhancement

alibaba/wan-2-7-image-pro
wan 2.7 image pro is alibaba's professional image generation model supporting text-to-image, image editing, and multi-reference generation with up to 4k high-definition output

alibaba/wan-2-7-image
wan 2.7 image is alibaba's fast image generation model supporting text-to-image, image editing, and multi-reference image generation with up to 2k resolution

google/veo-3-1-lite
veo 3.1 lite via gemini api - lightweight video generation with text and image input, audio support

phota/train
train a phota identity profile from 30-50 face images, poll status, list and delete profiles

x/post-thread
create threaded posts on x.com. provide 2-25 tweets that are posted sequentially as a reply chain. each tweet supports text (280 char limit) and optional media (up to 4 images or 1 video/gif). images over 5mb are auto-resized.

phota/edit
edit images with text prompts while preserving identity of known subjects

phota/generate
generate images from text prompts with identity-preserved subjects via [[profile_id]] syntax

phota/enhance
automatically enhance photo quality — lighting, composition, color, and sharpness

xai/grok-extend-video
extend existing videos using xai's grok imagine video model. takes an existing video and generates additional frames to continue it with prompt guidance.

xai/grok-reference-video
generate videos using reference images for style and content guidance with xai's grok imagine video model. provide reference images to influence the visual style of generated videos.

xai/grok-tts
convert text into natural speech using xai's text to speech api. supports multiple voices, expressive speech tags, and mp3/wav/pcm output formats.

elevenlabs/forced-alignment
elevenlabs forced alignment - align text to audio with word timestamps

elevenlabs/text-to-dialogue
elevenlabs text to dialogue - generate immersive multi-voice dialogue

elevenlabs/dubbing
elevenlabs dubbing - automatically dub audio/video to other languages

elevenlabs/music
elevenlabs music - generate studio-quality music from text prompts

elevenlabs/sound-effects
elevenlabs sound effects - generate custom sound effects from text

elevenlabs/voice-isolator
elevenlabs voice isolator - remove background noise from audio

elevenlabs/voice-changer
elevenlabs voice changer - transform voice in audio to a different voice

elevenlabs/stt
elevenlabs speech to text (scribe) - high-accuracy transcription with diarization

elevenlabs/tts
elevenlabs text to speech - high-quality multilingual voice synthesis

pruna/wan-i2v
transform static images into animated videos with text prompts

pruna/wan-t2v
generate videos directly from text descriptions in 480p or 720p

pruna/qwen-image-edit-plus
edit images using text instructions with multi-image support and pose transfer

pruna/flux-klein-4b
lightweight 4b parameter model with excellent speed-to-quality ratio

pruna/z-image-turbo-lora
fast generation with lora support for unique styles and personalized outputs

pruna/z-image-turbo
ultra-fast turbo image generation with minimal latency

pruna/qwen-image-fast
fast qwen-based image generation with creativity control

pruna/qwen-image
advanced text-to-image generation with optional lora weights and prompt enhancement

pruna/wan-image-small
fast, efficient text-to-image optimized for rapid prototyping and batch generation

pruna/flux-dev-lora
text-to-image and image-to-image generation with custom lora weights from huggingface

pruna/flux-dev
advanced text-to-image generation with multiple aspect ratios, speed optimizations, and high-quality outputs

pruna/p-video
fast text-to-video and image-to-video in 720p/1080p with audio support

pruna/p-image-edit-lora
fast image editing with custom lora styles for unique transformations

pruna/p-image-lora
pruna's flagship fast text-to-image with custom lora style support

pruna/p-image
pruna's flagship fast text-to-image with multiple aspect ratios and prompt enhancement

pruna/p-image-edit
fast image editing with text instructions and multi-image support

infsh/shell
execute shell commands in a sandboxed environment. run grep, sed, ls, find, and other cli tools with configurable working directory and timeout.

bytedance/seedream-5-lite
generate high-quality 2k-3k images from text prompts with single or multi-image input. supports text-to-image, image-to-image, and multi-reference image blending using bytedance's seedream 5 lite model via byteplus ark api.

alibaba/qwen-image-2-pro
qwen-image-2.0 pro offers enhanced text rendering, fine-grained realism, photorealistic scenes, and stronger semantic adherence for professional image generation and editing

alibaba/qwen-image-2
qwen-image-2.0 is alibaba's multimodal image generation model that integrates image generation and editing with enhanced text-rendering, realistic textures, and photorealistic scenes

google/gemini-3-1-flash-image-preview
gemini 3.1 flash image preview (nanobanana 2) via vertex ai - advanced image generation model powered by google cloud

openrouter/gpt-oss-safeguard-20b
gpt-oss safeguard 20b

infsh/remotion-render
render videos from react/remotion component code — pass tsx, get mp4

openrouter/minimax-m-25
minimax-m2.5 is a sota large language model designed for real-world productivity. trained in a diverse range of complex real-world digital working environments, m2.5 builds upon the coding expertise of m2.1 to extend into general office work, reaching fluency in generating and operating word, excel, and powerpoint files, context switching between diverse software environments, and working across different agent and human teams.

falai/kokoro-tts
kokoro tts - lightweight text-to-speech with multiple languages and voices

xai/grok-imagine-image-pro
generate and edit images using xai's grok imagine pro model. supports text-to-image and image editing with multiple aspect ratios.

openrouter/claude-opus-46
opus 4.6 is anthropic’s strongest model for coding and long-running professional tasks. it is built for agents that operate across entire workflows rather than single prompts, making it especially effective for large codebases, complex refactors, and multi-step debugging that unfolds over time.

falai/dia-tts
dia tts - generate realistic dialogue with emotion control, natural nonverbals, and voice cloning

infsh/agent-browser
browser automation for ai agents. navigate, interact with @e refs, take screenshots, record video with cursor indicator, execute javascript. supports proxy configuration.

x/dm-send
send a direct message on x.com. requires the recipient's user id (not username). text-only messages; media attachments are not supported.

x/user-follow
follow a user on x.com by user id. succeeds silently if already following.

x/user-get
get a user profile from x.com by id or username. returns bio, follower/following counts, tweet count, verified status, and profile image url.

x/post-retweet
retweet a post on x.com by post id. succeeds silently if already retweeted.

x/post-like
like a post on x.com by post id. succeeds silently if already liked.

x/post-delete
delete a post from x.com by post id. can only delete posts authored by the authenticated account.

x/post-get
get a post by id from x.com. returns text, author id, creation date, and engagement metrics (likes, retweets, replies, quotes).

x/post-create
create posts on x.com with text (280 char limit) and optional media. supports up to 4 images or 1 video/gif. can reply to or quote other posts. images over 5mb are auto-resized.

xai/grok-imagine-video
generate and edit videos using xai's grok imagine video model. supports text-to-video, image-to-video, and video editing with configurable duration and resolution.

xai/grok-imagine-image
generate and edit images using xai's grok imagine model. supports text-to-image and image editing with multiple aspect ratios.

google/veo-3-1
veo 3.1 via vertex ai - advanced video generation with frame interpolation, reference images, and audio generation

google/veo-3-fast
veo 3 fast via vertex ai - fast video generation with audio from text prompts and images

google/veo-3
veo 3 via vertex ai - generate videos with audio from text prompts and images

google/veo-2
veo 2 via vertex ai - generate high-quality realistic videos from text prompts

google/veo-3-1-fast
veo 3.1 fast via vertex ai - generate videos from text prompts or images with optional audio

falai/flux-dev-lora
text-to-image and image-to-image generation with flux.1 [dev] lora support. custom style adaptation and fine-tuned model variations from black forest labs.

falai/flux-2-klein-lora
text-to-image and image-to-image generation with flux.2 [klein] lora support. available in 4b and 9b parameter sizes. custom style adaptation and fine-tuned model variations from black forest labs.

bytedance/omnihuman-1-5
multi-character audio-driven avatar video generation. takes a portrait image + audio and generates a video where the person speaks/sings in sync. supports specifying which character to drive.

bytedance/omnihuman-1-0
audio-driven avatar video generation. takes a portrait image + audio and generates a video where the person speaks/sings in sync with the audio.

bytedance/seedream-3-0-t2i
generate cinematic quality images from text prompts with accurate text rendering using bytedance's seedream 3.0 t2i model via byteplus ark api.

bytedance/seedream-4-0
generate high-quality 2k-4k images from text prompts with optional image-to-image generation using bytedance's seedream 4.0 model via byteplus ark api.

bytedance/seedream-4-5
generate high-quality 2k-4k images from text prompts with optional image-to-image generation using bytedance's seedream 4.5 model via byteplus ark api.

bytedance/seedance-1-0-pro
generate high-quality videos up to 1080p from text prompts with optional first-frame image control using bytedance's seedance 1.0 pro model.

bytedance/seedance-1-0-pro-fast
fast high-quality video generation up to 1080p from text prompts with optional first-frame image control using bytedance's seedance 1.0 pro fast model.

bytedance/seedance-1-5-pro
generate high-quality videos from text prompts with optional first-frame image control using bytedance's seedance 1.5 pro model via byteplus ark api.

falai/imagine-art-1-5-pro-preview
advanced text-to-image model creating ultra-high-fidelity 4k visuals with lifelike realism and refined aesthetics.

google/gemini-2-5-flash-image
gemini 2.5 flash image (nanobanana) via vertex ai - advanced image generation model powered by google cloud

google/gemini-3-pro-image-preview
gemini 3 pro image preview (nanobanana pro) via vertex ai - advanced image generation model powered by google cloud

tavily/search-assistant
a search assistant that browses the internet to deliver comprehensive results, including ai-generated answers, images, and detailed sources.

openrouter/kimi-k2-thinking
a powerful open-source thinking agent that excels at complex, multi-step problem-solving and consistently uses tools effectively over extended operations.

infsh/caption-videos
add captions to videos using an existing caption file, such as those generated by a speech-to-text service.

tavily/extract
extracts clean, readable content, including text and images, from specified webpages, supporting batch processing for multiple urls.

infsh/array-switch
allows you to choose between two different inputs based on a condition applied to an array of data.

infsh/extract-last-frame
save a specific frame from the end of a video as a static image file.

infsh/dia-tts
dia tts - generate realistic dialogue with emotion control, natural nonverbals, and voice cloning

infsh/media-merger
merges multiple videos and images together using customized transitions.

infsh/array-element-switch
selects one of two possible inputs based on a comparison check within an array.

infsh/mask-image
combines two images—a main image and a semi-transparent mask—to selectively hide or reveal parts of the main image, creating a partially transparent result.
not enough? create new apps fast. templates + coding agents make it insanely extensible.
create your own apps
start from templates. add code, packages, docs. deploy in minutes.
schemas become tool parameters automatically. your app shows up in the grid and can be used by agents and workflows.
create workflows
build a graph of apps. deploy as a single callable app.

drag and drop to build the graph. map io to connect steps. deploy as an app.
we use cookies
we use cookies to ensure you get the best experience on our website. for more information on how we use cookies, please see our cookie policy.
by clicking "accept", you agree to our use of cookies.
learn more.

