apps/klingai

klingai

15 apps available on inference.sh.
run via API, SDK, or belt CLI.

15apps

3categories

image-v2

Kling Image V2 (Kolors V2.0) - text-to-image with 2K resolution, multi-image reference, and restyle. Restyle output matches input resolution.

image

image-o1

Kling Image O1 (Kolors Image-O1) - omni image generation with element control. Text-to-image and image-to-image at 1K/2K. $0.028/image.

image

image-v2

kling image v2 (kolors v2.0) - text-to-image with 2k resolution, multi-image reference, and restyle. restyle output matches input resolution.

image

image-o1

kling image o1 (kolors image-o1) - omni image generation with element control. text-to-image and image-to-image at 1k/2k. $0.028/image.

image

image-3o

kling image 3o (kolors image-3o) - most capable image model with native 4k, series-image generation, and element control. $0.028/image (4k $0.056).

image

image-v1

kling image v1 (kolors v1.0) - basic text-to-image and image-to-image generation. cheapest option at $0.0035/image.

image

image-v1-5

kling image v1.5 (kolors v1.5) - text-to-image with subject and face reference for character consistency. generate images preserving a person's appearance.

image

image-v2-1

kling image v2.1 (kolors v2.1) - text-to-image and multi-image reference generation. combine multiple images for complex compositions.

image

image-v3

kling image v3 (kolors v3.0) - latest image generation model with 1k/2k resolution support. highest quality text-to-image.

image

virtual-tryon

kling virtual try-on - ai clothing try-on from a person photo and clothing image. supports single items and upper+lower combos (v1.5). $0.07 per generation.

image

video

video-v3

kling v3.0 - latest and most capable video generation model. native 4k output, multi-shot generation, flexible 3-15s duration billed per second, element control, motion control, and synchronized audio.

video

video-v2-6

kling v2.6 video generation with native sound and voice control. supports text-to-video and image-to-video with start/end frames, synchronized audio generation, and voice-driven character animation.

video

video-o1

kling video o1 (omni) - unified video generation with text, image references, start/end frames, element references, and video references for editing and style transfer. the most capable kling model.

video

video-v2-5

kling v2.5 turbo - fast video generation from text and images. supports start/end frame interpolation in pro mode. optimized for speed while maintaining high quality at up to 1080p.

video

lip-sync

kling lip sync - drive mouth movements in videos using text or audio. ideal for dubbing, adding speech to silent videos, or replacing dialogue.

video

avatar

kling avatar - generate digital human broadcast-style talking head videos from a single face photo. provide text or audio for the avatar to speak.

video

audio

video-to-audio

kling video-to-audio - add generated sound effects, ambient audio, or music to any video. works with kling-generated and user-uploaded videos (3-20s).

audio

explore more on inference.sh

klingai is one of many providers on the grid. discover 250+ apps across image, video, audio, and more.

browse all apps api docs

we use cookies

we use cookies to ensure you get the best experience on our website. for more information on how we use cookies, please see our cookie policy.

by clicking "accept", you agree to our use of cookies.
learn more.