Supported Models
25 models across 10 providers. Last updated March 26, 2026.
Video Generation
9 modelsVeo 3.1
Google AI Studio
State-of-the-art video generation with audio. Text-to-video and image-to-video.
Veo 3.1 Fast
Google AI Studio
Faster inference variant of Veo 3.1 for rapid iteration.
Sora 2 Standard
OpenAI
High-quality video generation from text and image inputs.
Sora 2 Pro
OpenAI
Premium video generation with higher resolution and longer durations.
Kling 3.0
fal.ai
High-quality video generation with strong motion consistency.
Seedance 2.0
fal.ai
Dance and motion-focused video generation with natural movement.
Wan 2.2 (A14B)
fal.ai
Text-to-video with strong character consistency. 14B parameter model.
Wan 2.6
fal.ai
Latest Wan model with improved quality and longer durations.
HunyuanVideo (13B)
fal.ai
Tencent's 13B parameter video generation model via fal.ai.
Image Generation
4 modelsNano Banana Pro
Google AI Studio
High-quality image generation with excellent prompt adherence.
Nano Banana 2
Google AI Studio
Next-gen image generation with improved detail and consistency.
FLUX.2 Pro
Black Forest Labs
High-quality image generation with excellent prompt adherence and detail.
FLUX.2 klein 4B
Local (Sidecar)
Lightweight FLUX model for local image generation on Mac (Metal/MPS).
LLMs
5 modelsGemini Pro (latest)
Google AI Studio
Advanced reasoning and multimodal understanding with 1M token context.
Gemini Flash (latest)
Google AI Studio
Fast, cost-efficient model for everyday tasks. 1M context.
GPT-5
OpenAI
Advanced LLM with strong reasoning and 128K context window.
Claude Sonnet 4.6
Anthropic
High-capability model with excellent instruction following and 200K context.
Qwen3 4B
Local (Ollama)
Compact LLM for script writing and AI assistant. Runs fully on-device.
Audio / TTS
2 modelsMultilingual v2
ElevenLabs
High-quality multilingual voice cloning and text-to-speech.
Turbo v2.5
ElevenLabs
Low-latency voice synthesis for fast iteration and previews.
Transcription
5 modelsGemini Transcription
Google AI Studio
Cloud-based transcription via Gemini with high accuracy.
GPT-5 Transcription
OpenAI
Cloud-based transcription via OpenAI with word-level timestamps.
Scribe v1
ElevenLabs
ElevenLabs transcription with speaker detection.
Whisper Tiny
Local (bundled)
Lightweight on-device transcription. Bundled with Skia, no download required.
Whisper Small
Local (download)
Higher-accuracy on-device transcription. One-time download (~500 MB).
Start creating with these models.
Join the waitlist for early access and founding member pricing.
We'll email you when Skia is ready. No spam. Unsubscribe anytime.