Model Library
Browse all available AI models
seedance-2-0
Seedance 2.0 is a multimodal controllable video generation model developed by ByteDance’s Seed Team. Launched in early February 2026, it is now integrated into Doubao, Jimeng AI, and Volcano Engine (Model ID: doubao-seedance-2-0-260128), with an accelerated version—Seedance 2.0 Fast—available for low-latency scenarios.
gpt-image-2
OpenAI's latest image generation and editing model with a reasoning thinking mode achieving 99%+ text rendering accuracy, supporting up to 2K resolution, flexible aspect ratios, and multilingual text, deeply integrated into ChatGPT and API.
kling-v3
Kuaishou's unified multimodal video model with native 4K/60fps output, AI Director multi-shot storyboarding, multilingual native audio, and ultimate character consistency, unifying video understanding, generation, and editing in one workflow.
gemini-3.1-pro-preview
Google's frontier reasoning model, leading on novel logic (ARC-AGI-2 77.1%), graduate-level science (GPQA Diamond 94.3%) and agentic reliability, with a 1M-token context and full multimodal input — built for complex software engineering, long-document analysis and autonomous agent workflows.
claude-opus-4-8
Anthropic's flagship LLM for agentic coding, multidisciplinary reasoning, and knowledge work, with sharper judgment and stronger honesty, a 1M-token context window, and fast mode—built for code migration, deep research, and legal/financial analysis.
deepseek-v4-flash
DeepSeek's open-source efficiency-tier flagship — 284B MoE with 13B active parameters, 1M context, dual thinking/non-thinking modes, priced at a fraction of closed frontier models for top price-performance.
claude-opus-4-7
Anthropic's flagship LLM excelling at long-horizon agents and coding, scoring 87.6% on SWE-bench Verified, with 1M context, adaptive thinking, and high-res vision — for complex coding and enterprise workflows.
claude-opus-4-6
Anthropic's frontier flagship model built for deep reasoning, long-horizon agentic coding, and enterprise knowledge work — tops Terminal-Bench 2.0 and HLE with 1M context and adaptive thinking for the hardest tasks.
claude-opus-4-5-20251101
Anthropic's 4th-gen flagship setting state-of-the-art on SWE-Bench Verified and ARC-AGI-2, introducing effort control and context compaction, with industry-leading prompt-injection robustness — built for coding, reasoning, and enterprise agents.
claude-sonnet-4-6
Anthropic's high-value mid-tier model approaching Opus-level performance on coding, computer use, and agent orchestration, with 1M context and adaptive thinking — built for scaled coding, desktop automation, and enterprise agents.
glm-5.1
Z.ai's open-source flagship LLM with 754B MoE architecture, purpose-built for agentic coding and long-horizon tasks, capable of 8-hour autonomous execution and topping SWE-Bench Pro.
happyhorse-1.0
Alibaba ATH Innovation Unit's 15B video model — the first with native joint audio-video generation including dialogue, ambient sound, and lip-sync — #1 on the Artificial Analysis Video Arena, built for short-form, ads, and dialogue-driven video.
deepseek-v4-pro
DeepSeek's open-weight flagship 1.6T MoE model with hybrid sparse attention, 1M-token context, competition-grade coding (LiveCodeBench 93.5), and ultra-low pricing — built for code, reasoning, and long-horizon agents.
kimi-k2.6
Moonshot AI's open-weight 1T-parameter MoE multimodal agentic model, excelling in long-horizon coding, 300-sub-agent swarm orchestration, and native tool use — built for autonomous software engineering and agent workflows.
gpt-5.5
OpenAI's flagship frontier model — the first fully retrained base model since GPT-4.5, natively omnimodal with a 1M-token context, purpose-built for agentic coding, computer use, and knowledge work.
gemini-3.1-flash-image-preview
Nano Banana (gemini-3.1-flash-image-preview) is a highly multifunctional generative model designed for speed, accuracy, and creative flexibility. It excels at transforming text prompts into stunning visual effects while providing advanced composition and style transfer capabilities.
gemini-3-pro-image-preview
Google's flagship image generation and editing model built on Gemini 3 Pro, featuring up to 4K resolution, precise multilingual text rendering, real-time Google Search grounding, and studio-quality creative controls.
gemini-2.5-flash-image
Nano Banana (`gemini-2.5-flash-image`) is Google's natively multimodal image generation and editing model designed for speed and efficiency, supporting text-to-image, conversational editing, multi-image composition, and character consistency.
seedream-5.0-lite
ByteDance's intelligent image model with Chain of Thought visual reasoning and real-time web search, comprehensively upgraded in understanding, reasoning, and generation, supporting up to 4K output and 14-image reference editing.
seedream-4.0
ByteDance's multimodal image creation engine unifying text-to-image generation and editing in one architecture, supporting up to 4K output, 30+ art style switching, and multi-reference batch generation with 10x faster inference.
seedream-4.5
ByteDance's unified image generation and editing model with precise text rendering, native 4K resolution, and multi-image reference editing, supporting professional typography, material fidelity, and brand visual consistency for commercial design.
MiniMax-Hailuo-2.3
MiniMax's upgraded video model built on Hailuo 02, enhancing complex body motion, facial micro-expressions, and physical realism, with expanded style support including anime, ink wash, and game CG — better quality at the same price.
MiniMax-Hailuo-02
MiniMax's cinematic AI video generation model built on its proprietary NCR architecture with 2.5x efficiency gains, supporting native 1080p output, extreme physics simulation, and precise instruction following, ranked among the top global video models.
wan2.6
Alibaba's multimodal video generation model series supporting role-play (reference-to-video), multi-shot narrative, audio-visual sync, and up to 15-second output, enabling creators to star in AI videos with their own appearance and voice.
gemini-2.5-flash
Google's next-gen lightweight multimodal model featuring ultra-fast inference, native multimodal fusion, and massive context capacity, optimized for high-frequency interactions, real-time responses, and cost-efficient data processing.
minimax-m2.7
MiniMax's self-evolving 230B MoE model (10B active) with native Agent Teams, dynamic tool search, and end-to-end engineering delivery — MLE Bench Lite 66.6% trails only Opus 4.6/GPT-5.4, built for SRE, ML research, and office automation.
sora-2
OpenAI's flagship video and audio generation model built on a diffusion transformer architecture, featuring realistic physics simulation, synchronized audio-visual generation, and multi-shot controllability for cinematic content creation.