Model Library

Browse all available AI models

Volcengine
Bytedance

seedance-2-0

Text → VideoImage → Video

Seedance 2.0 is a multimodal controllable video generation model developed by ByteDance’s Seed Team. Launched in early February 2026, it is now integrated into Doubao, Jimeng AI, and Volcano Engine (Model ID: doubao-seedance-2-0-260128), with an accelerated version—Seedance 2.0 Fast—available for low-latency scenarios.

OpenAI
OpenAI

gpt-image-2

Text → ImageImage → Image

OpenAI's latest image generation and editing model with a reasoning thinking mode achieving 99%+ text rendering accuracy, supporting up to 2K resolution, flexible aspect ratios, and multilingual text, deeply integrated into ChatGPT and API.

Kling
Kling

kling-v3

Text → VideoImage → Video

Kuaishou's unified multimodal video model with native 4K/60fps output, AI Director multi-shot storyboarding, multilingual native audio, and ultimate character consistency, unifying video understanding, generation, and editing in one workflow.

Gemini
Google

gemini-3.1-pro-preview

Chat

Google's frontier reasoning model, leading on novel logic (ARC-AGI-2 77.1%), graduate-level science (GPQA Diamond 94.3%) and agentic reliability, with a 1M-token context and full multimodal input — built for complex software engineering, long-document analysis and autonomous agent workflows.

Anthropic
Anthropic

claude-opus-4-8

Chat

Anthropic's flagship LLM for agentic coding, multidisciplinary reasoning, and knowledge work, with sharper judgment and stronger honesty, a 1M-token context window, and fast mode—built for code migration, deep research, and legal/financial analysis.

DeepSeek
DeepSeek

deepseek-v4-flash

Chat

DeepSeek's open-source efficiency-tier flagship — 284B MoE with 13B active parameters, 1M context, dual thinking/non-thinking modes, priced at a fraction of closed frontier models for top price-performance.

Anthropic
Anthropic

claude-opus-4-7

Chat

Anthropic's flagship LLM excelling at long-horizon agents and coding, scoring 87.6% on SWE-bench Verified, with 1M context, adaptive thinking, and high-res vision — for complex coding and enterprise workflows.

Anthropic
Anthropic

claude-opus-4-6

Chat

Anthropic's frontier flagship model built for deep reasoning, long-horizon agentic coding, and enterprise knowledge work — tops Terminal-Bench 2.0 and HLE with 1M context and adaptive thinking for the hardest tasks.

Anthropic
Anthropic

claude-opus-4-5-20251101

Chat

Anthropic's 4th-gen flagship setting state-of-the-art on SWE-Bench Verified and ARC-AGI-2, introducing effort control and context compaction, with industry-leading prompt-injection robustness — built for coding, reasoning, and enterprise agents.

Anthropic
Anthropic

claude-sonnet-4-6

Chat

Anthropic's high-value mid-tier model approaching Opus-level performance on coding, computer use, and agent orchestration, with 1M context and adaptive thinking — built for scaled coding, desktop automation, and enterprise agents.

Z.aiZ.ai

glm-5.1

Chat

Z.ai's open-source flagship LLM with 754B MoE architecture, purpose-built for agentic coding and long-horizon tasks, capable of 8-hour autonomous execution and topping SWE-Bench Pro.

Qwen
Alibaba

happyhorse-1.0

Text → VideoImage → Video

Alibaba ATH Innovation Unit's 15B video model — the first with native joint audio-video generation including dialogue, ambient sound, and lip-sync — #1 on the Artificial Analysis Video Arena, built for short-form, ads, and dialogue-driven video.

DeepSeek
DeepSeek

deepseek-v4-pro

Chat

DeepSeek's open-weight flagship 1.6T MoE model with hybrid sparse attention, 1M-token context, competition-grade coding (LiveCodeBench 93.5), and ultra-low pricing — built for code, reasoning, and long-horizon agents.

MoonshotAI
Moonshot

kimi-k2.6

Chat

Moonshot AI's open-weight 1T-parameter MoE multimodal agentic model, excelling in long-horizon coding, 300-sub-agent swarm orchestration, and native tool use — built for autonomous software engineering and agent workflows.

OpenAI
OpenAI

gpt-5.5

Chat

OpenAI's flagship frontier model — the first fully retrained base model since GPT-4.5, natively omnimodal with a 1M-token context, purpose-built for agentic coding, computer use, and knowledge work.

Gemini
Google

gemini-3.1-flash-image-preview

Text → ImageImage → Image

Nano Banana (gemini-3.1-flash-image-preview) is a highly multifunctional generative model designed for speed, accuracy, and creative flexibility. It excels at transforming text prompts into stunning visual effects while providing advanced composition and style transfer capabilities.

Gemini
Google

gemini-3-pro-image-preview

Text → ImageImage → Image

Google's flagship image generation and editing model built on Gemini 3 Pro, featuring up to 4K resolution, precise multilingual text rendering, real-time Google Search grounding, and studio-quality creative controls.

Gemini
Google

gemini-2.5-flash-image

Text → ImageImage → Image

Nano Banana (`gemini-2.5-flash-image`) is Google's natively multimodal image generation and editing model designed for speed and efficiency, supporting text-to-image, conversational editing, multi-image composition, and character consistency.

Volcengine
Bytedance

seedream-5.0-lite

Text → ImageImage → Image

ByteDance's intelligent image model with Chain of Thought visual reasoning and real-time web search, comprehensively upgraded in understanding, reasoning, and generation, supporting up to 4K output and 14-image reference editing.

Volcengine
Bytedance

seedream-4.0

Text → ImageImage → Image

ByteDance's multimodal image creation engine unifying text-to-image generation and editing in one architecture, supporting up to 4K output, 30+ art style switching, and multi-reference batch generation with 10x faster inference.

Volcengine
Bytedance

seedream-4.5

Text → ImageImage → Image

ByteDance's unified image generation and editing model with precise text rendering, native 4K resolution, and multi-image reference editing, supporting professional typography, material fidelity, and brand visual consistency for commercial design.

Minimax
MiniMax

MiniMax-Hailuo-2.3

Text → VideoImage → Video

MiniMax's upgraded video model built on Hailuo 02, enhancing complex body motion, facial micro-expressions, and physical realism, with expanded style support including anime, ink wash, and game CG — better quality at the same price.

Minimax
MiniMax

MiniMax-Hailuo-02

Text → VideoImage → Video

MiniMax's cinematic AI video generation model built on its proprietary NCR architecture with 2.5x efficiency gains, supporting native 1080p output, extreme physics simulation, and precise instruction following, ranked among the top global video models.

Qwen
Alibaba

wan2.6

Text → VideoImage → Video

Alibaba's multimodal video generation model series supporting role-play (reference-to-video), multi-shot narrative, audio-visual sync, and up to 15-second output, enabling creators to star in AI videos with their own appearance and voice.

Gemini
Google

gemini-2.5-flash

Chat

Google's next-gen lightweight multimodal model featuring ultra-fast inference, native multimodal fusion, and massive context capacity, optimized for high-frequency interactions, real-time responses, and cost-efficient data processing.

Minimax
MiniMax

minimax-m2.7

Chat

MiniMax's self-evolving 230B MoE model (10B active) with native Agent Teams, dynamic tool search, and end-to-end engineering delivery — MLE Bench Lite 66.6% trails only Opus 4.6/GPT-5.4, built for SRE, ML research, and office automation.

OpenAI
OpenAI

sora-2

Text → VideoImage → Video

OpenAI's flagship video and audio generation model built on a diffusion transformer architecture, featuring realistic physics simulation, synchronized audio-visual generation, and multi-shot controllability for cinematic content creation.