Model Library

OpenAI's latest image generation and editing model with a reasoning thinking mode achieving 99%+ text rendering accuracy, supporting up to 2K resolution, flexible aspect ratios, and multilingual text, deeply integrated into ChatGPT and API.

Kling

kling-v3

Kuaishou's unified multimodal video model with native 4K/60fps output, AI Director multi-shot storyboarding, multilingual native audio, and ultimate character consistency, unifying video understanding, generation, and editing in one workflow.

gemini-3.1-pro-preview

Google's frontier reasoning model, leading on novel logic (ARC-AGI-2 77.1%), graduate-level science (GPQA Diamond 94.3%) and agentic reliability, with a 1M-token context and full multimodal input — built for complex software engineering, long-document analysis and autonomous agent workflows.

claude-opus-4-8

Anthropic's flagship LLM for agentic coding, multidisciplinary reasoning, and knowledge work, with sharper judgment and stronger honesty, a 1M-token context window, and fast mode—built for code migration, deep research, and legal/financial analysis.

DeepSeek

deepseek-v4-flash

DeepSeek's open-source efficiency-tier flagship — 284B MoE with 13B active parameters, 1M context, dual thinking/non-thinking modes, priced at a fraction of closed frontier models for top price-performance.

claude-opus-4-7

Anthropic's flagship LLM excelling at long-horizon agents and coding, scoring 87.6% on SWE-bench Verified, with 1M context, adaptive thinking, and high-res vision — for complex coding and enterprise workflows.

claude-opus-4-6

Anthropic's frontier flagship model built for deep reasoning, long-horizon agentic coding, and enterprise knowledge work — tops Terminal-Bench 2.0 and HLE with 1M context and adaptive thinking for the hardest tasks.

claude-opus-4-5-20251101

Anthropic's 4th-gen flagship setting state-of-the-art on SWE-Bench Verified and ARC-AGI-2, introducing effort control and context compaction, with industry-leading prompt-injection robustness — built for coding, reasoning, and enterprise agents.

claude-sonnet-4-6

Anthropic's high-value mid-tier model approaching Opus-level performance on coding, computer use, and agent orchestration, with 1M context and adaptive thinking — built for scaled coding, desktop automation, and enterprise agents.

Z.ai

glm-5.1

Z.ai's open-source flagship LLM with 754B MoE architecture, purpose-built for agentic coding and long-horizon tasks, capable of 8-hour autonomous execution and topping SWE-Bench Pro.

Alibaba

happyhorse-1.0

Alibaba ATH Innovation Unit's 15B video model — the first with native joint audio-video generation including dialogue, ambient sound, and lip-sync — #1 on the Artificial Analysis Video Arena, built for short-form, ads, and dialogue-driven video.

DeepSeek

deepseek-v4-pro

DeepSeek's open-weight flagship 1.6T MoE model with hybrid sparse attention, 1M-token context, competition-grade coding (LiveCodeBench 93.5), and ultra-low pricing — built for code, reasoning, and long-horizon agents.

Moonshot

kimi-k2.6

Moonshot AI's open-weight 1T-parameter MoE multimodal agentic model, excelling in long-horizon coding, 300-sub-agent swarm orchestration, and native tool use — built for autonomous software engineering and agent workflows.

OpenAI

gpt-5.5

OpenAI's flagship frontier model — the first fully retrained base model since GPT-4.5, natively omnimodal with a 1M-token context, purpose-built for agentic coding, computer use, and knowledge work.

gemini-3.1-flash-image-preview

Nano Banana (gemini-3.1-flash-image-preview) is a highly multifunctional generative model designed for speed, accuracy, and creative flexibility. It excels at transforming text prompts into stunning visual effects while providing advanced composition and style transfer capabilities.

gemini-3-pro-image-preview

Google's flagship image generation and editing model built on Gemini 3 Pro, featuring up to 4K resolution, precise multilingual text rendering, real-time Google Search grounding, and studio-quality creative controls.

gemini-2.5-flash-image

Nano Banana (`gemini-2.5-flash-image`) is Google's natively multimodal image generation and editing model designed for speed and efficiency, supporting text-to-image, conversational editing, multi-image composition, and character consistency.

Bytedance

seedream-5.0-lite

ByteDance's intelligent image model with Chain of Thought visual reasoning and real-time web search, comprehensively upgraded in understanding, reasoning, and generation, supporting up to 4K output and 14-image reference editing.

Bytedance

seedream-4.0

ByteDance's multimodal image creation engine unifying text-to-image generation and editing in one architecture, supporting up to 4K output, 30+ art style switching, and multi-reference batch generation with 10x faster inference.

Bytedance

seedream-4.5

ByteDance's unified image generation and editing model with precise text rendering, native 4K resolution, and multi-image reference editing, supporting professional typography, material fidelity, and brand visual consistency for commercial design.

MiniMax

MiniMax-Hailuo-2.3

MiniMax's upgraded video model built on Hailuo 02, enhancing complex body motion, facial micro-expressions, and physical realism, with expanded style support including anime, ink wash, and game CG — better quality at the same price.

MiniMax

MiniMax-Hailuo-02

MiniMax's cinematic AI video generation model built on its proprietary NCR architecture with 2.5x efficiency gains, supporting native 1080p output, extreme physics simulation, and precise instruction following, ranked among the top global video models.

Alibaba

wan2.6

Alibaba's multimodal video generation model series supporting role-play (reference-to-video), multi-shot narrative, audio-visual sync, and up to 15-second output, enabling creators to star in AI videos with their own appearance and voice.

gemini-2.5-flash

Google's next-gen lightweight multimodal model featuring ultra-fast inference, native multimodal fusion, and massive context capacity, optimized for high-frequency interactions, real-time responses, and cost-efficient data processing.

MiniMax

minimax-m2.7

MiniMax's self-evolving 230B MoE model (10B active) with native Agent Teams, dynamic tool search, and end-to-end engineering delivery — MLE Bench Lite 66.6% trails only Opus 4.6/GPT-5.4, built for SRE, ML research, and office automation.

OpenAI

sora-2