OpenAI/gpt-5
OpenAI's unified AI system combining a fast-response model and a deep-reasoning model via real-time routing, delivering state-of-the-art coding, math, multimodal understanding, and health reasoning for chat, development, and agentic tasks.
More from OpenAI
README
OpenAI/gpt-5
Supported Functionality
ItemSpecificationInputText, ImageOutputTextContextUp to 400,000 tokens (combined input + output) via API; ChatGPT product tiers vary (Free 8,000 / Plus 32,000 / Pro 128,000)Max Output128,000 tokensVision✓ SupportedFunction Calling✓ Supported
Description
Released by OpenAI in August 2025, GPT-5 was introduced as the company's "best AI system yet," consolidating the previously fragmented lineup of GPT-4o, o3, and other models into a single entry point. Rather than a single model, GPT-5 is a unified system composed of a fast, efficient model that handles most requests, a deeper reasoning model (GPT-5 thinking) for harder problems, and a real-time router that automatically selects which sub-model to invoke based on conversation type, task complexity, tool needs, and explicit user intent (e.g., "think hard about this").
Compared to prior models, GPT-5 delivers a substantial leap in coding, math, multimodal understanding, and real-world agentic tool use, while requiring markedly less thinking time and fewer tool calls to reach comparable results. With web search enabled, GPT-5's responses are roughly 45% less likely to contain a factual error than GPT-4o; in thinking mode, that figure rises to roughly 80% versus OpenAI o3, marking a notable reduction in hallucination.
Key Capabilities
Code generation & debugging: Scores 74.9% on SWE-bench Verified and 88% on Aider Polyglot multi-language code editing, well ahead of GPT-4.1 and o3. Mathematical reasoning: Achieves 94.6% on AIME 2025 (competition-level math, no tools), demonstrating strong pure reasoning ability. Multimodal understanding: Reaches 84.2% on MMMU, accurately interpreting charts, interface screenshots, and complex visual information. Health-related Q&A: Scores 46.2% on HealthBench Hard, proactively flagging potential health concerns and asking clarifying questions — the strongest health-focused model at release. Agentic tool use: Chains together dozens of tool calls, sequentially or in parallel, maintaining context across long, multi-step workflows while using fewer calls and less thinking time. Front-end development: In internal testing, GPT-5 outperformed o3 in front-end scenarios 70% of the time, producing cleaner layouts, typography, and spacing. Intelligent routing: A real-time router automatically judges task complexity and switches between fast response and deep reasoning without requiring the user to pick a model manually.
Technical Strengths
FeatureBenefitUnified system architectureEliminates the need to manually switch between models; the router dispatches tasks automatically for a more coherent experienceImproved reasoning efficiencyUses 50–80% fewer output tokens than o3 for comparable results, lowering cost and latencyLower hallucination rate~45% fewer factual errors with search, ~80% fewer in thinking mode, improving answer trustworthinessStrong code comprehensionSubstantial gains on SWE-bench and Aider Polyglot make it well-suited to real software engineering workHealth-domain tuningContinuously optimized against HealthBench, adapting responses to user knowledge level and geographyGPT-5 pro extended reasoningOffers longer reasoning chains for Pro users, reaching 88.4% on GPQA for the hardest professional questions
Capability Ratings
DimensionRatingNotesReasoningTop-tierNew highs on demanding benchmarks such as AIME and GPQA at releaseCodingTop-tierLeading results on SWE-bench Verified and Aider PolyglotCreative WritingStrongNo dedicated creative-writing benchmarks published, but solid as a general-purpose modelMultimodalExcellent84.2% on MMMU; handles charts, screenshots, and complex visualsResponse SpeedFastRouter defaults to the efficient model for routine requests, reserving deep reasoning for complex tasksContext WindowMedium400,000 tokens combined input/output via API; product tiers are more limited
Use Cases
Software development & code review: Leverages strong SWE-bench performance for feature implementation, debugging, and multi-language editing. Math & scientific computation: Handles competition-level math and graduate-level scientific reasoning step by step. Multimodal content analysis: Interprets charts, interface screenshots, and technical documents alongside text. Health information support: Helps users understand test results and prepare questions for doctors, as a decision-support aid (not a replacement for professional medical advice). Agentic automation: Chains multiple tool calls to complete complex, multi-step workflows such as data processing and cross-application tasks. Front-end interface development: Generates well-laid-out, visually polished web and app UI prototypes. General conversation & writing: Serves as a unified entry point for everyday Q&A, writing, and information retrieval.
Pricing
| Token Type | LinkAI Price | Official Price |
|---|---|---|
| input | $0.937500 / 1M tokens | $1.250000 / 1M tokens |
| output | $7.500000 / 1M tokens | $10.000000 / 1M tokens |
| reasoning_tokens | $7.500000 / 1M tokens | $10.000000 / 1M tokens |
| cache_read | $0.093750 / 1M tokens | $0.125000 / 1M tokens |