AI AgentGoogle

Claude Sonnet 4 Introduces 1 Million Token Context Support

Claude Sonnet 4 Introduces 1 Million Token Context Support

Claude Sonnet 4 now supports up to 1 million tokens of context on the Anthropic API—a fivefold increase that allows the processing of entire codebases with over 75,000 lines of code or dozens of research papers in a single request.

Long Context Support in Public Beta

Long context support for Sonnet 4 is currently in public beta on the Anthropic API and in Amazon Bedrock, with Google Cloud’s Vertex AI expected soon.

Longer Context, More Use Cases

With enhanced context capabilities, developers can implement more comprehensive and data-intensive use cases with Claude, including:

  • Large-scale code analysis: Load entire codebases including source files, tests, and documentation. Claude can understand project architecture, identify cross-file dependencies, and suggest improvements that consider the complete system design.
  • Document synthesis: Process extensive document sets like legal contracts, research papers, or technical specifications. Analyze relationships across hundreds of documents while maintaining full context.
  • Context-aware agents: Build agents that retain context across numerous tool calls and multi-step workflows. Incorporate complete API documentation, tool definitions, and interaction histories without sacrificing coherence.

API Pricing Adjustments

With the increased computational requirements, the pricing for prompts over 200K tokens is as follows:

Input Output
Prompts ≤ 200K $3 / MTok $15 / MTok
Prompts > 200K $6 / MTok $22.50 / MTok

By utilizing prompt caching, users can reduce latency and costs for Claude Sonnet 4 with long context. The 1M context window can also be paired with batch processing for an additional 50% cost savings.

Customer Spotlight: Bolt.new

Bolt.new is innovating web development by incorporating Claude into their browser-based development platform.

“Claude Sonnet 4 remains our go-to model for code generation workflows, consistently outperforming other leading models in production. With the 1M context window, developers can now work on significantly larger projects while maintaining the high accuracy we need for real-world coding.