Here's everything that moved the needle in artificial intelligence this week — model releases, funding rounds, policy updates, and developer tools worth watching. We read everything so you don't have to.

OpenAI Ships GPT-5 With 256K Context and Native Tool Use

OpenAI's most significant release in two years landed quietly on a Tuesday. GPT-5 raises the ceiling on context length to 256,000 tokens — enough to process an entire codebase or a full book in a single prompt. The model ships with native tool use baked in rather than bolted on, meaning it can browse the web, execute code, and call APIs without separate orchestration layers.

Early developer benchmarks put GPT-5 ahead of Claude 3 Opus on MMLU and HumanEval, though Anthropic disputes several testing methodologies. Real-world performance on multi-step reasoning tasks shows the largest jump — GPT-5 completes complex agent workflows that caused GPT-4 to loop or hallucinate intermediate steps.

Pricing has been a surprise: GPT-5 input tokens cost $0.01/1K — a 40% reduction from GPT-4 Turbo at launch. OpenAI is clearly betting on volume over margin as the inference cost curve continues to fall.

Anthropic Raises $4B Series D at $18.4B Valuation

Anthropic closed its largest funding round yet, with Google leading and Spark Capital, Salesforce Ventures, and a handful of sovereign wealth funds participating. The $18.4B valuation makes Anthropic the second-most valuable private AI company globally, behind only OpenAI.

CEO Dario Amodei framed the round entirely around compute: "The gap between what current models can do and what frontier models need to do requires infrastructure investment that pure revenue can't sustain yet." The funds will accelerate training runs for Claude 4 and expand Anthropic's AWS-based inference cluster.

Importantly, Anthropic renegotiated its enterprise terms with several Fortune 500 clients as part of the round, locking in multi-year contracts for Constitutional AI safety auditing — a new revenue line the company is quietly building alongside its model business.

Google DeepMind's Gemini 2 Ultra Claims Every Major Benchmark

Gemini 2 Ultra is the first model to surpass human-expert performance across MMLU, HumanEval, and MATH simultaneously, according to DeepMind's internal evaluations. Third-party testing from Stanford HAI broadly confirmed the results, with some caveats around prompting methodology for the MATH benchmark.

What's more interesting than the benchmark numbers is the multimodal architecture. Gemini 2 Ultra handles video, audio, and code in a single unified model — no separate specialized models stitched together. This matters for product development: one API, one context window, one billing line.

Google is offering Gemini 2 Ultra through Vertex AI at $0.0125/1K tokens for input, positioning it directly against GPT-5. Expect a pricing war through Q3 2026.

EU AI Act Enters Enforcement Phase

The EU's Artificial Intelligence Act began its first enforcement phase this week, activating requirements for high-risk AI systems in employment, credit scoring, and critical infrastructure. Companies deploying AI in these categories must now maintain conformity assessments, register their systems with the EU AI Office, and implement human oversight mechanisms.

Fines for non-compliance reach up to 3% of global annual revenue — or €15 million, whichever is higher. Legal teams at major technology companies have been quietly working through compliance checklists since March. Several US-based AI startups have paused EU market expansion pending clearer guidance from the AI Office on what constitutes "minimal risk."

The practical impact on developers: if you're building an AI application that makes decisions about people in employment, credit, or healthcare contexts in the EU, you need a legal review before your next deployment.

Mistral Releases Mistral Large 2 — Open Weights, 128K Context

Paris-based Mistral AI released Mistral Large 2 this week with open weights under the Mistral Research License. The model runs 123B parameters, supports 128K context, and benchmarks competitively with Claude 3 Sonnet on most tasks — at a fraction of the inference cost when self-hosted.

For engineering teams running AI on their own infrastructure, Mistral Large 2 is the most capable open-weights option available. It requires two H100s for comfortable inference, but the absence of per-token licensing fees changes the unit economics substantially for high-volume applications.

Mistral continues to thread a needle between open-source credibility and commercial viability — the Research License permits commercial use but prohibits redistribution of the weights in competing AI services. This limits cloud hyperscaler adoption but leaves the enterprise self-hosting market fully open.

This Week's Funding Rounds

CompanyRoundAmountLead Investor
CohereSeries E$450MSalesforce Ventures
ElevenLabsSeries B$180Ma16z
Perplexity AISeries C$250MSoftBank
Runway MLSeries D$308MGeneral Atlantic
ReplitSeries B ext.$97MGoogle Ventures

Developer Tools Worth Your Time This Week

LangChain v2.0 shipped a major overhaul of its agent framework. Streaming-first architecture, a 50% reduction in base memory footprint, and a cleaner interface for tool registration make this worth upgrading. The migration guide is thorough — expect 2–4 hours for a medium-sized agent project.

Cursor 0.42 introduced multi-file agentic editing: the AI can now plan and execute changes across multiple files simultaneously based on a single instruction. For refactoring and feature implementation tasks, this is a step-change improvement over one-file-at-a-time edits. See our full Cursor AI review for the complete feature breakdown.

Vercel AI SDK 3.5 adds native streaming support for tool calls, dramatically simplifying the code required to build streaming agents in Next.js applications. If you're building AI-powered applications on the Vercel stack, the upgrade is an easy win.

Where can I follow AI news daily?

The fastest sources: Hugging Face's daily papers feed for research, X/Twitter for real-time model release announcements, and AIBeat for curated daily briefings without the noise. For deeper analysis, The Batch (DeepLearning.AI), Import AI (Jack Clark), and Nathan Benaich's State of AI report are worth reading weekly.

How do I track AI funding rounds?

Crunchbase and PitchBook are the primary sources for verified deal data, though both require paid subscriptions for full access. For free tracking, the AIBeat weekly digest aggregates significant rounds over $50M alongside model releases and research — subscribe below to get it every Monday.