Leaked internal documents analyzed by TechCrunch show OpenAI paid Microsoft billions under a revenue share arrangement and spent roughly 12.4B on Azure inference from 2024 to 2025. The leaks highlight recurring AI compute costs, margin pressure, and the strategic role of cloud partners.

Leaked internal documents, analyzed in a TechCrunch led series of reports, reveal the scale of OpenAI payments to Microsoft under a revenue share arrangement and detail inference spend on Azure. Public summaries and independently circulated copies place OpenAI inference costs in the billions for 2024 to 2025, with some coverage citing a figure near 12.4B. These numbers expose the raw infrastructure bills behind large language models and shift the debate over AI cloud cost analysis, pricing and margins.
AI platforms incur two distinct kinds of compute spend. Training costs are episodic and capital intensive. Inference costs are ongoing and accrue each time a deployed model answers a user query. For commercial AI providers, inference can become the dominant recurring expense as usage scales. The leaked documents put a spotlight on those recurring bills and on how close ties to a cloud provider can reshape unit economics.
These findings matter for industry players customers and investors. If inference bills reach multi billion levels then gross margins on AI services may be thinner than subscription prices imply. Companies will need to:
For cloud providers like Microsoft the revenue share arrangement aligns incentives and secures recurring cloud revenue. For model makers the deal reduces capital expenditure but increases dependency on a single provider.
This reporting aligns with broader trends in automation and cloud economics where platform scale players consolidate infrastructure to capture performance and pricing advantages. Rising attention to per inference costs is prompting engineers to optimize models switch to more efficient runtimes and experiment with on premise or hybrid architectures. Conversations around AI compute pricing cloud billing for AI workloads and cost optimization are accelerating across the sector.
How much did OpenAI spend on Azure inference
Public summaries of the leaks estimate OpenAI inference costs in the billions for 2024 to 2025 with some reports citing roughly 12.4B for that period.
What is inference cost
Inference cost is the compute consumed each time a deployed model responds to a user request. It is a recurring operational expense that scales with usage.
How does this affect pricing for customers
If inference bills are high providers may adopt tiered pricing pay as you go models or rate limits to manage costs and preserve margins.
The leaks do more than reveal dollar amounts. They force a reckoning with the business model behind modern AI: models create demand but the compute required to serve that demand creates substantial recurring costs. For businesses building or buying AI solutions the takeaway is practical. Evaluate model performance alongside infrastructure arrangements and cloud billing terms that shape long term margins. Watch OpenAI and Microsoft for any public clarifications and for competitive moves that could reshape AI compute pricing and cloud cost management.



