In March 2026, OpenAI officially shut down Sora, both the consumer app and the developer API. As a social video app, Sora’s failure was hardly surprising. Day-30 user retention was below 5%, App Store downloads plummeted from 3.3 million in November 2025 to 1.1 million in February 2026, and its ranking slid from #1 to #172. Disney’s seemingly massive $1 billion partnership was also cancelled, with the BBC reporting that no funds had actually changed hands between the two parties (BBC). This is a textbook consumer product failure.
But the real signal is buried in a different decision: OpenAI killed the API too.
An API can continue generating value through the developer ecosystem even after the consumer app behind it fails. OpenAI itself operates exactly this way in text and code — ChatGPT is the traffic funnel, the API is the revenue engine, and the two can have completely independent lifecycles. The video API took a different path, and the reasons behind that are far more worth unpacking than the app’s failure itself. They point to a set of deep tensions between compute economics, market commoditization, and GPU resource allocation. For developers currently using or planning to use AI video APIs, and for anyone evaluating the investment thesis of this space, this case carries far more information density than a product shutdown headline.
Start with unit economics. Cantor Fitzgerald analyst Deepak Mathivanan estimated that generating a single 10-second Sora video costs roughly $1.30 in compute, requiring about 40 minutes of GPU time (or 4 GPUs in parallel for 8–10 minutes). SemiAnalysis’s AJ Kourabi confirmed this estimate (Forbes).
Why is video inference so expensive? Text generation’s core operation is token-by-token autoregressive sampling, where each step involves relatively modest computation. Video generation requires multi-step diffusion sampling in high-dimensional pixel space, with each step involving a full U-Net or DiT forward pass, and resolution and frame count multiply directly into cost. The inference compute for a single 10-second 720p video is roughly 10x or more that of a GPT-4-class text request.
Sora 2’s API was priced at $1/10s video on the standard tier and $3/10s on Pro. The standard tier lost over $0.30 per video generated, before accounting for storage, bandwidth, and safety review overhead. The consumer side was even more extreme: at peak, Sora was generating roughly 11.3 million videos per day. At $1.30 per video, that’s a daily burn rate of $15 million, annualizing to $5.4 billion (Remio). Meanwhile, Sora’s cumulative revenue as a standalone app was just $2.1 million. A cost-to-revenue ratio of roughly 2500:1.
These numbers are already fatal, but unit economics alone don’t explain why the API was shut down. The entire industry’s unit economics are similarly grim: Runway’s 2024 revenue was $44 million against an EBITDA loss of $155 million — losses running 3.5x revenue (TechCrunch). Global VCs poured $4.7 billion into AI video startups in 2025, while the entire industry’s actual software revenue was under $1 billion (Crunchbase). Everyone is losing money. The question becomes: why did OpenAI choose to exit while others chose to stay?
GPU opportunity cost is the fundamental constraint. GPUs are OpenAI’s scarcest resource. The same GPUs running Codex generate annualized revenue exceeding $10 billion; running the Sora API, they generated $2.1 million total. The core calculation here: every GPU-hour allocated to video generation directly cannibalizes resources from OpenAI’s text/code core business. Against the backdrop of OpenAI’s 2025 actual revenue of $13.1 billion and projected 2026 losses of $14 billion (CNBC), pulling GPUs away from product lines generating tens of billions in annual revenue to sustain an API with negligible revenue produces real opportunity cost every single day.
IPO discipline provided the execution window. OpenAI is pushing toward an IPO, with CFO Sarah Friar defining 2026 as the year of practical adoption, focused on health, science, and enterprise. Apps head Fidji Simo was more blunt at an all-hands: “We cannot miss this moment because we are distracted by side quests” (WIRED). For a money-losing consumer video product with collapsing retention and copyright litigation risk, shutting down the app is stanching the bleeding; shutting down the API signals focus to investors.
The internal value of world model technology far exceeds external API revenue. The Sora team wasn’t disbanded — leader Bill Peebles stayed on, with the direction pivoting to world simulation research (VentureBeat). Sam Altman publicly confirmed the robotics pivot. A January 2026 arXiv paper detailed the technical pathway from video generation models to robotics world models (arXiv). A rough conversion: the compute consumed by 1 minute of HD video generation is approximately equivalent to 1,000 hours of robotics simulation. If this technology’s endgame is modeling the physical world, fighting a price war with competitors at $1 per clip becomes a misallocation of resources.
The competitive landscape eliminated the case for staying. By the time Sora shut down, it had been surpassed on Artificial Analysis’s quality rankings by Kling 3.0, and Runway Gen-4.5 scored 41 Elo points above Sora. The quality gap was narrowing or even reversing, while the cost disadvantage was widening — Kling’s API pricing was 40–70% cheaper than Sora’s. Continuing to invest in the API meant fighting a price war on a track where OpenAI had no cost advantage, against competitors with either government-subsidized compute (Kling) or custom silicon cost advantages (Google).
These four forces don’t carry equal weight. Sora’s market performance was the trigger, GPU opportunity cost was the decisive constraint, the IPO provided the execution window, and world model internalization ensured a recovery path for the technical assets. A simpler explanation might suffice: Sora failed as a product, API demand never materialized, and OpenAI chose to cut losses. But if it were purely a product failure, OpenAI could have kept the API (which costs far less than the consumer app), scaled down investment, and waited for the market to mature. The choice to shut down the API entirely is better read as OpenAI concluding that video generation as a standalone API category has limited long-term value — especially when world model technology has higher-value internal applications. Supporting evidence: OpenAI is already developing a next-generation video model codenamed Spud, built from scratch on a new architecture (MindStudio). Shutting down the API avoids accumulating technical debt and customer commitments on an architecture that’s about to be superseded internally.
While OpenAI exited, Google, Kuaishou, and Runway stayed. These opposite decisions reflect fundamentally different cost functions.
Google’s Veo runs on TPUs, with inference costs roughly 80% lower than Nvidia GPUs (CNBC). Midjourney saw a 65% inference cost reduction after migrating from GPUs to TPUs, suggesting Google’s per-unit video generation cost could be as low as 1/3 to 1/5 of OpenAI’s. Veo’s Fast mode is priced as low as $0.15/second, and Google can still maintain positive unit economics at that level. Add YouTube’s 20 billion videos of training data, Google Cloud’s $15.15 billion in quarterly revenue (up 34% YoY), and a total annual revenue base exceeding $350 billion — Veo is a platform feature for Google, and its profit-or-loss outcome doesn’t change the strategic logic. Veo commands 96.4% model share on Google’s platform, a monopoly position Sora never came close to.
Kuaishou’s Kling took a different path. Kling reached $240 million in ARR by December 2025 with over 60 million creators (Yahoo Finance) — two orders of magnitude above Sora’s $2.1 million in cumulative revenue. On the cost side, Kuaishou benefits from government-subsidized compute and Huawei Ascend chips (~$6,900/chip vs. $25,000–30,000 for H100). ByteDance’s Seedance is even more aggressive, with API pricing as low as $0.0247/second, backed by ByteDance’s 4 billion yuan chip contract with Huawei. For these companies, short-form video generation is a natural extension of their product lines, with almost no resource competition against text/code AI platforms.
Runway’s situation is the most fragile and the most straightforward: it’s an all-in video generation pure play, and exiting means the company ceases to exist. Runway’s annualized 2025 revenue was approximately $300 million, and it raised $315 million at a $5.3 billion valuation in February 2026. Its bet is that declining hardware costs and improving model efficiency will make unit economics sustainable within 2–3 years. Gen-4.5 leading Sora by 41 Elo on quality rankings demonstrates that a company solely focused on one direction can indeed outperform a distracted giant on product quality.
The contrast across these three survival strategies is clear: Google wins on cost advantage (custom silicon), Kling wins on business synergy (video is core), and Runway survives on VC funding and focus. OpenAI had none of the three — renting Nvidia GPUs at the highest cost, video as a peripheral business, and GPU resources competing against text/code core products.
The answer depends on two variables.
The first is the ultimate scale and margin structure of the video generation market. Current narrow estimates put it at roughly $3.4 billion by 2033 (Fortune Business Insights). Even doubling that to $7 billion, it’s insufficient to become a core business for an OpenAI that already generated $13.1 billion in 2025 revenue. But if penetration into film production, advertising, and gaming pushes the market past $50 billion, then an early exit could prove to be a costly misjudgment.
The second is how quickly world model technology delivers in robotics. If OpenAI’s robotics business generates meaningful revenue between 2028 and 2030, then redirecting video generation compute and research resources toward robotics was the right call. Conversely, if robotics commercialization takes longer than expected — which is the historical norm in robotics — then OpenAI chose a category with zero revenue over one that was at least generating some.
My read is that this was the right decision. Video generation is commoditizing rapidly, with multiple models (Kling, Veo, Runway Gen-4.5, Seedance) matching or exceeding Sora on quality at lower cost. In a commoditizing market, the winners are the lowest-cost player (Google) or the player with the strongest core business synergy (Kuaishou, ByteDance) — and OpenAI is neither. Spud’s existence shows OpenAI has preserved the option to re-enter: if the market proves large enough, it can come back on a better architecture at a more favorable point in time (lower GPU costs, more efficient models). The time value of the IPO window is also practical — completing an IPO in 2026, versus going public in 2028 still carrying a money-losing video business, could affect valuation by more than the total lifetime profit of the video generation market itself.
The risk worth flagging: if a competitor — most likely Google — establishes brand monopoly in video generation comparable to ChatGPT’s position in text, Spud’s re-entry will face entrenched user habits and developer ecosystems. Veo’s 96.4% model share on its own platform doesn’t yet equate to cross-platform brand monopoly, but if Google deeply integrates Veo into YouTube, Google Workspace, and other products, the resulting distribution moat would be very difficult for a latecomer to breach.
If your product depends on the Sora API, migration is mandatory. Target options are Veo (lowest cost, largest platform), Kling (best price-performance ratio, China market coverage), or Runway (quality-first, independent platform). Pay attention to differences in resolution, duration, and style control parameters across providers — these will directly impact product experience.
If you’re building an AI video product, the competitive landscape is being redrawn. OpenAI’s exit removes a major competitor but also means this category loses its entry point on the largest AI platform. The developer ecosystem’s center of gravity will shift toward Google and Chinese vendors. For independent developers and small teams, compute cost should be weighted more heavily than model quality when choosing an API provider — quality gaps are closing fast, while cost gaps reflect chip-level and infrastructure-level advantages that won’t reverse in the near term.
If you’re evaluating AI investments, this case offers a clear sample on unit economics and compute allocation. Video inference costs roughly 10x that of text, while revenue is a fraction. For any company renting GPUs, video generation is nearly impossible to run profitably on the current cost curve. The variables worth watching: the rate of chip cost decline (especially pricing pressure from Google TPU and Huawei Ascend on Nvidia), the pace of model efficiency improvements (fewer diffusion steps, better distillation), and whether the market can break through the ceiling of current estimates.
The core logic behind OpenAI shutting down the Sora API is that GPU opportunity cost was too high: maintaining an API in a commoditizing market where it had no cost advantage meant every GPU-hour consumed eroded the revenue potential of its core business. IPO discipline provided the execution window, and the world model–to–robotics technology pathway ensured partial recovery of sunk costs. The only real risk is timing: if robotics commercialization arrives later than video generation market maturation, OpenAI will find it redirected resources from a category that could have made money to one that temporarily cannot. Based on current evidence, the expected value of this bet is positive.