AI AgentAI Products & PlatformsIndustry & Competition

They Build Connectors. Google Calls the OS. Why Model Quality Won't Close the AI Platform Gap

Last month, Google released an update. You can now long-press the power button on your phone, say something to Gemini—“Check that course email in my Gmail and add the listed books to my shopping cart”—and watch it open the email, extract the book list, find the products, and add them one by one.

This article isn’t about that feature itself. I want to make a deeper comparison: why OpenAI can’t do this, Anthropic can’t do this—only Google can. And the answer has little to do with model quality.

Three Ways to Let AI Operate Another App

Every AI system that wants to work across apps faces the same problem: how does the AI reach data inside another application?

There are three approaches. From top to bottom, the results get worse and the costs get higher.

Approach One: OS-level direct access. Gemini is part of Android, just like the notification system or the camera. It can read and write your Gmail without requiring separate authorization. It can check your Calendar without a separate login. These permissions were baked in when the system was designed.

Approach Two: Build a separate Connector for each service. This is how OpenAI and Anthropic do it. One for Slack, one for GitHub, one for Google Drive. Each Connector is its own engineering pipeline: register a developer account, set up OAuth, manage token expiration, handle rate limits. Every new service means hundreds of lines of code and its own authentication stack.

Approach Three: Simulate human GUI interaction. When a service offers neither system-level access nor an API, your only option is to screenshot, identify where the buttons are, simulate clicks, then screenshot again to confirm it worked. Every step is a screenshot-plus-inference cycle—slow; change the layout and positions break; every extra step burns another round of tokens.

The efficiency gaps between these approaches are large. Approach Three is an order of magnitude slower than Approach Two. Approach Two is an order of magnitude slower than Approach One. And Approach One’s integration cost is close to zero—no need to negotiate permissions service by service.

Google can use Approach One. OpenAI and Anthropic are stuck with Approaches Two and Three.

The Real Cost of Building Connectors

OpenAI Codex and Claude Code have rich Connector ecosystems. But each Connector is independent. If the AI wants to check Gmail, read Calendar, and send a Slack message simultaneously—it has to call three separate Connectors, each verifying its own login hasn’t expired.

This isn’t an implementation problem. Any AI not running inside the operating system has to knock on each door, one service at a time.

And knocking requires two preconditions, neither of which the knocker controls.

First precondition: there has to be a lock. Many services don’t expose APIs at all. Concur won’t let you write expense reports programmatically, so you can’t auto-submit reimbursements. Bloomberg has an API, but it starts at twenty-five thousand dollars a year. Your remaining option is reverse engineering, which carries legal risk, and your solution breaks every time they update.

Second precondition: someone has to make the key. Most established companies won’t build a Connector just for AI agents. Microsoft’s own Copilot connector took a long time to ship, and it doesn’t even work for personal accounts—enterprise only.

How many services your AI can operate doesn’t depend on how smart your AI is. It depends on two conditions you don’t control: whether they’ve left an interface, and whether anyone has done the integration work.

At this layer, Google is in roughly the same position as you. It also has to build Connectors and handle token expiration. Google’s advantage here is simply scale—service providers are more willing to build Connectors for it proactively.

But Approach One is completely different.

Integration Is Unavoidable. Who Does It Changes Everything.

Approach One isn’t free either. For Gemini to place an order on DoorDash, DoorDash still has to implement the corresponding interface in its app. Google calls this interface AppFunctions. Developers mark their exposed functions with a few annotations, the system registers them automatically, and Gemini calls them when needed.

Integration is unavoidable for everyone. Whether you’re Google or OpenAI, if you want AI to operate Uber, Uber has to provide an entry point.

The difference is who can make developers do that work.

Google can send an email to Android developers: “The next Android release includes a platform feature called Gemini Intelligence. Implement this interface and your app gets called directly by AI. Don’t implement it, and your competitor will.” Developers have strong incentives to comply—not because Google’s documentation is excellent, but because Google sets the admission standards for this platform.

OpenAI sends the same email and gets a completely different result. “Please integrate with our Connector protocol.” The developer asks: “Why? My users won’t download my app more because of this.”

A platform owner’s requirement and a third-party AI company’s request carry orders of magnitude different persuasive power.

The Sandbox Protects Users—and Google

This position comes with a side benefit.

Android’s security model prohibits ordinary apps from reading other apps’ screen content or intercepting other apps’ operations. These restrictions have entirely legitimate justifications—preventing malware from stealing photos, blocking trojans from reading banking SMS.

But the same rules don’t apply to Google itself. Gemini is a system component, outside the scope of those restrictions. ChatGPT is an ordinary app downloaded from the Play Store, subject to every limitation.

This is why the EU’s DMA ruling, reported by Android Headlines in April, matters so much. The EU is demanding that Google grant third-party AIs—ChatGPT, Claude, and others—the same system-level permissions as Gemini, including custom wake words and app control capabilities. If the ruling passes, the long-press power button entry point might have to open to ChatGPT.

As of now, the DMA ruling is still pending.

The Model Arms Race Isn’t the Main Battlefield

In my analysis of Manus earlier this year, I categorized AI product advantages into three compounding dimensions: tools, data, and intelligence. I argued that tool compounding is the easiest to replicate and doesn’t constitute a moat.

That judgment holds at the application layer. Writing one more API connector isn’t expensive.

But at the OS layer, the barrier to adding one more tool isn’t engineering—it’s ownership. If you don’t own Android, you will never get access to those interfaces that require no Connector, no separate authorization, and are connected by default at the system level. It’s a switch, not a matter of degree. Either you have it or you don’t.

So in AI OS competition, model quality may not be decisive. OpenAI can lead on benchmarks; Google can catch up with Gemini 3.1. But an AI that can use fifteen system-level tools versus one limited to a browser and a Python environment—even if the latter’s reasoning is a tier stronger, the delivered experience is hard to close.

Google initially opened this to food delivery, rideshare, and grocery—not because the technology can only handle those. The Gmail-to-cart demo at the beginning proves it can do far more. Those three categories were chosen because they face the least ecosystem resistance—Uber and DoorDash already need external traffic, so being called by Gemini doesn’t threaten their core business. Google is testing the boundaries step by step.

For thirty years, the operating system’s job was managing hardware. The OS of the future has one more thing to manage: what the user wants to do next, and which apps can help. What’s being scheduled has changed, but the source of power hasn’t—whoever controls the bottom layer sets the rules.

Humane’s AI Pin and Rabbit’s R1 saw this direction too. They didn’t fail because their AI wasn’t good enough—the AI Pin used GPT-4. What they didn’t have was an operating system already running on billions of devices, with millions of apps, holding every low-level permission.

Let’s return to the power button. Google knows users are unhappy—1,665 people on Pixel’s official forum want their power button back. But Google persists. Because the power button is the first thing you touch when you pick up your phone. Before the screen lights up, before any app icon appears. Whoever occupies this position knows first what the user wants to do.

Google has decided the criticism is worth it.