AI agents can think and code. But they still can't reliably work together -- and that gap is where billions of dollars will be won or lost.
Imagine you're building a city made of Lego. You've got great individual buildings -- a hospital, a school, a fire station. But you have no roads, no traffic lights, and no one directing which truck goes where. Buildings are great on their own, but the city falls apart because nothing connects them reliably.
That's what's happening with AI agents right now. We've gotten really good at making individual agents that can write code, send emails, search the web. But when you need many agents working together -- one agent hands off to another, they share information, they don't step on each other's toes -- there's almost nothing built to manage that. Teams are gluing it together by hand, and it's breaking.
Nate B Jones maps out six layers that agents need, from basic things like "a safe place to run code" up to "how do 50 agents coordinate on a real project." The bottom layers exist. The top layer -- orchestration -- is almost completely missing. And that's where the next giant company will be built.
Agents need a safe, isolated place to run code. E2B ($32M funding) uses Firecracker microVMs -- the same tech behind AWS Lambda -- giving each agent its own disposable kernel. Daytona ($24M Series A) bets on Docker containers with a shared kernel, claiming 90ms cold starts and persistent state. Modal targets GPU-heavy workloads. Browserbase ($300M+ valuation) gives agents headless browser access to interact with the web like a human. The key philosophical split: ephemeral vs. persistent sandboxes -- a bet on how long agent sessions will run.
Agent Mail ($6M seed, backed by General Catalyst, Paul Graham, and HubSpot CTO Dharmesh Shah) lets you programmatically create real email inboxes for agents. The thesis: email is a universal identity key because every SaaS service requires one. But Jones argues email is a shim, not an architecture. Threading is brittle, rate limits fight automation, and the signal-to-noise ratio murders context windows. The real need is agent-native identity -- on-chain options, A2A protocols, and MCP-based discovery are all competing. No clear winner yet.
Mem0 (14M downloads, chosen by AWS as exclusive memory provider for its agent SDK) treats memory as active curation, not conversation logging. It stores what matters, forgets what's outdated, and recalls only relevant context at inference. Architecture: graph DB + vector DB + key-value store. It beats OpenAI's built-in memory by 26% on the LoCoMo benchmark. The risk: every frontier lab (OpenAI, Anthropic) is building memory directly into models. If it becomes a model-level feature, standalone memory companies face platform risk. Counter-thesis: portability -- no one should own your memory.
Composio ($29M from Lightseed) provides managed auth, 200+ pre-built connectors, and observability for every tool call. Without it, every agent builder independently manages credentials, OAuth, rate limits, and API changes for every tool -- an N*M combinatorial nightmare. Long-term risk: if MCP becomes truly universal, managed integration loses value. But enterprises move slowly, and that gap is where the entire thesis sits.
Stripe Projects (launched April 2026) is the first credible trust layer for agent-to-service transactions. Agents use CLI commands to provision databases (ready in ~350ms), upgrade hosting, and pay for services -- with tokenized payment credentials that never expose raw card details. Coming next: agent-to-agent payments, metered billing, dynamic budget allocation, and FinOps observability for agent spend.
This is the biggest opportunity and the biggest problem. Gartner reported a 1,445% surge in multi-agent orchestration inquiries (Q1 2024 to Q2 2025). Current tooling like LangChain operates at the framework level -- fine for spinning up 3 agents in a notebook, useless for running 50 agents across enterprise systems with failure recovery, cost controls, and audit logging.
What's missing and needs to exist:
Jones draws a direct analogy to Kubernetes: not the compute itself, but the scheduling, scaling, and lifecycle management that made compute usable at enterprise scale. Whoever solves this owns the most valuable position in the agent stack.