What we're building, and why

A new class of actor is at work in production now — autonomous AI agents. Most of them are running through tool-call wrappers, limited to a handful of functions your team hand-wrapped for them. Evershell runs them differently: a real computer for the agent to work on, and a network that decides what it can reach.

How we got here

We started Evershell because the same wrong instinct kept winning in every engagement we touched.

Consulting for companies adopting AI agents, two patterns showed up everywhere. Ad-hoc agent runtimes spun up against production credentials, no governance, no audit. Or — at the other extreme — agents wrapped in a tangle of bespoke MCP servers and tool definitions, reinventing protocols for things a shell and an HTTP client have already done well for decades.

Both miss the point. The agent needs two things at once: a real computer to do the work, and an enforced boundary on what it can affect outside the sandbox. Neither alone is enough — and there's no standard runtime for that. So we started building one.

The thesis

The ceiling on autonomous AI isn't model intelligence. It's the runtime around the model.

Today's runtimes split into two failures. Tool-wrapping platforms — MCP, SDKs, function calling — lock the model into a handful of pre-defined actions. Capable models, narrow reach. Permissive containers go the other way: raw API keys, ad-hoc sandboxes, no enforcement at the boundary. Neither shape is one a team can actually trust with real work.

What's missing is a runtime that does both halves at once. Real capability inside, governed at the edge. The same shape that made cloud servers acceptable at enterprise scale — applied to a new class of actor.

What we believe

Give agents a real computer, not a custom protocol.

A tangle of MCP servers and bespoke tool definitions is the industry's current detour. Agents have judgment and a shell — let them use both. We're betting the protocol overhead falls away as the boundary gets real.

Hide credentials from the agent — don't just protect them.

App-layer secret management is a wish. A network that can't expose your keys is an answer. Same logic for identity, audit, and policy: runtime properties, not application-code conventions.

Without a runtime boundary, every autonomy decision is a tradeoff.

That's the ceiling on what teams will let agents do. The boundary isn't what restricts agents; its absence is.

Define this in the runtime, not in app code.

Every other modern security model lives at the runtime layer — kernel, network, IAM. That's where rules become enforcement, not just declarations.

How we're building it

  1. Hosted product first.

    A real platform that real customers run real workloads on. Get the abstraction right by working with people who feel the pain.

  2. Then open the foundation.

    Once the model is proven, open-source the runtime, the proxy, and the capability spec. Governance infrastructure that requires trust in the vendor's binary is not, ultimately, governance infrastructure.

  3. Then a community marketplace.

    Capability packs for the APIs people actually use. Compliance templates that aren't yours to write from scratch. The platform curates; the community contributes.

We're at step one — hosted runtime live, working with early customers.

The team

Built by engineers with backgrounds in machine learning at Apple, production LLM systems and distributed-systems architecture at Piano, and over a decade of platform engineering across European product companies.

Evershell sits between them: AI systems that don't behave predictably, and infrastructure that has to. The team's shared bias: care more about what survives production than what survives a demo.

Talk to us

If you're running 5-20 agents and feeling where this starts to bend, we'd like to hear about your setup.