Skip to content

Agent roles & providers

An agent role is the configuration unit an operator ships to their agents. It binds together everything a workspace needs to run: which LLM provider it talks to, which model, which capabilities (allow rules + transforms + budgets), which content packs, how long tasks may run, how long workspaces sit idle, and whether to keep warm pods ready to claim. Every workspace points at exactly one role.

An agent provider is the LLM backend the role dispatches to. It carries the upstream domain, the credential the proxy mints tokens from, and a kind discriminator that tells the agent how to format requests (Anthropic Messages, OpenAI Chat/Responses, Google Gemini, Vertex AI, Azure OpenAI).

This page covers both, plus the canvases the console gives you for editing them and reviewing the access posture they produce.

A provider row carries:

FieldPurpose
nameSlug — used by the role’s provider_id reference. Unique within the org.
display_nameOperator-visible label. Cosmetic, not unique.
kindClosed enum — see below. Determines wire shape.
domainUpstream hostname the proxy admits. The role’s capabilities can reach this domain on this provider’s behalf.
credential_nameWhich credential the proxy uses to authenticate to the upstream. The credential’s provider must match kind.
default_modelUsed when a role under this provider doesn’t override model.
capabilitiesYAML the role merges its own capabilities onto (see Capabilities below).
configPer-kind extras keyed by kind: api_version / api_type for Azure OpenAI, project / location for Vertex.
archived_atSet when the provider was archived.

The closed-enum provider kinds:

KindWireconfig keys
anthropicAnthropic Messages API
openaiOpenAI Chat/Responses
geminiGoogle Gemini API
vertexVertex AIproject, location
azure-openaiAzure OpenAI Serviceapi_version, api_type

The same kind can appear multiple times in an org — for example, an anthropic provider for production with a service-account credential and a second anthropic provider for staging with a developer’s credential. Roles pick a provider; the provider determines which credential the proxy injects.

Archiving a provider is destructive in scope: in one transaction the platform archives every role that references it and every stopped workspace under those roles. The archive is blocked if any of those roles has an active (non-stopped) workspace, so the operator can’t accidentally tear down running agents. The audit row’s cascaded_roles count records how many roles were swept in the same transaction.

Unarchive only restores the provider itself. Roles and workspaces stay archived until you unarchive them individually.

A role row carries the full configuration:

FieldPurpose
nameSlug — evershell ps --role <name>, POST /v1/tasks {role_name}. Unique within the org. Routing only; not shown to the agent.
display_nameOperator-visible label for the role list and the role node on the canvases. Not shown to the agent.
descriptionFree-form prose passed to the agent in its system prompt as operator-provided role instructions. Actively shapes how the agent behaves — what its job is, what to prioritise, what tone to take. Use it like a role-level briefing.
provider_idWhich provider the role dispatches to.
modelOptional — empty falls back to the provider’s default_model.
capabilitiesInline capability YAML (see Capabilities below).
environmentFree-form slug — production / staging / development / your own. Surfaced on audit rows and tracing spans for filtering; not injected into the agent’s prompt.
environment_descriptionFree-form prose passed to the agent in its system prompt as operator-provided environment context. Use it to tell the agent how to behave in this environment (caution levels, billing implications, who sees the output, etc.).
task_timeout_seconds, idle_timeout_secondsPer-task time limit; idle-stop deadline. Both fall back to system defaults when unset.
thinking_effortReasoning-budget knob — closed enum per provider kind (see below).
max_continuationsHow many compaction rounds a single task may chain before the agent wraps up.
max_context_tokensThe context ceiling the runtime watches for the ~80% wrap-up trigger.
pool_policy, pool_size, pool_max_sizeWarm-pool configuration — see Pool policy below.
pack_ids[]Ordered list of content packs attached to this role.
archived_atSoft-delete tombstone.

Reasoning-budget control is per-provider-kind, and each kind has a different closed set:

Provider kindthinking_effort valuesDefault
anthropicoff, low, medium, high, xhigh, maxmedium
openainone, low, medium, high, xhighmedium
geminiminimal, low, medium, highmedium
vertexsame as anthropicmedium
azure-openaisame as openaimedium

Some specific models override this (e.g. older models without thinking support, or preview models with a different shape) — the editor surfaces the right options for the picked provider/model pair, and rejects mismatches at save time.

A role’s effective capability set is the merge of:

  1. The provider’s capabilities YAML.
  2. The role’s own inline capabilities YAML.

The merge is by capability name: a role-defined capability with the same name as a provider-defined one overrides the provider’s; capabilities the provider didn’t define are appended. The provider’s role is to ship the baseline admission for its own upstream (e.g. an anthropic provider ships an anthropic-llm capability that admits POST /v1/messages with appropriate budget counters); roles override or extend on top.

Content packs aren’t a separate runtime merge tier. Pack capabilities reach the role’s policy through the editor, not through a runtime merge: when the operator attaches a pack on the role-editor canvas, the pack’s capabilities are auto-copied into the role’s in-memory capability list (skipping any names that already exist on the role), and the operator can edit or remove them before saving. Detaching the pack later removes its file mount from future workspaces but leaves the copied capabilities on the role — the operator removes them explicitly if they want them gone. See Content packs — Capabilities for the full story.

The merged YAML (provider + role) compiles to a Rego policy + a digested wire form at workspace activate-time and is handed to the proxy, which enforces it for the lifetime of the workspace.

For the YAML shape itself — allow rules, transforms, budget counters, save-time validation — see caps.yaml.

Cold-starting a workspace pod takes 20–30 seconds. For roles where task-submit latency matters, set a pool policy and the platform keeps pre-warmed pods ready for sub-second claim.

Three policies:

PolicyBehaviour
"" (disabled)No pre-warming. Every workspace under this role cold-starts.
staticKeep exactly pool_size ready pods. The pool refills as workspaces claim out of it.
dynamicKeep at least pool_size, up to pool_max_size. Scales with demand.

For dynamic, the platform measures claim rate over a 5-minute rolling window. When claims happen, the target lifts to recent_claims + 50% buffer, clamped to [pool_size, pool_max_size]. When the pool has been idle for 5 minutes with no claims, it drains back to pool_size. If pool_max_size is unset, it defaults to 3 × pool_size.

Pool pods are full workspace pods — same image, same packs mounted, agent process running — but with the external proxy in deny-all (not yet activated for any workspace). On claim, the CP relabels the pod, activates the proxy with the workspace’s config, and the pod is the workspace from that moment on.

Pool state is visible per role in the console as the pool chip on each role card: ready / creating / active / target / max. The chip is driven by an SSE stream so it stays current without polling.

A role has two removal verbs:

  • Archive — soft-delete with cascade. Archiving a role archives every stopped workspace under it in the same transaction. The archive is blocked if any of those workspaces is still active (provisioning / busy / idle / stopping); stop them first — either individually or via the role’s “stop all workspaces” action (POST /v1/agent-roles/{id}/stop-workspaces) which stops everything under the role in one shot. Archived roles disappear from default listings; an Active/Archived toggle on the role list brings them back.
  • Delete — hard delete from the database. Rejected with 409 role_in_use (foreign-key enforced) if any workspace still references the role, including stopped ones. Use it only when there are zero workspaces under the role; for retiring an established role, archive instead.

Provider archive cascades into role archive, as noted above. Unarchiving a role is blocked if its parent provider is archived — unarchive the provider first.

Role Topology (Agent Roles → View Topology button) is a graph rendering of the org’s roles and every upstream service their capabilities reach — basically, “what is configured to talk to what.”

  • Left panel: a compact list of every role in the org. A filter input narrows the list; each row shows the role’s display name, the bound provider and model, an edit pencil (when you have capability-write permission), and a live active-agents count chip when there’s at least one workspace in provisioning / busy / idle under that role. Clicking a row focuses that role on the canvas.
  • Right canvas: a graph of the roles and the services they touch, with pan / zoom / fullscreen controls, a layout toggle between hierarchical and force-directed, and a Graph search input that narrows the canvas to nodes and edges matching the query (role names, domains, capabilities). Clicking a role node focuses it the same way the left panel does.

Two node types:

  • Role nodes — one per agent role. Show the role’s display name and slug, the bound provider’s name + model, a live count of active agents (workspaces in provisioning / busy / idle) when greater than zero, and chips for the attached content packs. When the Risk overlay is on, a capability-breadth badge (narrow / moderate / broad / unrestricted) sits in the top-right corner. Right-clicking a role node opens a context menu with Edit role (when you have capability-write permission) and View live agents (jumps to the Live Agents view focused on that role’s running workspaces, when there are any).
  • API-service nodes — one per upstream domain any role’s capabilities reach.

Edges:

  • Implicit channel edges between roles that share access to the same writable domain (e.g. two roles can both write to the same Slack channel). These are the channels the leak-path detector walks.
  • Capability edges between roles and the API-service nodes their capabilities admit. Each edge carries methods, paths, the capabilities authorising the call, the credentials and secrets the proxy injects, and the budget counters that meter it.

The topology overlay toolbar adds five risk-aware layers you can toggle on top of the static graph:

ToggleWhat it shows
LabelsData classification (public / internal / confidential / restricted), network exposure (internet-facing / internal-only), and access patterns (read / read-write / delete / bulk) on nodes and edges.
RiskCapability-breadth badge (narrow / moderate / broad / unrestricted) and sensitivity-exposure badge on each role node.
LeaksHighlights leak paths. A leak path is a sensitive-data domain reachable on the same edge or implicit-channel network as an internet-facing one. Direct leaks are a single role with both kinds of access; transitive leaks bridge two roles via a shared writable domain. The toggle’s badge counts the paths the analyser found.
DefensePer-edge defense-depth score: out of six layers — HTTPS, domain pinning, method pinning, path pinning, credentials, rate limiting — how many are active vs. applicable. Edge styling and edge tooltips break it down.
BlastHover a role to see its blast radius: every domain it reaches directly, plus every domain reachable transitively via implicit channels. Colored by data sensitivity.

When roles or providers change in another tab or by another operator while you have the topology open, a Data changed — update graph prompt appears on the canvas. Clicking it re-lays the canvas against the latest state; the graph stays on its previous frame until you do (so a parallel rename or delete doesn’t jitter the canvas under you mid-inspection).

The topology canvas has a Risk Posture panel (icon button on the canvas, opens a slide-out on the right). The panel renders a generated markdown report that summarises:

  • Roles analysed — counts by capability breadth (unrestricted / broad / moderate / narrow) and by sensitivity exposure (restricted / confidential / internal / public).
  • Domains — every domain any role reaches, grouped by classification.
  • Network — internet-facing vs. internal-only domain counts.
  • Defense depth — connection counts in strong (5–6 layers active) / moderate (3–4) / weak (0–2) buckets, plus a top-N list of the weakest edges with the specific missing layers named.
  • Leak paths — every direct and transitive leak the analyser found, with the sensitive domain, its classification, and the internet-facing domain it’s reachable from.

The report is a snapshot of what the canvas is showing, in markdown — useful for handing to a security reviewer or pasting into a ticket.

The role editor at /agent-roles/new (or /agent-roles/:id/edit) is canvas-based — same graph rendering as the topology view, but scoped to one role and with the capabilities editable in place.

For a new role, the editor opens on a Role Template Picker first. Templates are pre-configured starting points; each carries a difficulty badge (beginner / intermediate / advanced) and pre-seeds defaults — name, description, provider kind, recommended packs, thinking effort, timeouts. The current templates:

  • Hello World — beginner. The smallest possible agent, for first-tenant onboarding.
  • Script Runner — runs scripts and processes data.
  • DevOps Monitor — pages, dashboards, deploy operations.
  • SecOps Triage — security alert triage and enrichment.
  • Incident Responder — investigating live incidents.

A separate Start from Scratch card opens the editor with an empty role.

Once a template is picked (or skipped), the canvas opens focused on the role node with its surroundings around it — attached packs and the API-service nodes the role’s capabilities reach — laid out the same way the topology view draws them, just scoped to this one role and editable.

The editor has four side panels the operator opens and closes independently — at most three are visible at once, with FIFO eviction when a fourth gets opened:

  • Properties — display name, description, provider, model, environment + environment description, task timeout, idle timeout, reasoning effort, max continuations, max context tokens, pool policy + size + max size. Validated as you type; field errors surface inline.
  • Capability Editor — the inline capability YAML for the role. A YAML pane and a structured form, switchable; the parser runs continuously and surfaces validation errors alongside save-time warnings (missing credentials, schema-format mismatches, etc.).
  • Templates — a library to pull capabilities from. Three sources: a curated capability preset library (separate from the role templates above), From Roles (capabilities defined on other roles in the org), and From Providers (capabilities defined on other providers). Drag a card onto the canvas to add it to the role; a blank-capability option is there for hand-writing from scratch.
  • Content Packs — toggles attached packs on or off for this role. Attaching a pack auto-copies its capabilities into the role’s in-memory list (see Capabilities above); detaching doesn’t undo that copy. Unattached packs named in an attached pack’s suggested_packs are highlighted with a suggested badge (e.g. attach google-search and web-fetch lights up) — see Content packs — Pack-to-pack suggestions.

Each pre-built capability is also draggable directly onto the canvas, and clicking a capability edge in the graph opens that capability (and any siblings on the same edge) in the Capability Editor.

Beyond auto-copying the attached pack’s own capabilities into the role (see Capabilities), the editor also cross-suggests neighbouring things in both directions:

  • Attaching a pack opens the Templates panel and briefly glows capabilities (from other packs or the built-in template library) whose domains overlap the just-attached pack — so you see what else is available against the same domain.
  • Adding a capability with a new domain opens the Content Packs panel and briefly glows any packs whose capabilities overlap that domain — so you see whether attaching a whole pack would be a better shortcut than writing capabilities one at a time.

These cross-suggestions don’t auto-apply — they’re nudges.

  • Capabilities the provider defines can’t be deleted from the UI. The backend merges the provider’s base capabilities back in on every save regardless of what’s in the role YAML — so the editor blocks deletes of those names rather than letting you author something the server will silently undo.
  • Risk Posture and Access Review panels are mutually exclusive with the editing panels. Opening either of those closes (and remembers) whatever editing panels were open; closing them restores the stack.

Save fires a single API call — create for a new role, an only-changed-fields update for an existing one.

The editor polls the role + provider state while open. If another operator edits the same role in parallel, the header surfaces a “modified externally” marker and a Data changed button next to Save, and Save itself is blocked until the divergence is reconciled. Clicking Data changed:

  • No conflict (their edits are disjoint from yours): the remote update auto-applies under you, your edits stay intact, and Save unblocks.
  • Conflict (you both touched the same fields or capability names): a ConflictResolver opens with a three-way diff — your local change, the remote change, the common ancestor. Pick per-field which side wins; the merged result becomes the new baseline and Save unblocks.

The role editor canvas also has an Access Review panel (bottom slide-up, opened from the Shield icon next to the Risk Posture button). It’s the workflow for closing the loop on real-world agent traffic.

The panel queries the role’s recent HTTP requests through the proxy and aggregates them by (method, destination, path, decision):

  • A row for every distinct request the agent attempted, with count (how often), last-seen timestamp, and the policy decision (allow or deny — with the deny reason).
  • Filter by decision, search by domain / method / path / capability / reason, sort by count or recency or destination.
  • Each denied row has a + Add capability button that produces a capability skeleton matching the destination, the method, and the path, and inserts it into the role’s inline YAML. From there you can refine the path patterns, add a schema, attach budget counters, and save — turning a stream of repeated denies into an explicit admission.

A custom time range narrows what the panel aggregates over.