Agency System — Workgraph

Core concepts

Instead of every agent being a generic assistant, workgraph gives agents composable identities built from three primitives:

Roles

A role defines what an agent does — its skills and desired outcome. Example: a "Programmer" role has skills in code-writing and testing, with a desired outcome of "working, tested code."

Tradeoffs

A tradeoff defines how an agent prioritizes — what compromises are acceptable and what constraints are non-negotiable. Example: a "Careful" tradeoff accepts being slow or verbose, but rejects producing unreliable or untested output.

Agents

An agent pairs a role with a tradeoff to form a complete identity. The same role with different tradeoffs produces different behavior: a "Careful Programmer" works differently than a "Fast Programmer."

Agents can be human or AI. AI agents require a role and tradeoff (injected into the prompt). Human agents use their own judgment — role and tradeoff are optional.

The agency loop

The agency system runs as a four-stage loop:

  1. Assign         2. Execute         3. Evaluate        4. Evolve
  identity          task               results            agency
  to task           (agent runs)       (score agent)      (recombine)
       │                 │                  │                  │
       └────────────────>└─────────────────>└─────────────────>┘
                                                               │
       ┌───────────────────────────────────────────────────────┘
       │             performance data feeds back
       ▼

Getting started

Seed the built-in starter roles and tradeoffs:

wg agency init

This creates four starter roles (Programmer, Reviewer, Documenter, Architect) and four tradeoffs (Careful, Fast, Thorough, Balanced), then pairs them into a default agent.

Or create your own:

# Create a role
wg role add "Programmer" \
  --outcome "Working, tested code" \
  --skill code-writing \
  --skill testing \
  --description "Writes, tests, and debugs code"

# Create a tradeoff
wg tradeoff add "Careful" \
  --accept "Slow" \
  --accept "Verbose" \
  --reject "Unreliable" \
  --reject "Untested" \
  --description "Prioritizes reliability over speed"

# Pair them into an agent
wg agent create "Careful Programmer" \
  --role <role-hash> \
  --tradeoff <tradeoff-hash>

Assigning agents to tasks

wg assign <task-id> <agent-hash>

When the service spawns that task, the agent's role and tradeoff are rendered into the prompt as an identity section. The agent sees its skills, desired outcome, acceptable tradeoffs, and non-negotiable constraints.

Enable automatic assignment:

wg config --auto-assign true

The coordinator creates .assign-* meta-tasks that use an LLM to match the best agent to each task based on required skills and agent capabilities.

Evaluation

After a task completes, evaluate the agent's work across four dimensions:

Dimension	Weight	Description
Correctness	40%	Does the output match the desired outcome?
Completeness	30%	Were all aspects of the task addressed?
Efficiency	15%	Was work done without unnecessary steps?
Style adherence	15%	Were conventions and constraints followed?

# LLM-based evaluation
wg evaluate run <task-id>

# Record from external sources
wg evaluate record --task <task-id> --score 0.9 --source "manual"

# View evaluation history
wg evaluate show --agent <agent-id>

Scores propagate to three levels: the agent's performance record, the role's record, and the tradeoff's record. This builds a comprehensive picture of which combinations work best.

Enable automatic evaluation:

wg config --auto-evaluate true

Evolution

Once you have enough evaluations, use the evolver to improve the agency:

wg evolve run

The evolver uses accumulated performance data to recombine the best-performing role+tradeoff pairs and retire underperformers. It supports multiple strategies:

# Targeted mutations
wg evolve run --strategy mutation --budget 3

# Preview without applying
wg evolve run --dry-run

# Autopoietic cycle — evolution as a recurring process
wg evolve run --autopoietic --max-iterations 5

Agency stats

View aggregate performance data:

$ wg agency stats

  Role        Tradeoff    Avg Score  Tasks  Rating
  ─────────────────────────────────────────────────
  Programmer  Thorough    0.93        63     HIGH
  Writer      Balanced    0.88        19     HIGH
  Analyst     Balanced    0.88        22     HIGH
  Programmer  Fast        0.71        12     MED
  Analyst     Cautious    0.54         8     LOW

Stats include role and tradeoff leaderboards, a synergy matrix showing which combinations perform best, and under-explored pairings.

Content-hash IDs

Every role, tradeoff, and agent is identified by a SHA-256 content hash of its identity-defining fields. This gives you:

Deterministic identity: Same content always produces the same ID
Automatic deduplication: Can't create two identical entities
Immutable identity: Changing identity fields creates a new entity — the old one stays

IDs are displayed as 8-character prefixes (e.g. a3f7c21d). All commands accept unique prefixes for convenience.

Human agents

Agents with human executors (matrix, email, shell) don't need a role or tradeoff — real people bring their own judgment. But they share the same identity model for capabilities, trust levels, and performance tracking.

wg agent create "Erik" \
  --executor matrix \
  --contact "@erik:server" \
  --capabilities rust,python,architecture \
  --trust-level verified