Merge pull request #129 from laurentenhoor/claude/update-docs-benefits-BhoG1

2026-02-11 15:48:12 +08:00
parent 163ac6ed3d b8ea37189b
commit 1e15c42657
12 changed files with 1501 additions and 896 deletions
--- a/README.md
+++ b/README.md
@@ -2,393 +2,235 @@
  <img src="assets/DevClaw.png" width="300" alt="DevClaw Logo">
 </p>
-# DevClaw - Development Plugin for OpenClaw
+# DevClaw — Development Plugin for OpenClaw
-**Every group chat becomes an autonomous development team.**
+**Turn any group chat into a dev team that ships.**
-Add the agent to a Telegram/WhatsApp group, point it at a GitLab/GitHub repo — that group now has an **orchestrator** managing the backlog, a **DEV** worker session writing code, and a **QA** worker session reviewing it. All autonomous. Add another group, get another team. Each project runs in complete isolation with its own task queue, workers, and session state.
+DevClaw is a plugin for [OpenClaw](https://openclaw.ai) that turns your orchestrator agent into a development manager. It hires developers, assigns tasks, reviews code, and keeps the pipeline moving — across as many projects as you have group chats. [Get started &rarr;](#getting-started)
-DevClaw is the [OpenClaw](https://openclaw.ai) plugin that makes this work.
+---
-## Why
+## What it looks like
-[OpenClaw](https://openclaw.ai) is great at giving AI agents the ability to develop software — spawn worker sessions, manage sessions, work with code. But running a real multi-project development pipeline exposes a gap: the orchestration layer between "agent can write code" and "agent reliably manages multiple projects" is brittle. Every task involves 10+ coordinated steps across GitLab labels, session state, model selection, and audit logging. Agents forget steps, corrupt state, null out session IDs they should preserve, or pick the wrong model for the job.
+You have two projects in two Telegram groups. You go to bed. You wake up:
-DevClaw fills that gap with guardrails. It gives the orchestrator atomic tools that make it impossible to forget a label transition, lose a session reference, or skip an audit log entry. The complexity of multi-project orchestration moves from agent instructions (that LLMs follow imperfectly) into deterministic code (that runs the same way every time).
+```
 ── Group: "Dev - My Webapp" ──────────────────────────────
-## The idea
+Agent:  "⚡ Sending DEV (medior) for #42: Add login page"
 Agent:  "✅ DEV DONE #42 — Login page with OAuth. Moved to QA."
 Agent:  "🔍 Sending QA (reviewer) for #42: Add login page"
 Agent:  "🎉 QA PASS #42. Issue closed."
 Agent:  "⚡ Sending DEV (junior) for #43: Fix button color on /settings"
 Agent:  "✅ DEV DONE #43 — Updated to brand blue. Moved to QA."
 Agent:  "❌ QA FAIL #43 — Color doesn't match dark mode. Back to DEV."
 Agent:  "⚡ Sending DEV (junior) for #43: Fix button color on /settings"
-One orchestrator agent manages all your projects. It reads task backlogs, creates issues, decides priorities, and delegates work. For each task, DevClaw assigns a developer from your **team** — a junior, medior, or senior dev writes the code, then a QA engineer reviews it. Every Telegram/WhatsApp group is a separate project — the orchestrator keeps them completely isolated while managing them all from a single process.
+  You:  "Create an issue for refactoring the profile page, pick it up."
-DevClaw gives the orchestrator nine tools that replace hundreds of lines of manual orchestration logic. Instead of following a 10-step checklist per task (fetch issue, check labels, pick model, check for existing session, transition label, dispatch task, update state, log audit event...), it calls `task_pickup` and the plugin handles everything atomically — including session dispatch. Workers call `task_complete` themselves for atomic state updates, and can file follow-up issues via `task_create`.
+Agent:  created #44 "Refactor user profile page" on GitHub — To Do
 Agent:  "⚡ Sending DEV (medior) for #44: Refactor user profile page"
-## Developer tiers
+Agent:  "✅ DEV DONE #43 — Fixed dark-mode color. Back to QA."
 Agent:  "🎉 QA PASS #43. Issue closed."
-DevClaw uses a developer seniority model. Each tier maps to a configurable LLM model:
+── Group: "Dev - My API" ─────────────────────────────────
-| Tier       | Role                | Default model                 | Assigns to                                        |
+Agent:  "🧠 Spawning DEV (senior) for #18: Migrate auth to OAuth2"
-| ---------- | ------------------- | ----------------------------- | ------------------------------------------------- |
+Agent:  "✅ DEV DONE #18 — OAuth2 provider with refresh tokens. Moved to QA."
-| **junior** | Junior developer    | `anthropic/claude-haiku-4-5`  | Typos, single-file fixes, simple changes          |
+Agent:  "🎉 QA PASS #18. Issue closed."
-| **medior** | Mid-level developer | `anthropic/claude-sonnet-4-5` | Features, bug fixes, multi-file changes           |
+Agent:  "⚡ Sending DEV (medior) for #19: Add rate limiting to /api/search"
 | **senior** | Senior developer    | `anthropic/claude-opus-4-5`   | Architecture, migrations, system-wide refactoring |
 | **qa**     | QA engineer         | `anthropic/claude-sonnet-4-5` | Code review, test validation                      |
 Configure which model each tier uses during setup or in `openclaw.json` plugin config.
 ## How it works
 ```mermaid
 graph TB
    subgraph "Group Chat A"
        direction TB
        A_O["🎯 Orchestrator"]
        A_GL[GitLab Issues]
        A_DEV["🔧 DEV (worker session)"]
        A_QA["🔍 QA (worker session)"]
        A_O -->|task_pickup| A_GL
        A_O -->|task_pickup dispatches| A_DEV
        A_O -->|task_pickup dispatches| A_QA
    end
    subgraph "Group Chat B"
        direction TB
        B_O["🎯 Orchestrator"]
        B_GL[GitLab Issues]
        B_DEV["🔧 DEV (worker session)"]
        B_QA["🔍 QA (worker session)"]
        B_O -->|task_pickup| B_GL
        B_O -->|task_pickup dispatches| B_DEV
        B_O -->|task_pickup dispatches| B_QA
    end
    subgraph "Group Chat C"
        direction TB
        C_O["🎯 Orchestrator"]
        C_GL[GitLab Issues]
        C_DEV["🔧 DEV (worker session)"]
        C_QA["🔍 QA (worker session)"]
        C_O -->|task_pickup| C_GL
        C_O -->|task_pickup dispatches| C_DEV
        C_O -->|task_pickup dispatches| C_QA
    end
    AGENT["Single OpenClaw Agent"]
    AGENT --- A_O
    AGENT --- B_O
    AGENT --- C_O
 ```
-It's the same agent process — but each group chat gives it a different project context. The orchestrator role, the workers, the task queue, and all state are fully isolated per group.
+Multiple issues shipped, a QA failure automatically retried, and a second project's migration completed — all while you slept. When you dropped in mid-stream to create an issue, the scheduler kept going before, during, and after.
-## Task lifecycle
+---
-Every task (GitLab issue) moves through a fixed pipeline of label states. Issues are created by the orchestrator agent or by worker sessions — not manually. DevClaw tools handle every transition atomically — label change, state update, audit log, and session management in a single call.
+## Why DevClaw
 ### Autonomous multi-project development
 Each project is fully isolated — own queue, workers, sessions, and state. DEV and QA execute in parallel within each project, and multiple projects run simultaneously. A token-free scheduling engine drives it all autonomously:
 - **[Scheduling engine](#automatic-scheduling)** — `work_heartbeat` continuously scans queues, dispatches workers, and drives DEV → QA → DEV [feedback loops](#how-tasks-flow-between-roles)
 - **[Project isolation](#execution-modes)** — parallel workers per project, parallel projects across the system
 - **[Role instructions](#custom-instructions-per-project)** — per-project, per-role prompts injected at dispatch time
 ### Process enforcement
 GitHub/GitLab issues are the single source of truth — not an internal database. Every tool call wraps the full operation into deterministic code with rollback on failure:
 - **[External task state](#your-issues-stay-in-your-tracker)** — labels, transitions, and status queries go through your issue tracker
 - **[Atomic operations](#what-atomic-means-here)** — label transition + state update + session dispatch + audit log in one call
 - **[Tool-based guardrails](#the-toolbox)** — 11 tools enforce the process; the agent provides intent, the plugin handles mechanics
 ### ~60-80% token savings
 Three mechanisms compound to cut token usage dramatically versus running one large model with fresh context each time:
 - **[Tier selection](#meet-your-team)** — Haiku for typos, Sonnet for features, Opus for architecture (~30-50% on simple tasks)
 - **[Session reuse](#sessions-accumulate-context)** — workers accumulate codebase knowledge across tasks (~40-60% per task)
 - **[Token-free scheduling](#automatic-scheduling)** — `work_heartbeat` runs on pure CLI calls, zero LLM tokens for orchestration
 ---
 ## The problem DevClaw solves
 OpenClaw is a great multi-agent runtime. It handles sessions, tools, channels, gateway RPC — everything you need to run AI agents. But it's a general-purpose platform. It has no opinion about how software gets built.
 Without DevClaw, your orchestrator agent has to figure out on its own how to:
 - Pick the right model for the task complexity
 - Create or reuse the right worker session
 - Transition issue labels in the right order
 - Track which worker is doing what across projects
 - Schedule QA after DEV completes, and re-schedule DEV after QA fails
 - Detect crashed workers and recover
 - Log everything for auditability
 That's a lot of reasoning per task. LLMs do it imperfectly — they forget steps, corrupt state, pick the wrong model, lose session references. You end up babysitting the thing you built to avoid babysitting.
 DevClaw moves all of that into deterministic plugin code. The agent says "pick up issue #42." The plugin handles the other 10 steps atomically. Every time, the same way, zero reasoning tokens spent on orchestration.
 ---
 ## Meet your team
 DevClaw doesn't think in model IDs. It thinks in people.
 When a task comes in, you don't configure `anthropic/claude-sonnet-4-5` — you assign a **medior developer**. The orchestrator evaluates task complexity and picks the right person for the job:
 ### Developers
 | Level | Assigns to | Model |
 |---|---|---|
 | **Junior** | Typos, CSS fixes, renames, single-file changes | Haiku |
 | **Medior** | Features, bug fixes, multi-file changes | Sonnet |
 | **Senior** | Architecture, migrations, system-wide refactoring | Opus |
 ### QA
 | Level | Assigns to | Model |
 |---|---|---|
 | **Reviewer** | Code review, test validation, PR inspection | Sonnet |
 | **Tester** | Manual testing, smoke tests | Haiku |
 A CSS typo gets the intern. A database migration gets the architect. You're not burning Opus tokens on a color change, and you're not sending Haiku to redesign your auth system.
 Every mapping is [configurable](docs/CONFIGURATION.md#model-tiers) — swap in any model you want per level.
 ---
 ## How a task moves through the pipeline
 Every issue follows the same path, no exceptions. DevClaw enforces it:
 ```
 Planning → To Do → Doing → To Test → Testing → Done
 ```
 ```mermaid
 stateDiagram-v2
    [*] --> Planning
    Planning --> ToDo: Ready for development
-    ToDo --> Doing: task_pickup (DEV) ⇄ blocked
+    ToDo --> Doing: DEV picks up
-    Doing --> ToTest: task_complete (DEV done)
+    Doing --> ToTest: DEV done
-    ToTest --> Testing: task_pickup (QA) / auto-chain ⇄ blocked
+    ToTest --> Testing: Scheduler picks up QA
-    Testing --> Done: task_complete (QA pass)
+    Testing --> Done: QA pass (issue closed)
-    Testing --> ToImprove: task_complete (QA fail)
+    Testing --> ToImprove: QA fail (back to DEV)
-    Testing --> Refining: task_complete (QA refine)
+    Testing --> Refining: QA needs human input
-    ToImprove --> Doing: task_pickup (DEV fix) or auto-chain
+    ToImprove --> Doing: Scheduler picks up DEV fix
-    Refining --> ToDo: Human decision
+    Refining --> ToDo: Human decides
    Done --> [*]
 ```
-### Worker self-reporting
+These labels live on your actual GitHub/GitLab issues. Not in some internal database — in the tool you already use. Filter by `Doing` in GitHub to see what's in progress. Set up a webhook on `Done` to trigger deploys. The issue tracker is the source of truth.
-Workers (DEV/QA sub-agent sessions) call `task_complete` directly when they finish — no orchestrator involvement needed for the state transition. Workers can also call `task_create` to file follow-up issues they discover during work.
+### What "atomic" means here
-### Completion enforcement
+When you say "pick up #42 for DEV", the plugin does all of this in one operation:
 1. Verifies the issue is in the right state
 2. Picks the developer level (or uses what you specified)
 3. Transitions the label (`To Do` → `Doing`)
 4. Creates or reuses the right worker session
 5. Dispatches the task with project-specific instructions
 6. Updates internal state
 7. Logs an audit entry
-Three layers guarantee that `task_complete` always runs, preventing tasks from getting stuck in "Doing" or "Testing" forever:
+If step 4 fails, step 3 is rolled back. No half-states, no orphaned labels, no "the issue says Doing but nobody's working on it."
-1. **Completion contract** — Every task message includes a mandatory section requiring the worker to call `task_complete`, even on failure. Workers use `"blocked"` if stuck.
+---
 2. **Blocked result** — Both DEV and QA can return `"blocked"` to gracefully put a task back in queue (`Doing → To Do`, `Testing → To Test`) instead of silently dying.
 3. **Stale worker watchdog** — The heartbeat health check detects workers active >2 hours and auto-reverts labels to queue, catching sessions that crashed or ran out of context.
-### Auto-chaining
+## What happens behind the scenes
-When a project has `autoChain: true`, `task_complete` automatically dispatches the next step:
+### Workers report back themselves
- **DEV "done"** → QA is dispatched immediately (using the qa tier)
+When a developer finishes, they call `work_finish` directly — no orchestrator involved:
 - **QA "fail"** → DEV fix is dispatched immediately (reuses previous DEV tier)
 - **QA "pass" / "refine" / "blocked"** → no chaining (pipeline done, needs human input, or returned to queue)
 - **DEV "blocked"** → no chaining (returned to queue for retry)
-When `autoChain` is false, `task_complete` returns a `nextAction` hint for the orchestrator to act on.
+- **DEV "done"** → label moves to `To Test`, scheduler picks up QA on next tick
 - **DEV "blocked"** → label moves back to `To Do`, task returns to queue
 - **QA "pass"** → label moves to `Done`, issue closes
 - **QA "fail"** → label moves to `To Improve`, scheduler picks up DEV on next tick
-## Session reuse
+The orchestrator doesn't need to poll, check, or coordinate. Workers are self-reporting.
-Worker sessions are expensive to start — each new spawn requires the session to read the full codebase (~50K tokens). DevClaw maintains **separate sessions per tier per role** (session-per-tier design). When a medior dev finishes task A and picks up task B on the same project, the plugin detects the existing session and sends the task directly — no new session needed.
+### Sessions accumulate context
-The plugin handles session dispatch internally via OpenClaw CLI. The orchestrator agent never calls `sessions_spawn` or `sessions_send` — it just calls `task_pickup` and the plugin does the rest.
+Each developer level gets its own persistent session per project. Your medior dev that's done 5 features on `my-app` already knows the codebase — it doesn't re-read 50K tokens of source code every time it picks up a new task.
-```mermaid
+That's a **~40-60% token saving per task** from session reuse alone.
 sequenceDiagram
    participant O as Orchestrator
    participant DC as DevClaw Plugin
    participant GL as GitLab
    participant S as Worker Session
-    O->>DC: task_pickup({ issueId: 42, role: "dev" })
+Combined with tier selection (not using Opus when Haiku will do) and the token-free heartbeat (more on that next), DevClaw significantly reduces your token bill versus running everything through one large model.
    DC->>GL: Fetch issue, verify label
    DC->>DC: Assign tier (junior/medior/senior)
    DC->>DC: Check existing session for assigned tier
    DC->>GL: Transition label (To Do → Doing)
    DC->>S: Dispatch task via CLI (create or reuse session)
    DC->>DC: Update projects.json, write audit log
    DC-->>O: { success: true, announcement: "🔧 DEV (medior) picking up #42" }
 ```
-## Developer assignment
+### Everything is logged
-The orchestrator LLM evaluates each issue's title, description, and labels to assign the appropriate developer tier, then passes it to `task_pickup` via the `model` parameter. This gives the LLM full context for the decision — it can weigh factors like codebase familiarity, task dependencies, and recent failure history that keyword matching would miss.
+Every tool call writes an NDJSON line to `audit.log`:
 The keyword heuristic in `model-selector.ts` serves as a **fallback only**, used when the orchestrator omits the `model` parameter.
 | Tier   | Role                | When                                                        |
 | ------ | ------------------- | ----------------------------------------------------------- |
 | junior | Junior developer    | Typos, CSS, renames, copy changes                           |
 | medior | Mid-level developer | Features, bug fixes, multi-file changes                     |
 | senior | Senior developer    | Architecture, migrations, security, system-wide refactoring |
 | qa     | QA engineer         | All QA tasks (code review, test validation)                 |
 ## State management
 All project state lives in a single `projects/projects.json` file in the orchestrator's workspace, keyed by Telegram group ID:
 ```json
 {
  "projects": {
    "-1234567890": {
      "name": "my-webapp",
      "repo": "~/git/my-webapp",
      "groupName": "Dev - My Webapp",
      "baseBranch": "development",
      "autoChain": true,
      "dev": {
        "active": false,
        "issueId": null,
        "model": "medior",
        "sessions": {
          "junior": "agent:orchestrator:subagent:a9e4d078-...",
          "medior": "agent:orchestrator:subagent:b3f5c912-...",
          "senior": null
        }
      },
      "qa": {
        "active": false,
        "issueId": null,
        "model": "qa",
        "sessions": {
          "qa": "agent:orchestrator:subagent:18707821-..."
        }
      }
    }
  }
 }
 ```
 Key design decisions:
 - **Session-per-tier** — each tier gets its own worker session, accumulating context independently. Tier selection maps directly to a session key.
 - **Sessions preserved on completion** — when a worker completes a task, `sessions` map is **preserved** (only `active` and `issueId` are cleared). This enables session reuse on the next pickup.
 - **Plugin-controlled dispatch** — the plugin creates and dispatches to sessions via OpenClaw CLI (`sessions.patch` + `openclaw agent`). The orchestrator agent never calls `sessions_spawn` or `sessions_send`.
 - **Sessions persist indefinitely** — no auto-cleanup. `session_health` handles manual cleanup when needed.
 All writes go through atomic temp-file-then-rename to prevent corruption.
 ## Tools
 ### `devclaw_setup`
 Set up DevClaw in an agent's workspace. Creates AGENTS.md, HEARTBEAT.md, role templates, and configures models. Can optionally create a new agent.
 **Parameters:**
 - `newAgentName` (string, optional) — Create a new agent with this name
 - `models` (object, optional) — Model overrides per tier: `{ junior, medior, senior, qa }`
 ### `task_pickup`
 Pick up a task from the issue queue for a DEV or QA worker.
 **Parameters:**
 - `issueId` (number, required) — Issue ID
 - `role` ("dev" | "qa", required) — Worker role
 - `projectGroupId` (string, required) — Telegram group ID
 - `model` (string, optional) — Developer tier (junior, medior, senior, qa). The orchestrator should evaluate the task complexity and choose. Falls back to keyword heuristic if omitted.
 **What it does atomically:**
 1. Resolves project from `projects.json`
 2. Validates no active worker for this role
 3. Fetches issue from issue tracker, verifies correct label state
 4. Assigns tier (LLM-chosen via `model` param, keyword heuristic fallback)
 5. Loads prompt instructions from `projects/prompts/<project>/<role>.md`
 6. Looks up existing session for assigned tier (session-per-tier)
 7. Transitions label (e.g. `To Do` → `Doing`)
 8. Creates session via Gateway RPC if new (`sessions.patch`)
 9. Dispatches task to worker session via CLI (`openclaw agent`) with role instructions appended
 10. Updates `projects.json` state (active, issueId, tier, session key)
 11. Writes audit log entry
 12. Returns announcement text for the orchestrator to post
 ### `task_complete`
 Complete a task with a result. Called by workers (DEV/QA sub-agent sessions) directly, or by the orchestrator.
 **Parameters:**
 - `role` ("dev" | "qa", required)
 - `result` ("done" | "pass" | "fail" | "refine" | "blocked", required)
 - `projectGroupId` (string, required)
 - `summary` (string, optional) — For the Telegram announcement
 **Results:**
 - **DEV "done"** — Pulls latest code, moves label `Doing` → `To Test`, deactivates worker. If `autoChain` enabled, automatically dispatches QA.
 - **DEV "blocked"** — Moves label `Doing` → `To Do`, deactivates worker. Task returns to queue for retry.
 - **QA "pass"** — Moves label `Testing` → `Done`, closes issue, deactivates worker
 - **QA "fail"** — Moves label `Testing` → `To Improve`, reopens issue. If `autoChain` enabled, automatically dispatches DEV fix (reuses previous DEV tier).
 - **QA "refine"** — Moves label `Testing` → `Refining`, awaits human decision
 - **QA "blocked"** — Moves label `Testing` → `To Test`, deactivates worker. Task returns to QA queue for retry.
 ### `task_update`
 Change an issue's state label programmatically without going through the full pickup/complete flow.
 **Parameters:**
 - `projectGroupId` (string, required) — Telegram/WhatsApp group ID
 - `issueId` (number, required) — Issue ID to update
 - `state` (string, required) — New state label (Planning, To Do, Doing, To Test, Testing, Done, To Improve, Refining)
 - `reason` (string, optional) — Audit log reason for the change
 **Use cases:**
 - Manual state adjustments (e.g., Planning → To Do after approval)
 - Failed auto-transitions that need correction
 - Bulk state changes by orchestrator
 ### `task_comment`
 Add a comment to an issue for feedback, notes, or discussion.
 **Parameters:**
 - `projectGroupId` (string, required) — Telegram/WhatsApp group ID
 - `issueId` (number, required) — Issue ID to comment on
 - `body` (string, required) — Comment body in markdown
 - `authorRole` ("dev" | "qa" | "orchestrator", optional) — Attribution role
 **Use cases:**
 - QA adds review feedback without blocking pass/fail
 - DEV posts implementation notes or progress updates
 - Orchestrator adds summary comments
 ### `task_create`
 Create a new issue in the project's issue tracker. Used by workers to file follow-up bugs, or by the orchestrator to create tasks from chat.
 **Parameters:**
 - `projectGroupId` (string, required) — Telegram group ID
 - `title` (string, required) — Issue title
 - `description` (string, optional) — Full issue body in markdown
 - `label` (string, optional) — State label (defaults to "Planning")
 - `assignees` (string[], optional) — Usernames to assign
 - `pickup` (boolean, optional) — If true, immediately pick up for DEV after creation
 ### `queue_status`
 Returns task queue counts and worker status across all projects (or a specific one).
 **Parameters:**
 - `projectGroupId` (string, optional) — Omit for all projects
 ### `session_health`
 Detects and optionally fixes state inconsistencies.
 **Parameters:**
 - `autoFix` (boolean, optional) — Auto-fix zombies and stale state
 **What it does:**
 - Queries live sessions via Gateway RPC (`sessions.list`)
 - Cross-references with `projects.json` worker state
 **Checks:**
 - Active worker with no session key (critical, auto-fixable)
 - Active worker whose session is dead — zombie (critical, auto-fixable)
 - Worker active for >2 hours — stale watchdog (warning, auto-fixable: reverts label to queue)
 - Inactive worker with lingering issue ID (warning, auto-fixable)
 ### `project_register`
 Register a new project with DevClaw. Creates all required issue tracker labels (idempotent), scaffolds role instruction files, and adds the project to `projects.json`. One-time setup per project. Auto-detects GitHub/GitLab from git remote.
 **Parameters:**
 - `projectGroupId` (string, required) — Telegram group ID (key in projects.json)
 - `name` (string, required) — Short project name
 - `repo` (string, required) — Path to git repo (e.g. `~/git/my-project`)
 - `groupName` (string, required) — Telegram group display name
 - `baseBranch` (string, required) — Base branch for development
 - `deployBranch` (string, optional) — Defaults to baseBranch
 - `deployUrl` (string, optional) — Deployment URL
 **What it does atomically:**
 1. Validates project not already registered
 2. Resolves repo path, auto-detects GitHub/GitLab, and verifies access
 3. Creates all 8 state labels (idempotent — safe to run on existing projects)
 4. Adds project entry to `projects.json` with empty worker state and `autoChain: false`
 5. Scaffolds prompt instruction files: `projects/prompts/<project>/dev.md` and `projects/prompts/<project>/qa.md`
 6. Writes audit log entry
 7. Returns announcement text
 ## Audit logging
 Every tool call automatically appends an NDJSON entry to `log/audit.log`. No manual logging required from the orchestrator agent.
 ```jsonl
 {"ts":"2026-02-08T10:30:00Z","event":"task_pickup","project":"my-webapp","issue":42,"role":"dev","tier":"medior","sessionAction":"send"}
 {"ts":"2026-02-08T10:30:01Z","event":"model_selection","issue":42,"role":"dev","tier":"medior","reason":"Standard dev task"}
 {"ts":"2026-02-08T10:45:00Z","event":"task_complete","project":"my-webapp","issue":42,"role":"dev","result":"done"}
 ```
 ## Quick start
 ```bash
-# 1. Install the plugin
+cat audit.log | jq 'select(.event=="work_start")'
 cp -r devclaw ~/.openclaw/extensions/
 # 2. Run setup (interactive — creates agent, configures models, writes workspace files)
 openclaw devclaw setup
 # 3. Add bot to Telegram group, then register a project
 # (via the agent in Telegram)
 ```
-See the [Onboarding Guide](docs/ONBOARDING.md) for detailed instructions.
+Full trace of every task, every level selection, every label transition, every health fix. No manual logging needed.
-## Configuration
+---
-Model tier configuration in `openclaw.json`:
+## Automatic scheduling
 DevClaw doesn't wait for you to tell it what to do next. A background scheduling system continuously scans for available work and dispatches workers — zero LLM tokens, pure deterministic code. This is the engine that keeps the pipeline moving: when DEV finishes, the scheduler sees a `To Test` issue and dispatches QA. When QA fails, the scheduler sees a `To Improve` issue and dispatches DEV. No hand-offs, no orchestrator reasoning — just label-driven scheduling.
 ### The `work_heartbeat`
 Every tick (default: 60 seconds), the scheduler runs two passes:
 1. **Health pass** — detects workers stuck for >2 hours, reverts their labels back to queue, deactivates them. Catches crashed sessions, context overflows, or workers that died without reporting back.
 2. **Queue pass** — scans for available tasks by priority (`To Improve` > `To Test` > `To Do`), fills free worker slots. DEV and QA slots are filled independently.
 All CLI calls and JSON reads. Workers only consume tokens when they actually start coding or reviewing. The scheduler also fires immediately after every `work_finish` (as a tick), so transitions happen without waiting for the next interval.
 ### How tasks flow between roles
 When a worker calls `work_finish`, the plugin transitions the label. The scheduler picks up the rest:
 - **DEV "done"** → label moves to `To Test` → next tick dispatches QA
 - **QA "fail"** → label moves to `To Improve` → next tick dispatches DEV (reuses previous level)
 - **QA "pass"** → label moves to `Done`, issue closes
 - **"blocked"** → label reverts to queue (`To Do` or `To Test`) for retry
 No orchestrator involvement. Workers self-report, the scheduler fills free slots.
 ### Execution modes
 Each project is fully isolated — its own queue, workers, sessions, state. No cross-project contamination. Two levels of parallelism control how work gets scheduled:
 - **Project-level (`roleExecution`)** — DEV and QA work simultaneously on different tasks (default: `parallel`) or take turns (`sequential`)
 - **Plugin-level (`projectExecution`)** — all registered projects dispatch workers independently (default: `parallel`) or only one project runs at a time (`sequential`)
 ### Configuration
 All scheduling behavior is configurable in `openclaw.json`:
 ```json
 {
@@ -396,12 +238,12 @@ Model tier configuration in `openclaw.json`:
    "entries": {
      "devclaw": {
        "config": {
-          "models": {
+          "work_heartbeat": {
-            "junior": "anthropic/claude-haiku-4-5",
+            "enabled": true,
-            "medior": "anthropic/claude-sonnet-4-5",
+            "intervalSeconds": 60,
-            "senior": "anthropic/claude-opus-4-5",
+            "maxPickupsPerTick": 4
-            "qa": "anthropic/claude-sonnet-4-5"
+          },
-          }
+          "projectExecution": "parallel"
        }
      }
    }
@@ -409,61 +251,156 @@ Model tier configuration in `openclaw.json`:
 }
 ```
-Restrict tools to your orchestrator agent only:
+Per-project settings live in `projects.json`:
 ```json
 {
-  "agents": {
+  "-1234567890": {
-    "list": [
+    "name": "my-app",
-      {
+    "roleExecution": "parallel"
        "id": "my-orchestrator",
        "tools": {
          "allow": [
            "devclaw_setup",
            "task_pickup",
            "task_complete",
            "task_update",
            "task_comment",
            "task_create",
            "queue_status",
            "session_health",
            "project_register"
          ]
        }
      }
    ]
  }
 }
 ```
-> DevClaw uses an `IssueProvider` interface to abstract issue tracker operations. GitLab (via `glab` CLI) and GitHub (via `gh` CLI) are supported — the provider is auto-detected from the git remote URL. Jira is planned.
+| Setting | Where | Default | What it controls |
 |---|---|---|---|
 | `work_heartbeat.enabled` | `openclaw.json` | `true` | Turn the heartbeat on/off |
 | `work_heartbeat.intervalSeconds` | `openclaw.json` | `60` | Seconds between ticks |
 | `work_heartbeat.maxPickupsPerTick` | `openclaw.json` | `4` | Max workers dispatched per tick |
 | `projectExecution` | `openclaw.json` | `"parallel"` | All projects at once, or one at a time |
 | `roleExecution` | `projects.json` | `"parallel"` | DEV+QA at once, or one role at a time |
-## Prompt instructions
+See the [Configuration reference](docs/CONFIGURATION.md) for the full schema.
-Workers receive role-specific instructions appended to their task message. `project_register` scaffolds editable files:
+---
 ## Task management
 ### Your issues stay in your tracker
 DevClaw doesn't have its own task database. All task state lives in **GitHub Issues** or **GitLab Issues** — auto-detected from your git remote. The eight pipeline labels are created on your repo when you register a project. Your project manager sees progress in GitHub without knowing DevClaw exists. Your CI/CD can trigger on label changes. If you stop using DevClaw, your issues and labels stay exactly where they are.
 The provider is pluggable (`IssueProvider` interface). GitHub and GitLab work today. Jira, Linear, or anything else just needs to implement the same interface.
 ### Creating, updating, and commenting
 Tasks can come from anywhere — the orchestrator creates them from chat, workers file bugs they discover mid-task, or you create them directly in GitHub/GitLab:
 ```
-workspace/
+You:    "Create an issue: fix the broken OAuth redirect"
-├── projects/
+Agent:  creates issue #43 with label "Planning"
-│   ├── projects.json     ← project state
+
-│   └── prompts/
+You:    "Move #43 to To Do"
-│       ├── my-webapp/    ← per-project prompts (edit to customize)
+Agent:  transitions label Planning → To Do
-│       │   ├── dev.md
+
-│       │   └── qa.md
+You:    "Add a comment on #42: needs to handle the edge case for expired tokens"
-│       └── another-project/
+Agent:  adds comment attributed to "orchestrator"
 │           ├── dev.md
 │           └── qa.md
 ├── log/
 │   └── audit.log         ← NDJSON event log
 ```
-`task_pickup` loads `projects/prompts/<project>/<role>.md`. Edit these files to customize worker behavior per project — for example, adding project-specific deployment steps or test commands.
+Workers can also comment during work — QA leaves review feedback, DEV posts implementation notes. Every comment carries role attribution so you know who said what.
-## Requirements
+### Custom instructions per project
- [OpenClaw](https://openclaw.ai)
+Each project gets instruction files that workers receive with every task they pick up:
 ```
 workspace/projects/roles/
 ├── my-webapp/
 │   ├── dev.md     "Run npm test before committing. Deploy URL: staging.example.com"
 │   └── qa.md      "Check OAuth flow. Verify mobile responsiveness."
 ├── my-api/
 │   ├── dev.md     "Run cargo test. Follow REST conventions in CONTRIBUTING.md"
 │   └── qa.md      "Verify all endpoints return correct status codes."
 └── default/
    ├── dev.md     (fallback for projects without custom instructions)
    └── qa.md
 ```
 Deployment steps, test commands, coding standards, acceptance criteria — all injected at dispatch time, per project, per role.
 ---
 ## Getting started
 ### Prerequisites
 - [OpenClaw](https://openclaw.ai) installed (`openclaw --version`)
 - Node.js >= 20
- [`glab`](https://gitlab.com/gitlab-org/cli) CLI installed and authenticated (GitLab provider), or [`gh`](https://cli.github.com) CLI (GitHub provider)
+- `gh` CLI ([GitHub](https://cli.github.com)) or `glab` CLI ([GitLab](https://gitlab.com/gitlab-org/cli)), authenticated
 ### Install
 ```bash
 cp -r devclaw ~/.openclaw/extensions/
 ```
 ### Set up through conversation
 The easiest way to configure DevClaw is to just talk to your agent:
 ```
 You:   "Help me set up DevClaw"
 Agent: "I'll walk you through it. Should I use this agent as the
        orchestrator, or create a new one?"
 You:   "Use this one"
 Agent: "Want to bind a messaging channel?"
 You:   "Telegram"
 Agent: "Here are the default developer assignments:
        Junior → Haiku, Medior → Sonnet, Senior → Opus
        Reviewer → Sonnet, Tester → Haiku
        Keep these or customize?"
 You:   "Keep them"
 Agent: "Done. Want to register a project?"
 You:   "Yes — my-app at ~/git/my-app, main branch"
 Agent: "Project registered. 8 labels created on your repo.
        Role instructions scaffolded. Try: 'check the queue'"
 ```
 You can also use the [CLI wizard or non-interactive setup](docs/ONBOARDING.md#step-2-run-setup) for scripted environments.
 ---
 ## The toolbox
 DevClaw gives the orchestrator 11 tools. These aren't just convenience wrappers — they're **guardrails**. Each tool encodes a complex multi-step operation into a single atomic call. The agent provides intent, the plugin handles mechanics. The agent physically cannot skip a label transition, forget to update state, or dispatch to the wrong session — those decisions are made by deterministic code, not LLM reasoning.
 | Tool | What it does |
 |---|---|
 | `work_start` | Pick up a task — resolves level, transitions label, dispatches session, logs audit |
 | `work_finish` | Complete a task — transitions label, updates state, ticks queue for next dispatch |
 | `task_create` | Create a new issue (used by workers to file bugs they discover) |
 | `task_update` | Manually change an issue's state label |
 | `task_comment` | Add a comment to an issue (with role attribution) |
 | `status` | Dashboard: queue counts + who's working on what |
 | `health` | Detect zombie workers, stale sessions, state inconsistencies |
 | `work_heartbeat` | Manually trigger a health check + queue dispatch cycle |
 | `project_register` | One-time project setup: creates labels, scaffolds instructions, initializes state |
 | `setup` | Agent + workspace initialization |
 | `onboard` | Conversational setup guide |
 Full parameters and usage in the [Tools Reference](docs/TOOLS.md).
 ---
 ## Documentation
 | | |
 |---|---|
 | **[Architecture](docs/ARCHITECTURE.md)** | System design, session model, data flow, end-to-end diagrams |
 | **[Tools Reference](docs/TOOLS.md)** | Complete reference for all 11 tools |
 | **[Configuration](docs/CONFIGURATION.md)** | `openclaw.json`, `projects.json`, heartbeat, notifications |
 | **[Onboarding Guide](docs/ONBOARDING.md)** | Full step-by-step setup |
 | **[QA Workflow](docs/QA_WORKFLOW.md)** | QA process and review templates |
 | **[Context Awareness](docs/CONTEXT-AWARENESS.md)** | How tools adapt to group vs. DM vs. agent context |
 | **[Testing](docs/TESTING.md)** | Test suite, fixtures, CI/CD |
 | **[Management Theory](docs/MANAGEMENT.md)** | The delegation model behind the design |
 | **[Roadmap](docs/ROADMAP.md)** | What's coming next |
 ---
 ## License
--- a/VERIFICATION.md
+++ b/VERIFICATION.md
@@ -1,45 +0,0 @@
 # Verification: task_create Default State
 ## Issue #115 Request
 Change default state for new tasks from "To Do" to "Planning"
 ## Current Implementation Status
 **Already implemented** - The default has been "Planning" since initial commit.
 ### Code Evidence
 File: `lib/tools/task-create.ts` (line 68)
 ```typescript
 const label = (params.label as StateLabel) ?? "Planning";
 ```
 ### Documentation Evidence
 File: `README.md` (line 308)
 ```
 - `label` (string, optional) — State label (defaults to "Planning")
 ```
 ### Tool Description
 The tool description itself states:
 ```
 The issue is created with a state label (defaults to "Planning").
 ```
 ## Timeline
 - **Feb 9, 2026** (commit 8a79755e): Initial task_create implementation with "Planning" default
 - **Feb 10, 2026**: Issue #115 created requesting this change (already done)
 ## Verification Test
 Default behavior can be verified by calling task_create without specifying a label:
 ```javascript
 task_create({
  projectGroupId: "-5239235162",
  title: "Test Issue"
  // label parameter omitted - should default to "Planning"
 })
 ```
 Expected result: Issue created with "Planning" label, NOT "To Do"
 ## Conclusion
 The requested feature is already fully implemented. No code changes needed.
--- a/docs/ARCHITECTURE.md
+++ b/docs/ARCHITECTURE.md
@@ -1,64 +1,116 @@
 # DevClaw — Architecture & Component Interaction
 ## How it works
 One OpenClaw agent process serves multiple group chats — each group gives it a different project context. The orchestrator role, the workers, the task queue, and all state are fully isolated per group.
 ```mermaid
 graph TB
    subgraph "Group Chat A"
        direction TB
        A_O["Orchestrator"]
        A_GL[GitHub/GitLab Issues]
        A_DEV["DEV (worker session)"]
        A_QA["QA (worker session)"]
        A_O -->|work_start| A_GL
        A_O -->|dispatches| A_DEV
        A_O -->|dispatches| A_QA
    end
    subgraph "Group Chat B"
        direction TB
        B_O["Orchestrator"]
        B_GL[GitHub/GitLab Issues]
        B_DEV["DEV (worker session)"]
        B_QA["QA (worker session)"]
        B_O -->|work_start| B_GL
        B_O -->|dispatches| B_DEV
        B_O -->|dispatches| B_QA
    end
    AGENT["Single OpenClaw Agent"]
    AGENT --- A_O
    AGENT --- B_O
 ```
 Worker sessions are expensive to start — each new spawn reads the full codebase (~50K tokens). DevClaw maintains **separate sessions per level per role** ([session-per-level design](#session-per-level-design)). When a medior dev finishes task A and picks up task B on the same project, the accumulated context carries over — no re-reading the repo. The plugin handles all session dispatch internally via OpenClaw CLI; the orchestrator agent never calls `sessions_spawn` or `sessions_send`.
 ```mermaid
 sequenceDiagram
    participant O as Orchestrator
    participant DC as DevClaw Plugin
    participant IT as Issue Tracker
    participant S as Worker Session
    O->>DC: work_start({ issueId: 42, role: "dev" })
    DC->>IT: Fetch issue, verify label
    DC->>DC: Assign level (junior/medior/senior)
    DC->>DC: Check existing session for assigned level
    DC->>IT: Transition label (To Do → Doing)
    DC->>S: Dispatch task via CLI (create or reuse session)
    DC->>DC: Update projects.json, write audit log
    DC-->>O: { success: true, announcement: "..." }
 ```
 ## Agents vs Sessions
 Understanding the OpenClaw model is key to understanding how DevClaw works:
 - **Agent** — A configured entity in `openclaw.json`. Has a workspace, model, identity files (SOUL.md, IDENTITY.md), and tool permissions. Persists across restarts.
 - **Session** — A runtime conversation instance. Each session has its own context window and conversation history, stored as a `.jsonl` transcript file.
- **Sub-agent session** — A session created under the orchestrator agent for a specific worker role. NOT a separate agent — it's a child session running under the same agent, with its own isolated context. Format: `agent:<parent>:subagent:<uuid>`.
+- **Sub-agent session** — A session created under the orchestrator agent for a specific worker role. NOT a separate agent — it's a child session running under the same agent, with its own isolated context. Format: `agent:<parent>:subagent:<project>-<role>-<level>`.
-### Session-per-tier design
+### Session-per-level design
-Each project maintains **separate sessions per developer tier per role**. A project's DEV might have a junior session, a medior session, and a senior session — each accumulating its own codebase context over time.
+Each project maintains **separate sessions per developer level per role**. A project's DEV might have a junior session, a medior session, and a senior session — each accumulating its own codebase context over time.
 ```
 Orchestrator Agent (configured in openclaw.json)
  └─ Main session (long-lived, handles all projects)
       │
       ├─ Project A
-       │    ├─ DEV sessions: { junior: <uuid>, medior: <uuid>, senior: null }
+       │    ├─ DEV sessions: { junior: <key>, medior: <key>, senior: null }
-       │    └─ QA sessions:  { qa: <uuid> }
+       │    └─ QA sessions:  { reviewer: <key>, tester: null }
       │
       └─ Project B
-            ├─ DEV sessions: { junior: null, medior: <uuid>, senior: null }
+            ├─ DEV sessions: { junior: null, medior: <key>, senior: null }
-            └─ QA sessions:  { qa: <uuid> }
+            └─ QA sessions:  { reviewer: <key>, tester: null }
 ```
-Why per-tier instead of switching models on one session:
+Why per-level instead of switching models on one session:
 - **No model switching overhead** — each session always uses the same model
 - **Accumulated context** — a junior session that's done 20 typo fixes knows the project well; a medior session that's done 5 features knows it differently
 - **No cross-model confusion** — conversation history stays with the model that generated it
- **Deterministic reuse** — tier selection directly maps to a session key, no patching needed
+- **Deterministic reuse** — level selection directly maps to a session key, no patching needed
 ### Plugin-controlled session lifecycle
 DevClaw controls the **full** session lifecycle end-to-end. The orchestrator agent never calls `sessions_spawn` or `sessions_send` — the plugin handles session creation and task dispatch internally using the OpenClaw CLI:
 ```
-Plugin dispatch (inside task_pickup):
+Plugin dispatch (inside work_start):
-  1. Assign tier, look up session, decide spawn vs send
+  1. Assign level, look up session, decide spawn vs send
  2. New session:  openclaw gateway call sessions.patch → create entry + set model
-                   openclaw agent --session-id <key> --message "task..."
+                   openclaw gateway call agent → dispatch task
-  3. Existing:     openclaw agent --session-id <key> --message "task..."
+  3. Existing:     openclaw gateway call agent → dispatch task to existing session
  4. Return result to orchestrator (announcement text, no session instructions)
 ```
-The agent's only job after `task_pickup` returns is to post the announcement to Telegram. Everything else — tier assignment, session creation, task dispatch, state update, audit logging — is deterministic plugin code.
+The agent's only job after `work_start` returns is to post the announcement to Telegram. Everything else — level assignment, session creation, task dispatch, state update, audit logging — is deterministic plugin code.
 **Why this matters:** Previously the plugin returned instructions like `{ sessionAction: "spawn", model: "sonnet" }` and the agent had to correctly call `sessions_spawn` with the right params. This was the fragile handoff point where agents would forget `cleanup: "keep"`, use wrong models, or corrupt session state. Moving dispatch into the plugin eliminates that entire class of errors.
-**Session persistence:** Sessions created via `sessions.patch` persist indefinitely (no auto-cleanup). The plugin manages lifecycle explicitly through `session_health`.
+**Session persistence:** Sessions created via `sessions.patch` persist indefinitely (no auto-cleanup). The plugin manages lifecycle explicitly through the `health` tool.
 **What we trade off vs. registered sub-agents:**
 | Feature | Sub-agent system | Plugin-controlled | DevClaw equivalent |
 |---|---|---|---|
 | Auto-reporting | Sub-agent reports to parent | No | Heartbeat polls for completion |
-| Concurrency control | `maxConcurrent` | No | `task_pickup` checks `active` flag |
+| Concurrency control | `maxConcurrent` | No | `work_start` checks `active` flag |
 | Lifecycle tracking | Parent-child registry | No | `projects.json` tracks all sessions |
-| Timeout detection | `runTimeoutSeconds` | No | `session_health` flags stale >2h |
+| Timeout detection | `runTimeoutSeconds` | No | `health` flags stale >2h |
-| Cleanup | Auto-archive | No | `session_health` manual cleanup |
+| Cleanup | Auto-archive | No | `health` manual cleanup |
 DevClaw provides equivalent guardrails for everything except auto-reporting, which the heartbeat handles.
@@ -74,22 +126,22 @@ graph TB
    subgraph "OpenClaw Runtime"
        MS[Main Session<br/>orchestrator agent]
        GW[Gateway RPC<br/>sessions.patch / sessions.list]
-        CLI[openclaw agent CLI]
+        CLI[openclaw gateway call agent]
        DEV_J[DEV session<br/>junior]
        DEV_M[DEV session<br/>medior]
        DEV_S[DEV session<br/>senior]
-        QA_E[QA session<br/>qa]
+        QA_R[QA session<br/>reviewer]
    end
    subgraph "DevClaw Plugin"
-        TP[task_pickup]
+        WS[work_start]
-        TC[task_complete]
+        WF[work_finish]
        TCR[task_create]
-        QS[queue_status]
+        ST[status]
-        SH[session_health]
+        SH[health]
        PR[project_register]
-        DS[devclaw_setup]
+        DS[setup]
-        TIER[Tier Resolver]
+        TIER[Level Resolver]
        PJ[projects.json]
        AL[audit.log]
    end
@@ -103,34 +155,34 @@ graph TB
    TG -->|delivers| MS
    MS -->|announces| TG
-    MS -->|calls| TP
+    MS -->|calls| WS
-    MS -->|calls| TC
+    MS -->|calls| WF
    MS -->|calls| TCR
-    MS -->|calls| QS
+    MS -->|calls| ST
    MS -->|calls| SH
    MS -->|calls| PR
    MS -->|calls| DS
-    TP -->|resolves tier| TIER
+    WS -->|resolves level| TIER
-    TP -->|transitions labels| GL
+    WS -->|transitions labels| GL
-    TP -->|reads/writes| PJ
+    WS -->|reads/writes| PJ
-    TP -->|appends| AL
+    WS -->|appends| AL
-    TP -->|creates session| GW
+    WS -->|creates session| GW
-    TP -->|dispatches task| CLI
+    WS -->|dispatches task| CLI
-    TC -->|transitions labels| GL
+    WF -->|transitions labels| GL
-    TC -->|closes/reopens| GL
+    WF -->|closes/reopens| GL
-    TC -->|reads/writes| PJ
+    WF -->|reads/writes| PJ
-    TC -->|git pull| REPO
+    WF -->|git pull| REPO
-    TC -->|auto-chain dispatch| CLI
+    WF -->|tick dispatch| CLI
-    TC -->|appends| AL
+    WF -->|appends| AL
    TCR -->|creates issue| GL
    TCR -->|appends| AL
-    QS -->|lists issues by label| GL
+    ST -->|lists issues by label| GL
-    QS -->|reads| PJ
+    ST -->|reads| PJ
-    QS -->|appends| AL
+    ST -->|appends| AL
    SH -->|reads/writes| PJ
    SH -->|checks sessions| GW
@@ -144,12 +196,12 @@ graph TB
    CLI -->|sends task| DEV_J
    CLI -->|sends task| DEV_M
    CLI -->|sends task| DEV_S
-    CLI -->|sends task| QA_E
+    CLI -->|sends task| QA_R
    DEV_J -->|writes code, creates MRs| REPO
    DEV_M -->|writes code, creates MRs| REPO
    DEV_S -->|writes code, creates MRs| REPO
-    QA_E -->|reviews code, tests| REPO
+    QA_R -->|reviews code, tests| REPO
 ```
 ## End-to-end flow: human to sub-agent
@@ -163,7 +215,7 @@ sequenceDiagram
    participant MS as Main Session<br/>(orchestrator)
    participant DC as DevClaw Plugin
    participant GW as Gateway RPC
-    participant CLI as openclaw agent CLI
+    participant CLI as openclaw gateway call agent
    participant DEV as DEV Session<br/>(medior)
    participant GL as Issue Tracker
@@ -171,34 +223,34 @@ sequenceDiagram
    H->>TG: "check status" (or heartbeat triggers)
    TG->>MS: delivers message
-    MS->>DC: queue_status()
+    MS->>DC: status()
-    DC->>GL: glab issue list --label "To Do"
+    DC->>GL: list issues by label "To Do"
    DC-->>MS: { toDo: [#42], dev: idle }
    Note over MS: Decides to pick up #42 for DEV as medior
-    MS->>DC: task_pickup({ issueId: 42, role: "dev", model: "medior", ... })
+    MS->>DC: work_start({ issueId: 42, role: "dev", level: "medior", ... })
-    DC->>DC: resolve tier "medior" → model ID
+    DC->>DC: resolve level "medior" → model ID
    DC->>DC: lookup dev.sessions.medior → null (first time)
-    DC->>GL: glab issue update 42 --unlabel "To Do" --label "Doing"
+    DC->>GL: transition label "To Do" → "Doing"
    DC->>GW: sessions.patch({ key: new-session-key, model: "anthropic/claude-sonnet-4-5" })
-    DC->>CLI: openclaw agent --session-id <key> --message "Build login page for #42..."
+    DC->>CLI: openclaw gateway call agent --params { sessionKey, message }
    CLI->>DEV: creates session, delivers task
    DC->>DC: store session key in projects.json + append audit.log
-    DC-->>MS: { success: true, announcement: "🔧 DEV (medior) picking up #42" }
+    DC-->>MS: { success: true, announcement: "🔧 Spawning DEV (medior) for #42" }
-    MS->>TG: "🔧 DEV (medior) picking up #42: Add login page"
+    MS->>TG: "🔧 Spawning DEV (medior) for #42: Add login page"
    TG->>H: sees announcement
    Note over DEV: Works autonomously — reads code, writes code, creates MR
-    Note over DEV: Calls task_complete when done
+    Note over DEV: Calls work_finish when done
-    DEV->>DC: task_complete({ role: "dev", result: "done", ... })
+    DEV->>DC: work_finish({ role: "dev", result: "done", ... })
-    DC->>GL: glab issue update 42 --unlabel "Doing" --label "To Test"
+    DC->>GL: transition label "Doing" → "To Test"
    DC->>DC: deactivate worker (sessions preserved)
-    DC-->>DEV: { announcement: "✅ DEV done #42" }
+    DC-->>DEV: { announcement: "✅ DEV DONE #42" }
-    MS->>TG: "✅ DEV done #42 — moved to QA queue"
+    MS->>TG: "✅ DEV DONE #42 — moved to QA queue"
    TG->>H: sees announcement
 ```
@@ -208,16 +260,16 @@ On the **next DEV task** for this project that also assigns medior:
 sequenceDiagram
    participant MS as Main Session
    participant DC as DevClaw Plugin
-    participant CLI as openclaw agent CLI
+    participant CLI as openclaw gateway call agent
    participant DEV as DEV Session<br/>(medior, existing)
-    MS->>DC: task_pickup({ issueId: 57, role: "dev", model: "medior", ... })
+    MS->>DC: work_start({ issueId: 57, role: "dev", level: "medior", ... })
-    DC->>DC: resolve tier "medior" → model ID
+    DC->>DC: resolve level "medior" → model ID
    DC->>DC: lookup dev.sessions.medior → existing key!
    Note over DC: No sessions.patch needed — session already exists
-    DC->>CLI: openclaw agent --session-id <key> --message "Fix validation for #57..."
+    DC->>CLI: openclaw gateway call agent --params { sessionKey, message }
    CLI->>DEV: delivers task to existing session (has full codebase context)
-    DC-->>MS: { success: true, announcement: "⚡ DEV (medior) picking up #57" }
+    DC-->>MS: { success: true, announcement: "⚡ Sending DEV (medior) for #57" }
 ```
 Session reuse saves ~50K tokens per task by not re-reading the codebase.
@@ -228,149 +280,144 @@ This traces a single issue from creation to completion, showing every component
 ### Phase 1: Issue created
-Issues are created by the orchestrator agent or by sub-agent sessions via `glab`. The orchestrator can create issues based on user requests in Telegram, backlog planning, or QA feedback. Sub-agents can also create issues when they discover bugs or related work during development.
+Issues are created by the orchestrator agent or by sub-agent sessions via `task_create` or directly via `gh`/`glab`. The orchestrator can create issues based on user requests in Telegram, backlog planning, or QA feedback. Sub-agents can also create issues when they discover bugs during development.
 ```
-Orchestrator Agent → Issue Tracker: creates issue #42 with label "To Do"
+Orchestrator Agent → Issue Tracker: creates issue #42 with label "Planning"
 ```
-**State:** Issue tracker has issue #42 labeled "To Do". Nothing in DevClaw yet.
+**State:** Issue tracker has issue #42 labeled "Planning". Nothing in DevClaw yet.
 ### Phase 2: Heartbeat detects work
 ```
-Heartbeat triggers → Orchestrator calls queue_status()
+Heartbeat triggers → Orchestrator calls status()
 ```
 ```mermaid
 sequenceDiagram
    participant A as Orchestrator
-    participant QS as queue_status
+    participant QS as status
    participant GL as Issue Tracker
    participant PJ as projects.json
    participant AL as audit.log
-    A->>QS: queue_status({ projectGroupId: "-123" })
+    A->>QS: status({ projectGroupId: "-123" })
    QS->>PJ: readProjects()
    PJ-->>QS: { dev: idle, qa: idle }
-    QS->>GL: glab issue list --label "To Do"
+    QS->>GL: list issues by label "To Do"
    GL-->>QS: [{ id: 42, title: "Add login page" }]
-    QS->>GL: glab issue list --label "To Test"
+    QS->>GL: list issues by label "To Test"
    GL-->>QS: []
-    QS->>GL: glab issue list --label "To Improve"
+    QS->>GL: list issues by label "To Improve"
    GL-->>QS: []
-    QS->>AL: append { event: "queue_status", ... }
+    QS->>AL: append { event: "status", ... }
    QS-->>A: { dev: idle, queue: { toDo: [#42] } }
 ```
-**Orchestrator decides:** DEV is idle, issue #42 is in To Do → pick it up. Evaluates complexity → assigns medior tier.
+**Orchestrator decides:** DEV is idle, issue #42 is in To Do → pick it up. Evaluates complexity → assigns medior level.
 ### Phase 3: DEV pickup
-The plugin handles everything end-to-end — tier resolution, session lookup, label transition, state update, **and** task dispatch to the worker session. The agent's only job after is to post the announcement.
+The plugin handles everything end-to-end — level resolution, session lookup, label transition, state update, **and** task dispatch to the worker session. The agent's only job after is to post the announcement.
 ```mermaid
 sequenceDiagram
    participant A as Orchestrator
-    participant TP as task_pickup
+    participant WS as work_start
    participant GL as Issue Tracker
-    participant TIER as Tier Resolver
+    participant TIER as Level Resolver
    participant GW as Gateway RPC
-    participant CLI as openclaw agent CLI
+    participant CLI as openclaw gateway call agent
    participant PJ as projects.json
    participant AL as audit.log
-    A->>TP: task_pickup({ issueId: 42, role: "dev", projectGroupId: "-123", model: "medior" })
+    A->>WS: work_start({ issueId: 42, role: "dev", projectGroupId: "-123", level: "medior" })
-    TP->>PJ: readProjects()
+    WS->>PJ: readProjects()
-    TP->>GL: glab issue view 42 --output json
+    WS->>GL: getIssue(42)
-    GL-->>TP: { title: "Add login page", labels: ["To Do"] }
+    GL-->>WS: { title: "Add login page", labels: ["To Do"] }
-    TP->>TP: Verify label is "To Do" ✓
+    WS->>WS: Verify label is "To Do"
-    TP->>TIER: resolve "medior" → "anthropic/claude-sonnet-4-5"
+    WS->>TIER: resolve "medior" → "anthropic/claude-sonnet-4-5"
-    TP->>PJ: lookup dev.sessions.medior
+    WS->>PJ: lookup dev.sessions.medior
-    TP->>GL: glab issue update 42 --unlabel "To Do" --label "Doing"
+    WS->>GL: transitionLabel(42, "To Do", "Doing")
    alt New session
-        TP->>GW: sessions.patch({ key: new-key, model: "anthropic/claude-sonnet-4-5" })
+        WS->>GW: sessions.patch({ key: new-key, model: "anthropic/claude-sonnet-4-5" })
    end
-    TP->>CLI: openclaw agent --session-id <key> --message "task..."
+    WS->>CLI: openclaw gateway call agent --params { sessionKey, message }
-    TP->>PJ: activateWorker + store session key
+    WS->>PJ: activateWorker + store session key
-    TP->>AL: append task_pickup + model_selection
+    WS->>AL: append work_start + model_selection
-    TP-->>A: { success: true, announcement: "🔧 ..." }
+    WS-->>A: { success: true, announcement: "🔧 ..." }
 ```
 **Writes:**
 - `Issue Tracker`: label "To Do" → "Doing"
- `projects.json`: dev.active=true, dev.issueId="42", dev.model="medior", dev.sessions.medior=key
+- `projects.json`: dev.active=true, dev.issueId="42", dev.level="medior", dev.sessions.medior=key
- `audit.log`: 2 entries (task_pickup, model_selection)
+- `audit.log`: 2 entries (work_start, model_selection)
 - `Session`: task message delivered to worker session via CLI
 ### Phase 4: DEV works
 ```
 DEV sub-agent session → reads codebase, writes code, creates MR
-DEV sub-agent session → calls task_complete({ role: "dev", result: "done", ... })
+DEV sub-agent session → calls work_finish({ role: "dev", result: "done", ... })
 ```
-This happens inside the OpenClaw session. The worker calls `task_complete` directly for atomic state updates. If the worker discovers unrelated bugs, it calls `task_create` to file them.
+This happens inside the OpenClaw session. The worker calls `work_finish` directly for atomic state updates. If the worker discovers unrelated bugs, it calls `task_create` to file them.
 ### Phase 5: DEV complete (worker self-reports)
 ```mermaid
 sequenceDiagram
    participant DEV as DEV Session
-    participant TC as task_complete
+    participant WF as work_finish
    participant GL as Issue Tracker
    participant PJ as projects.json
    participant AL as audit.log
    participant REPO as Git Repo
-    participant QA as QA Session (auto-chain)
+    participant QA as QA Session
-    DEV->>TC: task_complete({ role: "dev", result: "done", projectGroupId: "-123", summary: "Login page with OAuth" })
+    DEV->>WF: work_finish({ role: "dev", result: "done", projectGroupId: "-123", summary: "Login page with OAuth" })
-    TC->>PJ: readProjects()
+    WF->>PJ: readProjects()
-    PJ-->>TC: { dev: { active: true, issueId: "42" } }
+    PJ-->>WF: { dev: { active: true, issueId: "42" } }
-    TC->>REPO: git pull
+    WF->>REPO: git pull
-    TC->>PJ: deactivateWorker(-123, dev)
+    WF->>PJ: deactivateWorker(-123, dev)
    Note over PJ: active→false, issueId→null<br/>sessions map PRESERVED
-    TC->>GL: transition label "Doing" → "To Test"
+    WF->>GL: transitionLabel "Doing" → "To Test"
-    TC->>AL: append { event: "task_complete", role: "dev", result: "done" }
+    WF->>AL: append { event: "work_finish", role: "dev", result: "done" }
-    alt autoChain enabled
+    WF->>WF: tick queue (fill free slots)
-        TC->>GL: transition label "To Test" → "Testing"
+    Note over WF: Scheduler sees "To Test" issue, QA slot free → dispatches QA
-        TC->>QA: dispatchTask(role: "qa", tier: "qa")
+    WF-->>DEV: { announcement: "✅ DEV DONE #42", tickPickups: [...] }
        TC->>PJ: activateWorker(-123, qa)
        TC-->>DEV: { announcement: "✅ DEV done #42", autoChain: { dispatched: true, role: "qa" } }
    else autoChain disabled
        TC-->>DEV: { announcement: "✅ DEV done #42", nextAction: "qa_pickup" }
    end
 ```
 **Writes:**
 - `Git repo`: pulled latest (has DEV's merged code)
 - `projects.json`: dev.active=false, dev.issueId=null (sessions map preserved for reuse)
- `Issue Tracker`: label "Doing" → "To Test" (+ "To Test" → "Testing" if auto-chain)
+- `Issue Tracker`: label "Doing" → "To Test"
- `audit.log`: 1 entry (task_complete) + optional auto-chain entries
+- `audit.log`: 1 entry (work_finish) + tick entries if workers dispatched
 ### Phase 6: QA pickup
-Same as Phase 3, but with `role: "qa"`. Label transitions "To Test" → "Testing". Uses the qa tier.
+Same as Phase 3, but with `role: "qa"`. Label transitions "To Test" → "Testing". Uses the reviewer level.
-### Phase 7: QA result (3 possible outcomes)
+### Phase 7: QA result (4 possible outcomes)
 #### 7a. QA Pass
 ```mermaid
 sequenceDiagram
-    participant A as Orchestrator
+    participant QA as QA Session
-    participant TC as task_complete
+    participant WF as work_finish
    participant GL as Issue Tracker
    participant PJ as projects.json
    participant AL as audit.log
-    A->>TC: task_complete({ role: "qa", result: "pass", projectGroupId: "-123" })
+    QA->>WF: work_finish({ role: "qa", result: "pass", projectGroupId: "-123" })
-    TC->>PJ: deactivateWorker(-123, qa)
+    WF->>PJ: deactivateWorker(-123, qa)
-    TC->>GL: glab issue update 42 --unlabel "Testing" --label "Done"
+    WF->>GL: transitionLabel(42, "Testing", "Done")
-    TC->>GL: glab issue close 42
+    WF->>GL: closeIssue(42)
-    TC->>AL: append { event: "task_complete", role: "qa", result: "pass" }
+    WF->>AL: append { event: "work_finish", role: "qa", result: "pass" }
-    TC-->>A: { announcement: "🎉 QA PASS #42. Issue closed." }
+    WF-->>QA: { announcement: "🎉 QA PASS #42. Issue closed." }
 ```
 **Ticket complete.** Issue closed, label "Done".
@@ -379,18 +426,18 @@ sequenceDiagram
 ```mermaid
 sequenceDiagram
-    participant A as Orchestrator
+    participant QA as QA Session
-    participant TC as task_complete
+    participant WF as work_finish
    participant GL as Issue Tracker
    participant PJ as projects.json
    participant AL as audit.log
-    A->>TC: task_complete({ role: "qa", result: "fail", projectGroupId: "-123", summary: "OAuth redirect broken" })
+    QA->>WF: work_finish({ role: "qa", result: "fail", projectGroupId: "-123", summary: "OAuth redirect broken" })
-    TC->>PJ: deactivateWorker(-123, qa)
+    WF->>PJ: deactivateWorker(-123, qa)
-    TC->>GL: glab issue update 42 --unlabel "Testing" --label "To Improve"
+    WF->>GL: transitionLabel(42, "Testing", "To Improve")
-    TC->>GL: glab issue reopen 42
+    WF->>GL: reopenIssue(42)
-    TC->>AL: append { event: "task_complete", role: "qa", result: "fail" }
+    WF->>AL: append { event: "work_finish", role: "qa", result: "fail" }
-    TC-->>A: { announcement: "❌ QA FAIL #42 — OAuth redirect broken. Sent back to DEV." }
+    WF-->>QA: { announcement: "❌ QA FAIL #42 — OAuth redirect broken. Sent back to DEV." }
 ```
 **Cycle restarts:** Issue goes to "To Improve". Next heartbeat, DEV picks it up again (Phase 3, but from "To Improve" instead of "To Do").
@@ -410,43 +457,39 @@ DEV Blocked: "Doing" → "To Do"
 QA Blocked:  "Testing" → "To Test"
 ```
-Worker cannot complete (missing info, environment errors, etc.). Issue returns to queue for retry. No auto-chain — the task is available for the next heartbeat pickup.
+Worker cannot complete (missing info, environment errors, etc.). Issue returns to queue for retry. The task is available for the next heartbeat pickup.
 ### Completion enforcement
-Three layers guarantee that `task_complete` always runs:
+Three layers guarantee that `work_finish` always runs:
-1. **Completion contract** — Every task message sent to a worker session includes a mandatory `## MANDATORY: Task Completion` section listing available results and requiring `task_complete` even on failure. Workers are instructed to use `"blocked"` if stuck.
+1. **Completion contract** — Every task message sent to a worker session includes a mandatory `## MANDATORY: Task Completion` section listing available results and requiring `work_finish` even on failure. Workers are instructed to use `"blocked"` if stuck.
 2. **Blocked result** — Both DEV and QA can use `"blocked"` to gracefully return a task to queue without losing work. DEV blocked: `Doing → To Do`. QA blocked: `Testing → To Test`. This gives workers an escape hatch instead of silently dying.
-3. **Stale worker watchdog** — The heartbeat's health check detects workers active for >2 hours. With `autoFix=true`, it deactivates the worker and reverts the label back to queue. This catches sessions that crashed, ran out of context, or otherwise failed without calling `task_complete`. The `session_health` tool provides the same check for manual invocation.
+3. **Stale worker watchdog** — The heartbeat's health check detects workers active for >2 hours. With `fix=true`, it deactivates the worker and reverts the label back to queue. This catches sessions that crashed, ran out of context, or otherwise failed without calling `work_finish`. The `health` tool provides the same check for manual invocation.
 ### Phase 8: Heartbeat (continuous)
-The heartbeat runs periodically (triggered by the agent or a scheduled message). It combines health check + queue scan:
+The heartbeat runs periodically (via background service or manual `work_heartbeat` trigger). It combines health check + queue scan:
 ```mermaid
 sequenceDiagram
-    participant A as Orchestrator
+    participant HB as Heartbeat Service
-    participant SH as session_health
+    participant SH as health check
-    participant QS as queue_status
+    participant TK as projectTick
-    participant TP as task_pickup
+    participant WS as work_start (dispatch)
-    Note over A: Heartbeat triggered
+    Note over HB: Tick triggered (every 60s)
-    A->>SH: session_health({ autoFix: true })
+    HB->>SH: checkWorkerHealth per project per role
-    Note over SH: Checks sessions via Gateway RPC (sessions.list)
+    Note over SH: Checks for zombies, stale workers
-    SH-->>A: { healthy: true }
+    SH-->>HB: { fixes applied }
-    A->>QS: queue_status()
+    HB->>TK: projectTick per project
-    QS-->>A: { projects: [{ dev: idle, queue: { toDo: [#43], toTest: [#44] } }] }
+    Note over TK: Scans queue: To Improve > To Test > To Do
-
+    TK->>WS: dispatchTask (fill free slots)
-    Note over A: DEV idle + To Do #43 → assign medior
+    WS-->>TK: { dispatched }
-    A->>TP: task_pickup({ issueId: 43, role: "dev", model: "medior", ... })
+    TK-->>HB: { pickups, skipped }
    Note over TP: Plugin handles everything:<br/>tier resolve → session lookup →<br/>label transition → dispatch task →<br/>state update → audit log
    Note over A: QA idle + To Test #44 → assign qa
    A->>TP: task_pickup({ issueId: 44, role: "qa", model: "qa", ... })
 ```
 ## Data flow map
@@ -455,25 +498,27 @@ Every piece of data and where it lives:
 ```
 ┌─────────────────────────────────────────────────────────────────┐
-│ Issue Tracker (source of truth for tasks)                        │
+│ Issue Tracker (source of truth for tasks)                       │
 │                                                                 │
 │  Issue #42: "Add login page"                                    │
-│  Labels: [To Do | Doing | To Test | Testing | Done | ...]       │
+│  Labels: [Planning | To Do | Doing | To Test | Testing | ...]   │
 │  State: open / closed                                           │
 │  MRs/PRs: linked merge/pull requests                            │
 │  Created by: orchestrator (task_create), workers, or humans     │
 └─────────────────────────────────────────────────────────────────┘
-        ↕ glab/gh CLI (read/write, auto-detected)
+        ↕ gh/glab CLI (read/write, auto-detected)
 ┌─────────────────────────────────────────────────────────────────┐
 │ DevClaw Plugin (orchestration logic)                            │
 │                                                                 │
-│  devclaw_setup  → agent creation + workspace + model config    │
+│  setup          → agent creation + workspace + model config     │
-│  task_pickup    → tier + label + dispatch + role instr (e2e)   │
+│  work_start     → level + label + dispatch + role instr (e2e)   │
-│  task_complete  → label + state + git pull + auto-chain        │
+│  work_finish    → label + state + git pull + tick queue          │
-│  task_create    → create issue in tracker                      │
+│  task_create    → create issue in tracker                       │
-│  queue_status   → read labels + read state                     │
+│  task_update    → manual label state change                     │
-│  session_health → check sessions + fix zombies                 │
+│  task_comment   → add comment to issue                          │
-│  project_register → labels + prompts + state init (one-time)   │
+│  status         → read labels + read state                      │
 │  health         → check sessions + fix zombies                  │
 │  project_register → labels + prompts + state init (one-time)    │
 └─────────────────────────────────────────────────────────────────┘
        ↕ atomic file I/O          ↕ OpenClaw CLI (plugin shells out)
 ┌────────────────────────────────┐ ┌──────────────────────────────┐
@@ -481,39 +526,40 @@ Every piece of data and where it lives:
 │                                │ │ (called by plugin, not agent)│
 │  Per project:                  │ │                              │
 │    dev:                        │ │  openclaw gateway call       │
-│      active, issueId, model    │ │    sessions.patch → create   │
+│      active, issueId, level    │ │    sessions.patch → create   │
 │      sessions:                 │ │    sessions.list  → health   │
 │        junior: <key>           │ │    sessions.delete → cleanup │
 │        medior: <key>           │ │                              │
-│        senior: <key>           │ │  openclaw agent              │
+│        senior: <key>           │ │  openclaw gateway call agent │
-│    qa:                         │ │    --session-id <key>        │
+│    qa:                         │ │    --params { sessionKey,    │
-│      active, issueId, model    │ │    --message "task..."       │
+│      active, issueId, level    │ │      message, agentId }      │
 │      sessions:                 │ │    → dispatches to session   │
-│        qa: <key>               │ │                              │
+│        reviewer: <key>         │ │                              │
 │        tester: <key>           │ │                              │
 └────────────────────────────────┘ └──────────────────────────────┘
        ↕ append-only
 ┌─────────────────────────────────────────────────────────────────┐
 │ log/audit.log (observability)                                   │
 │                                                                 │
 │  NDJSON, one line per event:                                    │
-│  task_pickup, task_complete, model_selection,                   │
+│  work_start, work_finish, model_selection,                      │
-│  queue_status, health_check, session_spawn, session_reuse,     │
+│  status, health, task_create, task_update,                      │
-│  project_register, devclaw_setup                                │
+│  task_comment, project_register, setup, heartbeat_tick          │
 │                                                                 │
-│  Query with: cat audit.log | jq 'select(.event=="task_pickup")' │
+│  Query: cat audit.log | jq 'select(.event=="work_start")'      │
 └─────────────────────────────────────────────────────────────────┘
 ┌─────────────────────────────────────────────────────────────────┐
-│ Telegram (user-facing messages)                                 │
+│ Telegram / WhatsApp (user-facing messages)                      │
 │                                                                 │
 │  Per group chat:                                                │
-│    "🔧 Spawning DEV (medior) for #42: Add login page"           │
+│    "🔧 Spawning DEV (medior) for #42: Add login page"          │
 │    "⚡ Sending DEV (medior) for #57: Fix validation"            │
-│    "✅ DEV done #42 — Login page with OAuth. Moved to QA queue."│
+│    "✅ DEV DONE #42 — Login page with OAuth."                   │
 │    "🎉 QA PASS #42. Issue closed."                              │
-│    "❌ QA FAIL #42 — OAuth redirect broken. Sent back to DEV."  │
+│    "❌ QA FAIL #42 — OAuth redirect broken."                    │
-│    "🚫 DEV BLOCKED #42 — Missing dependencies. Returned to queue."│
+│    "🚫 DEV BLOCKED #42 — Missing dependencies."                │
-│    "🚫 QA BLOCKED #42 — Env not available. Returned to QA queue."│
+│    "🚫 QA BLOCKED #42 — Env not available."                    │
 └─────────────────────────────────────────────────────────────────┘
 ┌─────────────────────────────────────────────────────────────────┐
@@ -521,7 +567,7 @@ Every piece of data and where it lives:
 │                                                                 │
 │  DEV sub-agent sessions: read code, write code, create MRs      │
 │  QA sub-agent sessions: read code, run tests, review MRs        │
-│  task_complete (DEV done): git pull to sync latest               │
+│  work_finish (DEV done): git pull to sync latest                │
 └─────────────────────────────────────────────────────────────────┘
 ```
@@ -537,7 +583,7 @@ graph LR
        PR[Project registration]
        SETUP[Agent + workspace setup]
        SD[Session dispatch<br/>create + send via CLI]
-        AC[Auto-chaining<br/>DEV→QA, QA fail→DEV]
+        AC[Scheduling<br/>tick queue after work_finish]
        RI[Role instructions<br/>loaded per project]
        A[Audit logging]
        Z[Zombie cleanup]
@@ -553,7 +599,7 @@ graph LR
    subgraph "Sub-agent sessions handle"
        CR[Code writing]
        MR[MR creation/review]
-        TC_W[Task completion<br/>via task_complete]
+        WF_W[Task completion<br/>via work_finish]
        BUG[Bug filing<br/>via task_create]
    end
@@ -565,20 +611,22 @@ graph LR
 ## IssueProvider abstraction
-All issue tracker operations go through the `IssueProvider` interface, defined in `lib/issue-provider.ts`. This abstraction allows DevClaw to support multiple issue trackers without changing tool logic.
+All issue tracker operations go through the `IssueProvider` interface, defined in `lib/providers/provider.ts`. This abstraction allows DevClaw to support multiple issue trackers without changing tool logic.
 **Interface methods:**
 - `ensureLabel` / `ensureAllStateLabels` — idempotent label creation
 - `createIssue` — create issue with label and assignees
 - `listIssuesByLabel` / `getIssue` — issue queries
 - `transitionLabel` — atomic label state transition (unlabel + label)
 - `closeIssue` / `reopenIssue` — issue lifecycle
 - `hasStateLabel` / `getCurrentStateLabel` — label inspection
- `hasMergedMR` — MR/PR verification
+- `hasMergedMR` / `getMergedMRUrl` — MR/PR verification
 - `addComment` — add comment to issue
 - `healthCheck` — verify provider connectivity
 **Current providers:**
 - **GitLab** (`lib/providers/gitlab.ts`) — wraps `glab` CLI
 - **GitHub** (`lib/providers/github.ts`) — wraps `gh` CLI
 - **GitLab** (`lib/providers/gitlab.ts`) — wraps `glab` CLI
 **Planned providers:**
 - **Jira** — via REST API
@@ -589,16 +637,16 @@ Provider selection is handled by `createProvider()` in `lib/providers/index.ts`.
 | Failure | Detection | Recovery |
 |---|---|---|
-| Session dies mid-task | `session_health` checks via `sessions.list` Gateway RPC | `autoFix`: reverts label, clears active state, removes dead session from sessions map. Next heartbeat picks up task again (creates fresh session for that tier). |
+| Session dies mid-task | `health` checks via `sessions.list` Gateway RPC | `fix=true`: reverts label, clears active state. Next heartbeat picks up task again (creates fresh session for that level). |
-| glab command fails | Plugin tool throws error, returns to agent | Agent retries or reports to Telegram group |
+| gh/glab command fails | Plugin tool throws error, returns to agent | Agent retries or reports to Telegram group |
-| `openclaw agent` CLI fails | Plugin catches error during dispatch | Plugin rolls back: reverts label, clears active state. Returns error to agent for reporting. |
+| `openclaw gateway call agent` fails | Plugin catches error during dispatch | Plugin rolls back: reverts label, clears active state. Returns error. No orphaned state. |
-| `sessions.patch` fails | Plugin catches error during session creation | Plugin rolls back label transition. Returns error. No orphaned state. |
+| `sessions.patch` fails | Plugin catches error during session creation | Plugin rolls back label transition. Returns error. |
 | projects.json corrupted | Tool can't parse JSON | Manual fix needed. Atomic writes (temp+rename) prevent partial writes. |
-| Label out of sync | `task_pickup` verifies label before transitioning | Throws error if label doesn't match expected state. Agent reports mismatch. |
+| Label out of sync | `work_start` verifies label before transitioning | Throws error if label doesn't match expected state. |
-| Worker already active | `task_pickup` checks `active` flag | Throws error: "DEV worker already active on project". Must complete current task first. |
+| Worker already active | `work_start` checks `active` flag | Throws error: "DEV already active on project". Must complete current task first. |
-| Stale worker (>2h) | `session_health` and heartbeat health check | `autoFix`: deactivates worker, reverts label to queue (To Do / To Test). Task available for next pickup. |
+| Stale worker (>2h) | `health` and heartbeat health check | `fix=true`: deactivates worker, reverts label to queue. Task available for next pickup. |
-| Worker stuck/blocked | Worker calls `task_complete` with `"blocked"` | Deactivates worker, reverts label to queue. Issue available for retry. |
+| Worker stuck/blocked | Worker calls `work_finish` with `"blocked"` | Deactivates worker, reverts label to queue. Issue available for retry. |
-| `project_register` fails | Plugin catches error during label creation or state write | Clean error returned. No partial state — labels are idempotent, projects.json not written until all labels succeed. |
+| `project_register` fails | Plugin catches error during label creation or state write | Clean error returned. Labels are idempotent, projects.json not written until all labels succeed. |
 ## File locations
@@ -606,8 +654,9 @@ Provider selection is handled by `createProvider()` in `lib/providers/index.ts`.
 |---|---|---|
 | Plugin source | `~/.openclaw/extensions/devclaw/` | Plugin code |
 | Plugin manifest | `~/.openclaw/extensions/devclaw/openclaw.plugin.json` | Plugin registration |
-| Agent config | `~/.openclaw/openclaw.json` | Agent definition + tool permissions + tier config |
+| Agent config | `~/.openclaw/openclaw.json` | Agent definition + tool permissions + model config |
 | Worker state | `~/.openclaw/workspace-<agent>/projects/projects.json` | Per-project DEV/QA state |
 | Role instructions | `~/.openclaw/workspace-<agent>/projects/roles/<project>/` | Per-project `dev.md` and `qa.md` |
 | Audit log | `~/.openclaw/workspace-<agent>/log/audit.log` | NDJSON event log |
 | Session transcripts | `~/.openclaw/agents/<agent>/sessions/<uuid>.jsonl` | Conversation history per session |
 | Git repos | `~/git/<project>/` | Project source code |
--- a/docs/CONFIGURATION.md
+++ b/docs/CONFIGURATION.md
@@ -0,0 +1,354 @@
 # DevClaw — Configuration Reference
 All DevClaw configuration lives in two places: `openclaw.json` (plugin-level settings) and `projects.json` (per-project state).
 ## Plugin Configuration (`openclaw.json`)
 DevClaw is configured under `plugins.entries.devclaw.config` in `openclaw.json`.
 ### Model Tiers
 Override which LLM model powers each developer level:
 ```json
 {
  "plugins": {
    "entries": {
      "devclaw": {
        "config": {
          "models": {
            "dev": {
              "junior": "anthropic/claude-haiku-4-5",
              "medior": "anthropic/claude-sonnet-4-5",
              "senior": "anthropic/claude-opus-4-5"
            },
            "qa": {
              "reviewer": "anthropic/claude-sonnet-4-5",
              "tester": "anthropic/claude-haiku-4-5"
            }
          }
        }
      }
    }
  }
 }
 ```
 **Resolution order** (per `lib/tiers.ts:resolveModel`):
 1. Plugin config `models.<role>.<level>` — explicit override
 2. `DEFAULT_MODELS[role][level]` — built-in defaults (table below)
 3. Passthrough — treat the level string as a raw model ID
 **Default models:**
 | Role | Level | Default model |
 |---|---|---|
 | dev | junior | `anthropic/claude-haiku-4-5` |
 | dev | medior | `anthropic/claude-sonnet-4-5` |
 | dev | senior | `anthropic/claude-opus-4-5` |
 | qa | reviewer | `anthropic/claude-sonnet-4-5` |
 | qa | tester | `anthropic/claude-haiku-4-5` |
 ### Project Execution Mode
 Controls cross-project parallelism:
 ```json
 {
  "plugins": {
    "entries": {
      "devclaw": {
        "config": {
          "projectExecution": "parallel"
        }
      }
    }
  }
 }
 ```
 | Value | Behavior |
 |---|---|
 | `"parallel"` (default) | Multiple projects can have active workers simultaneously |
 | `"sequential"` | Only one project's workers active at a time. Useful for single-agent deployments. |
 Enforced in `work_heartbeat` and the heartbeat service before dispatching.
 ### Heartbeat Service
 Token-free interval-based health checks + queue dispatch:
 ```json
 {
  "plugins": {
    "entries": {
      "devclaw": {
        "config": {
          "work_heartbeat": {
            "enabled": true,
            "intervalSeconds": 60,
            "maxPickupsPerTick": 4
          }
        }
      }
    }
  }
 }
 ```
 | Setting | Type | Default | Description |
 |---|---|---|---|
 | `enabled` | boolean | `true` | Enable the heartbeat service |
 | `intervalSeconds` | number | `60` | Seconds between ticks |
 | `maxPickupsPerTick` | number | `4` | Maximum worker dispatches per tick (budget control) |
 **Source:** [`lib/services/heartbeat.ts`](../lib/services/heartbeat.ts)
 The heartbeat service runs as a plugin service tied to the gateway lifecycle. Every tick: health pass (auto-fix zombies, stale workers) → tick pass (fill free slots by priority). Zero LLM tokens consumed.
 ### Notifications
 Control which lifecycle events send notifications:
 ```json
 {
  "plugins": {
    "entries": {
      "devclaw": {
        "config": {
          "notifications": {
            "heartbeatDm": true,
            "workerStart": true,
            "workerComplete": true
          }
        }
      }
    }
  }
 }
 ```
 | Setting | Default | Description |
 |---|---|---|
 | `heartbeatDm` | `true` | Send heartbeat summary to orchestrator DM |
 | `workerStart` | `true` | Announce when a worker picks up a task |
 | `workerComplete` | `true` | Announce when a worker finishes a task |
 ### DevClaw Agent IDs
 List which agents are recognized as DevClaw orchestrators (used for context detection):
 ```json
 {
  "plugins": {
    "entries": {
      "devclaw": {
        "config": {
          "devClawAgentIds": ["my-orchestrator"]
        }
      }
    }
  }
 }
 ```
 ### Agent Tool Permissions
 Restrict DevClaw tools to your orchestrator agent:
 ```json
 {
  "agents": {
    "list": [
      {
        "id": "my-orchestrator",
        "tools": {
          "allow": [
            "work_start",
            "work_finish",
            "task_create",
            "task_update",
            "task_comment",
            "status",
            "health",
            "work_heartbeat",
            "project_register",
            "setup",
            "onboard"
          ]
        }
      }
    ]
  }
 }
 ```
 ---
 ## Project State (`projects.json`)
 All project state lives in `<workspace>/projects/projects.json`, keyed by group ID.
 **Source:** [`lib/projects.ts`](../lib/projects.ts)
 ### Schema
 ```json
 {
  "projects": {
    "<groupId>": {
      "name": "my-webapp",
      "repo": "~/git/my-webapp",
      "groupName": "Dev - My Webapp",
      "baseBranch": "development",
      "deployBranch": "development",
      "deployUrl": "https://my-webapp.example.com",
      "channel": "telegram",
      "roleExecution": "parallel",
      "dev": {
        "active": false,
        "issueId": null,
        "startTime": null,
        "level": null,
        "sessions": {
          "junior": null,
          "medior": "agent:orchestrator:subagent:my-webapp-dev-medior",
          "senior": null
        }
      },
      "qa": {
        "active": false,
        "issueId": null,
        "startTime": null,
        "level": null,
        "sessions": {
          "reviewer": "agent:orchestrator:subagent:my-webapp-qa-reviewer",
          "tester": null
        }
      }
    }
  }
 }
 ```
 ### Project fields
 | Field | Type | Description |
 |---|---|---|
 | `name` | string | Short project name |
 | `repo` | string | Path to git repo (supports `~/` expansion) |
 | `groupName` | string | Group display name |
 | `baseBranch` | string | Base branch for development |
 | `deployBranch` | string | Branch that triggers deployment |
 | `deployUrl` | string | Deployment URL |
 | `channel` | string | Messaging channel (`"telegram"`, `"whatsapp"`, etc.) |
 | `roleExecution` | `"parallel"` \| `"sequential"` | DEV/QA parallelism for this project |
 ### Worker state fields
 Each project has `dev` and `qa` worker state objects:
 | Field | Type | Description |
 |---|---|---|
 | `active` | boolean | Whether this role has an active worker |
 | `issueId` | string \| null | Issue being worked on (as string) |
 | `startTime` | string \| null | ISO timestamp when worker became active |
 | `level` | string \| null | Current level (`junior`, `medior`, `senior`, `reviewer`, `tester`) |
 | `sessions` | Record<string, string \| null> | Per-level session keys |
 **DEV session keys:** `junior`, `medior`, `senior`
 **QA session keys:** `reviewer`, `tester`
 ### Key design decisions
 - **Session-per-level** — each level gets its own worker session, accumulating context independently. Level selection maps directly to a session key.
 - **Sessions preserved on completion** — when a worker completes a task, the sessions map is preserved (only `active`, `issueId`, and `startTime` are cleared). This enables session reuse.
 - **Atomic writes** — all writes go through temp-file-then-rename to prevent corruption.
 - **Sessions persist indefinitely** — no auto-cleanup. The `health` tool handles manual cleanup.
 ---
 ## Workspace File Layout
 ```
 <workspace>/
 ├── projects/
 │   ├── projects.json          ← Project state (auto-managed)
 │   └── roles/
 │       ├── my-webapp/         ← Per-project role instructions (editable)
 │       │   ├── dev.md
 │       │   └── qa.md
 │       ├── another-project/
 │       │   ├── dev.md
 │       │   └── qa.md
 │       └── default/           ← Fallback role instructions
 │           ├── dev.md
 │           └── qa.md
 ├── log/
 │   └── audit.log              ← NDJSON event log (auto-managed)
 ├── AGENTS.md                  ← Agent identity documentation
 └── HEARTBEAT.md               ← Heartbeat operation guide
 ```
 ### Role instruction files
 `work_start` loads role instructions from `projects/roles/<project>/<role>.md` at dispatch time, falling back to `projects/roles/default/<role>.md`. These files are appended to the task message sent to worker sessions.
 Edit to customize: deployment steps, test commands, acceptance criteria, coding standards.
 **Source:** [`lib/dispatch.ts:loadRoleInstructions`](../lib/dispatch.ts)
 ---
 ## Audit Log
 Append-only NDJSON at `<workspace>/log/audit.log`. Auto-truncated to 250 lines.
 **Source:** [`lib/audit.ts`](../lib/audit.ts)
 ### Event types
 | Event | Trigger |
 |---|---|
 | `work_start` | Task dispatched to worker |
 | `model_selection` | Level resolved to model ID |
 | `work_finish` | Task completed |
 | `work_heartbeat` | Heartbeat tick completed |
 | `task_create` | Issue created |
 | `task_update` | Issue state changed |
 | `task_comment` | Comment added to issue |
 | `status` | Queue status queried |
 | `health` | Health scan completed |
 | `heartbeat_tick` | Heartbeat service tick (background) |
 | `project_register` | Project registered |
 ### Querying
 ```bash
 # All task dispatches
 cat audit.log | jq 'select(.event=="work_start")'
 # All completions for a project
 cat audit.log | jq 'select(.event=="work_finish" and .project=="my-webapp")'
 # Model selections
 cat audit.log | jq 'select(.event=="model_selection")'
 ```
 ---
 ## Issue Provider
 DevClaw uses an `IssueProvider` interface (`lib/providers/provider.ts`) to abstract issue tracker operations. The provider is auto-detected from the git remote URL.
 **Supported providers:**
 | Provider | CLI | Detection |
 |---|---|---|
 | GitHub | `gh` | Remote contains `github.com` |
 | GitLab | `glab` | Remote contains `gitlab` |
 **Planned:** Jira (via REST API)
 **Source:** [`lib/providers/index.ts`](../lib/providers/index.ts)
--- a/docs/CONTEXT-AWARENESS.md
+++ b/docs/CONTEXT-AWARENESS.md
@@ -1,6 +1,6 @@
-# Context-Aware DevClaw
+# DevClaw — Context Awareness
-DevClaw now adapts its behavior based on how you interact with it.
+DevClaw adapts its behavior based on how you interact with it.
 ## Design Philosophy
@@ -12,170 +12,122 @@ DevClaw enforces strict boundaries between projects:
 - Project work happens **inside that project's group**
 - Setup and configuration happen **outside project groups**
-This design prevents:
+This prevents:
- ❌ Cross-project contamination (workers picking up wrong project's tasks)
+- Cross-project contamination (workers picking up wrong project's tasks)
- ❌ Confusion about which project you're working on
+- Confusion about which project you're working on
- ❌ Accidental registration of wrong groups
+- Accidental registration of wrong groups
- ❌ Setup discussions cluttering project work channels
+- Setup discussions cluttering project work channels
 This enables:
- ✅ Clear mental model: "This group = this project"
+- Clear mental model: "This group = this project"
- ✅ Isolated work streams: Each project progresses independently
+- Isolated work streams: Each project progresses independently
- ✅ Dedicated teams: Workers focus on one project at a time
+- Dedicated teams: Workers focus on one project at a time
- ✅ Clean separation: Setup vs. operational work
+- Clean separation: Setup vs. operational work
 ## Three Interaction Contexts
-### 1. **Via Another Agent** (Setup Mode)
+### 1. Via Another Agent (Setup Mode)
-When you talk to your main agent (like Henk) about DevClaw:
+
- ✅ Use: `devclaw_onboard`, `devclaw_setup`
+When you talk to your main agent about DevClaw:
- ❌ Avoid: `task_pickup`, `queue_status` (operational tools)
+- Use: `onboard`, `setup`
 - Avoid: `work_start`, `status` (operational tools)
 **Example:**
 ```
-User → Henk: "Can you help me set up DevClaw?"
+User → Main Agent: "Can you help me set up DevClaw?"
-Henk → Calls devclaw_onboard
+Main Agent → Calls onboard
 ```
-### 2. **Direct Message to DevClaw Agent**
+### 2. Direct Message to DevClaw Agent
 When you DM the DevClaw agent directly on Telegram/WhatsApp:
- ✅ Use: `queue_status` (all projects), `session_health` (system overview)
+- Use: `status` (all projects), `health` (system overview)
- ❌ Avoid: `task_pickup` (project-specific work), setup tools
+- Avoid: `work_start` (project-specific work), setup tools
 **Example:**
 ```
 User → DevClaw DM: "Show me the status of all projects"
-DevClaw → Calls queue_status (shows all projects)
+DevClaw → Calls status (shows all projects)
 ```
-### 3. **Project Group Chat**
+### 3. Project Group Chat
 When you message in a Telegram/WhatsApp group bound to a project:
- ✅ Use: `task_pickup`, `task_complete`, `task_create`, `queue_status` (auto-filtered)
+- Use: `work_start`, `work_finish`, `task_create`, `status` (auto-filtered)
- ❌ Avoid: Setup tools, system-wide queries
+- Avoid: Setup tools, system-wide queries
 **Example:**
 ```
-User → OpenClaw Dev Group: "@henk pick up issue #42"
+User → Project Group: "pick up issue #42"
-DevClaw → Calls task_pickup (only works in groups)
+DevClaw → Calls work_start (only works in groups)
 ```
 ## How It Works
 ### Context Detection
 Each tool automatically detects:
- **Agent ID** - Is this the DevClaw agent or another agent?
+- **Agent ID** — Is this the DevClaw agent or another agent?
- **Message Channel** - Telegram, WhatsApp, or CLI?
+- **Message Channel** — Telegram, WhatsApp, or CLI?
- **Session Key** - Is this a group chat or direct message?
+- **Session Key** — Is this a group chat or direct message?
  - Format: `agent:{agentId}:{channel}:{type}:{id}`
  - Telegram group: `agent:devclaw:telegram:group:-5266044536`
  - WhatsApp group: `agent:devclaw:whatsapp:group:120363123@g.us`
  - DM: `agent:devclaw:telegram:user:657120585`
- **Project Binding** - Which project is this group bound to?
+- **Project Binding** — Which project is this group bound to?
 ### Guardrails
 Tools include context-aware guidance in their responses:
 ```json
 {
-  "contextGuidance": "🛡️ Context: Project Group Chat (telegram)\n
+  "contextGuidance": "Context: Project Group Chat (telegram)\n    You're in a Telegram group for project 'my-webapp'.\n    Use work_start, work_finish for project work.",
    You're in a Telegram group for project 'openclaw-core'.\n
    Use task_pickup, task_complete for project work.",
  ...
 }
 ```
-## Integrated Tools
+## Tool Context Requirements
-### ✅ `devclaw_onboard`
+| Tool | Group chat | Direct DM | Via agent |
- **Works best:** Via another agent or direct DM
+|---|---|---|---|
- **Blocks:** Group chats (setup shouldn't happen in project groups)
+| `onboard` | Blocked | Works | Works |
 | `setup` | Works | Works | Works |
 | `work_start` | Works | Blocked | Blocked |
 | `work_finish` | Works | Works | Works |
 | `task_create` | Works | Works | Works |
 | `task_update` | Works | Works | Works |
 | `task_comment` | Works | Works | Works |
 | `status` | Auto-filtered | All projects | Suggests onboard |
 | `health` | Auto-filtered | All projects | Works |
 | `work_heartbeat` | Single project | All projects | Works |
 | `project_register` | Works (required) | Blocked | Blocked |
-### ✅ `queue_status`
+**Why `project_register` requires group context:**
- **Group context:** Auto-filters to that project
+- Forces deliberate project registration from within the project's space
- **Direct context:** Shows all projects
+- You're physically in the group when binding it, making the connection explicit
- **Via-agent context:** Suggests using devclaw_onboard instead
+- Impossible to accidentally register the wrong group
 ### ✅ `task_pickup`
 - **ONLY works:** In project group chats
 - **Blocks:** Direct DMs and setup conversations
 ### ✅ `project_register`
 - **ONLY works:** In the Telegram/WhatsApp group you're registering
 - **Blocks:** Direct DMs and via-agent conversations
 - **Auto-detects:** Group ID from current chat (projectGroupId parameter now optional)
 **Why this matters:**
 - **Project Isolation**: Each group = one project = one dedicated team
 - **Clear Boundaries**: Forces deliberate project registration from within the project's space
 - **Team Clarity**: You're physically in the group when binding it, making the connection explicit
 - **No Mistakes**: Impossible to accidentally register the wrong group when you're in it
 - **Natural Workflow**: "This group is for Project X" → register Project X here
 ## Testing
 ### Debug Tool
 Use `context_test` to see what context is detected:
 ```
 # In any context:
 context_test
 # Returns:
 {
  "detectedContext": { "type": "group", "projectName": "openclaw-core" },
  "guardrails": "🛡️ Context: Project Group Chat..."
 }
 ```
 ### Manual Testing
 1. **Setup Mode:** Message your main agent → "Help me configure DevClaw"
 2. **Status Check:** DM DevClaw agent (Telegram/WhatsApp) → "Show me the queue"
 3. **Project Work:** Post in project group (Telegram/WhatsApp) → "@henk pick up #42"
 Each context should trigger different guardrails.
 ## Configuration
 Add to `~/.openclaw/openclaw.json`:
 ```json
 "plugins": {
  "entries": {
    "devclaw": {
      "config": {
        "devClawAgentIds": ["henk-development", "devclaw-test"],
        "models": { ... }
      }
    }
  }
 }
 ```
 The `devClawAgentIds` array lists which agents are DevClaw orchestrators.
 ## Implementation Details
 - **Module:** [lib/context-guard.ts](../lib/context-guard.ts)
 - **Tests:** [tests/unit/context-guard.test.ts](../tests/unit/context-guard.test.ts) (15 passing)
 - **Integrated tools:** 4 key tools (`devclaw_onboard`, `queue_status`, `task_pickup`, `project_register`)
 - **Detection logic:** Checks agentId, messageChannel, sessionKey pattern matching
 ## WhatsApp Support
-DevClaw **fully supports WhatsApp** groups with the same architecture as Telegram:
+DevClaw fully supports WhatsApp groups with the same architecture as Telegram:
- ✅ WhatsApp group detection via `sessionKey.includes("@g.us")`
+- WhatsApp group detection via `sessionKey.includes("@g.us")`
- ✅ Projects keyed by WhatsApp group ID (e.g., `"120363123@g.us"`)
+- Projects keyed by WhatsApp group ID (e.g., `"120363123@g.us"`)
- ✅ Context-aware tools work identically for both channels
+- Context-aware tools work identically for both channels
- ✅ One project = one group (Telegram OR WhatsApp)
+- One project = one group (Telegram OR WhatsApp)
 **To register a WhatsApp project:**
 1. Go to the WhatsApp group chat
 2. Call `project_register` from within the group
 3. Group ID auto-detected from context
-The architecture treats Telegram and WhatsApp identically - the only difference is the group ID format.
+## Implementation
-## Future Enhancements
+- **Module:** [`lib/context-guard.ts`](../lib/context-guard.ts)
 - **Detection logic:** Checks agentId, messageChannel, sessionKey pattern matching
 - **Configuration:** `devClawAgentIds` in plugin config lists which agents are DevClaw orchestrators
- [ ] Integrate into remaining tools (`task_complete`, `session_health`, `task_create`, `devclaw_setup`)
+## Related
- [ ] System prompt injection (requires OpenClaw core support)
+
- [ ] Context-based tool filtering (hide irrelevant tools)
+- [Configuration — devClawAgentIds](CONFIGURATION.md#devclaw-agent-ids)
- [ ] Per-project context overrides
+- [Architecture — Scope boundaries](ARCHITECTURE.md#scope-boundaries)
--- a/docs/MANAGEMENT.md
+++ b/docs/MANAGEMENT.md
@@ -12,14 +12,14 @@ DevClaw exists because of a gap that management theorists identified decades ago
 In 1969, Paul Hersey and Ken Blanchard published what would become Situational Leadership Theory. The central idea is deceptively simple: the way you delegate should match the capability and reliability of the person doing the work. You don't hand an intern the system architecture redesign. You don't ask your principal engineer to rename a CSS class.
-DevClaw's model selection does exactly this. When a task comes in, the plugin evaluates complexity from the issue title and description, then routes it to the cheapest model that can handle it:
+DevClaw's level selection does exactly this. When a task comes in, the plugin routes it to the cheapest model that can handle it:
-| Complexity                       | Model  | Analogy                     |
+| Complexity                       | Level    | Analogy                     |
-| -------------------------------- | ------ | --------------------------- |
+| -------------------------------- | -------- | --------------------------- |
-| Simple (typos, renames, copy)    | Haiku  | Junior dev — just execute   |
+| Simple (typos, renames, copy)    | Junior   | The intern — just execute   |
-| Standard (features, bug fixes)   | Sonnet | Mid-level — think and build |
+| Standard (features, bug fixes)   | Medior   | Mid-level — think and build |
-| Complex (architecture, security) | Opus   | Senior — design and reason  |
+| Complex (architecture, security) | Senior   | The architect — design and reason |
-| Review                           | Grok   | Independent reviewer        |
+| Review                           | Reviewer | Independent code reviewer   |
 This isn't just cost optimization. It mirrors what effective managers do instinctively: match the delegation level to the task, not to a fixed assumption about the delegate.
@@ -27,11 +27,11 @@ This isn't just cost optimization. It mirrors what effective managers do instinc
 Classical management theory — later formalized by Bernard Bass in his work on Transformational Leadership — introduced a concept called Management by Exception (MBE). The principle: a manager should only be pulled back into a workstream when something deviates from the expected path.
-DevClaw's task lifecycle is built on this. The orchestrator delegates a task via `task_pickup`, then steps away. It only re-engages in three scenarios:
+DevClaw's task lifecycle is built on this. The orchestrator delegates a task via `work_start`, then steps away. It only re-engages in three scenarios:
-1. **DEV completes work** → The task moves to QA automatically. No orchestrator involvement needed.
+1. **DEV completes work** → The label moves to `To Test`. The scheduler dispatches QA on the next tick. No orchestrator involvement needed.
 2. **QA passes** → The issue closes. Pipeline complete.
-3. **QA fails** → The task cycles back to DEV with a fix request. The orchestrator may need to adjust the model tier.
+3. **QA fails** → The label moves to `To Improve`. The scheduler dispatches DEV on the next tick. The orchestrator may need to adjust the model level.
 4. **QA refines** → The task enters a holding state that _requires human decision_. This is the explicit escalation boundary.
 The "refine" state is the most interesting from a delegation perspective. It's a conscious architectural decision that says: some judgments should not be automated. When the QA agent determines that a task needs rethinking rather than just fixing, it escalates to the only actor who has the full business context — the human.
@@ -61,7 +61,7 @@ One of the most common delegation failures is self-review. You don't ask the per
 DevClaw enforces structural separation between development and review by design:
 - DEV and QA are separate sub-agent sessions with separate state.
- QA uses a different model entirely (Grok), introducing genuine independence.
+- QA uses the reviewer level, which can be a different model entirely, introducing genuine independence.
 - The review happens after a clean label transition — QA picks up from `To Test`, not from watching DEV work in real time.
 This mirrors a principle from organizational design: effective controls require independence between execution and verification. It's the same reason companies separate their audit function from their operations.
@@ -72,7 +72,7 @@ Ronald Coase won a Nobel Prize for explaining why firms exist: transaction costs
 DevClaw applies the same logic to AI sessions. Spawning a new sub-agent session costs approximately 50,000 tokens of context loading — the agent needs to read the full codebase before it can do useful work. That's the onboarding cost.
-The plugin tracks session IDs across task completions. When a DEV finishes task A and task B is ready on the same project, DevClaw detects the existing session and returns `"sessionAction": "send"` instead of `"spawn"`. The orchestrator routes the new task to the running session. No re-onboarding. No context reload.
+The plugin tracks session keys across task completions. When a DEV finishes task A and task B is ready on the same project, DevClaw detects the existing session and reuses it instead of spawning a new one. No re-onboarding. No context reload.
 In management terms: keep your team stable. Reassigning the same person to the next task on their project is almost always cheaper than bringing in someone new — even if the new person is theoretically better qualified.
@@ -101,11 +101,11 @@ This is the deepest lesson from delegation theory: **good delegation isn't about
 Management research points to a few directions that could extend DevClaw's delegation model:
-**Progressive delegation.** Blanchard's model suggests increasing task complexity for delegates as they prove competent. DevClaw could track QA pass rates per model tier and automatically promote — if Haiku consistently passes QA on borderline tasks, start routing more work to it. This is how good managers develop their people, and it reduces cost over time.
+**Progressive delegation.** Blanchard's model suggests increasing task complexity for delegates as they prove competent. DevClaw could track QA pass rates per model level and automatically promote — if junior consistently passes QA on borderline tasks, start routing more work to it. This is how good managers develop their people, and it reduces cost over time.
 **Delegation authority expansion.** The Vroom-Yetton decision model maps when a leader should decide alone versus consulting the team. Currently, sub-agents have narrow authority — they execute tasks but can't restructure the backlog. Selectively expanding this (e.g., allowing a DEV agent to split a task it judges too large) would reduce orchestrator bottlenecks, mirroring how managers gradually give high-performers more autonomy.
-**Outcome-based learning.** Delegation research emphasizes that the _delegator_ learns from outcomes too. Aggregated metrics — QA fail rate by model tier, average cycles to Done, time-in-state distributions — would help both the orchestrator agent and the human calibrate their delegation patterns over time.
+**Outcome-based learning.** Delegation research emphasizes that the _delegator_ learns from outcomes too. Aggregated metrics — QA fail rate by model level, average cycles to Done, time-in-state distributions — would help both the orchestrator agent and the human calibrate their delegation patterns over time.
 ---
--- a/docs/ONBOARDING.md
+++ b/docs/ONBOARDING.md
@@ -1,18 +1,18 @@
 # DevClaw — Onboarding Guide
-## What you need before starting
+Step-by-step setup: install the plugin, configure an agent, register projects, and run your first task.
 ## Prerequisites
 | Requirement | Why | How to check |
 |---|---|---|
 | [OpenClaw](https://openclaw.ai) installed | DevClaw is an OpenClaw plugin | `openclaw --version` |
 | Node.js >= 20 | Runtime for plugin | `node --version` |
-| [`glab`](https://gitlab.com/gitlab-org/cli) or [`gh`](https://cli.github.com) CLI | Issue tracker provider (auto-detected from remote) | `glab --version` or `gh --version` |
+| [`gh`](https://cli.github.com) or [`glab`](https://gitlab.com/gitlab-org/cli) CLI | Issue tracker provider (auto-detected from git remote) | `gh --version` or `glab --version` |
-| CLI authenticated | Plugin calls glab/gh for every label transition | `glab auth status` or `gh auth status` |
+| CLI authenticated | Plugin calls gh/glab for every label transition | `gh auth status` or `glab auth status` |
-| A GitLab/GitHub repo with issues | The task backlog lives in the issue tracker | `glab issue list` or `gh issue list` from your repo |
+| A GitHub/GitLab repo with issues | The task backlog lives in the issue tracker | `gh issue list` or `glab issue list` from your repo |
-## Setup
+## Step 1: Install the plugin
 ### 1. Install the plugin
 ```bash
 # Copy to extensions directory (auto-discovered on next restart)
@@ -25,21 +25,21 @@ openclaw plugins list
 # Should show: DevClaw | devclaw | loaded
 ```
-### 2. Run setup
+## Step 2: Run setup
 There are three ways to set up DevClaw:
-#### Option A: Conversational onboarding (recommended)
+### Option A: Conversational onboarding (recommended)
-Call the `devclaw_onboard` tool from any agent that has the DevClaw plugin loaded. The agent will walk you through configuration step by step — asking about:
+Call the `onboard` tool from any agent that has the DevClaw plugin loaded. The agent walks you through configuration step by step — asking about:
 - Agent selection (current or create new)
 - Channel binding (telegram/whatsapp/none) — for new agents only
- Model tiers (accept defaults or customize)
+- Model levels (accept defaults or customize)
 - Optional project registration
 The tool returns instructions that guide the agent through the QA-style setup conversation.
-#### Option B: CLI wizard
+### Option B: CLI wizard
 ```bash
 openclaw devclaw setup
@@ -48,12 +48,13 @@ openclaw devclaw setup
 The setup wizard walks you through:
 1. **Agent** — Create a new orchestrator agent or configure an existing one
-2. **Developer team** — Choose which LLM model powers each developer tier:
+2. **Developer team** — Choose which LLM model powers each developer level:
-   - **Junior** (fast, cheap tasks) — default: `anthropic/claude-haiku-4-5`
+   - **DEV junior** (fast, cheap tasks) — default: `anthropic/claude-haiku-4-5`
-   - **Medior** (standard tasks) — default: `anthropic/claude-sonnet-4-5`
+   - **DEV medior** (standard tasks) — default: `anthropic/claude-sonnet-4-5`
-   - **Senior** (complex tasks) — default: `anthropic/claude-opus-4-5`
+   - **DEV senior** (complex tasks) — default: `anthropic/claude-opus-4-5`
-   - **QA** (code review) — default: `anthropic/claude-sonnet-4-5`
+   - **QA reviewer** (code review) — default: `anthropic/claude-sonnet-4-5`
-3. **Workspace** — Writes AGENTS.md, HEARTBEAT.md, role templates, and initializes memory
+   - **QA tester** (manual testing) — default: `anthropic/claude-haiku-4-5`
 3. **Workspace** — Writes AGENTS.md, HEARTBEAT.md, role templates, and initializes state
 Non-interactive mode:
 ```bash
@@ -66,45 +67,45 @@ openclaw devclaw setup --agent my-orchestrator \
  --senior "anthropic/claude-opus-4-5"
 ```
-#### Option C: Tool call (agent-driven)
+### Option C: Tool call (agent-driven)
 **Conversational onboarding via tool:**
 ```json
-devclaw_onboard({ mode: "first-run" })
+onboard({ "mode": "first-run" })
 ```
-The tool returns step-by-step instructions that guide the agent through the QA-style setup conversation.
+The tool returns step-by-step instructions that guide the agent through the setup conversation.
 **Direct setup (skip conversation):**
 ```json
-{
+setup({
  "newAgentName": "My Dev Orchestrator",
  "channelBinding": "telegram",
  "models": {
-    "junior": "anthropic/claude-haiku-4-5",
+    "dev": {
-    "senior": "anthropic/claude-opus-4-5"
+      "junior": "anthropic/claude-haiku-4-5",
      "senior": "anthropic/claude-opus-4-5"
    },
    "qa": {
      "reviewer": "anthropic/claude-sonnet-4-5"
    }
  }
-}
+})
 ```
-This calls `devclaw_setup` directly without conversational prompts.
+## Step 3: Channel binding (optional, for new agents)
-### 3. Channel binding (optional, for new agents)
+If you created a new agent during conversational onboarding and selected a channel binding (telegram/whatsapp), the agent is automatically bound. **Skip to step 4.**
 If you created a new agent during conversational onboarding and selected a channel binding (telegram/whatsapp), the agent is automatically bound and will receive messages from that channel. **Skip to step 4.**
 **Smart Migration**: If an existing agent already has a channel-wide binding (e.g., the old orchestrator receives all telegram messages), the onboarding agent will:
-1. Call `analyze_channel_bindings` to detect the conflict
+1. Detect the conflict
 2. Ask if you want to migrate the binding from the old agent to the new one
 3. If you confirm, the binding is automatically moved — no manual config edit needed
-This is useful when you're replacing an old orchestrator with a new one.
+If you didn't bind a channel during setup:
-If you didn't bind a channel during setup, you have two options:
+**Option A: Manually edit `openclaw.json`**
 **Option A: Manually edit `openclaw.json`** (for existing agents or post-creation binding)
 Add an entry to the `bindings` array:
 ```json
 {
  "bindings": [
@@ -136,131 +137,115 @@ Restart OpenClaw after editing.
 **Option B: Add bot to Telegram/WhatsApp group**
-If using a channel-wide binding (no peer filter), the agent will receive all messages from that channel. Add your orchestrator bot to the relevant Telegram group for the project.
+If using a channel-wide binding (no peer filter), the agent receives all messages from that channel. Add your orchestrator bot to the relevant Telegram group.
-### 4. Register your project
+## Step 4: Register your project
-Tell the orchestrator agent to register a new project:
+Go to the Telegram/WhatsApp group for the project and tell the orchestrator agent:
-> "Register project my-project at ~/git/my-project for group -1234567890 with base branch development"
+> "Register project my-project at ~/git/my-project with base branch development"
 The agent calls `project_register`, which atomically:
 - Validates the repo and auto-detects GitHub/GitLab from remote
 - Creates all 8 state labels (idempotent)
- Scaffolds prompt instruction files (`projects/prompts/<project>/dev.md` and `qa.md`)
+- Scaffolds role instruction files (`projects/roles/<project>/dev.md` and `qa.md`)
- Adds the project entry to `projects.json` with `autoChain: false`
+- Adds the project entry to `projects.json`
 - Logs the registration event
 **Initial state in `projects.json`:**
 ```json
 {
  "projects": {
    "-1234567890": {
      "name": "my-project",
      "repo": "~/git/my-project",
-      "groupName": "Dev - My Project",
+      "groupName": "Project: my-project",
      "deployUrl": "",
      "baseBranch": "development",
      "deployBranch": "development",
-      "autoChain": false,
+      "channel": "telegram",
      "roleExecution": "parallel",
      "dev": {
        "active": false,
        "issueId": null,
        "startTime": null,
-        "model": null,
+        "level": null,
        "sessions": { "junior": null, "medior": null, "senior": null }
      },
      "qa": {
        "active": false,
        "issueId": null,
        "startTime": null,
-        "model": null,
+        "level": null,
-        "sessions": { "qa": null }
+        "sessions": { "reviewer": null, "tester": null }
      }
    }
  }
 }
 ```
-**Manual fallback:** If you prefer CLI control, you can still create labels manually with `glab label create` and edit `projects.json` directly. See the [Architecture docs](ARCHITECTURE.md) for label names and colors.
+**Finding the Telegram group ID:** The group ID is the numeric ID of your Telegram supergroup (a negative number like `-1234567890`). When you call `project_register` from within the group, the ID is auto-detected from context.
-**Finding the Telegram group ID:** The group ID is the numeric ID of your Telegram supergroup (a negative number like `-1234567890`). You can find it via the Telegram bot API or from message metadata in OpenClaw logs.
+## Step 5: Create your first issue
 ### 5. Create your first issue
 Issues can be created in multiple ways:
 - **Via the agent** — Ask the orchestrator in the Telegram group: "Create an issue for adding a login page" (uses `task_create`)
 - **Via workers** — DEV/QA workers can call `task_create` to file follow-up bugs they discover
- **Via CLI** — `cd ~/git/my-project && glab issue create --title "My first task" --label "To Do"` (or `gh issue create`)
+- **Via CLI** — `cd ~/git/my-project && gh issue create --title "My first task" --label "To Do"` (or `glab issue create`)
 - **Via web UI** — Create an issue and add the "To Do" label
-### 6. Test the pipeline
+Note: `task_create` defaults to the "Planning" label. Use "To Do" explicitly when the task is ready for immediate work.
 ## Step 6: Test the pipeline
 Ask the agent in the Telegram group:
 > "Check the queue status"
-The agent should call `queue_status` and report the "To Do" issue. Then:
+The agent should call `status` and report the "To Do" issue. Then:
 > "Pick up issue #1 for DEV"
-The agent calls `task_pickup`, which assigns a developer tier, transitions the label to "Doing", creates or reuses a worker session, and dispatches the task — all in one call. The agent just posts the announcement.
+The agent calls `work_start`, which assigns a developer level, transitions the label to "Doing", creates or reuses a worker session, and dispatches the task — all in one call. The agent posts the announcement.
 ## Adding more projects
-Tell the agent to register a new project (step 3) and add the bot to the new Telegram group (step 4). That's it — `project_register` handles labels and state setup.
+Tell the agent to register a new project (step 4) from within the new project's Telegram group. That's it — `project_register` handles labels and state setup.
 Each project is fully isolated — separate queue, separate workers, separate state.
-## Developer tiers
+## Developer levels
-DevClaw assigns tasks to developer tiers instead of raw model names. This makes the system intuitive — you're assigning a "junior dev" to fix a typo, not configuring model parameters.
+DevClaw assigns tasks to developer levels instead of raw model names. This makes the system intuitive — you're assigning a "junior dev" to fix a typo, not configuring model parameters.
-| Tier | Role | Default model | When to assign |
+| Role | Level | Default model | When to assign |
-|------|------|---------------|----------------|
+|------|-------|---------------|----------------|
-| **junior** | Junior developer | `anthropic/claude-haiku-4-5` | Typos, single-file fixes, CSS changes |
+| DEV | **junior** | `anthropic/claude-haiku-4-5` | Typos, single-file fixes, CSS changes |
-| **medior** | Mid-level developer | `anthropic/claude-sonnet-4-5` | Features, bug fixes, multi-file changes |
+| DEV | **medior** | `anthropic/claude-sonnet-4-5` | Features, bug fixes, multi-file changes |
-| **senior** | Senior developer | `anthropic/claude-opus-4-5` | Architecture, migrations, system-wide refactoring |
+| DEV | **senior** | `anthropic/claude-opus-4-5` | Architecture, migrations, system-wide refactoring |
-| **qa** | QA engineer | `anthropic/claude-sonnet-4-5` | Code review, test validation |
+| QA | **reviewer** | `anthropic/claude-sonnet-4-5` | Code review, test validation |
 | QA | **tester** | `anthropic/claude-haiku-4-5` | Manual testing, smoke tests |
-Change which model powers each tier in `openclaw.json`:
+Change which model powers each level in `openclaw.json` — see [Configuration](CONFIGURATION.md#model-tiers).
 ```json
 {
  "plugins": {
    "entries": {
      "devclaw": {
        "config": {
          "models": {
            "junior": "anthropic/claude-haiku-4-5",
            "medior": "anthropic/claude-sonnet-4-5",
            "senior": "anthropic/claude-opus-4-5",
            "qa": "anthropic/claude-sonnet-4-5"
          }
        }
      }
    }
  }
 }
 ```
 ## What the plugin handles vs. what you handle
 | Responsibility | Who | Details |
 |---|---|---|
 | Plugin installation | You (once) | `cp -r devclaw ~/.openclaw/extensions/` |
-| Agent + workspace setup | Plugin (`devclaw_setup`) | Creates agent, configures models, writes workspace files |
+| Agent + workspace setup | Plugin (`setup`) | Creates agent, configures models, writes workspace files |
-| Channel binding analysis | Plugin (`analyze_channel_bindings`) | Detects channel conflicts, validates channel configuration |
+| Channel binding migration | Plugin (`setup` with `migrateFrom`) | Automatically moves channel-wide bindings between agents |
-| Channel binding migration | Plugin (`devclaw_setup` with `migrateFrom`) | Automatically moves channel-wide bindings between agents |
+| Label setup | Plugin (`project_register`) | 8 labels, created idempotently via IssueProvider |
-| Label setup | Plugin (`project_register`) | 8 labels, created idempotently via `IssueProvider` |
+| Prompt file scaffolding | Plugin (`project_register`) | Creates `projects/roles/<project>/dev.md` and `qa.md` |
 | Prompt file scaffolding | Plugin (`project_register`) | Creates `projects/prompts/<project>/dev.md` and `qa.md` |
 | Project registration | Plugin (`project_register`) | Entry in `projects.json` with empty worker state |
 | Telegram group setup | You (once per project) | Add bot to group |
 | Issue creation | Plugin (`task_create`) | Orchestrator or workers create issues from chat |
-| Label transitions | Plugin | Atomic label transitions via issue tracker CLI |
+| Label transitions | Plugin | Atomic transitions via issue tracker CLI |
-| Developer assignment | Plugin | LLM-selected tier by orchestrator, keyword heuristic fallback |
+| Developer assignment | Plugin | LLM-selected level by orchestrator, keyword heuristic fallback |
 | State management | Plugin | Atomic read/write to `projects.json` |
 | Session management | Plugin | Creates, reuses, and dispatches to sessions via CLI. Agent never touches session tools. |
-| Task completion | Plugin (`task_complete`) | Workers self-report. Auto-chains if enabled. |
+| Task completion | Plugin (`work_finish`) | Workers self-report. Scheduler dispatches next role. |
-| Prompt instructions | Plugin (`task_pickup`) | Loaded from `projects/prompts/<project>/<role>.md`, appended to task message |
+| Prompt instructions | Plugin (`work_start`) | Loaded from `projects/roles/<project>/<role>.md`, appended to task message |
 | Audit logging | Plugin | Automatic NDJSON append per tool call |
-| Zombie detection | Plugin | `session_health` checks active vs alive |
+| Zombie detection | Plugin | `health` checks active vs alive |
-| Queue scanning | Plugin | `queue_status` queries issue tracker per project |
+| Queue scanning | Plugin | `status` queries issue tracker per project |
--- a/docs/QA_WORKFLOW.md
+++ b/docs/QA_WORKFLOW.md
@@ -1,8 +1,6 @@
-# QA Workflow
+# DevClaw — QA Workflow
-## Overview
+Quality Assurance in DevClaw follows a structured workflow that ensures every review is documented and traceable.
 Quality Assurance (QA) in DevClaw follows a structured workflow that ensures every review is documented and traceable.
 ## Required Steps
@@ -28,10 +26,10 @@ task_comment({
 ### 3. Complete the Task
-After posting your comment, call `task_complete`:
+After posting your comment, call `work_finish`:
 ```javascript
-task_complete({
+work_finish({
  role: "qa",
  projectGroupId: "<group-id>",
  result: "pass",  // or "fail", "refine", "blocked"
@@ -39,15 +37,24 @@ task_complete({
 })
 ```
 ## QA Results
 | Result | Label transition | Meaning |
 |---|---|---|
 | `"pass"` | Testing → Done | Approved. Issue closed. |
 | `"fail"` | Testing → To Improve | Issues found. Issue reopened, sent back to DEV. |
 | `"refine"` | Testing → Refining | Needs human decision. Pipeline pauses. |
 | `"blocked"` | Testing → To Test | Cannot complete (env issues, etc.). Returns to QA queue. |
 ## Why Comments Are Required
-1. **Audit Trail**: Every review decision is documented
+1. **Audit Trail** — Every review decision is documented in the issue tracker
-2. **Knowledge Sharing**: Future reviewers understand what was tested
+2. **Knowledge Sharing** — Future reviewers understand what was tested
-3. **Quality Metrics**: Enables tracking of test coverage
+3. **Quality Metrics** — Enables tracking of test coverage
-4. **Debugging**: When issues arise later, we know what was checked
+4. **Debugging** — When issues arise later, we know what was checked
-5. **Compliance**: Some projects require documented QA evidence
+5. **Compliance** — Some projects require documented QA evidence
-## Comment Template
+## Comment Templates
 ### For Passing Reviews
@@ -72,15 +79,14 @@ task_complete({
 ### For Failing Reviews
 ```markdown
-## QA Review - Issues Found
+## QA Review — Issues Found
 **Tested:**
 - [What you tested]
 **Issues Found:**
 1. [Issue description with steps to reproduce]
-2. [Issue description with steps to reproduce]
+2. [Issue description with expected vs actual behavior]
 3. [Issue description with expected vs actual behavior]
 **Environment:**
 - [Test environment details]
@@ -90,25 +96,25 @@ task_complete({
 ## Enforcement
-As of [current date], QA workers are instructed via role templates to:
+QA workers receive instructions via role templates to:
- Always call `task_comment` BEFORE `task_complete`
+- Always call `task_comment` BEFORE `work_finish`
 - Include specific details about what was tested
 - Document results, environment, and any notes
 Prompt templates affected:
- `projects/prompts/<project>/qa.md`
+- `projects/roles/<project>/qa.md`
 - All project-specific QA templates should follow this pattern
 ## Best Practices
-1. **Be Specific**: Don't just say "tested the feature" - list what you tested
+1. **Be Specific** — Don't just say "tested the feature" — list what you tested
-2. **Include Environment**: Version numbers, browser, OS can matter
+2. **Include Environment** — Version numbers, browser, OS can matter
-3. **Document Edge Cases**: If you tested special scenarios, note them
+3. **Document Edge Cases** — If you tested special scenarios, note them
-4. **Use Screenshots**: For UI issues, screenshots help (link in comment)
+4. **Reference Requirements** — Link back to acceptance criteria from the issue
-5. **Reference Requirements**: Link back to acceptance criteria from the issue
+5. **Use Screenshots** — For UI issues, screenshots help (link in comment)
 ## Related
- Issue #103: Enforce QA comment on every review (pass or fail)
+- Tool: [`task_comment`](TOOLS.md#task_comment) — Add comments to issues
- Tool: `task_comment` - Add comments to issues
+- Tool: [`work_finish`](TOOLS.md#work_finish) — Complete QA tasks
- Tool: `task_complete` - Complete QA tasks
+- Config: [`projects/roles/<project>/qa.md`](CONFIGURATION.md#role-instruction-files) — QA role instructions
--- a/docs/ROADMAP.md
+++ b/docs/ROADMAP.md
@@ -15,35 +15,35 @@ This works for the common case but breaks down when you want:
 Roles become a configurable list instead of a hardcoded pair. Each role defines:
 - **Name** — e.g. `design`, `dev`, `qa`, `devops`
- **Tiers** — which developer tiers can be assigned (e.g. design only needs `medior`)
+- **Levels** — which developer levels can be assigned (e.g. design only needs `medior`)
 - **Pipeline position** — where it sits in the task lifecycle
 - **Worker count** — how many concurrent workers (default: 1)
 ```json
 {
  "roles": {
-    "dev": { "tiers": ["junior", "medior", "senior"], "workers": 1 },
+    "dev": { "levels": ["junior", "medior", "senior"], "workers": 1 },
-    "qa": { "tiers": ["qa"], "workers": 1 },
+    "qa": { "levels": ["reviewer", "tester"], "workers": 1 },
-    "devops": { "tiers": ["medior", "senior"], "workers": 1 }
+    "devops": { "levels": ["medior", "senior"], "workers": 1 }
  },
  "pipeline": ["dev", "qa", "devops"]
 }
 ```
-The pipeline definition replaces the hardcoded `Doing → To Test → Testing → Done` flow. Labels and transitions are generated from the pipeline config. Auto-chaining follows the pipeline order.
+The pipeline definition replaces the hardcoded `Doing → To Test → Testing → Done` flow. Labels and transitions are generated from the pipeline config. The scheduler follows the pipeline order when filling free slots.
 ### Open questions
 - How do custom labels map? Generate from role names, or let users define?
- Should roles have their own instruction files (`projects/prompts/<project>/<role>.md`) — yes, this already works
+- Should roles have their own instruction files (`projects/roles/<project>/<role>.md`) — yes, this already works
 - How to handle parallel roles (e.g. frontend + backend DEV in parallel before QA)?
 ---
-## Channel-agnostic groups
+## Channel-agnostic Groups
 Currently DevClaw maps projects to **Telegram group IDs**. The `projectGroupId` is a Telegram-specific negative number. This means:
- WhatsApp groups can't be used as project channels
+- WhatsApp groups can't be used as project channels (partially supported now via `channel` field)
 - Discord, Slack, or other channels are excluded
 - The naming (`groupId`, `groupName`) is Telegram-specific
@@ -77,19 +77,20 @@ Key changes:
 - All tool params, state keys, and docs updated accordingly
 - Backward compatible: existing Telegram-only keys migrated on read
-This enables any OpenClaw channel (Telegram, WhatsApp, Discord, Slack, etc.) to host a project — each group chat becomes an autonomous dev team regardless of platform.
+This enables any OpenClaw channel (Telegram, WhatsApp, Discord, Slack, etc.) to host a project.
 ### Open questions
 - Should one project be bindable to multiple channels? (e.g. Telegram for devs, WhatsApp for stakeholder updates)
- How does the orchestrator agent handle cross-channel context? (OpenClaw bindings already route by channel)
+- How does the orchestrator agent handle cross-channel context?
 ---
-## Other ideas
+## Other Ideas
 - **Jira provider** — `IssueProvider` interface already abstracts GitHub/GitLab; Jira is the obvious next addition
- **Deployment integration** — `task_complete` QA pass could trigger a deploy step via webhook or CLI
+- **Deployment integration** — `work_finish` QA pass could trigger a deploy step via webhook or CLI
- **Cost tracking** — log token usage per task/tier, surface in `queue_status`
+- **Cost tracking** — log token usage per task/level, surface in `status`
 - **Priority scoring** — automatic priority assignment based on labels, age, and dependencies
 - **Session archival** — auto-archive idle sessions after configurable timeout (currently indefinite)
 - **Progressive delegation** — track QA pass rates per level and auto-promote (see [Management Theory](MANAGEMENT.md))
--- a/docs/TESTING.md
+++ b/docs/TESTING.md
@@ -59,10 +59,15 @@ npm run test:ui
      "devclaw": {
        "config": {
          "models": {
-            "junior": "anthropic/claude-haiku-4-5",
+            "dev": {
-            "medior": "anthropic/claude-sonnet-4-5",
+              "junior": "anthropic/claude-haiku-4-5",
-            "senior": "anthropic/claude-opus-4-5",
+              "medior": "anthropic/claude-sonnet-4-5",
-            "qa": "anthropic/claude-sonnet-4-5"
+              "senior": "anthropic/claude-opus-4-5"
            },
            "qa": {
              "reviewer": "anthropic/claude-sonnet-4-5",
              "tester": "anthropic/claude-haiku-4-5"
            }
          }
        }
      }
--- a/docs/TOOLS.md
+++ b/docs/TOOLS.md
@@ -0,0 +1,361 @@
 # DevClaw — Tools Reference
 Complete reference for all 11 tools registered by DevClaw. See [`index.ts`](../index.ts) for registration.
 ## Worker Lifecycle
 ### `work_start`
 Pick up a task from the issue queue. Handles level assignment, label transition, session creation/reuse, task dispatch, and audit logging — all in one call.
 **Source:** [`lib/tools/work-start.ts`](../lib/tools/work-start.ts)
 **Context:** Only works in project group chats.
 **Parameters:**
 | Parameter | Type | Required | Description |
 |---|---|---|---|
 | `issueId` | number | No | Issue ID. If omitted, picks next by priority. |
 | `role` | `"dev"` \| `"qa"` | No | Worker role. Auto-detected from issue label if omitted. |
 | `projectGroupId` | string | No | Project group ID. Auto-detected from group context. |
 | `level` | string | No | Developer level (`junior`, `medior`, `senior`, `reviewer`). Auto-detected if omitted. |
 **What it does atomically:**
 1. Resolves project from `projects.json`
 2. Validates no active worker for this role
 3. Fetches issue from tracker, verifies correct label state
 4. Assigns level (LLM-chosen via `level` param → label detection → keyword heuristic fallback)
 5. Resolves level to model ID via config or defaults
 6. Loads prompt instructions from `projects/roles/<project>/<role>.md`
 7. Looks up existing session for assigned level (session-per-level)
 8. Transitions label (e.g. `To Do` → `Doing`)
 9. Creates session via Gateway RPC if new (`sessions.patch`)
 10. Dispatches task to worker session via CLI (`openclaw gateway call agent`)
 11. Updates `projects.json` state (active, issueId, level, session key)
 12. Writes audit log entries (work_start + model_selection)
 13. Sends notification
 14. Returns announcement text
 **Level selection priority:**
 1. `level` parameter (LLM-selected) — highest priority
 2. Issue label (e.g. a label named "junior" or "senior")
 3. Keyword heuristic from `model-selector.ts` — fallback
 **Execution guards:**
 - Rejects if role already has an active worker
 - Respects `roleExecution` (sequential: rejects if other role is active)
 **On failure:** Rolls back label transition. No orphaned state.
 ---
 ### `work_finish`
 Complete a task with a result. Called by workers (DEV/QA sub-agent sessions) directly, or by the orchestrator.
 **Source:** [`lib/tools/work-finish.ts`](../lib/tools/work-finish.ts)
 **Parameters:**
 | Parameter | Type | Required | Description |
 |---|---|---|---|
 | `role` | `"dev"` \| `"qa"` | Yes | Worker role |
 | `result` | string | Yes | Completion result (see table below) |
 | `projectGroupId` | string | Yes | Project group ID |
 | `summary` | string | No | Brief summary for the announcement |
 | `prUrl` | string | No | PR/MR URL (auto-detected if omitted) |
 **Valid results by role:**
 | Role | Result | Label transition | Side effects |
 |---|---|---|---|
 | DEV | `"done"` | Doing → To Test | git pull, auto-detect PR URL |
 | DEV | `"blocked"` | Doing → To Do | Task returns to queue |
 | QA | `"pass"` | Testing → Done | Issue closed |
 | QA | `"fail"` | Testing → To Improve | Issue reopened |
 | QA | `"refine"` | Testing → Refining | Awaits human decision |
 | QA | `"blocked"` | Testing → To Test | Task returns to QA queue |
 **What it does atomically:**
 1. Validates role:result combination
 2. Resolves project and active worker
 3. Executes completion via pipeline service (label transition + side effects)
 4. Deactivates worker (sessions map preserved for reuse)
 5. Sends notification
 6. Ticks queue to fill free worker slots
 7. Writes audit log
 **Scheduling:** After completion, `work_finish` ticks the queue. The scheduler sees the new label (`To Test` or `To Improve`) and dispatches the next worker if a slot is free.
 ---
 ## Task Management
 ### `task_create`
 Create a new issue in the project's issue tracker.
 **Source:** [`lib/tools/task-create.ts`](../lib/tools/task-create.ts)
 **Parameters:**
 | Parameter | Type | Required | Description |
 |---|---|---|---|
 | `projectGroupId` | string | Yes | Project group ID |
 | `title` | string | Yes | Issue title |
 | `description` | string | No | Full issue body (markdown) |
 | `label` | StateLabel | No | State label. Defaults to `"Planning"`. |
 | `assignees` | string[] | No | GitHub/GitLab usernames to assign |
 | `pickup` | boolean | No | If true, immediately pick up for DEV after creation |
 **Use cases:**
 - Orchestrator creates tasks from chat messages
 - Workers file follow-up bugs discovered during development
 - Breaking down epics into smaller tasks
 **Default behavior:** Creates issues in `"Planning"` state. Only use `"To Do"` when the user explicitly requests immediate work.
 ---
 ### `task_update`
 Change an issue's state label manually without going through the full pickup/complete flow.
 **Source:** [`lib/tools/task-update.ts`](../lib/tools/task-update.ts)
 **Parameters:**
 | Parameter | Type | Required | Description |
 |---|---|---|---|
 | `projectGroupId` | string | Yes | Project group ID |
 | `issueId` | number | Yes | Issue ID to update |
 | `state` | StateLabel | Yes | New state label |
 | `reason` | string | No | Audit log reason for the change |
 **Valid states:** `Planning`, `To Do`, `Doing`, `To Test`, `Testing`, `Done`, `To Improve`, `Refining`
 **Use cases:**
 - Manual state adjustments (e.g. `Planning → To Do` after approval)
 - Failed auto-transitions that need correction
 - Bulk state changes by orchestrator
 ---
 ### `task_comment`
 Add a comment to an issue for feedback, notes, or discussion.
 **Source:** [`lib/tools/task-comment.ts`](../lib/tools/task-comment.ts)
 **Parameters:**
 | Parameter | Type | Required | Description |
 |---|---|---|---|
 | `projectGroupId` | string | Yes | Project group ID |
 | `issueId` | number | Yes | Issue ID to comment on |
 | `body` | string | Yes | Comment body (markdown) |
 | `authorRole` | `"dev"` \| `"qa"` \| `"orchestrator"` | No | Attribution role prefix |
 **Use cases:**
 - QA adds review feedback before pass/fail decision
 - DEV posts implementation notes or progress updates
 - Orchestrator adds summary comments
 When `authorRole` is provided, the comment is prefixed with a role emoji and attribution label.
 ---
 ## Operations
 ### `status`
 Lightweight queue + worker state dashboard.
 **Source:** [`lib/tools/status.ts`](../lib/tools/status.ts)
 **Context:** Auto-filters to project in group chats. Shows all projects in DMs.
 **Parameters:**
 | Parameter | Type | Required | Description |
 |---|---|---|---|
 | `projectGroupId` | string | No | Filter to specific project. Omit for all. |
 **Returns per project:**
 - Worker state: active/idle, current issue, level, start time
 - Queue counts: To Do, To Test, To Improve
 - Role execution mode
 ---
 ### `health`
 Worker health scan with optional auto-fix.
 **Source:** [`lib/tools/health.ts`](../lib/tools/health.ts)
 **Context:** Auto-filters to project in group chats.
 **Parameters:**
 | Parameter | Type | Required | Description |
 |---|---|---|---|
 | `projectGroupId` | string | No | Filter to specific project. Omit for all. |
 | `fix` | boolean | No | Apply fixes for detected issues. Default: `false` (read-only). |
 | `activeSessions` | string[] | No | Active session IDs for zombie detection. |
 **Health checks:**
 | Issue | Severity | Detection | Auto-fix |
 |---|---|---|---|
 | Active worker with no session key | Critical | `active=true` but no session in map | Deactivate worker |
 | Active worker whose session is dead | Critical | Session key not in active sessions list | Deactivate worker, revert label |
 | Worker active >2 hours | Warning | `startTime` older than 2h | Deactivate worker, revert label to queue |
 | Inactive worker with lingering issue ID | Warning | `active=false` but `issueId` still set | Clear issueId |
 ---
 ### `work_heartbeat`
 Manual trigger for heartbeat: health fix + queue dispatch. Same logic as the background heartbeat service, but invoked on demand.
 **Source:** [`lib/tools/work-heartbeat.ts`](../lib/tools/work-heartbeat.ts)
 **Parameters:**
 | Parameter | Type | Required | Description |
 |---|---|---|---|
 | `projectGroupId` | string | No | Target single project. Omit for all. |
 | `dryRun` | boolean | No | Report only, don't dispatch. Default: `false`. |
 | `maxPickups` | number | No | Max worker dispatches per tick. |
 | `activeSessions` | string[] | No | Active session IDs for zombie detection. |
 **Two-pass sweep:**
 1. **Health pass** — Runs `checkWorkerHealth` per project per role. Auto-fixes zombies, stale workers, orphaned state.
 2. **Tick pass** — Calls `projectTick` per project. Fills free worker slots by priority (To Improve > To Test > To Do).
 **Execution guards:**
 - `projectExecution: "sequential"` — only one project active at a time
 - `roleExecution: "sequential"` — only one role (DEV or QA) active at a time per project (enforced in `projectTick`)
 ---
 ## Setup
 ### `project_register`
 One-time project setup. Creates state labels, scaffolds prompt files, adds project to state.
 **Source:** [`lib/tools/project-register.ts`](../lib/tools/project-register.ts)
 **Context:** Only works in the Telegram/WhatsApp group being registered.
 **Parameters:**
 | Parameter | Type | Required | Description |
 |---|---|---|---|
 | `projectGroupId` | string | No | Auto-detected from current group if omitted |
 | `name` | string | Yes | Short project name (e.g. `my-webapp`) |
 | `repo` | string | Yes | Path to git repo (e.g. `~/git/my-project`) |
 | `groupName` | string | No | Display name. Defaults to `Project: {name}`. |
 | `baseBranch` | string | Yes | Base branch for development |
 | `deployBranch` | string | No | Deploy branch. Defaults to baseBranch. |
 | `deployUrl` | string | No | Deployment URL |
 | `roleExecution` | `"parallel"` \| `"sequential"` | No | DEV/QA parallelism. Default: `"parallel"`. |
 **What it does atomically:**
 1. Validates project not already registered
 2. Resolves repo path, auto-detects GitHub/GitLab from git remote
 3. Verifies provider health (CLI installed and authenticated)
 4. Creates all 8 state labels (idempotent — safe to run again)
 5. Adds project entry to `projects.json` with empty worker state
   - DEV sessions: `{ junior: null, medior: null, senior: null }`
   - QA sessions: `{ reviewer: null, tester: null }`
 6. Scaffolds prompt files: `projects/roles/<project>/dev.md` and `qa.md`
 7. Writes audit log
 ---
 ### `setup`
 Agent + workspace initialization.
 **Source:** [`lib/tools/setup.ts`](../lib/tools/setup.ts)
 **Parameters:**
 | Parameter | Type | Required | Description |
 |---|---|---|---|
 | `newAgentName` | string | No | Create a new agent. Omit to configure current workspace. |
 | `channelBinding` | `"telegram"` \| `"whatsapp"` | No | Channel to bind (with `newAgentName` only) |
 | `migrateFrom` | string | No | Agent ID to migrate channel binding from |
 | `models` | object | No | Model overrides per role and level (see [Configuration](CONFIGURATION.md#model-tiers)) |
 | `projectExecution` | `"parallel"` \| `"sequential"` | No | Project execution mode |
 **What it does:**
 1. Creates a new agent or configures existing workspace
 2. Optionally binds messaging channel (Telegram/WhatsApp)
 3. Optionally migrates channel binding from another agent
 4. Writes workspace files: AGENTS.md, HEARTBEAT.md, `projects/projects.json`
 5. Configures model tiers in `openclaw.json`
 ---
 ### `onboard`
 Conversational onboarding guide. Returns step-by-step instructions for the agent to walk the user through setup.
 **Source:** [`lib/tools/onboard.ts`](../lib/tools/onboard.ts)
 **Context:** Works in DMs and via-agent. Blocks group chats (setup should not happen in project groups).
 **Parameters:**
 | Parameter | Type | Required | Description |
 |---|---|---|---|
 | `mode` | `"first-run"` \| `"reconfigure"` | No | Auto-detected from current state |
 **Flow:**
 1. Call `onboard` — returns QA-style step-by-step instructions
 2. Agent walks user through: agent selection, channel binding, model tiers
 3. Agent calls `setup` with collected answers
 4. User registers projects via `project_register` in group chats
 ---
 ## Completion Rules Reference
 The pipeline service (`lib/services/pipeline.ts`) defines declarative completion rules:
 ```
 dev:done    → Doing    → To Test     (git pull, detect PR)
 dev:blocked → Doing    → To Do       (return to queue)
 qa:pass     → Testing  → Done        (close issue)
 qa:fail     → Testing  → To Improve  (reopen issue)
 qa:refine   → Testing  → Refining    (await human decision)
 qa:blocked  → Testing  → To Test     (return to QA queue)
 ```
 ## Issue Priority Order
 When the heartbeat or `work_heartbeat` fills free worker slots, issues are prioritized:
 1. **To Improve** — QA failures get fixed first (highest priority)
 2. **To Test** — Completed DEV work gets reviewed next
 3. **To Do** — Fresh tasks are picked up last
 This ensures the pipeline clears its backlog before starting new work.
--- a/lib/templates.ts
+++ b/lib/templates.ts
@@ -102,7 +102,7 @@ All orchestration goes through these tools. You do NOT manually manage sessions,
 | \`status\` | Task queue and worker state per project (lightweight dashboard) |
 | \`health\` | Scan worker health: zombies, stale workers, orphaned state. Pass fix=true to auto-fix |
 | \`work_start\` | End-to-end: label transition, level assignment, session create/reuse, dispatch with role instructions |
-| \`work_finish\` | End-to-end: label transition, state update, issue close/reopen. Auto-ticks queue after completion. |
+| \`work_finish\` | End-to-end: label transition, state update, issue close/reopen. Ticks scheduler after completion. |
 ### Pipeline Flow
@@ -135,10 +135,10 @@ Evaluate each task and pass the appropriate developer level to \`work_start\`:
 ### When Work Completes
-Workers call \`work_finish\` themselves — the label transition, state update, and audit log happen atomically. After completion, \`work_finish\` auto-ticks the queue to fill free slots:
+Workers call \`work_finish\` themselves — the label transition, state update, and audit log happen atomically. After completion, \`work_finish\` ticks the scheduler to fill free slots:
- DEV "done" → issue moves to "To Test" → tick dispatches QA
+- DEV "done" → issue moves to "To Test" → scheduler dispatches QA
- QA "fail" → issue moves to "To Improve" → tick dispatches DEV
+- QA "fail" → issue moves to "To Improve" → scheduler dispatches DEV
 - QA "pass" → Done, no further dispatch
 - QA "refine" / blocked → needs human input