refactor: rename QA role to Tester and update related documentation

- Updated role references from "QA" to "Tester" in workflow and code comments.
- Revised documentation to reflect the new role structure, including role instructions and completion rules.
- Enhanced the testing guide with clearer instructions and examples for unit and E2E tests.
- Improved tools reference to align with the new role definitions and completion rules.
- Adjusted the roadmap to highlight recent changes in role configuration and workflow state machine.
This commit is contained in:
Lauren ten Hoor
2026-02-16 13:55:38 +08:00
parent 371e760d94
commit f7aa47102f
8 changed files with 928 additions and 634 deletions

View File

@@ -10,22 +10,22 @@ graph TB
direction TB
A_O["Orchestrator"]
A_GL[GitHub/GitLab Issues]
A_DEV["DEV (worker session)"]
A_QA["QA (worker session)"]
A_DEV["DEVELOPER (worker session)"]
A_TST["TESTER (worker session)"]
A_O -->|work_start| A_GL
A_O -->|dispatches| A_DEV
A_O -->|dispatches| A_QA
A_O -->|dispatches| A_TST
end
subgraph "Group Chat B"
direction TB
B_O["Orchestrator"]
B_GL[GitHub/GitLab Issues]
B_DEV["DEV (worker session)"]
B_QA["QA (worker session)"]
B_DEV["DEVELOPER (worker session)"]
B_TST["TESTER (worker session)"]
B_O -->|work_start| B_GL
B_O -->|dispatches| B_DEV
B_O -->|dispatches| B_QA
B_O -->|dispatches| B_TST
end
AGENT["Single OpenClaw Agent"]
@@ -33,7 +33,7 @@ graph TB
AGENT --- B_O
```
Worker sessions are expensive to start — each new spawn reads the full codebase (~50K tokens). DevClaw maintains **separate sessions per level per role** ([session-per-level design](#session-per-level-design)). When a medior dev finishes task A and picks up task B on the same project, the accumulated context carries over — no re-reading the repo. The plugin handles all session dispatch internally via OpenClaw CLI; the orchestrator agent never calls `sessions_spawn` or `sessions_send`.
Worker sessions are expensive to start — each new spawn reads the full codebase (~50K tokens). DevClaw maintains **separate sessions per level per role** ([session-per-level design](#session-per-level-design)). When a medior developer finishes task A and picks up task B on the same project, the accumulated context carries over — no re-reading the repo. The plugin handles all session dispatch internally via OpenClaw CLI; the orchestrator agent never calls `sessions_spawn` or `sessions_send`.
```mermaid
sequenceDiagram
@@ -42,7 +42,7 @@ sequenceDiagram
participant IT as Issue Tracker
participant S as Worker Session
O->>DC: work_start({ issueId: 42, role: "dev" })
O->>DC: work_start({ issueId: 42, role: "developer" })
DC->>IT: Fetch issue, verify label
DC->>DC: Assign level (junior/medior/senior)
DC->>DC: Check existing session for assigned level
@@ -62,19 +62,20 @@ Understanding the OpenClaw model is key to understanding how DevClaw works:
### Session-per-level design
Each project maintains **separate sessions per developer level per role**. A project's DEV might have a junior session, a medior session, and a senior session — each accumulating its own codebase context over time.
Each project maintains **separate sessions per developer level per role**. A project's DEVELOPER might have a junior session, a medior session, and a senior session — each accumulating its own codebase context over time.
```
Orchestrator Agent (configured in openclaw.json)
└─ Main session (long-lived, handles all projects)
├─ Project A
│ ├─ DEV sessions: { junior: <key>, medior: <key>, senior: null }
QA sessions: { reviewer: <key>, tester: null }
│ ├─ DEVELOPER sessions: { junior: <key>, medior: <key>, senior: null }
TESTER sessions: { junior: null, medior: <key>, senior: null }
│ └─ ARCHITECT sessions: { junior: <key>, senior: null }
└─ Project B
├─ DEV sessions: { junior: null, medior: <key>, senior: null }
└─ QA sessions: { reviewer: <key>, tester: null }
├─ DEVELOPER sessions: { junior: null, medior: <key>, senior: null }
└─ TESTER sessions: { junior: null, medior: <key>, senior: null }
```
Why per-level instead of switching models on one session:
@@ -114,6 +115,18 @@ The agent's only job after `work_start` returns is to post the announcement to T
DevClaw provides equivalent guardrails for everything except auto-reporting, which the heartbeat handles.
## Roles
DevClaw ships with three built-in roles, defined in `lib/roles/registry.ts`. All roles use the same level scheme (junior/medior/senior) — levels describe task complexity, not the role.
| Role | ID | Levels | Default Level | Completion Results |
|---|---|---|---|---|
| Developer | `developer` | junior, medior, senior | medior | done, review, blocked |
| Tester | `tester` | junior, medior, senior | medior | pass, fail, refine, blocked |
| Architect | `architect` | junior, senior | junior | done, blocked |
Roles are extensible — add a new entry to `ROLE_REGISTRY` and corresponding workflow states to get a new role. The `workflow.yaml` config can also override levels, models, and emoji per role, or disable a role entirely (`architect: false`).
## System overview
```mermaid
@@ -127,10 +140,11 @@ graph TB
MS[Main Session<br/>orchestrator agent]
GW[Gateway RPC<br/>sessions.patch / sessions.list]
CLI[openclaw gateway call agent]
DEV_J[DEV session<br/>junior]
DEV_M[DEV session<br/>medior]
DEV_S[DEV session<br/>senior]
QA_R[QA session<br/>reviewer]
DEV_J[DEVELOPER session<br/>junior]
DEV_M[DEVELOPER session<br/>medior]
DEV_S[DEVELOPER session<br/>senior]
TST_M[TESTER session<br/>medior]
ARCH[ARCHITECT session<br/>junior]
end
subgraph "DevClaw Plugin"
@@ -196,12 +210,13 @@ graph TB
CLI -->|sends task| DEV_J
CLI -->|sends task| DEV_M
CLI -->|sends task| DEV_S
CLI -->|sends task| QA_R
CLI -->|sends task| TST_M
CLI -->|sends task| ARCH
DEV_J -->|writes code, creates MRs| REPO
DEV_M -->|writes code, creates MRs| REPO
DEV_S -->|writes code, creates MRs| REPO
QA_R -->|reviews code, tests| REPO
DEV_J -->|writes code, creates PRs| REPO
DEV_M -->|writes code, creates PRs| REPO
DEV_S -->|writes code, creates PRs| REPO
TST_M -->|reviews code, tests| REPO
```
## End-to-end flow: human to sub-agent
@@ -216,7 +231,7 @@ sequenceDiagram
participant DC as DevClaw Plugin
participant GW as Gateway RPC
participant CLI as openclaw gateway call agent
participant DEV as DEV Session<br/>(medior)
participant DEV as DEVELOPER Session<br/>(medior)
participant GL as Issue Tracker
Note over H,GL: Issue exists in queue (To Do)
@@ -225,51 +240,51 @@ sequenceDiagram
TG->>MS: delivers message
MS->>DC: status()
DC->>GL: list issues by label "To Do"
DC-->>MS: { toDo: [#42], dev: idle }
DC-->>MS: { toDo: [#42], developer: idle }
Note over MS: Decides to pick up #42 for DEV as medior
Note over MS: Decides to pick up #42 for DEVELOPER as medior
MS->>DC: work_start({ issueId: 42, role: "dev", level: "medior", ... })
MS->>DC: work_start({ issueId: 42, role: "developer", level: "medior", ... })
DC->>DC: resolve level "medior" → model ID
DC->>DC: lookup dev.sessions.medior → null (first time)
DC->>DC: lookup developer.sessions.medior → null (first time)
DC->>GL: transition label "To Do" → "Doing"
DC->>GW: sessions.patch({ key: new-session-key, model: "anthropic/claude-sonnet-4-5" })
DC->>CLI: openclaw gateway call agent --params { sessionKey, message }
CLI->>DEV: creates session, delivers task
DC->>DC: store session key in projects.json + append audit.log
DC-->>MS: { success: true, announcement: "🔧 Spawning DEV (medior) for #42" }
DC-->>MS: { success: true, announcement: "🔧 Spawning DEVELOPER (medior) for #42" }
MS->>TG: "🔧 Spawning DEV (medior) for #42: Add login page"
MS->>TG: "🔧 Spawning DEVELOPER (medior) for #42: Add login page"
TG->>H: sees announcement
Note over DEV: Works autonomously — reads code, writes code, creates MR
Note over DEV: Works autonomously — reads code, writes code, creates PR
Note over DEV: Calls work_finish when done
DEV->>DC: work_finish({ role: "dev", result: "done", ... })
DEV->>DC: work_finish({ role: "developer", result: "done", ... })
DC->>GL: transition label "Doing" → "To Test"
DC->>DC: deactivate worker (sessions preserved)
DC-->>DEV: { announcement: "✅ DEV DONE #42" }
DC-->>DEV: { announcement: "✅ DEVELOPER DONE #42" }
MS->>TG: "✅ DEV DONE #42 — moved to QA queue"
MS->>TG: "✅ DEVELOPER DONE #42 — moved to TESTER queue"
TG->>H: sees announcement
```
On the **next DEV task** for this project that also assigns medior:
On the **next DEVELOPER task** for this project that also assigns medior:
```mermaid
sequenceDiagram
participant MS as Main Session
participant DC as DevClaw Plugin
participant CLI as openclaw gateway call agent
participant DEV as DEV Session<br/>(medior, existing)
participant DEV as DEVELOPER Session<br/>(medior, existing)
MS->>DC: work_start({ issueId: 57, role: "dev", level: "medior", ... })
MS->>DC: work_start({ issueId: 57, role: "developer", level: "medior", ... })
DC->>DC: resolve level "medior" → model ID
DC->>DC: lookup dev.sessions.medior → existing key!
DC->>DC: lookup developer.sessions.medior → existing key!
Note over DC: No sessions.patch needed — session already exists
DC->>CLI: openclaw gateway call agent --params { sessionKey, message }
CLI->>DEV: delivers task to existing session (has full codebase context)
DC-->>MS: { success: true, announcement: "⚡ Sending DEV (medior) for #57" }
DC-->>MS: { success: true, announcement: "⚡ Sending DEVELOPER (medior) for #57" }
```
Session reuse saves ~50K tokens per task by not re-reading the codebase.
@@ -304,7 +319,7 @@ sequenceDiagram
A->>QS: status({ projectGroupId: "-123" })
QS->>PJ: readProjects()
PJ-->>QS: { dev: idle, qa: idle }
PJ-->>QS: { developer: idle, tester: idle }
QS->>GL: list issues by label "To Do"
GL-->>QS: [{ id: 42, title: "Add login page" }]
QS->>GL: list issues by label "To Test"
@@ -312,12 +327,12 @@ sequenceDiagram
QS->>GL: list issues by label "To Improve"
GL-->>QS: []
QS->>AL: append { event: "status", ... }
QS-->>A: { dev: idle, queue: { toDo: [#42] } }
QS-->>A: { developer: idle, queue: { toDo: [#42] } }
```
**Orchestrator decides:** DEV is idle, issue #42 is in To Do → pick it up. Evaluates complexity → assigns medior level.
**Orchestrator decides:** DEVELOPER is idle, issue #42 is in To Do → pick it up. Evaluates complexity → assigns medior level.
### Phase 3: DEV pickup
### Phase 3: DEVELOPER pickup
The plugin handles everything end-to-end — level resolution, session lookup, label transition, state update, **and** task dispatch to the worker session. The agent's only job after is to post the announcement.
@@ -332,13 +347,13 @@ sequenceDiagram
participant PJ as projects.json
participant AL as audit.log
A->>WS: work_start({ issueId: 42, role: "dev", projectGroupId: "-123", level: "medior" })
A->>WS: work_start({ issueId: 42, role: "developer", projectGroupId: "-123", level: "medior" })
WS->>PJ: readProjects()
WS->>GL: getIssue(42)
GL-->>WS: { title: "Add login page", labels: ["To Do"] }
WS->>WS: Verify label is "To Do"
WS->>TIER: resolve "medior" → "anthropic/claude-sonnet-4-5"
WS->>PJ: lookup dev.sessions.medior
WS->>PJ: lookup developer.sessions.medior
WS->>GL: transitionLabel(42, "To Do", "Doing")
alt New session
WS->>GW: sessions.patch({ key: new-key, model: "anthropic/claude-sonnet-4-5" })
@@ -351,98 +366,116 @@ sequenceDiagram
**Writes:**
- `Issue Tracker`: label "To Do" → "Doing"
- `projects.json`: dev.active=true, dev.issueId="42", dev.level="medior", dev.sessions.medior=key
- `projects.json`: workers.developer.active=true, issueId="42", level="medior", sessions.medior=key
- `audit.log`: 2 entries (work_start, model_selection)
- `Session`: task message delivered to worker session via CLI
### Phase 4: DEV works
### Phase 4: DEVELOPER works
```
DEV sub-agent session → reads codebase, writes code, creates MR
DEV sub-agent session → calls work_finish({ role: "dev", result: "done", ... })
DEVELOPER sub-agent session → reads codebase, writes code, creates PR
DEVELOPER sub-agent session → calls work_finish({ role: "developer", result: "done", ... })
```
This happens inside the OpenClaw session. The worker calls `work_finish` directly for atomic state updates. If the worker discovers unrelated bugs, it calls `task_create` to file them.
### Phase 5: DEV complete (worker self-reports)
### Phase 5: DEVELOPER complete (worker self-reports)
```mermaid
sequenceDiagram
participant DEV as DEV Session
participant DEV as DEVELOPER Session
participant WF as work_finish
participant GL as Issue Tracker
participant PJ as projects.json
participant AL as audit.log
participant REPO as Git Repo
participant QA as QA Session
DEV->>WF: work_finish({ role: "dev", result: "done", projectGroupId: "-123", summary: "Login page with OAuth" })
DEV->>WF: work_finish({ role: "developer", result: "done", projectGroupId: "-123", summary: "Login page with OAuth" })
WF->>PJ: readProjects()
PJ-->>WF: { dev: { active: true, issueId: "42" } }
PJ-->>WF: { developer: { active: true, issueId: "42" } }
WF->>REPO: git pull
WF->>PJ: deactivateWorker(-123, dev)
WF->>PJ: deactivateWorker(-123, developer)
Note over PJ: active→false, issueId→null<br/>sessions map PRESERVED
WF->>GL: transitionLabel "Doing" → "To Test"
WF->>AL: append { event: "work_finish", role: "dev", result: "done" }
WF->>AL: append { event: "work_finish", role: "developer", result: "done" }
WF->>WF: tick queue (fill free slots)
Note over WF: Scheduler sees "To Test" issue, QA slot free → dispatches QA
WF-->>DEV: { announcement: "✅ DEV DONE #42", tickPickups: [...] }
Note over WF: Scheduler sees "To Test" issue, TESTER slot free → dispatches TESTER
WF-->>DEV: { announcement: "✅ DEVELOPER DONE #42", tickPickups: [...] }
```
**Writes:**
- `Git repo`: pulled latest (has DEV's merged code)
- `projects.json`: dev.active=false, dev.issueId=null (sessions map preserved for reuse)
- `Git repo`: pulled latest (has DEVELOPER's merged code)
- `projects.json`: workers.developer.active=false, issueId=null (sessions map preserved for reuse)
- `Issue Tracker`: label "Doing" → "To Test"
- `audit.log`: 1 entry (work_finish) + tick entries if workers dispatched
### Phase 6: QA pickup
### Phase 5b: DEVELOPER requests review (alternative path)
Same as Phase 3, but with `role: "qa"`. Label transitions "To Test" → "Testing". Uses the reviewer level.
### Phase 7: QA result (4 possible outcomes)
#### 7a. QA Pass
Instead of merging the PR themselves, a developer can leave it open for human review:
```mermaid
sequenceDiagram
participant QA as QA Session
participant DEV as DEVELOPER Session
participant WF as work_finish
participant GL as Issue Tracker
participant PJ as projects.json
DEV->>WF: work_finish({ role: "developer", result: "review", ... })
WF->>GL: transitionLabel "Doing" → "In Review"
WF->>PJ: deactivateWorker (sessions preserved)
WF-->>DEV: { announcement: "👀 DEVELOPER REVIEW #42" }
```
The issue sits in "In Review" until the heartbeat's **review pass** detects the PR has been merged, then automatically transitions to "To Test".
### Phase 6: TESTER pickup
Same as Phase 3, but with `role: "tester"`. Label transitions "To Test" → "Testing". Level selection determines which tester session is used.
### Phase 7: TESTER result (4 possible outcomes)
#### 7a. TESTER Pass
```mermaid
sequenceDiagram
participant TST as TESTER Session
participant WF as work_finish
participant GL as Issue Tracker
participant PJ as projects.json
participant AL as audit.log
QA->>WF: work_finish({ role: "qa", result: "pass", projectGroupId: "-123" })
WF->>PJ: deactivateWorker(-123, qa)
TST->>WF: work_finish({ role: "tester", result: "pass", projectGroupId: "-123" })
WF->>PJ: deactivateWorker(-123, tester)
WF->>GL: transitionLabel(42, "Testing", "Done")
WF->>GL: closeIssue(42)
WF->>AL: append { event: "work_finish", role: "qa", result: "pass" }
WF-->>QA: { announcement: "🎉 QA PASS #42. Issue closed." }
WF->>AL: append { event: "work_finish", role: "tester", result: "pass" }
WF-->>TST: { announcement: "🎉 TESTER PASS #42. Issue closed." }
```
**Ticket complete.** Issue closed, label "Done".
#### 7b. QA Fail
#### 7b. TESTER Fail
```mermaid
sequenceDiagram
participant QA as QA Session
participant TST as TESTER Session
participant WF as work_finish
participant GL as Issue Tracker
participant PJ as projects.json
participant AL as audit.log
QA->>WF: work_finish({ role: "qa", result: "fail", projectGroupId: "-123", summary: "OAuth redirect broken" })
WF->>PJ: deactivateWorker(-123, qa)
TST->>WF: work_finish({ role: "tester", result: "fail", projectGroupId: "-123", summary: "OAuth redirect broken" })
WF->>PJ: deactivateWorker(-123, tester)
WF->>GL: transitionLabel(42, "Testing", "To Improve")
WF->>GL: reopenIssue(42)
WF->>AL: append { event: "work_finish", role: "qa", result: "fail" }
WF-->>QA: { announcement: "❌ QA FAIL #42 — OAuth redirect broken. Sent back to DEV." }
WF->>AL: append { event: "work_finish", role: "tester", result: "fail" }
WF-->>TST: { announcement: "❌ TESTER FAIL #42 — OAuth redirect broken. Sent back to DEVELOPER." }
```
**Cycle restarts:** Issue goes to "To Improve". Next heartbeat, DEV picks it up again (Phase 3, but from "To Improve" instead of "To Do").
**Cycle restarts:** Issue goes to "To Improve". Next heartbeat, DEVELOPER picks it up again (Phase 3, but from "To Improve" instead of "To Do").
#### 7c. QA Refine
#### 7c. TESTER Refine
```
Label: "Testing" → "Refining"
@@ -450,14 +483,14 @@ Label: "Testing" → "Refining"
Issue needs human decision. Pipeline pauses until human moves it to "To Do" or closes it.
#### 7d. Blocked (DEV or QA)
#### 7d. Blocked (DEVELOPER or TESTER)
```
DEV Blocked: "Doing" → "To Do"
QA Blocked: "Testing" → "To Test"
DEVELOPER Blocked: "Doing" → "Refining"
TESTER Blocked: "Testing" → "Refining"
```
Worker cannot complete (missing info, environment errors, etc.). Issue returns to queue for retry. The task is available for the next heartbeat pickup.
Worker cannot complete (missing info, environment errors, etc.). Issue enters hold state for human decision. The human can move it back to "To Do" to retry or take other action.
### Completion enforcement
@@ -465,18 +498,19 @@ Three layers guarantee that `work_finish` always runs:
1. **Completion contract** — Every task message sent to a worker session includes a mandatory `## MANDATORY: Task Completion` section listing available results and requiring `work_finish` even on failure. Workers are instructed to use `"blocked"` if stuck.
2. **Blocked result**Both DEV and QA can use `"blocked"` to gracefully return a task to queue without losing work. DEV blocked: `Doing → To Do`. QA blocked: `Testing → To Test`. This gives workers an escape hatch instead of silently dying.
2. **Blocked result**All roles can use `"blocked"` to gracefully hand off to a human. Developer blocked: `Doing → Refining`. Tester blocked: `Testing → Refining`. This gives workers an escape hatch instead of silently dying.
3. **Stale worker watchdog** — The heartbeat's health check detects workers active for >2 hours. With `fix=true`, it deactivates the worker and reverts the label back to queue. This catches sessions that crashed, ran out of context, or otherwise failed without calling `work_finish`. The `health` tool provides the same check for manual invocation.
### Phase 8: Heartbeat (continuous)
The heartbeat runs periodically (via background service or manual `work_heartbeat` trigger). It combines health check + queue scan:
The heartbeat runs periodically (via background service or manual `work_heartbeat` trigger). It combines health check + review polling + queue scan:
```mermaid
sequenceDiagram
participant HB as Heartbeat Service
participant SH as health check
participant RV as review pass
participant TK as projectTick
participant WS as work_start (dispatch)
Note over HB: Tick triggered (every 60s)
@@ -485,6 +519,10 @@ sequenceDiagram
Note over SH: Checks for zombies, stale workers
SH-->>HB: { fixes applied }
HB->>RV: reviewPass per project
Note over RV: Polls PR status for "In Review" issues
RV-->>HB: { transitions made }
HB->>TK: projectTick per project
Note over TK: Scans queue: To Improve > To Test > To Do
TK->>WS: dispatchTask (fill free slots)
@@ -492,6 +530,31 @@ sequenceDiagram
TK-->>HB: { pickups, skipped }
```
## Worker instructions (bootstrap hook)
Role-specific instructions (coding standards, deployment steps, completion rules) are injected into worker sessions via the `agent:bootstrap` hook — not appended to the task message.
```mermaid
sequenceDiagram
participant GW as Gateway
participant BH as Bootstrap Hook
participant FS as Filesystem
Note over GW: Worker session starts
GW->>BH: agent:bootstrap event (sessionKey, bootstrapFiles[])
BH->>BH: Parse session key → { projectName, role }
BH->>FS: Load role instructions (project-specific → default)
FS-->>BH: content + source path
BH->>BH: Push WORKER_INSTRUCTIONS.md into bootstrapFiles
BH-->>GW: bootstrapFiles now includes role instructions
```
**Resolution order:**
1. `devclaw/projects/<project>/prompts/<role>.md` (project-specific)
2. `devclaw/prompts/<role>.md` (workspace default)
The source path is logged for production traceability: `Bootstrap hook: injected developer instructions for project "my-app" from /path/to/prompts/developer.md`.
## Data flow map
Every piece of data and where it lives:
@@ -503,15 +566,16 @@ Every piece of data and where it lives:
│ Issue #42: "Add login page" │
│ Labels: [Planning | To Do | Doing | To Test | Testing | ...] │
│ State: open / closed │
MRs/PRs: linked merge/pull requests
│ PRs: linked pull/merge requests (status polled for In Review)
│ Created by: orchestrator (task_create), workers, or humans │
└─────────────────────────────────────────────────────────────────┘
↕ gh/glab CLI (read/write, auto-detected)
↕ cockatiel resilience: retry + circuit breaker
┌─────────────────────────────────────────────────────────────────┐
│ DevClaw Plugin (orchestration logic) │
│ │
│ setup → agent creation + workspace + model config │
│ work_start → level + label + dispatch + role instr (e2e)
│ work_start → level + label + dispatch (e2e)
│ work_finish → label + state + git pull + tick queue │
│ task_create → create issue in tracker │
│ task_update → manual label state change │
@@ -519,27 +583,38 @@ Every piece of data and where it lives:
│ status → read labels + read state │
│ health → check sessions + fix zombies │
│ project_register → labels + prompts + state init (one-time) │
│ design_task → architect dispatch │
│ │
│ Bootstrap hook → injects role instructions into worker sessions│
│ Review pass → polls PR status, auto-advances In Review │
│ Config loader → three-layer merge + Zod validation │
└─────────────────────────────────────────────────────────────────┘
↕ atomic file I/O ↕ OpenClaw CLI (plugin shells out)
┌────────────────────────────────┐ ┌──────────────────────────────┐
projects/projects.json │ │ OpenClaw Gateway + CLI │
devclaw/projects.json │ │ OpenClaw Gateway + CLI │
│ │ │ (called by plugin, not agent)│
│ Per project: │ │ │
dev: │ │ openclaw gateway call │
active, issueId, level │ │ sessions.patch → create │
sessions: │ │ sessions.list → health │
junior: <key> │ │ sessions.delete → cleanup │
medior: <key> │ │ │
senior: <key> │ │ openclaw gateway call agent │
qa: │ │ --params { sessionKey, │
active, issueId, level │ │ message, agentId } │
sessions: │ │ → dispatches to session │
reviewer: <key> │ │ │
tester: <key> │ │ │
workers: │ │ openclaw gateway call │
developer: │ │ sessions.patch → create │
active, issueId, level │ │ sessions.list → health │
sessions: │ │ sessions.delete → cleanup │
junior: <key> │ │ │
medior: <key> │ │ openclaw gateway call agent │
senior: <key> │ │ --params { sessionKey, │
tester: │ │ message, agentId } │
active, issueId, level │ │ → dispatches to session │
sessions: │ │ │
junior: <key> │ │ │
│ medior: <key> │ │ │
│ senior: <key> │ │ │
│ architect: │ │ │
│ sessions: │ │ │
│ junior: <key> │ │ │
│ senior: <key> │ │ │
└────────────────────────────────┘ └──────────────────────────────┘
↕ append-only
┌─────────────────────────────────────────────────────────────────┐
│ log/audit.log (observability)
devclaw/log/audit.log (observability) │
│ │
│ NDJSON, one line per event: │
│ work_start, work_finish, model_selection, │
@@ -553,21 +628,23 @@ Every piece of data and where it lives:
│ Telegram / WhatsApp (user-facing messages) │
│ │
│ Per group chat: │
│ "🔧 Spawning DEV (medior) for #42: Add login page"
│ "⚡ Sending DEV (medior) for #57: Fix validation"
│ "✅ DEV DONE #42 — Login page with OAuth."
│ "🎉 QA PASS #42. Issue closed."
│ "❌ QA FAIL #42 — OAuth redirect broken."
│ "🚫 DEV BLOCKED #42 — Missing dependencies." │
│ "🚫 QA BLOCKED #42 — Env not available."
│ "🔧 Spawning DEVELOPER (medior) for #42: Add login page" │
│ "⚡ Sending DEVELOPER (medior) for #57: Fix validation" │
│ "✅ DEVELOPER DONE #42 — Login page with OAuth." │
│ "👀 DEVELOPER REVIEW #42 — PR open for review."
│ "🎉 TESTER PASS #42. Issue closed."
│ "❌ TESTER FAIL #42 — OAuth redirect broken." │
│ "🚫 DEVELOPER BLOCKED #42 — Missing dependencies."
│ "🚫 TESTER BLOCKED #42 — Env not available." │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ Git Repository (codebase) │
│ │
│ DEV sub-agent sessions: read code, write code, create MRs
QA sub-agent sessions: read code, run tests, review MRs
work_finish (DEV done): git pull to sync latest
│ DEVELOPER sub-agent sessions: read code, write code, create PRs│
TESTER sub-agent sessions: read code, run tests, review PRs │
ARCHITECT sub-agent sessions: research, design, recommend
│ work_finish (developer done): git pull to sync latest │
└─────────────────────────────────────────────────────────────────┘
```
@@ -584,9 +661,12 @@ graph LR
SETUP[Agent + workspace setup]
SD[Session dispatch<br/>create + send via CLI]
AC[Scheduling<br/>tick queue after work_finish]
RI[Role instructions<br/>loaded per project]
RI[Role instructions<br/>injected via bootstrap hook]
RV[Review polling<br/>PR status → auto-advance]
A[Audit logging]
Z[Zombie cleanup]
CFG[Config validation<br/>Zod + integrity checks]
RES[Provider resilience<br/>retry + circuit breaker]
end
subgraph "Orchestrator handles (planning only)"
@@ -600,7 +680,7 @@ graph LR
subgraph "Sub-agent sessions handle"
CR[Code writing]
MR[MR creation/review]
MR[PR creation/review]
WF_W[Task completion<br/>via work_finish]
BUG[Bug filing<br/>via task_create]
end
@@ -611,7 +691,7 @@ graph LR
end
```
**Key boundary:** The orchestrator is a planner and dispatcher — it never writes code. All implementation work (code edits, git operations, tests) must go through sub-agent sessions via the `task_create``work_start` pipeline. This ensures audit trails, tier selection, and QA review for every code change.
**Key boundary:** The orchestrator is a planner and dispatcher — it never writes code. All implementation work (code edits, git operations, tests) must go through sub-agent sessions via the `task_create``work_start` pipeline. This ensures audit trails, level selection, and testing for every code change.
## IssueProvider abstraction
@@ -624,10 +704,13 @@ All issue tracker operations go through the `IssueProvider` interface, defined i
- `transitionLabel` — atomic label state transition (unlabel + label)
- `closeIssue` / `reopenIssue` — issue lifecycle
- `hasStateLabel` / `getCurrentStateLabel` — label inspection
- `getPrStatus` — get PR/MR state (open, merged, approved, none)
- `hasMergedMR` / `getMergedMRUrl` — MR/PR verification
- `addComment` — add comment to issue
- `healthCheck` — verify provider connectivity
**Provider resilience:** All provider calls are wrapped with cockatiel retry (3 attempts, exponential backoff) + circuit breaker (opens after 5 consecutive failures, half-opens after 30s). See `lib/providers/resilience.ts`.
**Current providers:**
- **GitHub** (`lib/providers/github.ts`) — wraps `gh` CLI
- **GitLab** (`lib/providers/gitlab.ts`) — wraps `glab` CLI
@@ -637,19 +720,34 @@ All issue tracker operations go through the `IssueProvider` interface, defined i
Provider selection is handled by `createProvider()` in `lib/providers/index.ts`. Auto-detects GitHub vs GitLab from the git remote URL.
## Configuration system
DevClaw uses a three-layer config system with `workflow.yaml` files:
```
Layer 1: Built-in defaults (ROLE_REGISTRY + DEFAULT_WORKFLOW)
Layer 2: Workspace: <workspace>/devclaw/workflow.yaml
Layer 3: Project: <workspace>/devclaw/projects/<project>/workflow.yaml
```
Each layer can override roles (levels, models, emoji), workflow states/transitions, and timeouts. Config is validated with Zod schemas at load time, with cross-reference integrity checks (transition targets exist, queue states have roles, terminal states have no outgoing transitions).
See [CONFIGURATION.md](CONFIGURATION.md) for the full reference.
## Error recovery
| Failure | Detection | Recovery |
|---|---|---|
| Session dies mid-task | `health` checks via `sessions.list` Gateway RPC | `fix=true`: reverts label, clears active state. Next heartbeat picks up task again (creates fresh session for that level). |
| gh/glab command fails | Plugin tool throws error, returns to agent | Agent retries or reports to Telegram group |
| gh/glab command fails | Cockatiel retry (3 attempts), then circuit breaker | Circuit opens after 5 consecutive failures, prevents hammering. Plugin catches and returns error. |
| `openclaw gateway call agent` fails | Plugin catches error during dispatch | Plugin rolls back: reverts label, clears active state. Returns error. No orphaned state. |
| `sessions.patch` fails | Plugin catches error during session creation | Plugin rolls back label transition. Returns error. |
| projects.json corrupted | Tool can't parse JSON | Manual fix needed. Atomic writes (temp+rename) prevent partial writes. |
| projects.json corrupted | Tool can't parse JSON | Manual fix needed. Atomic writes (temp+rename) prevent partial writes. File locking prevents concurrent races. |
| Label out of sync | `work_start` verifies label before transitioning | Throws error if label doesn't match expected state. |
| Worker already active | `work_start` checks `active` flag | Throws error: "DEV already active on project". Must complete current task first. |
| Worker already active | `work_start` checks `active` flag | Throws error: "DEVELOPER already active on project". Must complete current task first. |
| Stale worker (>2h) | `health` and heartbeat health check | `fix=true`: deactivates worker, reverts label to queue. Task available for next pickup. |
| Worker stuck/blocked | Worker calls `work_finish` with `"blocked"` | Deactivates worker, reverts label to queue. Issue available for retry. |
| Worker stuck/blocked | Worker calls `work_finish` with `"blocked"` | Deactivates worker, transitions to "Refining" (hold state). Requires human decision to proceed. |
| Config invalid | Zod schema validation at load time | Clear error message with field path. Prevents startup with broken config. |
| `project_register` fails | Plugin catches error during label creation or state write | Clean error returned. Labels are idempotent, projects.json not written until all labels succeed. |
## File locations
@@ -659,8 +757,11 @@ Provider selection is handled by `createProvider()` in `lib/providers/index.ts`.
| Plugin source | `~/.openclaw/extensions/devclaw/` | Plugin code |
| Plugin manifest | `~/.openclaw/extensions/devclaw/openclaw.plugin.json` | Plugin registration |
| Agent config | `~/.openclaw/openclaw.json` | Agent definition + tool permissions + model config |
| Worker state | `~/.openclaw/workspace-<agent>/projects/projects.json` | Per-project DEV/QA state |
| Role instructions | `~/.openclaw/workspace-<agent>/projects/roles/<project>/` | Per-project `dev.md` and `qa.md` |
| Audit log | `~/.openclaw/workspace-<agent>/log/audit.log` | NDJSON event log |
| Worker state | `<workspace>/devclaw/projects.json` | Per-project worker state |
| Workflow config (workspace) | `<workspace>/devclaw/workflow.yaml` | Workspace-level role/workflow overrides |
| Workflow config (project) | `<workspace>/devclaw/projects/<project>/workflow.yaml` | Project-specific overrides |
| Default role instructions | `<workspace>/devclaw/prompts/<role>.md` | Default `developer.md`, `tester.md`, `architect.md` |
| Project role instructions | `<workspace>/devclaw/projects/<project>/prompts/<role>.md` | Per-project role instruction overrides |
| Audit log | `<workspace>/devclaw/log/audit.log` | NDJSON event log |
| Session transcripts | `~/.openclaw/agents/<agent>/sessions/<uuid>.jsonl` | Conversation history per session |
| Git repos | `~/git/<project>/` | Project source code |