docs: overhaul documentation for consistency with implementation
Complete documentation rewrite to match the current codebase: - README: add benefits section (process consistency, token savings with estimates, project isolation, continuous planning, feedback loops, role-based prompts, atomic operations, audit trail), task workflow with state diagram, model-to-role mapping tables, installation guide - New TOOLS.md: complete reference for all 11 tools with parameters, behavior, and execution guards - New CONFIGURATION.md: full config reference for openclaw.json, projects.json, heartbeat, notifications, workspace layout - Fix tool names across all docs: task_pickup→work_start, task_complete→work_finish - Fix tier model: QA has reviewer/tester levels, not flat "qa" - Fix config schema: nested models.dev.*/models.qa.* structure - Fix prompt path: projects/roles/ not projects/prompts/ - Fix worker state: uses "level" field not "model"/"tier" - Fix MANAGEMENT.md: remove incorrect model references - Fix TESTING.md: update model config example to nested structure - Remove VERIFICATION.md (one-off checklist, no longer needed) - Add cross-references between all docs pages https://claude.ai/code/session_01R3rGevPY748gP4uK2ggYag
This commit is contained in:
@@ -6,59 +6,59 @@ Understanding the OpenClaw model is key to understanding how DevClaw works:
|
||||
|
||||
- **Agent** — A configured entity in `openclaw.json`. Has a workspace, model, identity files (SOUL.md, IDENTITY.md), and tool permissions. Persists across restarts.
|
||||
- **Session** — A runtime conversation instance. Each session has its own context window and conversation history, stored as a `.jsonl` transcript file.
|
||||
- **Sub-agent session** — A session created under the orchestrator agent for a specific worker role. NOT a separate agent — it's a child session running under the same agent, with its own isolated context. Format: `agent:<parent>:subagent:<uuid>`.
|
||||
- **Sub-agent session** — A session created under the orchestrator agent for a specific worker role. NOT a separate agent — it's a child session running under the same agent, with its own isolated context. Format: `agent:<parent>:subagent:<project>-<role>-<level>`.
|
||||
|
||||
### Session-per-tier design
|
||||
### Session-per-level design
|
||||
|
||||
Each project maintains **separate sessions per developer tier per role**. A project's DEV might have a junior session, a medior session, and a senior session — each accumulating its own codebase context over time.
|
||||
Each project maintains **separate sessions per developer level per role**. A project's DEV might have a junior session, a medior session, and a senior session — each accumulating its own codebase context over time.
|
||||
|
||||
```
|
||||
Orchestrator Agent (configured in openclaw.json)
|
||||
└─ Main session (long-lived, handles all projects)
|
||||
│
|
||||
├─ Project A
|
||||
│ ├─ DEV sessions: { junior: <uuid>, medior: <uuid>, senior: null }
|
||||
│ └─ QA sessions: { qa: <uuid> }
|
||||
│ ├─ DEV sessions: { junior: <key>, medior: <key>, senior: null }
|
||||
│ └─ QA sessions: { reviewer: <key>, tester: null }
|
||||
│
|
||||
└─ Project B
|
||||
├─ DEV sessions: { junior: null, medior: <uuid>, senior: null }
|
||||
└─ QA sessions: { qa: <uuid> }
|
||||
├─ DEV sessions: { junior: null, medior: <key>, senior: null }
|
||||
└─ QA sessions: { reviewer: <key>, tester: null }
|
||||
```
|
||||
|
||||
Why per-tier instead of switching models on one session:
|
||||
Why per-level instead of switching models on one session:
|
||||
- **No model switching overhead** — each session always uses the same model
|
||||
- **Accumulated context** — a junior session that's done 20 typo fixes knows the project well; a medior session that's done 5 features knows it differently
|
||||
- **No cross-model confusion** — conversation history stays with the model that generated it
|
||||
- **Deterministic reuse** — tier selection directly maps to a session key, no patching needed
|
||||
- **Deterministic reuse** — level selection directly maps to a session key, no patching needed
|
||||
|
||||
### Plugin-controlled session lifecycle
|
||||
|
||||
DevClaw controls the **full** session lifecycle end-to-end. The orchestrator agent never calls `sessions_spawn` or `sessions_send` — the plugin handles session creation and task dispatch internally using the OpenClaw CLI:
|
||||
|
||||
```
|
||||
Plugin dispatch (inside task_pickup):
|
||||
1. Assign tier, look up session, decide spawn vs send
|
||||
Plugin dispatch (inside work_start):
|
||||
1. Assign level, look up session, decide spawn vs send
|
||||
2. New session: openclaw gateway call sessions.patch → create entry + set model
|
||||
openclaw agent --session-id <key> --message "task..."
|
||||
3. Existing: openclaw agent --session-id <key> --message "task..."
|
||||
openclaw gateway call agent → dispatch task
|
||||
3. Existing: openclaw gateway call agent → dispatch task to existing session
|
||||
4. Return result to orchestrator (announcement text, no session instructions)
|
||||
```
|
||||
|
||||
The agent's only job after `task_pickup` returns is to post the announcement to Telegram. Everything else — tier assignment, session creation, task dispatch, state update, audit logging — is deterministic plugin code.
|
||||
The agent's only job after `work_start` returns is to post the announcement to Telegram. Everything else — level assignment, session creation, task dispatch, state update, audit logging — is deterministic plugin code.
|
||||
|
||||
**Why this matters:** Previously the plugin returned instructions like `{ sessionAction: "spawn", model: "sonnet" }` and the agent had to correctly call `sessions_spawn` with the right params. This was the fragile handoff point where agents would forget `cleanup: "keep"`, use wrong models, or corrupt session state. Moving dispatch into the plugin eliminates that entire class of errors.
|
||||
|
||||
**Session persistence:** Sessions created via `sessions.patch` persist indefinitely (no auto-cleanup). The plugin manages lifecycle explicitly through `session_health`.
|
||||
**Session persistence:** Sessions created via `sessions.patch` persist indefinitely (no auto-cleanup). The plugin manages lifecycle explicitly through the `health` tool.
|
||||
|
||||
**What we trade off vs. registered sub-agents:**
|
||||
|
||||
| Feature | Sub-agent system | Plugin-controlled | DevClaw equivalent |
|
||||
|---|---|---|---|
|
||||
| Auto-reporting | Sub-agent reports to parent | No | Heartbeat polls for completion |
|
||||
| Concurrency control | `maxConcurrent` | No | `task_pickup` checks `active` flag |
|
||||
| Concurrency control | `maxConcurrent` | No | `work_start` checks `active` flag |
|
||||
| Lifecycle tracking | Parent-child registry | No | `projects.json` tracks all sessions |
|
||||
| Timeout detection | `runTimeoutSeconds` | No | `session_health` flags stale >2h |
|
||||
| Cleanup | Auto-archive | No | `session_health` manual cleanup |
|
||||
| Timeout detection | `runTimeoutSeconds` | No | `health` flags stale >2h |
|
||||
| Cleanup | Auto-archive | No | `health` manual cleanup |
|
||||
|
||||
DevClaw provides equivalent guardrails for everything except auto-reporting, which the heartbeat handles.
|
||||
|
||||
@@ -74,22 +74,22 @@ graph TB
|
||||
subgraph "OpenClaw Runtime"
|
||||
MS[Main Session<br/>orchestrator agent]
|
||||
GW[Gateway RPC<br/>sessions.patch / sessions.list]
|
||||
CLI[openclaw agent CLI]
|
||||
CLI[openclaw gateway call agent]
|
||||
DEV_J[DEV session<br/>junior]
|
||||
DEV_M[DEV session<br/>medior]
|
||||
DEV_S[DEV session<br/>senior]
|
||||
QA_E[QA session<br/>qa]
|
||||
QA_R[QA session<br/>reviewer]
|
||||
end
|
||||
|
||||
subgraph "DevClaw Plugin"
|
||||
TP[task_pickup]
|
||||
TC[task_complete]
|
||||
WS[work_start]
|
||||
WF[work_finish]
|
||||
TCR[task_create]
|
||||
QS[queue_status]
|
||||
SH[session_health]
|
||||
ST[status]
|
||||
SH[health]
|
||||
PR[project_register]
|
||||
DS[devclaw_setup]
|
||||
TIER[Tier Resolver]
|
||||
DS[setup]
|
||||
TIER[Level Resolver]
|
||||
PJ[projects.json]
|
||||
AL[audit.log]
|
||||
end
|
||||
@@ -103,34 +103,34 @@ graph TB
|
||||
TG -->|delivers| MS
|
||||
MS -->|announces| TG
|
||||
|
||||
MS -->|calls| TP
|
||||
MS -->|calls| TC
|
||||
MS -->|calls| WS
|
||||
MS -->|calls| WF
|
||||
MS -->|calls| TCR
|
||||
MS -->|calls| QS
|
||||
MS -->|calls| ST
|
||||
MS -->|calls| SH
|
||||
MS -->|calls| PR
|
||||
MS -->|calls| DS
|
||||
|
||||
TP -->|resolves tier| TIER
|
||||
TP -->|transitions labels| GL
|
||||
TP -->|reads/writes| PJ
|
||||
TP -->|appends| AL
|
||||
TP -->|creates session| GW
|
||||
TP -->|dispatches task| CLI
|
||||
WS -->|resolves level| TIER
|
||||
WS -->|transitions labels| GL
|
||||
WS -->|reads/writes| PJ
|
||||
WS -->|appends| AL
|
||||
WS -->|creates session| GW
|
||||
WS -->|dispatches task| CLI
|
||||
|
||||
TC -->|transitions labels| GL
|
||||
TC -->|closes/reopens| GL
|
||||
TC -->|reads/writes| PJ
|
||||
TC -->|git pull| REPO
|
||||
TC -->|auto-chain dispatch| CLI
|
||||
TC -->|appends| AL
|
||||
WF -->|transitions labels| GL
|
||||
WF -->|closes/reopens| GL
|
||||
WF -->|reads/writes| PJ
|
||||
WF -->|git pull| REPO
|
||||
WF -->|auto-chain dispatch| CLI
|
||||
WF -->|appends| AL
|
||||
|
||||
TCR -->|creates issue| GL
|
||||
TCR -->|appends| AL
|
||||
|
||||
QS -->|lists issues by label| GL
|
||||
QS -->|reads| PJ
|
||||
QS -->|appends| AL
|
||||
ST -->|lists issues by label| GL
|
||||
ST -->|reads| PJ
|
||||
ST -->|appends| AL
|
||||
|
||||
SH -->|reads/writes| PJ
|
||||
SH -->|checks sessions| GW
|
||||
@@ -144,12 +144,12 @@ graph TB
|
||||
CLI -->|sends task| DEV_J
|
||||
CLI -->|sends task| DEV_M
|
||||
CLI -->|sends task| DEV_S
|
||||
CLI -->|sends task| QA_E
|
||||
CLI -->|sends task| QA_R
|
||||
|
||||
DEV_J -->|writes code, creates MRs| REPO
|
||||
DEV_M -->|writes code, creates MRs| REPO
|
||||
DEV_S -->|writes code, creates MRs| REPO
|
||||
QA_E -->|reviews code, tests| REPO
|
||||
QA_R -->|reviews code, tests| REPO
|
||||
```
|
||||
|
||||
## End-to-end flow: human to sub-agent
|
||||
@@ -163,7 +163,7 @@ sequenceDiagram
|
||||
participant MS as Main Session<br/>(orchestrator)
|
||||
participant DC as DevClaw Plugin
|
||||
participant GW as Gateway RPC
|
||||
participant CLI as openclaw agent CLI
|
||||
participant CLI as openclaw gateway call agent
|
||||
participant DEV as DEV Session<br/>(medior)
|
||||
participant GL as Issue Tracker
|
||||
|
||||
@@ -171,34 +171,34 @@ sequenceDiagram
|
||||
|
||||
H->>TG: "check status" (or heartbeat triggers)
|
||||
TG->>MS: delivers message
|
||||
MS->>DC: queue_status()
|
||||
DC->>GL: glab issue list --label "To Do"
|
||||
MS->>DC: status()
|
||||
DC->>GL: list issues by label "To Do"
|
||||
DC-->>MS: { toDo: [#42], dev: idle }
|
||||
|
||||
Note over MS: Decides to pick up #42 for DEV as medior
|
||||
|
||||
MS->>DC: task_pickup({ issueId: 42, role: "dev", model: "medior", ... })
|
||||
DC->>DC: resolve tier "medior" → model ID
|
||||
MS->>DC: work_start({ issueId: 42, role: "dev", level: "medior", ... })
|
||||
DC->>DC: resolve level "medior" → model ID
|
||||
DC->>DC: lookup dev.sessions.medior → null (first time)
|
||||
DC->>GL: glab issue update 42 --unlabel "To Do" --label "Doing"
|
||||
DC->>GL: transition label "To Do" → "Doing"
|
||||
DC->>GW: sessions.patch({ key: new-session-key, model: "anthropic/claude-sonnet-4-5" })
|
||||
DC->>CLI: openclaw agent --session-id <key> --message "Build login page for #42..."
|
||||
DC->>CLI: openclaw gateway call agent --params { sessionKey, message }
|
||||
CLI->>DEV: creates session, delivers task
|
||||
DC->>DC: store session key in projects.json + append audit.log
|
||||
DC-->>MS: { success: true, announcement: "🔧 DEV (medior) picking up #42" }
|
||||
DC-->>MS: { success: true, announcement: "🔧 Spawning DEV (medior) for #42" }
|
||||
|
||||
MS->>TG: "🔧 DEV (medior) picking up #42: Add login page"
|
||||
MS->>TG: "🔧 Spawning DEV (medior) for #42: Add login page"
|
||||
TG->>H: sees announcement
|
||||
|
||||
Note over DEV: Works autonomously — reads code, writes code, creates MR
|
||||
Note over DEV: Calls task_complete when done
|
||||
Note over DEV: Calls work_finish when done
|
||||
|
||||
DEV->>DC: task_complete({ role: "dev", result: "done", ... })
|
||||
DC->>GL: glab issue update 42 --unlabel "Doing" --label "To Test"
|
||||
DEV->>DC: work_finish({ role: "dev", result: "done", ... })
|
||||
DC->>GL: transition label "Doing" → "To Test"
|
||||
DC->>DC: deactivate worker (sessions preserved)
|
||||
DC-->>DEV: { announcement: "✅ DEV done #42" }
|
||||
DC-->>DEV: { announcement: "✅ DEV DONE #42" }
|
||||
|
||||
MS->>TG: "✅ DEV done #42 — moved to QA queue"
|
||||
MS->>TG: "✅ DEV DONE #42 — moved to QA queue"
|
||||
TG->>H: sees announcement
|
||||
```
|
||||
|
||||
@@ -208,16 +208,16 @@ On the **next DEV task** for this project that also assigns medior:
|
||||
sequenceDiagram
|
||||
participant MS as Main Session
|
||||
participant DC as DevClaw Plugin
|
||||
participant CLI as openclaw agent CLI
|
||||
participant CLI as openclaw gateway call agent
|
||||
participant DEV as DEV Session<br/>(medior, existing)
|
||||
|
||||
MS->>DC: task_pickup({ issueId: 57, role: "dev", model: "medior", ... })
|
||||
DC->>DC: resolve tier "medior" → model ID
|
||||
MS->>DC: work_start({ issueId: 57, role: "dev", level: "medior", ... })
|
||||
DC->>DC: resolve level "medior" → model ID
|
||||
DC->>DC: lookup dev.sessions.medior → existing key!
|
||||
Note over DC: No sessions.patch needed — session already exists
|
||||
DC->>CLI: openclaw agent --session-id <key> --message "Fix validation for #57..."
|
||||
DC->>CLI: openclaw gateway call agent --params { sessionKey, message }
|
||||
CLI->>DEV: delivers task to existing session (has full codebase context)
|
||||
DC-->>MS: { success: true, announcement: "⚡ DEV (medior) picking up #57" }
|
||||
DC-->>MS: { success: true, announcement: "⚡ Sending DEV (medior) for #57" }
|
||||
```
|
||||
|
||||
Session reuse saves ~50K tokens per task by not re-reading the codebase.
|
||||
@@ -228,118 +228,118 @@ This traces a single issue from creation to completion, showing every component
|
||||
|
||||
### Phase 1: Issue created
|
||||
|
||||
Issues are created by the orchestrator agent or by sub-agent sessions via `glab`. The orchestrator can create issues based on user requests in Telegram, backlog planning, or QA feedback. Sub-agents can also create issues when they discover bugs or related work during development.
|
||||
Issues are created by the orchestrator agent or by sub-agent sessions via `task_create` or directly via `gh`/`glab`. The orchestrator can create issues based on user requests in Telegram, backlog planning, or QA feedback. Sub-agents can also create issues when they discover bugs during development.
|
||||
|
||||
```
|
||||
Orchestrator Agent → Issue Tracker: creates issue #42 with label "To Do"
|
||||
Orchestrator Agent → Issue Tracker: creates issue #42 with label "Planning"
|
||||
```
|
||||
|
||||
**State:** Issue tracker has issue #42 labeled "To Do". Nothing in DevClaw yet.
|
||||
**State:** Issue tracker has issue #42 labeled "Planning". Nothing in DevClaw yet.
|
||||
|
||||
### Phase 2: Heartbeat detects work
|
||||
|
||||
```
|
||||
Heartbeat triggers → Orchestrator calls queue_status()
|
||||
Heartbeat triggers → Orchestrator calls status()
|
||||
```
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant A as Orchestrator
|
||||
participant QS as queue_status
|
||||
participant QS as status
|
||||
participant GL as Issue Tracker
|
||||
participant PJ as projects.json
|
||||
participant AL as audit.log
|
||||
|
||||
A->>QS: queue_status({ projectGroupId: "-123" })
|
||||
A->>QS: status({ projectGroupId: "-123" })
|
||||
QS->>PJ: readProjects()
|
||||
PJ-->>QS: { dev: idle, qa: idle }
|
||||
QS->>GL: glab issue list --label "To Do"
|
||||
QS->>GL: list issues by label "To Do"
|
||||
GL-->>QS: [{ id: 42, title: "Add login page" }]
|
||||
QS->>GL: glab issue list --label "To Test"
|
||||
QS->>GL: list issues by label "To Test"
|
||||
GL-->>QS: []
|
||||
QS->>GL: glab issue list --label "To Improve"
|
||||
QS->>GL: list issues by label "To Improve"
|
||||
GL-->>QS: []
|
||||
QS->>AL: append { event: "queue_status", ... }
|
||||
QS->>AL: append { event: "status", ... }
|
||||
QS-->>A: { dev: idle, queue: { toDo: [#42] } }
|
||||
```
|
||||
|
||||
**Orchestrator decides:** DEV is idle, issue #42 is in To Do → pick it up. Evaluates complexity → assigns medior tier.
|
||||
**Orchestrator decides:** DEV is idle, issue #42 is in To Do → pick it up. Evaluates complexity → assigns medior level.
|
||||
|
||||
### Phase 3: DEV pickup
|
||||
|
||||
The plugin handles everything end-to-end — tier resolution, session lookup, label transition, state update, **and** task dispatch to the worker session. The agent's only job after is to post the announcement.
|
||||
The plugin handles everything end-to-end — level resolution, session lookup, label transition, state update, **and** task dispatch to the worker session. The agent's only job after is to post the announcement.
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant A as Orchestrator
|
||||
participant TP as task_pickup
|
||||
participant WS as work_start
|
||||
participant GL as Issue Tracker
|
||||
participant TIER as Tier Resolver
|
||||
participant TIER as Level Resolver
|
||||
participant GW as Gateway RPC
|
||||
participant CLI as openclaw agent CLI
|
||||
participant CLI as openclaw gateway call agent
|
||||
participant PJ as projects.json
|
||||
participant AL as audit.log
|
||||
|
||||
A->>TP: task_pickup({ issueId: 42, role: "dev", projectGroupId: "-123", model: "medior" })
|
||||
TP->>PJ: readProjects()
|
||||
TP->>GL: glab issue view 42 --output json
|
||||
GL-->>TP: { title: "Add login page", labels: ["To Do"] }
|
||||
TP->>TP: Verify label is "To Do" ✓
|
||||
TP->>TIER: resolve "medior" → "anthropic/claude-sonnet-4-5"
|
||||
TP->>PJ: lookup dev.sessions.medior
|
||||
TP->>GL: glab issue update 42 --unlabel "To Do" --label "Doing"
|
||||
A->>WS: work_start({ issueId: 42, role: "dev", projectGroupId: "-123", level: "medior" })
|
||||
WS->>PJ: readProjects()
|
||||
WS->>GL: getIssue(42)
|
||||
GL-->>WS: { title: "Add login page", labels: ["To Do"] }
|
||||
WS->>WS: Verify label is "To Do"
|
||||
WS->>TIER: resolve "medior" → "anthropic/claude-sonnet-4-5"
|
||||
WS->>PJ: lookup dev.sessions.medior
|
||||
WS->>GL: transitionLabel(42, "To Do", "Doing")
|
||||
alt New session
|
||||
TP->>GW: sessions.patch({ key: new-key, model: "anthropic/claude-sonnet-4-5" })
|
||||
WS->>GW: sessions.patch({ key: new-key, model: "anthropic/claude-sonnet-4-5" })
|
||||
end
|
||||
TP->>CLI: openclaw agent --session-id <key> --message "task..."
|
||||
TP->>PJ: activateWorker + store session key
|
||||
TP->>AL: append task_pickup + model_selection
|
||||
TP-->>A: { success: true, announcement: "🔧 ..." }
|
||||
WS->>CLI: openclaw gateway call agent --params { sessionKey, message }
|
||||
WS->>PJ: activateWorker + store session key
|
||||
WS->>AL: append work_start + model_selection
|
||||
WS-->>A: { success: true, announcement: "🔧 ..." }
|
||||
```
|
||||
|
||||
**Writes:**
|
||||
- `Issue Tracker`: label "To Do" → "Doing"
|
||||
- `projects.json`: dev.active=true, dev.issueId="42", dev.model="medior", dev.sessions.medior=key
|
||||
- `audit.log`: 2 entries (task_pickup, model_selection)
|
||||
- `projects.json`: dev.active=true, dev.issueId="42", dev.level="medior", dev.sessions.medior=key
|
||||
- `audit.log`: 2 entries (work_start, model_selection)
|
||||
- `Session`: task message delivered to worker session via CLI
|
||||
|
||||
### Phase 4: DEV works
|
||||
|
||||
```
|
||||
DEV sub-agent session → reads codebase, writes code, creates MR
|
||||
DEV sub-agent session → calls task_complete({ role: "dev", result: "done", ... })
|
||||
DEV sub-agent session → calls work_finish({ role: "dev", result: "done", ... })
|
||||
```
|
||||
|
||||
This happens inside the OpenClaw session. The worker calls `task_complete` directly for atomic state updates. If the worker discovers unrelated bugs, it calls `task_create` to file them.
|
||||
This happens inside the OpenClaw session. The worker calls `work_finish` directly for atomic state updates. If the worker discovers unrelated bugs, it calls `task_create` to file them.
|
||||
|
||||
### Phase 5: DEV complete (worker self-reports)
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant DEV as DEV Session
|
||||
participant TC as task_complete
|
||||
participant WF as work_finish
|
||||
participant GL as Issue Tracker
|
||||
participant PJ as projects.json
|
||||
participant AL as audit.log
|
||||
participant REPO as Git Repo
|
||||
participant QA as QA Session (auto-chain)
|
||||
|
||||
DEV->>TC: task_complete({ role: "dev", result: "done", projectGroupId: "-123", summary: "Login page with OAuth" })
|
||||
TC->>PJ: readProjects()
|
||||
PJ-->>TC: { dev: { active: true, issueId: "42" } }
|
||||
TC->>REPO: git pull
|
||||
TC->>PJ: deactivateWorker(-123, dev)
|
||||
DEV->>WF: work_finish({ role: "dev", result: "done", projectGroupId: "-123", summary: "Login page with OAuth" })
|
||||
WF->>PJ: readProjects()
|
||||
PJ-->>WF: { dev: { active: true, issueId: "42" } }
|
||||
WF->>REPO: git pull
|
||||
WF->>PJ: deactivateWorker(-123, dev)
|
||||
Note over PJ: active→false, issueId→null<br/>sessions map PRESERVED
|
||||
TC->>GL: transition label "Doing" → "To Test"
|
||||
TC->>AL: append { event: "task_complete", role: "dev", result: "done" }
|
||||
WF->>GL: transitionLabel "Doing" → "To Test"
|
||||
WF->>AL: append { event: "work_finish", role: "dev", result: "done" }
|
||||
|
||||
alt autoChain enabled
|
||||
TC->>GL: transition label "To Test" → "Testing"
|
||||
TC->>QA: dispatchTask(role: "qa", tier: "qa")
|
||||
TC->>PJ: activateWorker(-123, qa)
|
||||
TC-->>DEV: { announcement: "✅ DEV done #42", autoChain: { dispatched: true, role: "qa" } }
|
||||
WF->>GL: transitionLabel "To Test" → "Testing"
|
||||
WF->>QA: dispatchTask(role: "qa", level: "reviewer")
|
||||
WF->>PJ: activateWorker(-123, qa)
|
||||
WF-->>DEV: { announcement: "✅ DEV DONE #42", autoChain: { dispatched: true, role: "qa" } }
|
||||
else autoChain disabled
|
||||
TC-->>DEV: { announcement: "✅ DEV done #42", nextAction: "qa_pickup" }
|
||||
WF-->>DEV: { announcement: "✅ DEV DONE #42", nextAction: "qa_pickup" }
|
||||
end
|
||||
```
|
||||
|
||||
@@ -347,30 +347,30 @@ sequenceDiagram
|
||||
- `Git repo`: pulled latest (has DEV's merged code)
|
||||
- `projects.json`: dev.active=false, dev.issueId=null (sessions map preserved for reuse)
|
||||
- `Issue Tracker`: label "Doing" → "To Test" (+ "To Test" → "Testing" if auto-chain)
|
||||
- `audit.log`: 1 entry (task_complete) + optional auto-chain entries
|
||||
- `audit.log`: 1 entry (work_finish) + optional auto-chain entries
|
||||
|
||||
### Phase 6: QA pickup
|
||||
|
||||
Same as Phase 3, but with `role: "qa"`. Label transitions "To Test" → "Testing". Uses the qa tier.
|
||||
Same as Phase 3, but with `role: "qa"`. Label transitions "To Test" → "Testing". Uses the reviewer level.
|
||||
|
||||
### Phase 7: QA result (3 possible outcomes)
|
||||
### Phase 7: QA result (4 possible outcomes)
|
||||
|
||||
#### 7a. QA Pass
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant A as Orchestrator
|
||||
participant TC as task_complete
|
||||
participant QA as QA Session
|
||||
participant WF as work_finish
|
||||
participant GL as Issue Tracker
|
||||
participant PJ as projects.json
|
||||
participant AL as audit.log
|
||||
|
||||
A->>TC: task_complete({ role: "qa", result: "pass", projectGroupId: "-123" })
|
||||
TC->>PJ: deactivateWorker(-123, qa)
|
||||
TC->>GL: glab issue update 42 --unlabel "Testing" --label "Done"
|
||||
TC->>GL: glab issue close 42
|
||||
TC->>AL: append { event: "task_complete", role: "qa", result: "pass" }
|
||||
TC-->>A: { announcement: "🎉 QA PASS #42. Issue closed." }
|
||||
QA->>WF: work_finish({ role: "qa", result: "pass", projectGroupId: "-123" })
|
||||
WF->>PJ: deactivateWorker(-123, qa)
|
||||
WF->>GL: transitionLabel(42, "Testing", "Done")
|
||||
WF->>GL: closeIssue(42)
|
||||
WF->>AL: append { event: "work_finish", role: "qa", result: "pass" }
|
||||
WF-->>QA: { announcement: "🎉 QA PASS #42. Issue closed." }
|
||||
```
|
||||
|
||||
**Ticket complete.** Issue closed, label "Done".
|
||||
@@ -379,18 +379,18 @@ sequenceDiagram
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant A as Orchestrator
|
||||
participant TC as task_complete
|
||||
participant QA as QA Session
|
||||
participant WF as work_finish
|
||||
participant GL as Issue Tracker
|
||||
participant PJ as projects.json
|
||||
participant AL as audit.log
|
||||
|
||||
A->>TC: task_complete({ role: "qa", result: "fail", projectGroupId: "-123", summary: "OAuth redirect broken" })
|
||||
TC->>PJ: deactivateWorker(-123, qa)
|
||||
TC->>GL: glab issue update 42 --unlabel "Testing" --label "To Improve"
|
||||
TC->>GL: glab issue reopen 42
|
||||
TC->>AL: append { event: "task_complete", role: "qa", result: "fail" }
|
||||
TC-->>A: { announcement: "❌ QA FAIL #42 — OAuth redirect broken. Sent back to DEV." }
|
||||
QA->>WF: work_finish({ role: "qa", result: "fail", projectGroupId: "-123", summary: "OAuth redirect broken" })
|
||||
WF->>PJ: deactivateWorker(-123, qa)
|
||||
WF->>GL: transitionLabel(42, "Testing", "To Improve")
|
||||
WF->>GL: reopenIssue(42)
|
||||
WF->>AL: append { event: "work_finish", role: "qa", result: "fail" }
|
||||
WF-->>QA: { announcement: "❌ QA FAIL #42 — OAuth redirect broken. Sent back to DEV." }
|
||||
```
|
||||
|
||||
**Cycle restarts:** Issue goes to "To Improve". Next heartbeat, DEV picks it up again (Phase 3, but from "To Improve" instead of "To Do").
|
||||
@@ -414,39 +414,35 @@ Worker cannot complete (missing info, environment errors, etc.). Issue returns t
|
||||
|
||||
### Completion enforcement
|
||||
|
||||
Three layers guarantee that `task_complete` always runs:
|
||||
Three layers guarantee that `work_finish` always runs:
|
||||
|
||||
1. **Completion contract** — Every task message sent to a worker session includes a mandatory `## MANDATORY: Task Completion` section listing available results and requiring `task_complete` even on failure. Workers are instructed to use `"blocked"` if stuck.
|
||||
1. **Completion contract** — Every task message sent to a worker session includes a mandatory `## MANDATORY: Task Completion` section listing available results and requiring `work_finish` even on failure. Workers are instructed to use `"blocked"` if stuck.
|
||||
|
||||
2. **Blocked result** — Both DEV and QA can use `"blocked"` to gracefully return a task to queue without losing work. DEV blocked: `Doing → To Do`. QA blocked: `Testing → To Test`. This gives workers an escape hatch instead of silently dying.
|
||||
|
||||
3. **Stale worker watchdog** — The heartbeat's health check detects workers active for >2 hours. With `autoFix=true`, it deactivates the worker and reverts the label back to queue. This catches sessions that crashed, ran out of context, or otherwise failed without calling `task_complete`. The `session_health` tool provides the same check for manual invocation.
|
||||
3. **Stale worker watchdog** — The heartbeat's health check detects workers active for >2 hours. With `fix=true`, it deactivates the worker and reverts the label back to queue. This catches sessions that crashed, ran out of context, or otherwise failed without calling `work_finish`. The `health` tool provides the same check for manual invocation.
|
||||
|
||||
### Phase 8: Heartbeat (continuous)
|
||||
|
||||
The heartbeat runs periodically (triggered by the agent or a scheduled message). It combines health check + queue scan:
|
||||
The heartbeat runs periodically (via background service or manual `work_heartbeat` trigger). It combines health check + queue scan:
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant A as Orchestrator
|
||||
participant SH as session_health
|
||||
participant QS as queue_status
|
||||
participant TP as task_pickup
|
||||
Note over A: Heartbeat triggered
|
||||
participant HB as Heartbeat Service
|
||||
participant SH as health check
|
||||
participant TK as projectTick
|
||||
participant WS as work_start (dispatch)
|
||||
Note over HB: Tick triggered (every 60s)
|
||||
|
||||
A->>SH: session_health({ autoFix: true })
|
||||
Note over SH: Checks sessions via Gateway RPC (sessions.list)
|
||||
SH-->>A: { healthy: true }
|
||||
HB->>SH: checkWorkerHealth per project per role
|
||||
Note over SH: Checks for zombies, stale workers
|
||||
SH-->>HB: { fixes applied }
|
||||
|
||||
A->>QS: queue_status()
|
||||
QS-->>A: { projects: [{ dev: idle, queue: { toDo: [#43], toTest: [#44] } }] }
|
||||
|
||||
Note over A: DEV idle + To Do #43 → assign medior
|
||||
A->>TP: task_pickup({ issueId: 43, role: "dev", model: "medior", ... })
|
||||
Note over TP: Plugin handles everything:<br/>tier resolve → session lookup →<br/>label transition → dispatch task →<br/>state update → audit log
|
||||
|
||||
Note over A: QA idle + To Test #44 → assign qa
|
||||
A->>TP: task_pickup({ issueId: 44, role: "qa", model: "qa", ... })
|
||||
HB->>TK: projectTick per project
|
||||
Note over TK: Scans queue: To Improve > To Test > To Do
|
||||
TK->>WS: dispatchTask (fill free slots)
|
||||
WS-->>TK: { dispatched }
|
||||
TK-->>HB: { pickups, skipped }
|
||||
```
|
||||
|
||||
## Data flow map
|
||||
@@ -455,25 +451,27 @@ Every piece of data and where it lives:
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Issue Tracker (source of truth for tasks) │
|
||||
│ Issue Tracker (source of truth for tasks) │
|
||||
│ │
|
||||
│ Issue #42: "Add login page" │
|
||||
│ Labels: [To Do | Doing | To Test | Testing | Done | ...] │
|
||||
│ Labels: [Planning | To Do | Doing | To Test | Testing | ...] │
|
||||
│ State: open / closed │
|
||||
│ MRs/PRs: linked merge/pull requests │
|
||||
│ Created by: orchestrator (task_create), workers, or humans │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
↕ glab/gh CLI (read/write, auto-detected)
|
||||
↕ gh/glab CLI (read/write, auto-detected)
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ DevClaw Plugin (orchestration logic) │
|
||||
│ │
|
||||
│ devclaw_setup → agent creation + workspace + model config │
|
||||
│ task_pickup → tier + label + dispatch + role instr (e2e) │
|
||||
│ task_complete → label + state + git pull + auto-chain │
|
||||
│ task_create → create issue in tracker │
|
||||
│ queue_status → read labels + read state │
|
||||
│ session_health → check sessions + fix zombies │
|
||||
│ project_register → labels + prompts + state init (one-time) │
|
||||
│ setup → agent creation + workspace + model config │
|
||||
│ work_start → level + label + dispatch + role instr (e2e) │
|
||||
│ work_finish → label + state + git pull + auto-chain │
|
||||
│ task_create → create issue in tracker │
|
||||
│ task_update → manual label state change │
|
||||
│ task_comment → add comment to issue │
|
||||
│ status → read labels + read state │
|
||||
│ health → check sessions + fix zombies │
|
||||
│ project_register → labels + prompts + state init (one-time) │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
↕ atomic file I/O ↕ OpenClaw CLI (plugin shells out)
|
||||
┌────────────────────────────────┐ ┌──────────────────────────────┐
|
||||
@@ -481,39 +479,40 @@ Every piece of data and where it lives:
|
||||
│ │ │ (called by plugin, not agent)│
|
||||
│ Per project: │ │ │
|
||||
│ dev: │ │ openclaw gateway call │
|
||||
│ active, issueId, model │ │ sessions.patch → create │
|
||||
│ active, issueId, level │ │ sessions.patch → create │
|
||||
│ sessions: │ │ sessions.list → health │
|
||||
│ junior: <key> │ │ sessions.delete → cleanup │
|
||||
│ medior: <key> │ │ │
|
||||
│ senior: <key> │ │ openclaw agent │
|
||||
│ qa: │ │ --session-id <key> │
|
||||
│ active, issueId, model │ │ --message "task..." │
|
||||
│ senior: <key> │ │ openclaw gateway call agent │
|
||||
│ qa: │ │ --params { sessionKey, │
|
||||
│ active, issueId, level │ │ message, agentId } │
|
||||
│ sessions: │ │ → dispatches to session │
|
||||
│ qa: <key> │ │ │
|
||||
│ reviewer: <key> │ │ │
|
||||
│ tester: <key> │ │ │
|
||||
└────────────────────────────────┘ └──────────────────────────────┘
|
||||
↕ append-only
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ log/audit.log (observability) │
|
||||
│ │
|
||||
│ NDJSON, one line per event: │
|
||||
│ task_pickup, task_complete, model_selection, │
|
||||
│ queue_status, health_check, session_spawn, session_reuse, │
|
||||
│ project_register, devclaw_setup │
|
||||
│ work_start, work_finish, model_selection, │
|
||||
│ status, health, task_create, task_update, │
|
||||
│ task_comment, project_register, setup, heartbeat_tick │
|
||||
│ │
|
||||
│ Query with: cat audit.log | jq 'select(.event=="task_pickup")' │
|
||||
│ Query: cat audit.log | jq 'select(.event=="work_start")' │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Telegram (user-facing messages) │
|
||||
│ Telegram / WhatsApp (user-facing messages) │
|
||||
│ │
|
||||
│ Per group chat: │
|
||||
│ "🔧 Spawning DEV (medior) for #42: Add login page" │
|
||||
│ "🔧 Spawning DEV (medior) for #42: Add login page" │
|
||||
│ "⚡ Sending DEV (medior) for #57: Fix validation" │
|
||||
│ "✅ DEV done #42 — Login page with OAuth. Moved to QA queue."│
|
||||
│ "✅ DEV DONE #42 — Login page with OAuth." │
|
||||
│ "🎉 QA PASS #42. Issue closed." │
|
||||
│ "❌ QA FAIL #42 — OAuth redirect broken. Sent back to DEV." │
|
||||
│ "🚫 DEV BLOCKED #42 — Missing dependencies. Returned to queue."│
|
||||
│ "🚫 QA BLOCKED #42 — Env not available. Returned to QA queue."│
|
||||
│ "❌ QA FAIL #42 — OAuth redirect broken." │
|
||||
│ "🚫 DEV BLOCKED #42 — Missing dependencies." │
|
||||
│ "🚫 QA BLOCKED #42 — Env not available." │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
@@ -521,7 +520,7 @@ Every piece of data and where it lives:
|
||||
│ │
|
||||
│ DEV sub-agent sessions: read code, write code, create MRs │
|
||||
│ QA sub-agent sessions: read code, run tests, review MRs │
|
||||
│ task_complete (DEV done): git pull to sync latest │
|
||||
│ work_finish (DEV done): git pull to sync latest │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
@@ -553,7 +552,7 @@ graph LR
|
||||
subgraph "Sub-agent sessions handle"
|
||||
CR[Code writing]
|
||||
MR[MR creation/review]
|
||||
TC_W[Task completion<br/>via task_complete]
|
||||
WF_W[Task completion<br/>via work_finish]
|
||||
BUG[Bug filing<br/>via task_create]
|
||||
end
|
||||
|
||||
@@ -565,20 +564,22 @@ graph LR
|
||||
|
||||
## IssueProvider abstraction
|
||||
|
||||
All issue tracker operations go through the `IssueProvider` interface, defined in `lib/issue-provider.ts`. This abstraction allows DevClaw to support multiple issue trackers without changing tool logic.
|
||||
All issue tracker operations go through the `IssueProvider` interface, defined in `lib/providers/provider.ts`. This abstraction allows DevClaw to support multiple issue trackers without changing tool logic.
|
||||
|
||||
**Interface methods:**
|
||||
- `ensureLabel` / `ensureAllStateLabels` — idempotent label creation
|
||||
- `createIssue` — create issue with label and assignees
|
||||
- `listIssuesByLabel` / `getIssue` — issue queries
|
||||
- `transitionLabel` — atomic label state transition (unlabel + label)
|
||||
- `closeIssue` / `reopenIssue` — issue lifecycle
|
||||
- `hasStateLabel` / `getCurrentStateLabel` — label inspection
|
||||
- `hasMergedMR` — MR/PR verification
|
||||
- `hasMergedMR` / `getMergedMRUrl` — MR/PR verification
|
||||
- `addComment` — add comment to issue
|
||||
- `healthCheck` — verify provider connectivity
|
||||
|
||||
**Current providers:**
|
||||
- **GitLab** (`lib/providers/gitlab.ts`) — wraps `glab` CLI
|
||||
- **GitHub** (`lib/providers/github.ts`) — wraps `gh` CLI
|
||||
- **GitLab** (`lib/providers/gitlab.ts`) — wraps `glab` CLI
|
||||
|
||||
**Planned providers:**
|
||||
- **Jira** — via REST API
|
||||
@@ -589,16 +590,16 @@ Provider selection is handled by `createProvider()` in `lib/providers/index.ts`.
|
||||
|
||||
| Failure | Detection | Recovery |
|
||||
|---|---|---|
|
||||
| Session dies mid-task | `session_health` checks via `sessions.list` Gateway RPC | `autoFix`: reverts label, clears active state, removes dead session from sessions map. Next heartbeat picks up task again (creates fresh session for that tier). |
|
||||
| glab command fails | Plugin tool throws error, returns to agent | Agent retries or reports to Telegram group |
|
||||
| `openclaw agent` CLI fails | Plugin catches error during dispatch | Plugin rolls back: reverts label, clears active state. Returns error to agent for reporting. |
|
||||
| `sessions.patch` fails | Plugin catches error during session creation | Plugin rolls back label transition. Returns error. No orphaned state. |
|
||||
| Session dies mid-task | `health` checks via `sessions.list` Gateway RPC | `fix=true`: reverts label, clears active state. Next heartbeat picks up task again (creates fresh session for that level). |
|
||||
| gh/glab command fails | Plugin tool throws error, returns to agent | Agent retries or reports to Telegram group |
|
||||
| `openclaw gateway call agent` fails | Plugin catches error during dispatch | Plugin rolls back: reverts label, clears active state. Returns error. No orphaned state. |
|
||||
| `sessions.patch` fails | Plugin catches error during session creation | Plugin rolls back label transition. Returns error. |
|
||||
| projects.json corrupted | Tool can't parse JSON | Manual fix needed. Atomic writes (temp+rename) prevent partial writes. |
|
||||
| Label out of sync | `task_pickup` verifies label before transitioning | Throws error if label doesn't match expected state. Agent reports mismatch. |
|
||||
| Worker already active | `task_pickup` checks `active` flag | Throws error: "DEV worker already active on project". Must complete current task first. |
|
||||
| Stale worker (>2h) | `session_health` and heartbeat health check | `autoFix`: deactivates worker, reverts label to queue (To Do / To Test). Task available for next pickup. |
|
||||
| Worker stuck/blocked | Worker calls `task_complete` with `"blocked"` | Deactivates worker, reverts label to queue. Issue available for retry. |
|
||||
| `project_register` fails | Plugin catches error during label creation or state write | Clean error returned. No partial state — labels are idempotent, projects.json not written until all labels succeed. |
|
||||
| Label out of sync | `work_start` verifies label before transitioning | Throws error if label doesn't match expected state. |
|
||||
| Worker already active | `work_start` checks `active` flag | Throws error: "DEV already active on project". Must complete current task first. |
|
||||
| Stale worker (>2h) | `health` and heartbeat health check | `fix=true`: deactivates worker, reverts label to queue. Task available for next pickup. |
|
||||
| Worker stuck/blocked | Worker calls `work_finish` with `"blocked"` | Deactivates worker, reverts label to queue. Issue available for retry. |
|
||||
| `project_register` fails | Plugin catches error during label creation or state write | Clean error returned. Labels are idempotent, projects.json not written until all labels succeed. |
|
||||
|
||||
## File locations
|
||||
|
||||
@@ -606,8 +607,9 @@ Provider selection is handled by `createProvider()` in `lib/providers/index.ts`.
|
||||
|---|---|---|
|
||||
| Plugin source | `~/.openclaw/extensions/devclaw/` | Plugin code |
|
||||
| Plugin manifest | `~/.openclaw/extensions/devclaw/openclaw.plugin.json` | Plugin registration |
|
||||
| Agent config | `~/.openclaw/openclaw.json` | Agent definition + tool permissions + tier config |
|
||||
| Agent config | `~/.openclaw/openclaw.json` | Agent definition + tool permissions + model config |
|
||||
| Worker state | `~/.openclaw/workspace-<agent>/projects/projects.json` | Per-project DEV/QA state |
|
||||
| Role instructions | `~/.openclaw/workspace-<agent>/projects/roles/<project>/` | Per-project `dev.md` and `qa.md` |
|
||||
| Audit log | `~/.openclaw/workspace-<agent>/log/audit.log` | NDJSON event log |
|
||||
| Session transcripts | `~/.openclaw/agents/<agent>/sessions/<uuid>.jsonl` | Conversation history per session |
|
||||
| Git repos | `~/git/<project>/` | Project source code |
|
||||
|
||||
354
docs/CONFIGURATION.md
Normal file
354
docs/CONFIGURATION.md
Normal file
@@ -0,0 +1,354 @@
|
||||
# DevClaw — Configuration Reference
|
||||
|
||||
All DevClaw configuration lives in two places: `openclaw.json` (plugin-level settings) and `projects.json` (per-project state).
|
||||
|
||||
## Plugin Configuration (`openclaw.json`)
|
||||
|
||||
DevClaw is configured under `plugins.entries.devclaw.config` in `openclaw.json`.
|
||||
|
||||
### Model Tiers
|
||||
|
||||
Override which LLM model powers each developer level:
|
||||
|
||||
```json
|
||||
{
|
||||
"plugins": {
|
||||
"entries": {
|
||||
"devclaw": {
|
||||
"config": {
|
||||
"models": {
|
||||
"dev": {
|
||||
"junior": "anthropic/claude-haiku-4-5",
|
||||
"medior": "anthropic/claude-sonnet-4-5",
|
||||
"senior": "anthropic/claude-opus-4-5"
|
||||
},
|
||||
"qa": {
|
||||
"reviewer": "anthropic/claude-sonnet-4-5",
|
||||
"tester": "anthropic/claude-haiku-4-5"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Resolution order** (per `lib/tiers.ts:resolveModel`):
|
||||
|
||||
1. Plugin config `models.<role>.<level>` — explicit override
|
||||
2. `DEFAULT_MODELS[role][level]` — built-in defaults (table below)
|
||||
3. Passthrough — treat the level string as a raw model ID
|
||||
|
||||
**Default models:**
|
||||
|
||||
| Role | Level | Default model |
|
||||
|---|---|---|
|
||||
| dev | junior | `anthropic/claude-haiku-4-5` |
|
||||
| dev | medior | `anthropic/claude-sonnet-4-5` |
|
||||
| dev | senior | `anthropic/claude-opus-4-5` |
|
||||
| qa | reviewer | `anthropic/claude-sonnet-4-5` |
|
||||
| qa | tester | `anthropic/claude-haiku-4-5` |
|
||||
|
||||
### Project Execution Mode
|
||||
|
||||
Controls cross-project parallelism:
|
||||
|
||||
```json
|
||||
{
|
||||
"plugins": {
|
||||
"entries": {
|
||||
"devclaw": {
|
||||
"config": {
|
||||
"projectExecution": "parallel"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
| Value | Behavior |
|
||||
|---|---|
|
||||
| `"parallel"` (default) | Multiple projects can have active workers simultaneously |
|
||||
| `"sequential"` | Only one project's workers active at a time. Useful for single-agent deployments. |
|
||||
|
||||
Enforced in `work_heartbeat` and the heartbeat service before dispatching.
|
||||
|
||||
### Heartbeat Service
|
||||
|
||||
Token-free interval-based health checks + queue dispatch:
|
||||
|
||||
```json
|
||||
{
|
||||
"plugins": {
|
||||
"entries": {
|
||||
"devclaw": {
|
||||
"config": {
|
||||
"work_heartbeat": {
|
||||
"enabled": true,
|
||||
"intervalSeconds": 60,
|
||||
"maxPickupsPerTick": 4
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
| Setting | Type | Default | Description |
|
||||
|---|---|---|---|
|
||||
| `enabled` | boolean | `true` | Enable the heartbeat service |
|
||||
| `intervalSeconds` | number | `60` | Seconds between ticks |
|
||||
| `maxPickupsPerTick` | number | `4` | Maximum worker dispatches per tick (budget control) |
|
||||
|
||||
**Source:** [`lib/services/heartbeat.ts`](../lib/services/heartbeat.ts)
|
||||
|
||||
The heartbeat service runs as a plugin service tied to the gateway lifecycle. Every tick: health pass (auto-fix zombies, stale workers) → tick pass (fill free slots by priority). Zero LLM tokens consumed.
|
||||
|
||||
### Notifications
|
||||
|
||||
Control which lifecycle events send notifications:
|
||||
|
||||
```json
|
||||
{
|
||||
"plugins": {
|
||||
"entries": {
|
||||
"devclaw": {
|
||||
"config": {
|
||||
"notifications": {
|
||||
"heartbeatDm": true,
|
||||
"workerStart": true,
|
||||
"workerComplete": true
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
| Setting | Default | Description |
|
||||
|---|---|---|
|
||||
| `heartbeatDm` | `true` | Send heartbeat summary to orchestrator DM |
|
||||
| `workerStart` | `true` | Announce when a worker picks up a task |
|
||||
| `workerComplete` | `true` | Announce when a worker finishes a task |
|
||||
|
||||
### DevClaw Agent IDs
|
||||
|
||||
List which agents are recognized as DevClaw orchestrators (used for context detection):
|
||||
|
||||
```json
|
||||
{
|
||||
"plugins": {
|
||||
"entries": {
|
||||
"devclaw": {
|
||||
"config": {
|
||||
"devClawAgentIds": ["my-orchestrator"]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Agent Tool Permissions
|
||||
|
||||
Restrict DevClaw tools to your orchestrator agent:
|
||||
|
||||
```json
|
||||
{
|
||||
"agents": {
|
||||
"list": [
|
||||
{
|
||||
"id": "my-orchestrator",
|
||||
"tools": {
|
||||
"allow": [
|
||||
"work_start",
|
||||
"work_finish",
|
||||
"task_create",
|
||||
"task_update",
|
||||
"task_comment",
|
||||
"status",
|
||||
"health",
|
||||
"work_heartbeat",
|
||||
"project_register",
|
||||
"setup",
|
||||
"onboard"
|
||||
]
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Project State (`projects.json`)
|
||||
|
||||
All project state lives in `<workspace>/projects/projects.json`, keyed by group ID.
|
||||
|
||||
**Source:** [`lib/projects.ts`](../lib/projects.ts)
|
||||
|
||||
### Schema
|
||||
|
||||
```json
|
||||
{
|
||||
"projects": {
|
||||
"<groupId>": {
|
||||
"name": "my-webapp",
|
||||
"repo": "~/git/my-webapp",
|
||||
"groupName": "Dev - My Webapp",
|
||||
"baseBranch": "development",
|
||||
"deployBranch": "development",
|
||||
"deployUrl": "https://my-webapp.example.com",
|
||||
"channel": "telegram",
|
||||
"roleExecution": "parallel",
|
||||
"dev": {
|
||||
"active": false,
|
||||
"issueId": null,
|
||||
"startTime": null,
|
||||
"level": null,
|
||||
"sessions": {
|
||||
"junior": null,
|
||||
"medior": "agent:orchestrator:subagent:my-webapp-dev-medior",
|
||||
"senior": null
|
||||
}
|
||||
},
|
||||
"qa": {
|
||||
"active": false,
|
||||
"issueId": null,
|
||||
"startTime": null,
|
||||
"level": null,
|
||||
"sessions": {
|
||||
"reviewer": "agent:orchestrator:subagent:my-webapp-qa-reviewer",
|
||||
"tester": null
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Project fields
|
||||
|
||||
| Field | Type | Description |
|
||||
|---|---|---|
|
||||
| `name` | string | Short project name |
|
||||
| `repo` | string | Path to git repo (supports `~/` expansion) |
|
||||
| `groupName` | string | Group display name |
|
||||
| `baseBranch` | string | Base branch for development |
|
||||
| `deployBranch` | string | Branch that triggers deployment |
|
||||
| `deployUrl` | string | Deployment URL |
|
||||
| `channel` | string | Messaging channel (`"telegram"`, `"whatsapp"`, etc.) |
|
||||
| `roleExecution` | `"parallel"` \| `"sequential"` | DEV/QA parallelism for this project |
|
||||
|
||||
### Worker state fields
|
||||
|
||||
Each project has `dev` and `qa` worker state objects:
|
||||
|
||||
| Field | Type | Description |
|
||||
|---|---|---|
|
||||
| `active` | boolean | Whether this role has an active worker |
|
||||
| `issueId` | string \| null | Issue being worked on (as string) |
|
||||
| `startTime` | string \| null | ISO timestamp when worker became active |
|
||||
| `level` | string \| null | Current level (`junior`, `medior`, `senior`, `reviewer`, `tester`) |
|
||||
| `sessions` | Record<string, string \| null> | Per-level session keys |
|
||||
|
||||
**DEV session keys:** `junior`, `medior`, `senior`
|
||||
**QA session keys:** `reviewer`, `tester`
|
||||
|
||||
### Key design decisions
|
||||
|
||||
- **Session-per-level** — each level gets its own worker session, accumulating context independently. Level selection maps directly to a session key.
|
||||
- **Sessions preserved on completion** — when a worker completes a task, the sessions map is preserved (only `active`, `issueId`, and `startTime` are cleared). This enables session reuse.
|
||||
- **Atomic writes** — all writes go through temp-file-then-rename to prevent corruption.
|
||||
- **Sessions persist indefinitely** — no auto-cleanup. The `health` tool handles manual cleanup.
|
||||
|
||||
---
|
||||
|
||||
## Workspace File Layout
|
||||
|
||||
```
|
||||
<workspace>/
|
||||
├── projects/
|
||||
│ ├── projects.json ← Project state (auto-managed)
|
||||
│ └── roles/
|
||||
│ ├── my-webapp/ ← Per-project role instructions (editable)
|
||||
│ │ ├── dev.md
|
||||
│ │ └── qa.md
|
||||
│ ├── another-project/
|
||||
│ │ ├── dev.md
|
||||
│ │ └── qa.md
|
||||
│ └── default/ ← Fallback role instructions
|
||||
│ ├── dev.md
|
||||
│ └── qa.md
|
||||
├── log/
|
||||
│ └── audit.log ← NDJSON event log (auto-managed)
|
||||
├── AGENTS.md ← Agent identity documentation
|
||||
└── HEARTBEAT.md ← Heartbeat operation guide
|
||||
```
|
||||
|
||||
### Role instruction files
|
||||
|
||||
`work_start` loads role instructions from `projects/roles/<project>/<role>.md` at dispatch time, falling back to `projects/roles/default/<role>.md`. These files are appended to the task message sent to worker sessions.
|
||||
|
||||
Edit to customize: deployment steps, test commands, acceptance criteria, coding standards.
|
||||
|
||||
**Source:** [`lib/dispatch.ts:loadRoleInstructions`](../lib/dispatch.ts)
|
||||
|
||||
---
|
||||
|
||||
## Audit Log
|
||||
|
||||
Append-only NDJSON at `<workspace>/log/audit.log`. Auto-truncated to 250 lines.
|
||||
|
||||
**Source:** [`lib/audit.ts`](../lib/audit.ts)
|
||||
|
||||
### Event types
|
||||
|
||||
| Event | Trigger |
|
||||
|---|---|
|
||||
| `work_start` | Task dispatched to worker |
|
||||
| `model_selection` | Level resolved to model ID |
|
||||
| `work_finish` | Task completed |
|
||||
| `work_heartbeat` | Heartbeat tick completed |
|
||||
| `task_create` | Issue created |
|
||||
| `task_update` | Issue state changed |
|
||||
| `task_comment` | Comment added to issue |
|
||||
| `status` | Queue status queried |
|
||||
| `health` | Health scan completed |
|
||||
| `heartbeat_tick` | Heartbeat service tick (background) |
|
||||
| `project_register` | Project registered |
|
||||
|
||||
### Querying
|
||||
|
||||
```bash
|
||||
# All task dispatches
|
||||
cat audit.log | jq 'select(.event=="work_start")'
|
||||
|
||||
# All completions for a project
|
||||
cat audit.log | jq 'select(.event=="work_finish" and .project=="my-webapp")'
|
||||
|
||||
# Model selections
|
||||
cat audit.log | jq 'select(.event=="model_selection")'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Issue Provider
|
||||
|
||||
DevClaw uses an `IssueProvider` interface (`lib/providers/provider.ts`) to abstract issue tracker operations. The provider is auto-detected from the git remote URL.
|
||||
|
||||
**Supported providers:**
|
||||
|
||||
| Provider | CLI | Detection |
|
||||
|---|---|---|
|
||||
| GitHub | `gh` | Remote contains `github.com` |
|
||||
| GitLab | `glab` | Remote contains `gitlab` |
|
||||
|
||||
**Planned:** Jira (via REST API)
|
||||
|
||||
**Source:** [`lib/providers/index.ts`](../lib/providers/index.ts)
|
||||
@@ -1,6 +1,6 @@
|
||||
# Context-Aware DevClaw
|
||||
# DevClaw — Context Awareness
|
||||
|
||||
DevClaw now adapts its behavior based on how you interact with it.
|
||||
DevClaw adapts its behavior based on how you interact with it.
|
||||
|
||||
## Design Philosophy
|
||||
|
||||
@@ -12,170 +12,122 @@ DevClaw enforces strict boundaries between projects:
|
||||
- Project work happens **inside that project's group**
|
||||
- Setup and configuration happen **outside project groups**
|
||||
|
||||
This design prevents:
|
||||
- ❌ Cross-project contamination (workers picking up wrong project's tasks)
|
||||
- ❌ Confusion about which project you're working on
|
||||
- ❌ Accidental registration of wrong groups
|
||||
- ❌ Setup discussions cluttering project work channels
|
||||
This prevents:
|
||||
- Cross-project contamination (workers picking up wrong project's tasks)
|
||||
- Confusion about which project you're working on
|
||||
- Accidental registration of wrong groups
|
||||
- Setup discussions cluttering project work channels
|
||||
|
||||
This enables:
|
||||
- ✅ Clear mental model: "This group = this project"
|
||||
- ✅ Isolated work streams: Each project progresses independently
|
||||
- ✅ Dedicated teams: Workers focus on one project at a time
|
||||
- ✅ Clean separation: Setup vs. operational work
|
||||
- Clear mental model: "This group = this project"
|
||||
- Isolated work streams: Each project progresses independently
|
||||
- Dedicated teams: Workers focus on one project at a time
|
||||
- Clean separation: Setup vs. operational work
|
||||
|
||||
## Three Interaction Contexts
|
||||
|
||||
### 1. **Via Another Agent** (Setup Mode)
|
||||
When you talk to your main agent (like Henk) about DevClaw:
|
||||
- ✅ Use: `devclaw_onboard`, `devclaw_setup`
|
||||
- ❌ Avoid: `task_pickup`, `queue_status` (operational tools)
|
||||
### 1. Via Another Agent (Setup Mode)
|
||||
|
||||
When you talk to your main agent about DevClaw:
|
||||
- Use: `onboard`, `setup`
|
||||
- Avoid: `work_start`, `status` (operational tools)
|
||||
|
||||
**Example:**
|
||||
```
|
||||
User → Henk: "Can you help me set up DevClaw?"
|
||||
Henk → Calls devclaw_onboard
|
||||
User → Main Agent: "Can you help me set up DevClaw?"
|
||||
Main Agent → Calls onboard
|
||||
```
|
||||
|
||||
### 2. **Direct Message to DevClaw Agent**
|
||||
### 2. Direct Message to DevClaw Agent
|
||||
|
||||
When you DM the DevClaw agent directly on Telegram/WhatsApp:
|
||||
- ✅ Use: `queue_status` (all projects), `session_health` (system overview)
|
||||
- ❌ Avoid: `task_pickup` (project-specific work), setup tools
|
||||
- Use: `status` (all projects), `health` (system overview)
|
||||
- Avoid: `work_start` (project-specific work), setup tools
|
||||
|
||||
**Example:**
|
||||
```
|
||||
User → DevClaw DM: "Show me the status of all projects"
|
||||
DevClaw → Calls queue_status (shows all projects)
|
||||
DevClaw → Calls status (shows all projects)
|
||||
```
|
||||
|
||||
### 3. **Project Group Chat**
|
||||
### 3. Project Group Chat
|
||||
|
||||
When you message in a Telegram/WhatsApp group bound to a project:
|
||||
- ✅ Use: `task_pickup`, `task_complete`, `task_create`, `queue_status` (auto-filtered)
|
||||
- ❌ Avoid: Setup tools, system-wide queries
|
||||
- Use: `work_start`, `work_finish`, `task_create`, `status` (auto-filtered)
|
||||
- Avoid: Setup tools, system-wide queries
|
||||
|
||||
**Example:**
|
||||
```
|
||||
User → OpenClaw Dev Group: "@henk pick up issue #42"
|
||||
DevClaw → Calls task_pickup (only works in groups)
|
||||
User → Project Group: "pick up issue #42"
|
||||
DevClaw → Calls work_start (only works in groups)
|
||||
```
|
||||
|
||||
## How It Works
|
||||
|
||||
### Context Detection
|
||||
|
||||
Each tool automatically detects:
|
||||
- **Agent ID** - Is this the DevClaw agent or another agent?
|
||||
- **Message Channel** - Telegram, WhatsApp, or CLI?
|
||||
- **Session Key** - Is this a group chat or direct message?
|
||||
- **Agent ID** — Is this the DevClaw agent or another agent?
|
||||
- **Message Channel** — Telegram, WhatsApp, or CLI?
|
||||
- **Session Key** — Is this a group chat or direct message?
|
||||
- Format: `agent:{agentId}:{channel}:{type}:{id}`
|
||||
- Telegram group: `agent:devclaw:telegram:group:-5266044536`
|
||||
- WhatsApp group: `agent:devclaw:whatsapp:group:120363123@g.us`
|
||||
- DM: `agent:devclaw:telegram:user:657120585`
|
||||
- **Project Binding** - Which project is this group bound to?
|
||||
- **Project Binding** — Which project is this group bound to?
|
||||
|
||||
### Guardrails
|
||||
|
||||
Tools include context-aware guidance in their responses:
|
||||
```json
|
||||
{
|
||||
"contextGuidance": "🛡️ Context: Project Group Chat (telegram)\n
|
||||
You're in a Telegram group for project 'openclaw-core'.\n
|
||||
Use task_pickup, task_complete for project work.",
|
||||
"contextGuidance": "Context: Project Group Chat (telegram)\n You're in a Telegram group for project 'my-webapp'.\n Use work_start, work_finish for project work.",
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
## Integrated Tools
|
||||
## Tool Context Requirements
|
||||
|
||||
### ✅ `devclaw_onboard`
|
||||
- **Works best:** Via another agent or direct DM
|
||||
- **Blocks:** Group chats (setup shouldn't happen in project groups)
|
||||
| Tool | Group chat | Direct DM | Via agent |
|
||||
|---|---|---|---|
|
||||
| `onboard` | Blocked | Works | Works |
|
||||
| `setup` | Works | Works | Works |
|
||||
| `work_start` | Works | Blocked | Blocked |
|
||||
| `work_finish` | Works | Works | Works |
|
||||
| `task_create` | Works | Works | Works |
|
||||
| `task_update` | Works | Works | Works |
|
||||
| `task_comment` | Works | Works | Works |
|
||||
| `status` | Auto-filtered | All projects | Suggests onboard |
|
||||
| `health` | Auto-filtered | All projects | Works |
|
||||
| `work_heartbeat` | Single project | All projects | Works |
|
||||
| `project_register` | Works (required) | Blocked | Blocked |
|
||||
|
||||
### ✅ `queue_status`
|
||||
- **Group context:** Auto-filters to that project
|
||||
- **Direct context:** Shows all projects
|
||||
- **Via-agent context:** Suggests using devclaw_onboard instead
|
||||
|
||||
### ✅ `task_pickup`
|
||||
- **ONLY works:** In project group chats
|
||||
- **Blocks:** Direct DMs and setup conversations
|
||||
|
||||
### ✅ `project_register`
|
||||
- **ONLY works:** In the Telegram/WhatsApp group you're registering
|
||||
- **Blocks:** Direct DMs and via-agent conversations
|
||||
- **Auto-detects:** Group ID from current chat (projectGroupId parameter now optional)
|
||||
|
||||
**Why this matters:**
|
||||
- **Project Isolation**: Each group = one project = one dedicated team
|
||||
- **Clear Boundaries**: Forces deliberate project registration from within the project's space
|
||||
- **Team Clarity**: You're physically in the group when binding it, making the connection explicit
|
||||
- **No Mistakes**: Impossible to accidentally register the wrong group when you're in it
|
||||
- **Natural Workflow**: "This group is for Project X" → register Project X here
|
||||
|
||||
## Testing
|
||||
|
||||
### Debug Tool
|
||||
Use `context_test` to see what context is detected:
|
||||
```
|
||||
# In any context:
|
||||
context_test
|
||||
|
||||
# Returns:
|
||||
{
|
||||
"detectedContext": { "type": "group", "projectName": "openclaw-core" },
|
||||
"guardrails": "🛡️ Context: Project Group Chat..."
|
||||
}
|
||||
```
|
||||
|
||||
### Manual Testing
|
||||
1. **Setup Mode:** Message your main agent → "Help me configure DevClaw"
|
||||
2. **Status Check:** DM DevClaw agent (Telegram/WhatsApp) → "Show me the queue"
|
||||
3. **Project Work:** Post in project group (Telegram/WhatsApp) → "@henk pick up #42"
|
||||
|
||||
Each context should trigger different guardrails.
|
||||
|
||||
## Configuration
|
||||
|
||||
Add to `~/.openclaw/openclaw.json`:
|
||||
```json
|
||||
"plugins": {
|
||||
"entries": {
|
||||
"devclaw": {
|
||||
"config": {
|
||||
"devClawAgentIds": ["henk-development", "devclaw-test"],
|
||||
"models": { ... }
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The `devClawAgentIds` array lists which agents are DevClaw orchestrators.
|
||||
|
||||
## Implementation Details
|
||||
|
||||
- **Module:** [lib/context-guard.ts](../lib/context-guard.ts)
|
||||
- **Tests:** [tests/unit/context-guard.test.ts](../tests/unit/context-guard.test.ts) (15 passing)
|
||||
- **Integrated tools:** 4 key tools (`devclaw_onboard`, `queue_status`, `task_pickup`, `project_register`)
|
||||
- **Detection logic:** Checks agentId, messageChannel, sessionKey pattern matching
|
||||
**Why `project_register` requires group context:**
|
||||
- Forces deliberate project registration from within the project's space
|
||||
- You're physically in the group when binding it, making the connection explicit
|
||||
- Impossible to accidentally register the wrong group
|
||||
|
||||
## WhatsApp Support
|
||||
|
||||
DevClaw **fully supports WhatsApp** groups with the same architecture as Telegram:
|
||||
DevClaw fully supports WhatsApp groups with the same architecture as Telegram:
|
||||
|
||||
- ✅ WhatsApp group detection via `sessionKey.includes("@g.us")`
|
||||
- ✅ Projects keyed by WhatsApp group ID (e.g., `"120363123@g.us"`)
|
||||
- ✅ Context-aware tools work identically for both channels
|
||||
- ✅ One project = one group (Telegram OR WhatsApp)
|
||||
- WhatsApp group detection via `sessionKey.includes("@g.us")`
|
||||
- Projects keyed by WhatsApp group ID (e.g., `"120363123@g.us"`)
|
||||
- Context-aware tools work identically for both channels
|
||||
- One project = one group (Telegram OR WhatsApp)
|
||||
|
||||
**To register a WhatsApp project:**
|
||||
1. Go to the WhatsApp group chat
|
||||
2. Call `project_register` from within the group
|
||||
3. Group ID auto-detected from context
|
||||
|
||||
The architecture treats Telegram and WhatsApp identically - the only difference is the group ID format.
|
||||
## Implementation
|
||||
|
||||
## Future Enhancements
|
||||
- **Module:** [`lib/context-guard.ts`](../lib/context-guard.ts)
|
||||
- **Detection logic:** Checks agentId, messageChannel, sessionKey pattern matching
|
||||
- **Configuration:** `devClawAgentIds` in plugin config lists which agents are DevClaw orchestrators
|
||||
|
||||
- [ ] Integrate into remaining tools (`task_complete`, `session_health`, `task_create`, `devclaw_setup`)
|
||||
- [ ] System prompt injection (requires OpenClaw core support)
|
||||
- [ ] Context-based tool filtering (hide irrelevant tools)
|
||||
- [ ] Per-project context overrides
|
||||
## Related
|
||||
|
||||
- [Configuration — devClawAgentIds](CONFIGURATION.md#devclaw-agent-ids)
|
||||
- [Architecture — Scope boundaries](ARCHITECTURE.md#scope-boundaries)
|
||||
|
||||
@@ -12,14 +12,14 @@ DevClaw exists because of a gap that management theorists identified decades ago
|
||||
|
||||
In 1969, Paul Hersey and Ken Blanchard published what would become Situational Leadership Theory. The central idea is deceptively simple: the way you delegate should match the capability and reliability of the person doing the work. You don't hand an intern the system architecture redesign. You don't ask your principal engineer to rename a CSS class.
|
||||
|
||||
DevClaw's model selection does exactly this. When a task comes in, the plugin evaluates complexity from the issue title and description, then routes it to the cheapest model that can handle it:
|
||||
DevClaw's level selection does exactly this. When a task comes in, the plugin routes it to the cheapest model that can handle it:
|
||||
|
||||
| Complexity | Model | Analogy |
|
||||
| -------------------------------- | ------ | --------------------------- |
|
||||
| Simple (typos, renames, copy) | Haiku | Junior dev — just execute |
|
||||
| Standard (features, bug fixes) | Sonnet | Mid-level — think and build |
|
||||
| Complex (architecture, security) | Opus | Senior — design and reason |
|
||||
| Review | Grok | Independent reviewer |
|
||||
| Complexity | Level | Analogy |
|
||||
| -------------------------------- | -------- | --------------------------- |
|
||||
| Simple (typos, renames, copy) | Junior | The intern — just execute |
|
||||
| Standard (features, bug fixes) | Medior | Mid-level — think and build |
|
||||
| Complex (architecture, security) | Senior | The architect — design and reason |
|
||||
| Review | Reviewer | Independent code reviewer |
|
||||
|
||||
This isn't just cost optimization. It mirrors what effective managers do instinctively: match the delegation level to the task, not to a fixed assumption about the delegate.
|
||||
|
||||
@@ -27,11 +27,11 @@ This isn't just cost optimization. It mirrors what effective managers do instinc
|
||||
|
||||
Classical management theory — later formalized by Bernard Bass in his work on Transformational Leadership — introduced a concept called Management by Exception (MBE). The principle: a manager should only be pulled back into a workstream when something deviates from the expected path.
|
||||
|
||||
DevClaw's task lifecycle is built on this. The orchestrator delegates a task via `task_pickup`, then steps away. It only re-engages in three scenarios:
|
||||
DevClaw's task lifecycle is built on this. The orchestrator delegates a task via `work_start`, then steps away. It only re-engages in three scenarios:
|
||||
|
||||
1. **DEV completes work** → The task moves to QA automatically. No orchestrator involvement needed.
|
||||
2. **QA passes** → The issue closes. Pipeline complete.
|
||||
3. **QA fails** → The task cycles back to DEV with a fix request. The orchestrator may need to adjust the model tier.
|
||||
3. **QA fails** → The task cycles back to DEV with a fix request. The orchestrator may need to adjust the model level.
|
||||
4. **QA refines** → The task enters a holding state that _requires human decision_. This is the explicit escalation boundary.
|
||||
|
||||
The "refine" state is the most interesting from a delegation perspective. It's a conscious architectural decision that says: some judgments should not be automated. When the QA agent determines that a task needs rethinking rather than just fixing, it escalates to the only actor who has the full business context — the human.
|
||||
@@ -61,7 +61,7 @@ One of the most common delegation failures is self-review. You don't ask the per
|
||||
DevClaw enforces structural separation between development and review by design:
|
||||
|
||||
- DEV and QA are separate sub-agent sessions with separate state.
|
||||
- QA uses a different model entirely (Grok), introducing genuine independence.
|
||||
- QA uses the reviewer level, which can be a different model entirely, introducing genuine independence.
|
||||
- The review happens after a clean label transition — QA picks up from `To Test`, not from watching DEV work in real time.
|
||||
|
||||
This mirrors a principle from organizational design: effective controls require independence between execution and verification. It's the same reason companies separate their audit function from their operations.
|
||||
@@ -72,7 +72,7 @@ Ronald Coase won a Nobel Prize for explaining why firms exist: transaction costs
|
||||
|
||||
DevClaw applies the same logic to AI sessions. Spawning a new sub-agent session costs approximately 50,000 tokens of context loading — the agent needs to read the full codebase before it can do useful work. That's the onboarding cost.
|
||||
|
||||
The plugin tracks session IDs across task completions. When a DEV finishes task A and task B is ready on the same project, DevClaw detects the existing session and returns `"sessionAction": "send"` instead of `"spawn"`. The orchestrator routes the new task to the running session. No re-onboarding. No context reload.
|
||||
The plugin tracks session keys across task completions. When a DEV finishes task A and task B is ready on the same project, DevClaw detects the existing session and reuses it instead of spawning a new one. No re-onboarding. No context reload.
|
||||
|
||||
In management terms: keep your team stable. Reassigning the same person to the next task on their project is almost always cheaper than bringing in someone new — even if the new person is theoretically better qualified.
|
||||
|
||||
@@ -101,11 +101,11 @@ This is the deepest lesson from delegation theory: **good delegation isn't about
|
||||
|
||||
Management research points to a few directions that could extend DevClaw's delegation model:
|
||||
|
||||
**Progressive delegation.** Blanchard's model suggests increasing task complexity for delegates as they prove competent. DevClaw could track QA pass rates per model tier and automatically promote — if Haiku consistently passes QA on borderline tasks, start routing more work to it. This is how good managers develop their people, and it reduces cost over time.
|
||||
**Progressive delegation.** Blanchard's model suggests increasing task complexity for delegates as they prove competent. DevClaw could track QA pass rates per model level and automatically promote — if junior consistently passes QA on borderline tasks, start routing more work to it. This is how good managers develop their people, and it reduces cost over time.
|
||||
|
||||
**Delegation authority expansion.** The Vroom-Yetton decision model maps when a leader should decide alone versus consulting the team. Currently, sub-agents have narrow authority — they execute tasks but can't restructure the backlog. Selectively expanding this (e.g., allowing a DEV agent to split a task it judges too large) would reduce orchestrator bottlenecks, mirroring how managers gradually give high-performers more autonomy.
|
||||
|
||||
**Outcome-based learning.** Delegation research emphasizes that the _delegator_ learns from outcomes too. Aggregated metrics — QA fail rate by model tier, average cycles to Done, time-in-state distributions — would help both the orchestrator agent and the human calibrate their delegation patterns over time.
|
||||
**Outcome-based learning.** Delegation research emphasizes that the _delegator_ learns from outcomes too. Aggregated metrics — QA fail rate by model level, average cycles to Done, time-in-state distributions — would help both the orchestrator agent and the human calibrate their delegation patterns over time.
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -1,18 +1,18 @@
|
||||
# DevClaw — Onboarding Guide
|
||||
|
||||
## What you need before starting
|
||||
Step-by-step setup: install the plugin, configure an agent, register projects, and run your first task.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
| Requirement | Why | How to check |
|
||||
|---|---|---|
|
||||
| [OpenClaw](https://openclaw.ai) installed | DevClaw is an OpenClaw plugin | `openclaw --version` |
|
||||
| Node.js >= 20 | Runtime for plugin | `node --version` |
|
||||
| [`glab`](https://gitlab.com/gitlab-org/cli) or [`gh`](https://cli.github.com) CLI | Issue tracker provider (auto-detected from remote) | `glab --version` or `gh --version` |
|
||||
| CLI authenticated | Plugin calls glab/gh for every label transition | `glab auth status` or `gh auth status` |
|
||||
| A GitLab/GitHub repo with issues | The task backlog lives in the issue tracker | `glab issue list` or `gh issue list` from your repo |
|
||||
| [`gh`](https://cli.github.com) or [`glab`](https://gitlab.com/gitlab-org/cli) CLI | Issue tracker provider (auto-detected from git remote) | `gh --version` or `glab --version` |
|
||||
| CLI authenticated | Plugin calls gh/glab for every label transition | `gh auth status` or `glab auth status` |
|
||||
| A GitHub/GitLab repo with issues | The task backlog lives in the issue tracker | `gh issue list` or `glab issue list` from your repo |
|
||||
|
||||
## Setup
|
||||
|
||||
### 1. Install the plugin
|
||||
## Step 1: Install the plugin
|
||||
|
||||
```bash
|
||||
# Copy to extensions directory (auto-discovered on next restart)
|
||||
@@ -25,21 +25,21 @@ openclaw plugins list
|
||||
# Should show: DevClaw | devclaw | loaded
|
||||
```
|
||||
|
||||
### 2. Run setup
|
||||
## Step 2: Run setup
|
||||
|
||||
There are three ways to set up DevClaw:
|
||||
|
||||
#### Option A: Conversational onboarding (recommended)
|
||||
### Option A: Conversational onboarding (recommended)
|
||||
|
||||
Call the `devclaw_onboard` tool from any agent that has the DevClaw plugin loaded. The agent will walk you through configuration step by step — asking about:
|
||||
Call the `onboard` tool from any agent that has the DevClaw plugin loaded. The agent walks you through configuration step by step — asking about:
|
||||
- Agent selection (current or create new)
|
||||
- Channel binding (telegram/whatsapp/none) — for new agents only
|
||||
- Model tiers (accept defaults or customize)
|
||||
- Model levels (accept defaults or customize)
|
||||
- Optional project registration
|
||||
|
||||
The tool returns instructions that guide the agent through the QA-style setup conversation.
|
||||
|
||||
#### Option B: CLI wizard
|
||||
### Option B: CLI wizard
|
||||
|
||||
```bash
|
||||
openclaw devclaw setup
|
||||
@@ -48,12 +48,13 @@ openclaw devclaw setup
|
||||
The setup wizard walks you through:
|
||||
|
||||
1. **Agent** — Create a new orchestrator agent or configure an existing one
|
||||
2. **Developer team** — Choose which LLM model powers each developer tier:
|
||||
- **Junior** (fast, cheap tasks) — default: `anthropic/claude-haiku-4-5`
|
||||
- **Medior** (standard tasks) — default: `anthropic/claude-sonnet-4-5`
|
||||
- **Senior** (complex tasks) — default: `anthropic/claude-opus-4-5`
|
||||
- **QA** (code review) — default: `anthropic/claude-sonnet-4-5`
|
||||
3. **Workspace** — Writes AGENTS.md, HEARTBEAT.md, role templates, and initializes memory
|
||||
2. **Developer team** — Choose which LLM model powers each developer level:
|
||||
- **DEV junior** (fast, cheap tasks) — default: `anthropic/claude-haiku-4-5`
|
||||
- **DEV medior** (standard tasks) — default: `anthropic/claude-sonnet-4-5`
|
||||
- **DEV senior** (complex tasks) — default: `anthropic/claude-opus-4-5`
|
||||
- **QA reviewer** (code review) — default: `anthropic/claude-sonnet-4-5`
|
||||
- **QA tester** (manual testing) — default: `anthropic/claude-haiku-4-5`
|
||||
3. **Workspace** — Writes AGENTS.md, HEARTBEAT.md, role templates, and initializes state
|
||||
|
||||
Non-interactive mode:
|
||||
```bash
|
||||
@@ -66,45 +67,45 @@ openclaw devclaw setup --agent my-orchestrator \
|
||||
--senior "anthropic/claude-opus-4-5"
|
||||
```
|
||||
|
||||
#### Option C: Tool call (agent-driven)
|
||||
### Option C: Tool call (agent-driven)
|
||||
|
||||
**Conversational onboarding via tool:**
|
||||
```json
|
||||
devclaw_onboard({ mode: "first-run" })
|
||||
onboard({ "mode": "first-run" })
|
||||
```
|
||||
|
||||
The tool returns step-by-step instructions that guide the agent through the QA-style setup conversation.
|
||||
The tool returns step-by-step instructions that guide the agent through the setup conversation.
|
||||
|
||||
**Direct setup (skip conversation):**
|
||||
```json
|
||||
{
|
||||
setup({
|
||||
"newAgentName": "My Dev Orchestrator",
|
||||
"channelBinding": "telegram",
|
||||
"models": {
|
||||
"junior": "anthropic/claude-haiku-4-5",
|
||||
"senior": "anthropic/claude-opus-4-5"
|
||||
"dev": {
|
||||
"junior": "anthropic/claude-haiku-4-5",
|
||||
"senior": "anthropic/claude-opus-4-5"
|
||||
},
|
||||
"qa": {
|
||||
"reviewer": "anthropic/claude-sonnet-4-5"
|
||||
}
|
||||
}
|
||||
}
|
||||
})
|
||||
```
|
||||
|
||||
This calls `devclaw_setup` directly without conversational prompts.
|
||||
## Step 3: Channel binding (optional, for new agents)
|
||||
|
||||
### 3. Channel binding (optional, for new agents)
|
||||
|
||||
If you created a new agent during conversational onboarding and selected a channel binding (telegram/whatsapp), the agent is automatically bound and will receive messages from that channel. **Skip to step 4.**
|
||||
If you created a new agent during conversational onboarding and selected a channel binding (telegram/whatsapp), the agent is automatically bound. **Skip to step 4.**
|
||||
|
||||
**Smart Migration**: If an existing agent already has a channel-wide binding (e.g., the old orchestrator receives all telegram messages), the onboarding agent will:
|
||||
1. Call `analyze_channel_bindings` to detect the conflict
|
||||
1. Detect the conflict
|
||||
2. Ask if you want to migrate the binding from the old agent to the new one
|
||||
3. If you confirm, the binding is automatically moved — no manual config edit needed
|
||||
|
||||
This is useful when you're replacing an old orchestrator with a new one.
|
||||
If you didn't bind a channel during setup:
|
||||
|
||||
If you didn't bind a channel during setup, you have two options:
|
||||
**Option A: Manually edit `openclaw.json`**
|
||||
|
||||
**Option A: Manually edit `openclaw.json`** (for existing agents or post-creation binding)
|
||||
|
||||
Add an entry to the `bindings` array:
|
||||
```json
|
||||
{
|
||||
"bindings": [
|
||||
@@ -136,131 +137,115 @@ Restart OpenClaw after editing.
|
||||
|
||||
**Option B: Add bot to Telegram/WhatsApp group**
|
||||
|
||||
If using a channel-wide binding (no peer filter), the agent will receive all messages from that channel. Add your orchestrator bot to the relevant Telegram group for the project.
|
||||
If using a channel-wide binding (no peer filter), the agent receives all messages from that channel. Add your orchestrator bot to the relevant Telegram group.
|
||||
|
||||
### 4. Register your project
|
||||
## Step 4: Register your project
|
||||
|
||||
Tell the orchestrator agent to register a new project:
|
||||
Go to the Telegram/WhatsApp group for the project and tell the orchestrator agent:
|
||||
|
||||
> "Register project my-project at ~/git/my-project for group -1234567890 with base branch development"
|
||||
> "Register project my-project at ~/git/my-project with base branch development"
|
||||
|
||||
The agent calls `project_register`, which atomically:
|
||||
- Validates the repo and auto-detects GitHub/GitLab from remote
|
||||
- Creates all 8 state labels (idempotent)
|
||||
- Scaffolds prompt instruction files (`projects/prompts/<project>/dev.md` and `qa.md`)
|
||||
- Adds the project entry to `projects.json` with `autoChain: false`
|
||||
- Scaffolds role instruction files (`projects/roles/<project>/dev.md` and `qa.md`)
|
||||
- Adds the project entry to `projects.json`
|
||||
- Logs the registration event
|
||||
|
||||
**Initial state in `projects.json`:**
|
||||
|
||||
```json
|
||||
{
|
||||
"projects": {
|
||||
"-1234567890": {
|
||||
"name": "my-project",
|
||||
"repo": "~/git/my-project",
|
||||
"groupName": "Dev - My Project",
|
||||
"deployUrl": "",
|
||||
"groupName": "Project: my-project",
|
||||
"baseBranch": "development",
|
||||
"deployBranch": "development",
|
||||
"autoChain": false,
|
||||
"channel": "telegram",
|
||||
"roleExecution": "parallel",
|
||||
"dev": {
|
||||
"active": false,
|
||||
"issueId": null,
|
||||
"startTime": null,
|
||||
"model": null,
|
||||
"level": null,
|
||||
"sessions": { "junior": null, "medior": null, "senior": null }
|
||||
},
|
||||
"qa": {
|
||||
"active": false,
|
||||
"issueId": null,
|
||||
"startTime": null,
|
||||
"model": null,
|
||||
"sessions": { "qa": null }
|
||||
"level": null,
|
||||
"sessions": { "reviewer": null, "tester": null }
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Manual fallback:** If you prefer CLI control, you can still create labels manually with `glab label create` and edit `projects.json` directly. See the [Architecture docs](ARCHITECTURE.md) for label names and colors.
|
||||
**Finding the Telegram group ID:** The group ID is the numeric ID of your Telegram supergroup (a negative number like `-1234567890`). When you call `project_register` from within the group, the ID is auto-detected from context.
|
||||
|
||||
**Finding the Telegram group ID:** The group ID is the numeric ID of your Telegram supergroup (a negative number like `-1234567890`). You can find it via the Telegram bot API or from message metadata in OpenClaw logs.
|
||||
|
||||
### 5. Create your first issue
|
||||
## Step 5: Create your first issue
|
||||
|
||||
Issues can be created in multiple ways:
|
||||
- **Via the agent** — Ask the orchestrator in the Telegram group: "Create an issue for adding a login page" (uses `task_create`)
|
||||
- **Via workers** — DEV/QA workers can call `task_create` to file follow-up bugs they discover
|
||||
- **Via CLI** — `cd ~/git/my-project && glab issue create --title "My first task" --label "To Do"` (or `gh issue create`)
|
||||
- **Via CLI** — `cd ~/git/my-project && gh issue create --title "My first task" --label "To Do"` (or `glab issue create`)
|
||||
- **Via web UI** — Create an issue and add the "To Do" label
|
||||
|
||||
### 6. Test the pipeline
|
||||
Note: `task_create` defaults to the "Planning" label. Use "To Do" explicitly when the task is ready for immediate work.
|
||||
|
||||
## Step 6: Test the pipeline
|
||||
|
||||
Ask the agent in the Telegram group:
|
||||
|
||||
> "Check the queue status"
|
||||
|
||||
The agent should call `queue_status` and report the "To Do" issue. Then:
|
||||
The agent should call `status` and report the "To Do" issue. Then:
|
||||
|
||||
> "Pick up issue #1 for DEV"
|
||||
|
||||
The agent calls `task_pickup`, which assigns a developer tier, transitions the label to "Doing", creates or reuses a worker session, and dispatches the task — all in one call. The agent just posts the announcement.
|
||||
The agent calls `work_start`, which assigns a developer level, transitions the label to "Doing", creates or reuses a worker session, and dispatches the task — all in one call. The agent posts the announcement.
|
||||
|
||||
## Adding more projects
|
||||
|
||||
Tell the agent to register a new project (step 3) and add the bot to the new Telegram group (step 4). That's it — `project_register` handles labels and state setup.
|
||||
Tell the agent to register a new project (step 4) from within the new project's Telegram group. That's it — `project_register` handles labels and state setup.
|
||||
|
||||
Each project is fully isolated — separate queue, separate workers, separate state.
|
||||
|
||||
## Developer tiers
|
||||
## Developer levels
|
||||
|
||||
DevClaw assigns tasks to developer tiers instead of raw model names. This makes the system intuitive — you're assigning a "junior dev" to fix a typo, not configuring model parameters.
|
||||
DevClaw assigns tasks to developer levels instead of raw model names. This makes the system intuitive — you're assigning a "junior dev" to fix a typo, not configuring model parameters.
|
||||
|
||||
| Tier | Role | Default model | When to assign |
|
||||
|------|------|---------------|----------------|
|
||||
| **junior** | Junior developer | `anthropic/claude-haiku-4-5` | Typos, single-file fixes, CSS changes |
|
||||
| **medior** | Mid-level developer | `anthropic/claude-sonnet-4-5` | Features, bug fixes, multi-file changes |
|
||||
| **senior** | Senior developer | `anthropic/claude-opus-4-5` | Architecture, migrations, system-wide refactoring |
|
||||
| **qa** | QA engineer | `anthropic/claude-sonnet-4-5` | Code review, test validation |
|
||||
| Role | Level | Default model | When to assign |
|
||||
|------|-------|---------------|----------------|
|
||||
| DEV | **junior** | `anthropic/claude-haiku-4-5` | Typos, single-file fixes, CSS changes |
|
||||
| DEV | **medior** | `anthropic/claude-sonnet-4-5` | Features, bug fixes, multi-file changes |
|
||||
| DEV | **senior** | `anthropic/claude-opus-4-5` | Architecture, migrations, system-wide refactoring |
|
||||
| QA | **reviewer** | `anthropic/claude-sonnet-4-5` | Code review, test validation |
|
||||
| QA | **tester** | `anthropic/claude-haiku-4-5` | Manual testing, smoke tests |
|
||||
|
||||
Change which model powers each tier in `openclaw.json`:
|
||||
```json
|
||||
{
|
||||
"plugins": {
|
||||
"entries": {
|
||||
"devclaw": {
|
||||
"config": {
|
||||
"models": {
|
||||
"junior": "anthropic/claude-haiku-4-5",
|
||||
"medior": "anthropic/claude-sonnet-4-5",
|
||||
"senior": "anthropic/claude-opus-4-5",
|
||||
"qa": "anthropic/claude-sonnet-4-5"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
Change which model powers each level in `openclaw.json` — see [Configuration](CONFIGURATION.md#model-tiers).
|
||||
|
||||
## What the plugin handles vs. what you handle
|
||||
|
||||
| Responsibility | Who | Details |
|
||||
|---|---|---|
|
||||
| Plugin installation | You (once) | `cp -r devclaw ~/.openclaw/extensions/` |
|
||||
| Agent + workspace setup | Plugin (`devclaw_setup`) | Creates agent, configures models, writes workspace files |
|
||||
| Channel binding analysis | Plugin (`analyze_channel_bindings`) | Detects channel conflicts, validates channel configuration |
|
||||
| Channel binding migration | Plugin (`devclaw_setup` with `migrateFrom`) | Automatically moves channel-wide bindings between agents |
|
||||
| Label setup | Plugin (`project_register`) | 8 labels, created idempotently via `IssueProvider` |
|
||||
| Prompt file scaffolding | Plugin (`project_register`) | Creates `projects/prompts/<project>/dev.md` and `qa.md` |
|
||||
| Agent + workspace setup | Plugin (`setup`) | Creates agent, configures models, writes workspace files |
|
||||
| Channel binding migration | Plugin (`setup` with `migrateFrom`) | Automatically moves channel-wide bindings between agents |
|
||||
| Label setup | Plugin (`project_register`) | 8 labels, created idempotently via IssueProvider |
|
||||
| Prompt file scaffolding | Plugin (`project_register`) | Creates `projects/roles/<project>/dev.md` and `qa.md` |
|
||||
| Project registration | Plugin (`project_register`) | Entry in `projects.json` with empty worker state |
|
||||
| Telegram group setup | You (once per project) | Add bot to group |
|
||||
| Issue creation | Plugin (`task_create`) | Orchestrator or workers create issues from chat |
|
||||
| Label transitions | Plugin | Atomic label transitions via issue tracker CLI |
|
||||
| Developer assignment | Plugin | LLM-selected tier by orchestrator, keyword heuristic fallback |
|
||||
| Label transitions | Plugin | Atomic transitions via issue tracker CLI |
|
||||
| Developer assignment | Plugin | LLM-selected level by orchestrator, keyword heuristic fallback |
|
||||
| State management | Plugin | Atomic read/write to `projects.json` |
|
||||
| Session management | Plugin | Creates, reuses, and dispatches to sessions via CLI. Agent never touches session tools. |
|
||||
| Task completion | Plugin (`task_complete`) | Workers self-report. Auto-chains if enabled. |
|
||||
| Prompt instructions | Plugin (`task_pickup`) | Loaded from `projects/prompts/<project>/<role>.md`, appended to task message |
|
||||
| Task completion | Plugin (`work_finish`) | Workers self-report. Auto-chains if enabled. |
|
||||
| Prompt instructions | Plugin (`work_start`) | Loaded from `projects/roles/<project>/<role>.md`, appended to task message |
|
||||
| Audit logging | Plugin | Automatic NDJSON append per tool call |
|
||||
| Zombie detection | Plugin | `session_health` checks active vs alive |
|
||||
| Queue scanning | Plugin | `queue_status` queries issue tracker per project |
|
||||
| Zombie detection | Plugin | `health` checks active vs alive |
|
||||
| Queue scanning | Plugin | `status` queries issue tracker per project |
|
||||
|
||||
@@ -1,8 +1,6 @@
|
||||
# QA Workflow
|
||||
# DevClaw — QA Workflow
|
||||
|
||||
## Overview
|
||||
|
||||
Quality Assurance (QA) in DevClaw follows a structured workflow that ensures every review is documented and traceable.
|
||||
Quality Assurance in DevClaw follows a structured workflow that ensures every review is documented and traceable.
|
||||
|
||||
## Required Steps
|
||||
|
||||
@@ -28,10 +26,10 @@ task_comment({
|
||||
|
||||
### 3. Complete the Task
|
||||
|
||||
After posting your comment, call `task_complete`:
|
||||
After posting your comment, call `work_finish`:
|
||||
|
||||
```javascript
|
||||
task_complete({
|
||||
work_finish({
|
||||
role: "qa",
|
||||
projectGroupId: "<group-id>",
|
||||
result: "pass", // or "fail", "refine", "blocked"
|
||||
@@ -39,15 +37,24 @@ task_complete({
|
||||
})
|
||||
```
|
||||
|
||||
## QA Results
|
||||
|
||||
| Result | Label transition | Meaning |
|
||||
|---|---|---|
|
||||
| `"pass"` | Testing → Done | Approved. Issue closed. |
|
||||
| `"fail"` | Testing → To Improve | Issues found. Issue reopened, sent back to DEV. |
|
||||
| `"refine"` | Testing → Refining | Needs human decision. Pipeline pauses. |
|
||||
| `"blocked"` | Testing → To Test | Cannot complete (env issues, etc.). Returns to QA queue. |
|
||||
|
||||
## Why Comments Are Required
|
||||
|
||||
1. **Audit Trail**: Every review decision is documented
|
||||
2. **Knowledge Sharing**: Future reviewers understand what was tested
|
||||
3. **Quality Metrics**: Enables tracking of test coverage
|
||||
4. **Debugging**: When issues arise later, we know what was checked
|
||||
5. **Compliance**: Some projects require documented QA evidence
|
||||
1. **Audit Trail** — Every review decision is documented in the issue tracker
|
||||
2. **Knowledge Sharing** — Future reviewers understand what was tested
|
||||
3. **Quality Metrics** — Enables tracking of test coverage
|
||||
4. **Debugging** — When issues arise later, we know what was checked
|
||||
5. **Compliance** — Some projects require documented QA evidence
|
||||
|
||||
## Comment Template
|
||||
## Comment Templates
|
||||
|
||||
### For Passing Reviews
|
||||
|
||||
@@ -61,7 +68,7 @@ task_complete({
|
||||
|
||||
**Results:** All tests passed. No regressions found.
|
||||
|
||||
**Environment:**
|
||||
**Environment:**
|
||||
- Browser/Platform: [details]
|
||||
- Version: [details]
|
||||
- Test data: [if relevant]
|
||||
@@ -72,15 +79,14 @@ task_complete({
|
||||
### For Failing Reviews
|
||||
|
||||
```markdown
|
||||
## QA Review - Issues Found
|
||||
## QA Review — Issues Found
|
||||
|
||||
**Tested:**
|
||||
- [What you tested]
|
||||
|
||||
**Issues Found:**
|
||||
1. [Issue description with steps to reproduce]
|
||||
2. [Issue description with steps to reproduce]
|
||||
3. [Issue description with expected vs actual behavior]
|
||||
2. [Issue description with expected vs actual behavior]
|
||||
|
||||
**Environment:**
|
||||
- [Test environment details]
|
||||
@@ -90,25 +96,25 @@ task_complete({
|
||||
|
||||
## Enforcement
|
||||
|
||||
As of [current date], QA workers are instructed via role templates to:
|
||||
- Always call `task_comment` BEFORE `task_complete`
|
||||
QA workers receive instructions via role templates to:
|
||||
- Always call `task_comment` BEFORE `work_finish`
|
||||
- Include specific details about what was tested
|
||||
- Document results, environment, and any notes
|
||||
|
||||
Prompt templates affected:
|
||||
- `projects/prompts/<project>/qa.md`
|
||||
- `projects/roles/<project>/qa.md`
|
||||
- All project-specific QA templates should follow this pattern
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Be Specific**: Don't just say "tested the feature" - list what you tested
|
||||
2. **Include Environment**: Version numbers, browser, OS can matter
|
||||
3. **Document Edge Cases**: If you tested special scenarios, note them
|
||||
4. **Use Screenshots**: For UI issues, screenshots help (link in comment)
|
||||
5. **Reference Requirements**: Link back to acceptance criteria from the issue
|
||||
1. **Be Specific** — Don't just say "tested the feature" — list what you tested
|
||||
2. **Include Environment** — Version numbers, browser, OS can matter
|
||||
3. **Document Edge Cases** — If you tested special scenarios, note them
|
||||
4. **Reference Requirements** — Link back to acceptance criteria from the issue
|
||||
5. **Use Screenshots** — For UI issues, screenshots help (link in comment)
|
||||
|
||||
## Related
|
||||
|
||||
- Issue #103: Enforce QA comment on every review (pass or fail)
|
||||
- Tool: `task_comment` - Add comments to issues
|
||||
- Tool: `task_complete` - Complete QA tasks
|
||||
- Tool: [`task_comment`](TOOLS.md#task_comment) — Add comments to issues
|
||||
- Tool: [`work_finish`](TOOLS.md#work_finish) — Complete QA tasks
|
||||
- Config: [`projects/roles/<project>/qa.md`](CONFIGURATION.md#role-instruction-files) — QA role instructions
|
||||
|
||||
@@ -15,16 +15,16 @@ This works for the common case but breaks down when you want:
|
||||
|
||||
Roles become a configurable list instead of a hardcoded pair. Each role defines:
|
||||
- **Name** — e.g. `design`, `dev`, `qa`, `devops`
|
||||
- **Tiers** — which developer tiers can be assigned (e.g. design only needs `medior`)
|
||||
- **Levels** — which developer levels can be assigned (e.g. design only needs `medior`)
|
||||
- **Pipeline position** — where it sits in the task lifecycle
|
||||
- **Worker count** — how many concurrent workers (default: 1)
|
||||
|
||||
```json
|
||||
{
|
||||
"roles": {
|
||||
"dev": { "tiers": ["junior", "medior", "senior"], "workers": 1 },
|
||||
"qa": { "tiers": ["qa"], "workers": 1 },
|
||||
"devops": { "tiers": ["medior", "senior"], "workers": 1 }
|
||||
"dev": { "levels": ["junior", "medior", "senior"], "workers": 1 },
|
||||
"qa": { "levels": ["reviewer", "tester"], "workers": 1 },
|
||||
"devops": { "levels": ["medior", "senior"], "workers": 1 }
|
||||
},
|
||||
"pipeline": ["dev", "qa", "devops"]
|
||||
}
|
||||
@@ -35,15 +35,15 @@ The pipeline definition replaces the hardcoded `Doing → To Test → Testing
|
||||
### Open questions
|
||||
|
||||
- How do custom labels map? Generate from role names, or let users define?
|
||||
- Should roles have their own instruction files (`projects/prompts/<project>/<role>.md`) — yes, this already works
|
||||
- Should roles have their own instruction files (`projects/roles/<project>/<role>.md`) — yes, this already works
|
||||
- How to handle parallel roles (e.g. frontend + backend DEV in parallel before QA)?
|
||||
|
||||
---
|
||||
|
||||
## Channel-agnostic groups
|
||||
## Channel-agnostic Groups
|
||||
|
||||
Currently DevClaw maps projects to **Telegram group IDs**. The `projectGroupId` is a Telegram-specific negative number. This means:
|
||||
- WhatsApp groups can't be used as project channels
|
||||
- WhatsApp groups can't be used as project channels (partially supported now via `channel` field)
|
||||
- Discord, Slack, or other channels are excluded
|
||||
- The naming (`groupId`, `groupName`) is Telegram-specific
|
||||
|
||||
@@ -77,19 +77,20 @@ Key changes:
|
||||
- All tool params, state keys, and docs updated accordingly
|
||||
- Backward compatible: existing Telegram-only keys migrated on read
|
||||
|
||||
This enables any OpenClaw channel (Telegram, WhatsApp, Discord, Slack, etc.) to host a project — each group chat becomes an autonomous dev team regardless of platform.
|
||||
This enables any OpenClaw channel (Telegram, WhatsApp, Discord, Slack, etc.) to host a project.
|
||||
|
||||
### Open questions
|
||||
|
||||
- Should one project be bindable to multiple channels? (e.g. Telegram for devs, WhatsApp for stakeholder updates)
|
||||
- How does the orchestrator agent handle cross-channel context? (OpenClaw bindings already route by channel)
|
||||
- How does the orchestrator agent handle cross-channel context?
|
||||
|
||||
---
|
||||
|
||||
## Other ideas
|
||||
## Other Ideas
|
||||
|
||||
- **Jira provider** — `IssueProvider` interface already abstracts GitHub/GitLab; Jira is the obvious next addition
|
||||
- **Deployment integration** — `task_complete` QA pass could trigger a deploy step via webhook or CLI
|
||||
- **Cost tracking** — log token usage per task/tier, surface in `queue_status`
|
||||
- **Deployment integration** — `work_finish` QA pass could trigger a deploy step via webhook or CLI
|
||||
- **Cost tracking** — log token usage per task/level, surface in `status`
|
||||
- **Priority scoring** — automatic priority assignment based on labels, age, and dependencies
|
||||
- **Session archival** — auto-archive idle sessions after configurable timeout (currently indefinite)
|
||||
- **Progressive delegation** — track QA pass rates per level and auto-promote (see [Management Theory](MANAGEMENT.md))
|
||||
|
||||
@@ -59,10 +59,15 @@ npm run test:ui
|
||||
"devclaw": {
|
||||
"config": {
|
||||
"models": {
|
||||
"junior": "anthropic/claude-haiku-4-5",
|
||||
"medior": "anthropic/claude-sonnet-4-5",
|
||||
"senior": "anthropic/claude-opus-4-5",
|
||||
"qa": "anthropic/claude-sonnet-4-5"
|
||||
"dev": {
|
||||
"junior": "anthropic/claude-haiku-4-5",
|
||||
"medior": "anthropic/claude-sonnet-4-5",
|
||||
"senior": "anthropic/claude-opus-4-5"
|
||||
},
|
||||
"qa": {
|
||||
"reviewer": "anthropic/claude-sonnet-4-5",
|
||||
"tester": "anthropic/claude-haiku-4-5"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
361
docs/TOOLS.md
Normal file
361
docs/TOOLS.md
Normal file
@@ -0,0 +1,361 @@
|
||||
# DevClaw — Tools Reference
|
||||
|
||||
Complete reference for all 11 tools registered by DevClaw. See [`index.ts`](../index.ts) for registration.
|
||||
|
||||
## Worker Lifecycle
|
||||
|
||||
### `work_start`
|
||||
|
||||
Pick up a task from the issue queue. Handles level assignment, label transition, session creation/reuse, task dispatch, and audit logging — all in one call.
|
||||
|
||||
**Source:** [`lib/tools/work-start.ts`](../lib/tools/work-start.ts)
|
||||
|
||||
**Context:** Only works in project group chats.
|
||||
|
||||
**Parameters:**
|
||||
|
||||
| Parameter | Type | Required | Description |
|
||||
|---|---|---|---|
|
||||
| `issueId` | number | No | Issue ID. If omitted, picks next by priority. |
|
||||
| `role` | `"dev"` \| `"qa"` | No | Worker role. Auto-detected from issue label if omitted. |
|
||||
| `projectGroupId` | string | No | Project group ID. Auto-detected from group context. |
|
||||
| `level` | string | No | Developer level (`junior`, `medior`, `senior`, `reviewer`). Auto-detected if omitted. |
|
||||
|
||||
**What it does atomically:**
|
||||
|
||||
1. Resolves project from `projects.json`
|
||||
2. Validates no active worker for this role
|
||||
3. Fetches issue from tracker, verifies correct label state
|
||||
4. Assigns level (LLM-chosen via `level` param → label detection → keyword heuristic fallback)
|
||||
5. Resolves level to model ID via config or defaults
|
||||
6. Loads prompt instructions from `projects/roles/<project>/<role>.md`
|
||||
7. Looks up existing session for assigned level (session-per-level)
|
||||
8. Transitions label (e.g. `To Do` → `Doing`)
|
||||
9. Creates session via Gateway RPC if new (`sessions.patch`)
|
||||
10. Dispatches task to worker session via CLI (`openclaw gateway call agent`)
|
||||
11. Updates `projects.json` state (active, issueId, level, session key)
|
||||
12. Writes audit log entries (work_start + model_selection)
|
||||
13. Sends notification
|
||||
14. Returns announcement text
|
||||
|
||||
**Level selection priority:**
|
||||
|
||||
1. `level` parameter (LLM-selected) — highest priority
|
||||
2. Issue label (e.g. a label named "junior" or "senior")
|
||||
3. Keyword heuristic from `model-selector.ts` — fallback
|
||||
|
||||
**Execution guards:**
|
||||
|
||||
- Rejects if role already has an active worker
|
||||
- Respects `roleExecution` (sequential: rejects if other role is active)
|
||||
|
||||
**On failure:** Rolls back label transition. No orphaned state.
|
||||
|
||||
---
|
||||
|
||||
### `work_finish`
|
||||
|
||||
Complete a task with a result. Called by workers (DEV/QA sub-agent sessions) directly, or by the orchestrator.
|
||||
|
||||
**Source:** [`lib/tools/work-finish.ts`](../lib/tools/work-finish.ts)
|
||||
|
||||
**Parameters:**
|
||||
|
||||
| Parameter | Type | Required | Description |
|
||||
|---|---|---|---|
|
||||
| `role` | `"dev"` \| `"qa"` | Yes | Worker role |
|
||||
| `result` | string | Yes | Completion result (see table below) |
|
||||
| `projectGroupId` | string | Yes | Project group ID |
|
||||
| `summary` | string | No | Brief summary for the announcement |
|
||||
| `prUrl` | string | No | PR/MR URL (auto-detected if omitted) |
|
||||
|
||||
**Valid results by role:**
|
||||
|
||||
| Role | Result | Label transition | Side effects |
|
||||
|---|---|---|---|
|
||||
| DEV | `"done"` | Doing → To Test | git pull, auto-detect PR URL |
|
||||
| DEV | `"blocked"` | Doing → To Do | Task returns to queue |
|
||||
| QA | `"pass"` | Testing → Done | Issue closed |
|
||||
| QA | `"fail"` | Testing → To Improve | Issue reopened |
|
||||
| QA | `"refine"` | Testing → Refining | Awaits human decision |
|
||||
| QA | `"blocked"` | Testing → To Test | Task returns to QA queue |
|
||||
|
||||
**What it does atomically:**
|
||||
|
||||
1. Validates role:result combination
|
||||
2. Resolves project and active worker
|
||||
3. Executes completion via pipeline service (label transition + side effects)
|
||||
4. Deactivates worker (sessions map preserved for reuse)
|
||||
5. Sends notification
|
||||
6. Ticks queue to fill free worker slots
|
||||
7. Writes audit log
|
||||
|
||||
**Auto-chaining** (when enabled on the project): `dev:done` dispatches QA automatically. `qa:fail` re-dispatches DEV using the previous level.
|
||||
|
||||
---
|
||||
|
||||
## Task Management
|
||||
|
||||
### `task_create`
|
||||
|
||||
Create a new issue in the project's issue tracker.
|
||||
|
||||
**Source:** [`lib/tools/task-create.ts`](../lib/tools/task-create.ts)
|
||||
|
||||
**Parameters:**
|
||||
|
||||
| Parameter | Type | Required | Description |
|
||||
|---|---|---|---|
|
||||
| `projectGroupId` | string | Yes | Project group ID |
|
||||
| `title` | string | Yes | Issue title |
|
||||
| `description` | string | No | Full issue body (markdown) |
|
||||
| `label` | StateLabel | No | State label. Defaults to `"Planning"`. |
|
||||
| `assignees` | string[] | No | GitHub/GitLab usernames to assign |
|
||||
| `pickup` | boolean | No | If true, immediately pick up for DEV after creation |
|
||||
|
||||
**Use cases:**
|
||||
|
||||
- Orchestrator creates tasks from chat messages
|
||||
- Workers file follow-up bugs discovered during development
|
||||
- Breaking down epics into smaller tasks
|
||||
|
||||
**Default behavior:** Creates issues in `"Planning"` state. Only use `"To Do"` when the user explicitly requests immediate work.
|
||||
|
||||
---
|
||||
|
||||
### `task_update`
|
||||
|
||||
Change an issue's state label manually without going through the full pickup/complete flow.
|
||||
|
||||
**Source:** [`lib/tools/task-update.ts`](../lib/tools/task-update.ts)
|
||||
|
||||
**Parameters:**
|
||||
|
||||
| Parameter | Type | Required | Description |
|
||||
|---|---|---|---|
|
||||
| `projectGroupId` | string | Yes | Project group ID |
|
||||
| `issueId` | number | Yes | Issue ID to update |
|
||||
| `state` | StateLabel | Yes | New state label |
|
||||
| `reason` | string | No | Audit log reason for the change |
|
||||
|
||||
**Valid states:** `Planning`, `To Do`, `Doing`, `To Test`, `Testing`, `Done`, `To Improve`, `Refining`
|
||||
|
||||
**Use cases:**
|
||||
|
||||
- Manual state adjustments (e.g. `Planning → To Do` after approval)
|
||||
- Failed auto-transitions that need correction
|
||||
- Bulk state changes by orchestrator
|
||||
|
||||
---
|
||||
|
||||
### `task_comment`
|
||||
|
||||
Add a comment to an issue for feedback, notes, or discussion.
|
||||
|
||||
**Source:** [`lib/tools/task-comment.ts`](../lib/tools/task-comment.ts)
|
||||
|
||||
**Parameters:**
|
||||
|
||||
| Parameter | Type | Required | Description |
|
||||
|---|---|---|---|
|
||||
| `projectGroupId` | string | Yes | Project group ID |
|
||||
| `issueId` | number | Yes | Issue ID to comment on |
|
||||
| `body` | string | Yes | Comment body (markdown) |
|
||||
| `authorRole` | `"dev"` \| `"qa"` \| `"orchestrator"` | No | Attribution role prefix |
|
||||
|
||||
**Use cases:**
|
||||
|
||||
- QA adds review feedback before pass/fail decision
|
||||
- DEV posts implementation notes or progress updates
|
||||
- Orchestrator adds summary comments
|
||||
|
||||
When `authorRole` is provided, the comment is prefixed with a role emoji and attribution label.
|
||||
|
||||
---
|
||||
|
||||
## Operations
|
||||
|
||||
### `status`
|
||||
|
||||
Lightweight queue + worker state dashboard.
|
||||
|
||||
**Source:** [`lib/tools/status.ts`](../lib/tools/status.ts)
|
||||
|
||||
**Context:** Auto-filters to project in group chats. Shows all projects in DMs.
|
||||
|
||||
**Parameters:**
|
||||
|
||||
| Parameter | Type | Required | Description |
|
||||
|---|---|---|---|
|
||||
| `projectGroupId` | string | No | Filter to specific project. Omit for all. |
|
||||
|
||||
**Returns per project:**
|
||||
|
||||
- Worker state: active/idle, current issue, level, start time
|
||||
- Queue counts: To Do, To Test, To Improve
|
||||
- Role execution mode
|
||||
|
||||
---
|
||||
|
||||
### `health`
|
||||
|
||||
Worker health scan with optional auto-fix.
|
||||
|
||||
**Source:** [`lib/tools/health.ts`](../lib/tools/health.ts)
|
||||
|
||||
**Context:** Auto-filters to project in group chats.
|
||||
|
||||
**Parameters:**
|
||||
|
||||
| Parameter | Type | Required | Description |
|
||||
|---|---|---|---|
|
||||
| `projectGroupId` | string | No | Filter to specific project. Omit for all. |
|
||||
| `fix` | boolean | No | Apply fixes for detected issues. Default: `false` (read-only). |
|
||||
| `activeSessions` | string[] | No | Active session IDs for zombie detection. |
|
||||
|
||||
**Health checks:**
|
||||
|
||||
| Issue | Severity | Detection | Auto-fix |
|
||||
|---|---|---|---|
|
||||
| Active worker with no session key | Critical | `active=true` but no session in map | Deactivate worker |
|
||||
| Active worker whose session is dead | Critical | Session key not in active sessions list | Deactivate worker, revert label |
|
||||
| Worker active >2 hours | Warning | `startTime` older than 2h | Deactivate worker, revert label to queue |
|
||||
| Inactive worker with lingering issue ID | Warning | `active=false` but `issueId` still set | Clear issueId |
|
||||
|
||||
---
|
||||
|
||||
### `work_heartbeat`
|
||||
|
||||
Manual trigger for heartbeat: health fix + queue dispatch. Same logic as the background heartbeat service, but invoked on demand.
|
||||
|
||||
**Source:** [`lib/tools/work-heartbeat.ts`](../lib/tools/work-heartbeat.ts)
|
||||
|
||||
**Parameters:**
|
||||
|
||||
| Parameter | Type | Required | Description |
|
||||
|---|---|---|---|
|
||||
| `projectGroupId` | string | No | Target single project. Omit for all. |
|
||||
| `dryRun` | boolean | No | Report only, don't dispatch. Default: `false`. |
|
||||
| `maxPickups` | number | No | Max worker dispatches per tick. |
|
||||
| `activeSessions` | string[] | No | Active session IDs for zombie detection. |
|
||||
|
||||
**Two-pass sweep:**
|
||||
|
||||
1. **Health pass** — Runs `checkWorkerHealth` per project per role. Auto-fixes zombies, stale workers, orphaned state.
|
||||
2. **Tick pass** — Calls `projectTick` per project. Fills free worker slots by priority (To Improve > To Test > To Do).
|
||||
|
||||
**Execution guards:**
|
||||
|
||||
- `projectExecution: "sequential"` — only one project active at a time
|
||||
- `roleExecution: "sequential"` — only one role (DEV or QA) active at a time per project (enforced in `projectTick`)
|
||||
|
||||
---
|
||||
|
||||
## Setup
|
||||
|
||||
### `project_register`
|
||||
|
||||
One-time project setup. Creates state labels, scaffolds prompt files, adds project to state.
|
||||
|
||||
**Source:** [`lib/tools/project-register.ts`](../lib/tools/project-register.ts)
|
||||
|
||||
**Context:** Only works in the Telegram/WhatsApp group being registered.
|
||||
|
||||
**Parameters:**
|
||||
|
||||
| Parameter | Type | Required | Description |
|
||||
|---|---|---|---|
|
||||
| `projectGroupId` | string | No | Auto-detected from current group if omitted |
|
||||
| `name` | string | Yes | Short project name (e.g. `my-webapp`) |
|
||||
| `repo` | string | Yes | Path to git repo (e.g. `~/git/my-project`) |
|
||||
| `groupName` | string | No | Display name. Defaults to `Project: {name}`. |
|
||||
| `baseBranch` | string | Yes | Base branch for development |
|
||||
| `deployBranch` | string | No | Deploy branch. Defaults to baseBranch. |
|
||||
| `deployUrl` | string | No | Deployment URL |
|
||||
| `roleExecution` | `"parallel"` \| `"sequential"` | No | DEV/QA parallelism. Default: `"parallel"`. |
|
||||
|
||||
**What it does atomically:**
|
||||
|
||||
1. Validates project not already registered
|
||||
2. Resolves repo path, auto-detects GitHub/GitLab from git remote
|
||||
3. Verifies provider health (CLI installed and authenticated)
|
||||
4. Creates all 8 state labels (idempotent — safe to run again)
|
||||
5. Adds project entry to `projects.json` with empty worker state
|
||||
- DEV sessions: `{ junior: null, medior: null, senior: null }`
|
||||
- QA sessions: `{ reviewer: null, tester: null }`
|
||||
6. Scaffolds prompt files: `projects/roles/<project>/dev.md` and `qa.md`
|
||||
7. Writes audit log
|
||||
|
||||
---
|
||||
|
||||
### `setup`
|
||||
|
||||
Agent + workspace initialization.
|
||||
|
||||
**Source:** [`lib/tools/setup.ts`](../lib/tools/setup.ts)
|
||||
|
||||
**Parameters:**
|
||||
|
||||
| Parameter | Type | Required | Description |
|
||||
|---|---|---|---|
|
||||
| `newAgentName` | string | No | Create a new agent. Omit to configure current workspace. |
|
||||
| `channelBinding` | `"telegram"` \| `"whatsapp"` | No | Channel to bind (with `newAgentName` only) |
|
||||
| `migrateFrom` | string | No | Agent ID to migrate channel binding from |
|
||||
| `models` | object | No | Model overrides per role and level (see [Configuration](CONFIGURATION.md#model-tiers)) |
|
||||
| `projectExecution` | `"parallel"` \| `"sequential"` | No | Project execution mode |
|
||||
|
||||
**What it does:**
|
||||
|
||||
1. Creates a new agent or configures existing workspace
|
||||
2. Optionally binds messaging channel (Telegram/WhatsApp)
|
||||
3. Optionally migrates channel binding from another agent
|
||||
4. Writes workspace files: AGENTS.md, HEARTBEAT.md, `projects/projects.json`
|
||||
5. Configures model tiers in `openclaw.json`
|
||||
|
||||
---
|
||||
|
||||
### `onboard`
|
||||
|
||||
Conversational onboarding guide. Returns step-by-step instructions for the agent to walk the user through setup.
|
||||
|
||||
**Source:** [`lib/tools/onboard.ts`](../lib/tools/onboard.ts)
|
||||
|
||||
**Context:** Works in DMs and via-agent. Blocks group chats (setup should not happen in project groups).
|
||||
|
||||
**Parameters:**
|
||||
|
||||
| Parameter | Type | Required | Description |
|
||||
|---|---|---|---|
|
||||
| `mode` | `"first-run"` \| `"reconfigure"` | No | Auto-detected from current state |
|
||||
|
||||
**Flow:**
|
||||
|
||||
1. Call `onboard` — returns QA-style step-by-step instructions
|
||||
2. Agent walks user through: agent selection, channel binding, model tiers
|
||||
3. Agent calls `setup` with collected answers
|
||||
4. User registers projects via `project_register` in group chats
|
||||
|
||||
---
|
||||
|
||||
## Completion Rules Reference
|
||||
|
||||
The pipeline service (`lib/services/pipeline.ts`) defines declarative completion rules:
|
||||
|
||||
```
|
||||
dev:done → Doing → To Test (git pull, detect PR)
|
||||
dev:blocked → Doing → To Do (return to queue)
|
||||
qa:pass → Testing → Done (close issue)
|
||||
qa:fail → Testing → To Improve (reopen issue)
|
||||
qa:refine → Testing → Refining (await human decision)
|
||||
qa:blocked → Testing → To Test (return to QA queue)
|
||||
```
|
||||
|
||||
## Issue Priority Order
|
||||
|
||||
When the heartbeat or `work_heartbeat` fills free worker slots, issues are prioritized:
|
||||
|
||||
1. **To Improve** — QA failures get fixed first (highest priority)
|
||||
2. **To Test** — Completed DEV work gets reviewed next
|
||||
3. **To Do** — Fresh tasks are picked up last
|
||||
|
||||
This ensures the pipeline clears its backlog before starting new work.
|
||||
Reference in New Issue
Block a user