Complete documentation rewrite to match the current codebase: - README: add benefits section (process consistency, token savings with estimates, project isolation, continuous planning, feedback loops, role-based prompts, atomic operations, audit trail), task workflow with state diagram, model-to-role mapping tables, installation guide - New TOOLS.md: complete reference for all 11 tools with parameters, behavior, and execution guards - New CONFIGURATION.md: full config reference for openclaw.json, projects.json, heartbeat, notifications, workspace layout - Fix tool names across all docs: task_pickup→work_start, task_complete→work_finish - Fix tier model: QA has reviewer/tester levels, not flat "qa" - Fix config schema: nested models.dev.*/models.qa.* structure - Fix prompt path: projects/roles/ not projects/prompts/ - Fix worker state: uses "level" field not "model"/"tier" - Fix MANAGEMENT.md: remove incorrect model references - Fix TESTING.md: update model config example to nested structure - Remove VERIFICATION.md (one-off checklist, no longer needed) - Add cross-references between all docs pages https://claude.ai/code/session_01R3rGevPY748gP4uK2ggYag
341 lines
14 KiB
Markdown
341 lines
14 KiB
Markdown
<p align="center">
|
|
<img src="assets/DevClaw.png" width="300" alt="DevClaw Logo">
|
|
</p>
|
|
|
|
# DevClaw — Development Plugin for OpenClaw
|
|
|
|
**Every group chat becomes an autonomous development team.**
|
|
|
|
Add an agent to a Telegram/WhatsApp group, point it at a GitHub/GitLab repo — that group now has an **orchestrator** managing the backlog, a **DEV** worker writing code, and a **QA** worker reviewing it. All autonomous. Add another group, get another team. Each project runs in complete isolation with its own task queue, workers, and session state.
|
|
|
|
DevClaw is the [OpenClaw](https://openclaw.ai) plugin that makes this work.
|
|
|
|
## Benefits
|
|
|
|
### Process consistency
|
|
|
|
Every task follows the same fixed pipeline — `Planning → To Do → Doing → To Test → Testing → Done` — across every project. Label transitions, state updates, session dispatch, and audit logging happen atomically inside the plugin. The orchestrator agent **cannot** skip a step, forget a label, or corrupt session state. Hundreds of lines of manual orchestration logic collapse into a single `work_start` call.
|
|
|
|
### Token savings
|
|
|
|
DevClaw reduces token consumption at three levels:
|
|
|
|
| Mechanism | How it works | Estimated savings |
|
|
|---|---|---|
|
|
| **Shared sessions** | Each developer level per role maintains one persistent session per project. When a medior dev finishes task A and picks up task B, the plugin reuses the existing session — no codebase re-reading. | **~40-60%** per task (~50K tokens saved per session reuse) |
|
|
| **Tier selection** | Junior for typos (Haiku), medior for features (Sonnet), senior for architecture (Opus). The right model for the job means you're not burning Opus tokens on a CSS fix. | **~30-50%** on simple tasks vs. always using the largest model |
|
|
| **Token-free heartbeat** | The heartbeat service runs every 60s doing health checks and queue dispatch using pure deterministic code + CLI calls. Zero LLM tokens consumed. Workers only use tokens when they actually process tasks. | **100%** savings on orchestration overhead |
|
|
|
|
### Project isolation and parallelization
|
|
|
|
Each project is fully isolated — separate task queue, separate worker state, separate sessions. No cross-project contamination. Two execution modes control parallelism:
|
|
|
|
- **Project-level**: DEV and QA can work simultaneously on different tasks (parallel, default) or one role at a time (sequential)
|
|
- **Plugin-level**: Multiple projects can have active workers at once (parallel, default) or only one project active at a time (sequential)
|
|
|
|
### Continuous planning
|
|
|
|
The heartbeat service runs a continuous loop: health check → queue scan → dispatch. It detects stale workers (>2 hours), auto-reverts stuck labels, and fills free worker slots — all without human intervention or agent LLM tokens. The orchestrator agent only gets involved when a decision requires judgment.
|
|
|
|
### Feedback loops
|
|
|
|
Three automated feedback loops keep the pipeline self-correcting:
|
|
|
|
1. **Auto-chaining** — DEV "done" automatically dispatches QA. QA "fail" automatically re-dispatches DEV. No orchestrator action needed.
|
|
2. **Stale worker watchdog** — Workers active >2 hours are auto-detected. Labels revert to queue, workers deactivated. Tasks available for retry.
|
|
3. **Completion enforcement** — Every task message includes a mandatory `work_finish` section. Workers use `"blocked"` if stuck. Three-layer guarantee prevents tasks from getting stuck forever.
|
|
|
|
### Role-based instruction prompts
|
|
|
|
Workers receive customizable, project-specific instructions loaded at dispatch time:
|
|
|
|
```
|
|
workspace/projects/roles/
|
|
├── my-webapp/
|
|
│ ├── dev.md ← "Run npm test before committing. Deploy URL: ..."
|
|
│ └── qa.md ← "Check OAuth flow. Verify mobile responsiveness."
|
|
└── default/
|
|
├── dev.md ← Fallback for projects without custom instructions
|
|
└── qa.md
|
|
```
|
|
|
|
Edit these files to inject deployment steps, test commands, acceptance criteria, or coding standards — per project, per role.
|
|
|
|
### Atomic operations with rollback
|
|
|
|
Every tool call wraps multiple operations (label transition + state update + session dispatch + audit log) into a single atomic action. If session dispatch fails, the label transition is rolled back. No orphaned state. No half-completed operations.
|
|
|
|
### Full audit trail
|
|
|
|
Every tool call automatically appends an NDJSON entry to `log/audit.log`. Query with `jq` to trace any task's full history. No manual logging required from the orchestrator.
|
|
|
|
---
|
|
|
|
## The model-to-role mapping
|
|
|
|
DevClaw doesn't expose raw model names. You're assigning a _junior developer_ to fix a typo, not configuring `anthropic/claude-haiku-4-5`. Each developer level maps to a configurable LLM:
|
|
|
|
### DEV levels
|
|
|
|
| Level | Who they are | Default model | Assigns to |
|
|
|---|---|---|---|
|
|
| `junior` | The intern | `anthropic/claude-haiku-4-5` | Typos, single-file fixes, CSS changes |
|
|
| `medior` | The reliable mid-level | `anthropic/claude-sonnet-4-5` | Features, bug fixes, multi-file changes |
|
|
| `senior` | The architect | `anthropic/claude-opus-4-5` | Architecture, migrations, system-wide refactoring |
|
|
|
|
### QA levels
|
|
|
|
| Level | Who they are | Default model | Assigns to |
|
|
|---|---|---|---|
|
|
| `reviewer` | The code reviewer | `anthropic/claude-sonnet-4-5` | Code review, test validation, PR inspection |
|
|
| `tester` | The QA tester | `anthropic/claude-haiku-4-5` | Manual testing, smoke tests |
|
|
|
|
The orchestrator LLM evaluates each issue and picks the appropriate level. A keyword-based heuristic in `model-selector.ts` serves as fallback when the orchestrator omits the level. Override which model powers each level in [`openclaw.json`](docs/CONFIGURATION.md#model-tiers).
|
|
|
|
---
|
|
|
|
## Task workflow
|
|
|
|
Every task (issue) moves through a fixed pipeline of label states. DevClaw tools handle every transition atomically.
|
|
|
|
```mermaid
|
|
stateDiagram-v2
|
|
[*] --> Planning
|
|
Planning --> ToDo: Ready for development
|
|
|
|
ToDo --> Doing: work_start (DEV) ⇄ blocked
|
|
Doing --> ToTest: work_finish (DEV done)
|
|
|
|
ToTest --> Testing: work_start (QA) / auto-chain ⇄ blocked
|
|
Testing --> Done: work_finish (QA pass)
|
|
Testing --> ToImprove: work_finish (QA fail)
|
|
Testing --> Refining: work_finish (QA refine)
|
|
|
|
ToImprove --> Doing: work_start (DEV fix) or auto-chain
|
|
Refining --> ToDo: Human decision
|
|
|
|
Done --> [*]
|
|
```
|
|
|
|
### The eight state labels
|
|
|
|
| Label | Color | Meaning |
|
|
|---|---|---|
|
|
| **Planning** | Blue-grey | Pre-work review — issue exists but not ready for development |
|
|
| **To Do** | Blue | Ready for DEV pickup |
|
|
| **Doing** | Orange | DEV actively working |
|
|
| **To Test** | Cyan | Ready for QA pickup |
|
|
| **Testing** | Purple | QA actively reviewing |
|
|
| **Done** | Green | Complete — issue closed |
|
|
| **To Improve** | Red | QA failed — back to DEV |
|
|
| **Refining** | Yellow | Awaiting human decision |
|
|
|
|
### Worker self-reporting
|
|
|
|
Workers call `work_finish` directly when they're done — no orchestrator involvement needed for the state transition. Workers can also call `task_create` to file follow-up issues they discover during work.
|
|
|
|
### Auto-chaining
|
|
|
|
When a project has auto-chaining enabled:
|
|
|
|
- **DEV "done"** → QA is dispatched immediately (using the reviewer level)
|
|
- **QA "fail"** → DEV fix is dispatched immediately (reuses previous DEV level)
|
|
- **QA "pass" / "refine" / "blocked"** → no chaining (pipeline done, needs human input, or returned to queue)
|
|
- **DEV "blocked"** → no chaining (returned to queue for retry)
|
|
|
|
### Completion enforcement
|
|
|
|
Three layers guarantee tasks never get stuck:
|
|
|
|
1. **Completion contract** — Every task message includes a mandatory section requiring `work_finish`, even on failure. Workers use `"blocked"` if stuck.
|
|
2. **Blocked result** — Both DEV and QA can gracefully put a task back in queue (`Doing → To Do`, `Testing → To Test`).
|
|
3. **Stale worker watchdog** — Heartbeat detects workers active >2 hours and auto-reverts labels to queue.
|
|
|
|
---
|
|
|
|
## Installation
|
|
|
|
### Requirements
|
|
|
|
| Requirement | Why | Verify |
|
|
|---|---|---|
|
|
| [OpenClaw](https://openclaw.ai) | DevClaw is an OpenClaw plugin | `openclaw --version` |
|
|
| Node.js >= 20 | Plugin runtime | `node --version` |
|
|
| [`gh`](https://cli.github.com) or [`glab`](https://gitlab.com/gitlab-org/cli) CLI | Issue tracker provider (auto-detected from git remote) | `gh --version` / `glab --version` |
|
|
| CLI authenticated | Plugin calls gh/glab for every label transition | `gh auth status` / `glab auth status` |
|
|
|
|
### Install the plugin
|
|
|
|
```bash
|
|
cp -r devclaw ~/.openclaw/extensions/
|
|
```
|
|
|
|
Verify:
|
|
|
|
```bash
|
|
openclaw plugins list
|
|
# Should show: DevClaw | devclaw | loaded
|
|
```
|
|
|
|
### Run setup
|
|
|
|
Three options — pick one:
|
|
|
|
**Option A: Conversational onboarding (recommended)**
|
|
|
|
Call the `onboard` tool from any agent with DevClaw loaded. It walks through configuration step by step.
|
|
|
|
**Option B: CLI wizard**
|
|
|
|
```bash
|
|
openclaw devclaw setup
|
|
```
|
|
|
|
**Option C: Non-interactive CLI**
|
|
|
|
```bash
|
|
openclaw devclaw setup --new-agent "My Orchestrator"
|
|
```
|
|
|
|
Setup creates an agent, configures model tiers, writes workspace files (AGENTS.md, HEARTBEAT.md, role templates), and optionally binds a messaging channel.
|
|
|
|
### Register a project
|
|
|
|
In the Telegram/WhatsApp group for the project:
|
|
|
|
> "Register project my-app at ~/git/my-app with base branch main"
|
|
|
|
The agent calls `project_register`, which atomically creates all 8 state labels, scaffolds role instruction files, and adds the project to `projects.json`.
|
|
|
|
### Start working
|
|
|
|
```
|
|
"Check the queue" → agent calls status
|
|
"Pick up issue #1 for DEV" → agent calls work_start
|
|
[DEV works autonomously] → calls work_finish when done
|
|
[Heartbeat fills next slot] → QA dispatched automatically
|
|
```
|
|
|
|
See the [Onboarding Guide](docs/ONBOARDING.md) for detailed step-by-step instructions.
|
|
|
|
---
|
|
|
|
## How it works
|
|
|
|
```mermaid
|
|
graph TB
|
|
subgraph "Group Chat A"
|
|
direction TB
|
|
A_O["Orchestrator"]
|
|
A_GL[GitHub/GitLab Issues]
|
|
A_DEV["DEV (worker session)"]
|
|
A_QA["QA (worker session)"]
|
|
A_O -->|work_start| A_GL
|
|
A_O -->|dispatches| A_DEV
|
|
A_O -->|dispatches| A_QA
|
|
end
|
|
|
|
subgraph "Group Chat B"
|
|
direction TB
|
|
B_O["Orchestrator"]
|
|
B_GL[GitHub/GitLab Issues]
|
|
B_DEV["DEV (worker session)"]
|
|
B_QA["QA (worker session)"]
|
|
B_O -->|work_start| B_GL
|
|
B_O -->|dispatches| B_DEV
|
|
B_O -->|dispatches| B_QA
|
|
end
|
|
|
|
AGENT["Single OpenClaw Agent"]
|
|
AGENT --- A_O
|
|
AGENT --- B_O
|
|
```
|
|
|
|
Same agent process — each group chat gives it a different project context. The orchestrator role, the workers, the task queue, and all state are fully isolated per group.
|
|
|
|
---
|
|
|
|
## Session reuse
|
|
|
|
Worker sessions are expensive to start — each new spawn reads the full codebase (~50K tokens). DevClaw maintains **separate sessions per level per role** (session-per-level design). When a medior dev finishes task A and picks up task B on the same project, the plugin detects the existing session and sends the task directly.
|
|
|
|
The plugin handles session dispatch internally via OpenClaw CLI. The orchestrator agent never calls `sessions_spawn` or `sessions_send` — it calls `work_start` and the plugin does the rest.
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant O as Orchestrator
|
|
participant DC as DevClaw Plugin
|
|
participant IT as Issue Tracker
|
|
participant S as Worker Session
|
|
|
|
O->>DC: work_start({ issueId: 42, role: "dev" })
|
|
DC->>IT: Fetch issue, verify label
|
|
DC->>DC: Assign level (junior/medior/senior)
|
|
DC->>DC: Check existing session for assigned level
|
|
DC->>IT: Transition label (To Do → Doing)
|
|
DC->>S: Dispatch task via CLI (create or reuse session)
|
|
DC->>DC: Update projects.json, write audit log
|
|
DC-->>O: { success: true, announcement: "..." }
|
|
```
|
|
|
|
---
|
|
|
|
## Tools
|
|
|
|
DevClaw registers **11 tools**, grouped by function:
|
|
|
|
### Worker lifecycle
|
|
|
|
| Tool | Description |
|
|
|---|---|
|
|
| [`work_start`](docs/TOOLS.md#work_start) | Pick up a task — handles level assignment, label transition, session dispatch, audit |
|
|
| [`work_finish`](docs/TOOLS.md#work_finish) | Complete a task — handles label transition, state update, auto-chaining, queue tick |
|
|
|
|
### Task management
|
|
|
|
| Tool | Description |
|
|
|---|---|
|
|
| [`task_create`](docs/TOOLS.md#task_create) | Create a new issue in the tracker |
|
|
| [`task_update`](docs/TOOLS.md#task_update) | Change an issue's state label manually |
|
|
| [`task_comment`](docs/TOOLS.md#task_comment) | Add a comment to an issue |
|
|
|
|
### Operations
|
|
|
|
| Tool | Description |
|
|
|---|---|
|
|
| [`status`](docs/TOOLS.md#status) | Queue counts + worker state dashboard |
|
|
| [`health`](docs/TOOLS.md#health) | Worker health checks + zombie detection |
|
|
| [`work_heartbeat`](docs/TOOLS.md#work_heartbeat) | Manual trigger for health + queue dispatch |
|
|
|
|
### Setup
|
|
|
|
| Tool | Description |
|
|
|---|---|
|
|
| [`project_register`](docs/TOOLS.md#project_register) | One-time project setup (labels, prompts, state) |
|
|
| [`setup`](docs/TOOLS.md#setup) | Agent + workspace initialization |
|
|
| [`onboard`](docs/TOOLS.md#onboard) | Conversational onboarding guide |
|
|
|
|
See the [Tools Reference](docs/TOOLS.md) for full parameters and usage.
|
|
|
|
---
|
|
|
|
## Documentation
|
|
|
|
| Document | Description |
|
|
|---|---|
|
|
| [Architecture](docs/ARCHITECTURE.md) | System design, session-per-level model, data flow, component interactions |
|
|
| [Tools Reference](docs/TOOLS.md) | Complete reference for all 11 tools with parameters and examples |
|
|
| [Configuration](docs/CONFIGURATION.md) | Full config reference — `openclaw.json`, `projects.json`, heartbeat, notifications |
|
|
| [Onboarding Guide](docs/ONBOARDING.md) | Step-by-step setup: install, configure, register projects, test the pipeline |
|
|
| [QA Workflow](docs/QA_WORKFLOW.md) | QA process: review documentation, comment templates, enforcement |
|
|
| [Context Awareness](docs/CONTEXT-AWARENESS.md) | How DevClaw adapts behavior based on interaction context |
|
|
| [Testing Guide](docs/TESTING.md) | Automated test suite: scenarios, fixtures, CI/CD integration |
|
|
| [Management Theory](docs/MANAGEMENT.md) | The delegation theory behind DevClaw's design |
|
|
| [Roadmap](docs/ROADMAP.md) | Planned features: configurable roles, channel-agnostic groups, Jira |
|
|
|
|
---
|
|
|
|
## License
|
|
|
|
MIT
|