Files
devclaw-gitea/README.md
Claude 553efcc146 docs: overhaul documentation for consistency with implementation
Complete documentation rewrite to match the current codebase:

- README: add benefits section (process consistency, token savings with
  estimates, project isolation, continuous planning, feedback loops,
  role-based prompts, atomic operations, audit trail), task workflow
  with state diagram, model-to-role mapping tables, installation guide
- New TOOLS.md: complete reference for all 11 tools with parameters,
  behavior, and execution guards
- New CONFIGURATION.md: full config reference for openclaw.json,
  projects.json, heartbeat, notifications, workspace layout
- Fix tool names across all docs: task_pickup→work_start,
  task_complete→work_finish
- Fix tier model: QA has reviewer/tester levels, not flat "qa"
- Fix config schema: nested models.dev.*/models.qa.* structure
- Fix prompt path: projects/roles/ not projects/prompts/
- Fix worker state: uses "level" field not "model"/"tier"
- Fix MANAGEMENT.md: remove incorrect model references
- Fix TESTING.md: update model config example to nested structure
- Remove VERIFICATION.md (one-off checklist, no longer needed)
- Add cross-references between all docs pages

https://claude.ai/code/session_01R3rGevPY748gP4uK2ggYag
2026-02-10 20:13:22 +00:00

341 lines
14 KiB
Markdown

<p align="center">
<img src="assets/DevClaw.png" width="300" alt="DevClaw Logo">
</p>
# DevClaw — Development Plugin for OpenClaw
**Every group chat becomes an autonomous development team.**
Add an agent to a Telegram/WhatsApp group, point it at a GitHub/GitLab repo — that group now has an **orchestrator** managing the backlog, a **DEV** worker writing code, and a **QA** worker reviewing it. All autonomous. Add another group, get another team. Each project runs in complete isolation with its own task queue, workers, and session state.
DevClaw is the [OpenClaw](https://openclaw.ai) plugin that makes this work.
## Benefits
### Process consistency
Every task follows the same fixed pipeline — `Planning → To Do → Doing → To Test → Testing → Done` — across every project. Label transitions, state updates, session dispatch, and audit logging happen atomically inside the plugin. The orchestrator agent **cannot** skip a step, forget a label, or corrupt session state. Hundreds of lines of manual orchestration logic collapse into a single `work_start` call.
### Token savings
DevClaw reduces token consumption at three levels:
| Mechanism | How it works | Estimated savings |
|---|---|---|
| **Shared sessions** | Each developer level per role maintains one persistent session per project. When a medior dev finishes task A and picks up task B, the plugin reuses the existing session — no codebase re-reading. | **~40-60%** per task (~50K tokens saved per session reuse) |
| **Tier selection** | Junior for typos (Haiku), medior for features (Sonnet), senior for architecture (Opus). The right model for the job means you're not burning Opus tokens on a CSS fix. | **~30-50%** on simple tasks vs. always using the largest model |
| **Token-free heartbeat** | The heartbeat service runs every 60s doing health checks and queue dispatch using pure deterministic code + CLI calls. Zero LLM tokens consumed. Workers only use tokens when they actually process tasks. | **100%** savings on orchestration overhead |
### Project isolation and parallelization
Each project is fully isolated — separate task queue, separate worker state, separate sessions. No cross-project contamination. Two execution modes control parallelism:
- **Project-level**: DEV and QA can work simultaneously on different tasks (parallel, default) or one role at a time (sequential)
- **Plugin-level**: Multiple projects can have active workers at once (parallel, default) or only one project active at a time (sequential)
### Continuous planning
The heartbeat service runs a continuous loop: health check → queue scan → dispatch. It detects stale workers (>2 hours), auto-reverts stuck labels, and fills free worker slots — all without human intervention or agent LLM tokens. The orchestrator agent only gets involved when a decision requires judgment.
### Feedback loops
Three automated feedback loops keep the pipeline self-correcting:
1. **Auto-chaining** — DEV "done" automatically dispatches QA. QA "fail" automatically re-dispatches DEV. No orchestrator action needed.
2. **Stale worker watchdog** — Workers active >2 hours are auto-detected. Labels revert to queue, workers deactivated. Tasks available for retry.
3. **Completion enforcement** — Every task message includes a mandatory `work_finish` section. Workers use `"blocked"` if stuck. Three-layer guarantee prevents tasks from getting stuck forever.
### Role-based instruction prompts
Workers receive customizable, project-specific instructions loaded at dispatch time:
```
workspace/projects/roles/
├── my-webapp/
│ ├── dev.md ← "Run npm test before committing. Deploy URL: ..."
│ └── qa.md ← "Check OAuth flow. Verify mobile responsiveness."
└── default/
├── dev.md ← Fallback for projects without custom instructions
└── qa.md
```
Edit these files to inject deployment steps, test commands, acceptance criteria, or coding standards — per project, per role.
### Atomic operations with rollback
Every tool call wraps multiple operations (label transition + state update + session dispatch + audit log) into a single atomic action. If session dispatch fails, the label transition is rolled back. No orphaned state. No half-completed operations.
### Full audit trail
Every tool call automatically appends an NDJSON entry to `log/audit.log`. Query with `jq` to trace any task's full history. No manual logging required from the orchestrator.
---
## The model-to-role mapping
DevClaw doesn't expose raw model names. You're assigning a _junior developer_ to fix a typo, not configuring `anthropic/claude-haiku-4-5`. Each developer level maps to a configurable LLM:
### DEV levels
| Level | Who they are | Default model | Assigns to |
|---|---|---|---|
| `junior` | The intern | `anthropic/claude-haiku-4-5` | Typos, single-file fixes, CSS changes |
| `medior` | The reliable mid-level | `anthropic/claude-sonnet-4-5` | Features, bug fixes, multi-file changes |
| `senior` | The architect | `anthropic/claude-opus-4-5` | Architecture, migrations, system-wide refactoring |
### QA levels
| Level | Who they are | Default model | Assigns to |
|---|---|---|---|
| `reviewer` | The code reviewer | `anthropic/claude-sonnet-4-5` | Code review, test validation, PR inspection |
| `tester` | The QA tester | `anthropic/claude-haiku-4-5` | Manual testing, smoke tests |
The orchestrator LLM evaluates each issue and picks the appropriate level. A keyword-based heuristic in `model-selector.ts` serves as fallback when the orchestrator omits the level. Override which model powers each level in [`openclaw.json`](docs/CONFIGURATION.md#model-tiers).
---
## Task workflow
Every task (issue) moves through a fixed pipeline of label states. DevClaw tools handle every transition atomically.
```mermaid
stateDiagram-v2
[*] --> Planning
Planning --> ToDo: Ready for development
ToDo --> Doing: work_start (DEV) ⇄ blocked
Doing --> ToTest: work_finish (DEV done)
ToTest --> Testing: work_start (QA) / auto-chain ⇄ blocked
Testing --> Done: work_finish (QA pass)
Testing --> ToImprove: work_finish (QA fail)
Testing --> Refining: work_finish (QA refine)
ToImprove --> Doing: work_start (DEV fix) or auto-chain
Refining --> ToDo: Human decision
Done --> [*]
```
### The eight state labels
| Label | Color | Meaning |
|---|---|---|
| **Planning** | Blue-grey | Pre-work review — issue exists but not ready for development |
| **To Do** | Blue | Ready for DEV pickup |
| **Doing** | Orange | DEV actively working |
| **To Test** | Cyan | Ready for QA pickup |
| **Testing** | Purple | QA actively reviewing |
| **Done** | Green | Complete — issue closed |
| **To Improve** | Red | QA failed — back to DEV |
| **Refining** | Yellow | Awaiting human decision |
### Worker self-reporting
Workers call `work_finish` directly when they're done — no orchestrator involvement needed for the state transition. Workers can also call `task_create` to file follow-up issues they discover during work.
### Auto-chaining
When a project has auto-chaining enabled:
- **DEV "done"** → QA is dispatched immediately (using the reviewer level)
- **QA "fail"** → DEV fix is dispatched immediately (reuses previous DEV level)
- **QA "pass" / "refine" / "blocked"** → no chaining (pipeline done, needs human input, or returned to queue)
- **DEV "blocked"** → no chaining (returned to queue for retry)
### Completion enforcement
Three layers guarantee tasks never get stuck:
1. **Completion contract** — Every task message includes a mandatory section requiring `work_finish`, even on failure. Workers use `"blocked"` if stuck.
2. **Blocked result** — Both DEV and QA can gracefully put a task back in queue (`Doing → To Do`, `Testing → To Test`).
3. **Stale worker watchdog** — Heartbeat detects workers active >2 hours and auto-reverts labels to queue.
---
## Installation
### Requirements
| Requirement | Why | Verify |
|---|---|---|
| [OpenClaw](https://openclaw.ai) | DevClaw is an OpenClaw plugin | `openclaw --version` |
| Node.js >= 20 | Plugin runtime | `node --version` |
| [`gh`](https://cli.github.com) or [`glab`](https://gitlab.com/gitlab-org/cli) CLI | Issue tracker provider (auto-detected from git remote) | `gh --version` / `glab --version` |
| CLI authenticated | Plugin calls gh/glab for every label transition | `gh auth status` / `glab auth status` |
### Install the plugin
```bash
cp -r devclaw ~/.openclaw/extensions/
```
Verify:
```bash
openclaw plugins list
# Should show: DevClaw | devclaw | loaded
```
### Run setup
Three options — pick one:
**Option A: Conversational onboarding (recommended)**
Call the `onboard` tool from any agent with DevClaw loaded. It walks through configuration step by step.
**Option B: CLI wizard**
```bash
openclaw devclaw setup
```
**Option C: Non-interactive CLI**
```bash
openclaw devclaw setup --new-agent "My Orchestrator"
```
Setup creates an agent, configures model tiers, writes workspace files (AGENTS.md, HEARTBEAT.md, role templates), and optionally binds a messaging channel.
### Register a project
In the Telegram/WhatsApp group for the project:
> "Register project my-app at ~/git/my-app with base branch main"
The agent calls `project_register`, which atomically creates all 8 state labels, scaffolds role instruction files, and adds the project to `projects.json`.
### Start working
```
"Check the queue" → agent calls status
"Pick up issue #1 for DEV" → agent calls work_start
[DEV works autonomously] → calls work_finish when done
[Heartbeat fills next slot] → QA dispatched automatically
```
See the [Onboarding Guide](docs/ONBOARDING.md) for detailed step-by-step instructions.
---
## How it works
```mermaid
graph TB
subgraph "Group Chat A"
direction TB
A_O["Orchestrator"]
A_GL[GitHub/GitLab Issues]
A_DEV["DEV (worker session)"]
A_QA["QA (worker session)"]
A_O -->|work_start| A_GL
A_O -->|dispatches| A_DEV
A_O -->|dispatches| A_QA
end
subgraph "Group Chat B"
direction TB
B_O["Orchestrator"]
B_GL[GitHub/GitLab Issues]
B_DEV["DEV (worker session)"]
B_QA["QA (worker session)"]
B_O -->|work_start| B_GL
B_O -->|dispatches| B_DEV
B_O -->|dispatches| B_QA
end
AGENT["Single OpenClaw Agent"]
AGENT --- A_O
AGENT --- B_O
```
Same agent process — each group chat gives it a different project context. The orchestrator role, the workers, the task queue, and all state are fully isolated per group.
---
## Session reuse
Worker sessions are expensive to start — each new spawn reads the full codebase (~50K tokens). DevClaw maintains **separate sessions per level per role** (session-per-level design). When a medior dev finishes task A and picks up task B on the same project, the plugin detects the existing session and sends the task directly.
The plugin handles session dispatch internally via OpenClaw CLI. The orchestrator agent never calls `sessions_spawn` or `sessions_send` — it calls `work_start` and the plugin does the rest.
```mermaid
sequenceDiagram
participant O as Orchestrator
participant DC as DevClaw Plugin
participant IT as Issue Tracker
participant S as Worker Session
O->>DC: work_start({ issueId: 42, role: "dev" })
DC->>IT: Fetch issue, verify label
DC->>DC: Assign level (junior/medior/senior)
DC->>DC: Check existing session for assigned level
DC->>IT: Transition label (To Do → Doing)
DC->>S: Dispatch task via CLI (create or reuse session)
DC->>DC: Update projects.json, write audit log
DC-->>O: { success: true, announcement: "..." }
```
---
## Tools
DevClaw registers **11 tools**, grouped by function:
### Worker lifecycle
| Tool | Description |
|---|---|
| [`work_start`](docs/TOOLS.md#work_start) | Pick up a task — handles level assignment, label transition, session dispatch, audit |
| [`work_finish`](docs/TOOLS.md#work_finish) | Complete a task — handles label transition, state update, auto-chaining, queue tick |
### Task management
| Tool | Description |
|---|---|
| [`task_create`](docs/TOOLS.md#task_create) | Create a new issue in the tracker |
| [`task_update`](docs/TOOLS.md#task_update) | Change an issue's state label manually |
| [`task_comment`](docs/TOOLS.md#task_comment) | Add a comment to an issue |
### Operations
| Tool | Description |
|---|---|
| [`status`](docs/TOOLS.md#status) | Queue counts + worker state dashboard |
| [`health`](docs/TOOLS.md#health) | Worker health checks + zombie detection |
| [`work_heartbeat`](docs/TOOLS.md#work_heartbeat) | Manual trigger for health + queue dispatch |
### Setup
| Tool | Description |
|---|---|
| [`project_register`](docs/TOOLS.md#project_register) | One-time project setup (labels, prompts, state) |
| [`setup`](docs/TOOLS.md#setup) | Agent + workspace initialization |
| [`onboard`](docs/TOOLS.md#onboard) | Conversational onboarding guide |
See the [Tools Reference](docs/TOOLS.md) for full parameters and usage.
---
## Documentation
| Document | Description |
|---|---|
| [Architecture](docs/ARCHITECTURE.md) | System design, session-per-level model, data flow, component interactions |
| [Tools Reference](docs/TOOLS.md) | Complete reference for all 11 tools with parameters and examples |
| [Configuration](docs/CONFIGURATION.md) | Full config reference — `openclaw.json`, `projects.json`, heartbeat, notifications |
| [Onboarding Guide](docs/ONBOARDING.md) | Step-by-step setup: install, configure, register projects, test the pipeline |
| [QA Workflow](docs/QA_WORKFLOW.md) | QA process: review documentation, comment templates, enforcement |
| [Context Awareness](docs/CONTEXT-AWARENESS.md) | How DevClaw adapts behavior based on interaction context |
| [Testing Guide](docs/TESTING.md) | Automated test suite: scenarios, fixtures, CI/CD integration |
| [Management Theory](docs/MANAGEMENT.md) | The delegation theory behind DevClaw's design |
| [Roadmap](docs/ROADMAP.md) | Planned features: configurable roles, channel-agnostic groups, Jira |
---
## License
MIT