refactor: rename QA role to Tester and update related documentation
- Updated role references from "QA" to "Tester" in workflow and code comments. - Revised documentation to reflect the new role structure, including role instructions and completion rules. - Enhanced the testing guide with clearer instructions and examples for unit and E2E tests. - Improved tools reference to align with the new role definitions and completion rules. - Adjusted the roadmap to highlight recent changes in role configuration and workflow state machine.
This commit is contained in:
@@ -1,53 +1,77 @@
|
||||
# DevClaw — Roadmap
|
||||
|
||||
## Configurable Roles
|
||||
## Recently Completed
|
||||
|
||||
Currently DevClaw has two hardcoded roles: **DEV** and **QA**. Each project gets one worker slot per role. The pipeline is fixed: DEV writes code, QA reviews it.
|
||||
### Dynamic Roles and Role Registry
|
||||
|
||||
This works for the common case but breaks down when you want:
|
||||
- A **design** role that creates mockups before DEV starts
|
||||
- A **devops** role that handles deployment after QA passes
|
||||
- A **PM** role that triages and prioritizes the backlog
|
||||
- Multiple DEV workers in parallel (e.g. frontend + backend)
|
||||
- A project with no QA step at all
|
||||
Roles are no longer hardcoded. The `ROLE_REGISTRY` in `lib/roles/registry.ts` defines three built-in roles — **developer**, **tester**, **architect** — each with configurable levels, models, emoji, and completion results. Adding a new role means adding one entry to the registry; everything else (workers, sessions, labels, prompts) derives from it.
|
||||
|
||||
### Planned: role configuration per project
|
||||
All roles use a unified junior/medior/senior level scheme (architect uses junior/senior). Per-role model overrides live in `workflow.yaml`.
|
||||
|
||||
Roles become a configurable list instead of a hardcoded pair. Each role defines:
|
||||
- **Name** — e.g. `design`, `dev`, `qa`, `devops`
|
||||
- **Levels** — which developer levels can be assigned (e.g. design only needs `medior`)
|
||||
- **Pipeline position** — where it sits in the task lifecycle
|
||||
- **Worker count** — how many concurrent workers (default: 1)
|
||||
### Workflow State Machine
|
||||
|
||||
```json
|
||||
{
|
||||
"roles": {
|
||||
"dev": { "levels": ["junior", "medior", "senior"], "workers": 1 },
|
||||
"qa": { "levels": ["reviewer", "tester"], "workers": 1 },
|
||||
"devops": { "levels": ["medior", "senior"], "workers": 1 }
|
||||
},
|
||||
"pipeline": ["dev", "qa", "devops"]
|
||||
}
|
||||
The issue lifecycle is now a configurable state machine defined in `workflow.yaml`. The default workflow defines 11 states:
|
||||
|
||||
```
|
||||
Planning → To Do → Doing → To Test → Testing → Done
|
||||
↘ In Review → (PR merged) → To Test
|
||||
↘ To Improve → Doing
|
||||
↘ Refining → (human decision)
|
||||
To Design → Designing → Planning
|
||||
```
|
||||
|
||||
The pipeline definition replaces the hardcoded `Doing → To Test → Testing → Done` flow. Labels and transitions are generated from the pipeline config. The scheduler follows the pipeline order when filling free slots.
|
||||
States have types (`queue`, `active`, `hold`, `review`, `terminal`), transitions with actions (`gitPull`, `detectPr`, `closeIssue`, `reopenIssue`), and review checks (`prMerged`, `prApproved`).
|
||||
|
||||
### Open questions
|
||||
### Three-Layer Configuration
|
||||
|
||||
- How do custom labels map? Generate from role names, or let users define?
|
||||
- Should roles have their own instruction files (`projects/roles/<project>/<role>.md`) — yes, this already works
|
||||
- How to handle parallel roles (e.g. frontend + backend DEV in parallel before QA)?
|
||||
Config resolution follows three layers, each partially overriding the one below:
|
||||
|
||||
1. **Built-in defaults** — `ROLE_REGISTRY` + `DEFAULT_WORKFLOW`
|
||||
2. **Workspace** — `<workspace>/devclaw/workflow.yaml`
|
||||
3. **Project** — `<workspace>/devclaw/projects/<project>/workflow.yaml`
|
||||
|
||||
Validated at load time with Zod schemas (`lib/config/schema.ts`). Integrity checks verify transition targets exist, queue states have roles, and terminal states have no outgoing transitions.
|
||||
|
||||
### Provider Resilience
|
||||
|
||||
All issue tracker calls (GitHub via `gh`, GitLab via `glab`) are wrapped with cockatiel retry (3 attempts, exponential backoff) and circuit breaker (opens after 5 consecutive failures, half-opens after 30s). See `lib/providers/resilience.ts`.
|
||||
|
||||
### Bootstrap Hook for Role Instructions
|
||||
|
||||
Worker sessions receive role-specific instructions via the `agent:bootstrap` hook at session startup, not appended to the task message. The hook reads from `devclaw/projects/<project>/prompts/<role>.md`, falling back to `devclaw/prompts/<role>.md`. Supports source tracking with `loadRoleInstructions(dir, { withSource: true })`.
|
||||
|
||||
### In Review State and PR Polling
|
||||
|
||||
DEVELOPER can submit a PR for human review (`result: "review"`), which transitions the issue to `In Review`. The heartbeat's review pass polls PR status via `getPrStatus()` on the provider. When the PR is merged, the issue auto-transitions to `To Test` for TESTER pickup.
|
||||
|
||||
### Architect Role
|
||||
|
||||
The architect role enables design investigations. `design_task` creates a `To Design` issue and dispatches an architect worker. The architect completes with `done` (→ Planning) or `blocked` (→ Refining).
|
||||
|
||||
### Workspace Layout Migration
|
||||
|
||||
Data directory moved from `<workspace>/projects/` to `<workspace>/devclaw/`. Automatic migration on first load — see `lib/setup/migrate-layout.ts`.
|
||||
|
||||
### E2E Test Infrastructure
|
||||
|
||||
Purpose-built test harness (`lib/testing/`) with:
|
||||
- `TestProvider` — in-memory `IssueProvider` with call tracking
|
||||
- `createTestHarness()` — scaffolds temp workspace, mock `runCommand`, test provider
|
||||
- `simulateBootstrap()` — tests the full bootstrap hook chain without a live gateway
|
||||
- `CommandInterceptor` — captures and filters CLI calls
|
||||
|
||||
---
|
||||
|
||||
## Channel-agnostic Groups
|
||||
## Planned
|
||||
|
||||
### Channel-agnostic Groups
|
||||
|
||||
Currently DevClaw maps projects to **Telegram group IDs**. The `projectGroupId` is a Telegram-specific negative number. This means:
|
||||
- WhatsApp groups can't be used as project channels (partially supported now via `channel` field)
|
||||
- Discord, Slack, or other channels are excluded
|
||||
- The naming (`groupId`, `groupName`) is Telegram-specific
|
||||
|
||||
### Planned: abstract channel binding
|
||||
**Planned: abstract channel binding**
|
||||
|
||||
Replace Telegram-specific group IDs with a generic channel identifier that works across any OpenClaw channel.
|
||||
|
||||
@@ -57,14 +81,12 @@ Replace Telegram-specific group IDs with a generic channel identifier that works
|
||||
"whatsapp:120363140032870788@g.us": {
|
||||
"name": "my-project",
|
||||
"channel": "whatsapp",
|
||||
"peer": "120363140032870788@g.us",
|
||||
...
|
||||
"peer": "120363140032870788@g.us"
|
||||
},
|
||||
"telegram:-1234567890": {
|
||||
"name": "other-project",
|
||||
"channel": "telegram",
|
||||
"peer": "-1234567890",
|
||||
...
|
||||
"peer": "-1234567890"
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -79,7 +101,7 @@ Key changes:
|
||||
|
||||
This enables any OpenClaw channel (Telegram, WhatsApp, Discord, Slack, etc.) to host a project.
|
||||
|
||||
### Open questions
|
||||
#### Open questions
|
||||
|
||||
- Should one project be bindable to multiple channels? (e.g. Telegram for devs, WhatsApp for stakeholder updates)
|
||||
- How does the orchestrator agent handle cross-channel context?
|
||||
@@ -89,8 +111,9 @@ This enables any OpenClaw channel (Telegram, WhatsApp, Discord, Slack, etc.) to
|
||||
## Other Ideas
|
||||
|
||||
- **Jira provider** — `IssueProvider` interface already abstracts GitHub/GitLab; Jira is the obvious next addition
|
||||
- **Deployment integration** — `work_finish` QA pass could trigger a deploy step via webhook or CLI
|
||||
- **Deployment integration** — `work_finish` TESTER pass could trigger a deploy step via webhook or CLI
|
||||
- **Cost tracking** — log token usage per task/level, surface in `status`
|
||||
- **Priority scoring** — automatic priority assignment based on labels, age, and dependencies
|
||||
- **Session archival** — auto-archive idle sessions after configurable timeout (currently indefinite)
|
||||
- **Progressive delegation** — track QA pass rates per level and auto-promote (see [Management Theory](MANAGEMENT.md))
|
||||
- **Progressive delegation** — track TESTER pass rates per level and auto-promote (see [Management Theory](MANAGEMENT.md))
|
||||
- **Custom workflow actions** — user-defined actions in `workflow.yaml` (e.g. deploy scripts, notifications)
|
||||
|
||||
Reference in New Issue
Block a user