feat: refactor model selection to use developer tiers

- Replaced raw model aliases with developer tiers (junior, medior, senior, qa) in dispatch and model selection logic.
- Updated `dispatchTask` to resolve models based on tiers and plugin configuration.
- Modified `selectModel` to return tier names instead of model aliases based on task description.
- Implemented migration logic for transitioning from old model aliases to new tier names in worker state.
- Added setup logic for agent creation and model configuration in `setup.ts`.
- Created shared templates for workspace files and instructions for DEV/QA workers.
- Enhanced project registration to scaffold role files based on developer tiers.
- Updated task management tools to reflect changes in model selection and tier assignment.
- Introduced a new `devclaw_setup` tool for agent-driven setup and configuration.
- Updated plugin configuration schema to support model mapping per developer tier.
This commit is contained in:
Lauren ten Hoor
2026-02-09 13:41:22 +08:00
parent 8a79755e4c
commit aa8e8dbd1b
16 changed files with 1162 additions and 257 deletions

114
README.md
View File

@@ -14,9 +14,22 @@ DevClaw fills that gap with guardrails. It gives the orchestrator atomic tools t
## The idea
One orchestrator agent manages all your projects. It reads task backlogs, creates issues, decides priorities, and delegates work. For each task, DevClaw creates (or reuses) a **DEV** worker session to write code or a **QA** worker session to review it. Every Telegram group is a separate project — the orchestrator keeps them completely isolated while managing them all from a single process.
One orchestrator agent manages all your projects. It reads task backlogs, creates issues, decides priorities, and delegates work. For each task, DevClaw assigns a developer from your **team** — a junior, medior, or senior dev writes the code, then a QA engineer reviews it. Every Telegram group is a separate project — the orchestrator keeps them completely isolated while managing them all from a single process.
DevClaw gives the orchestrator six tools that replace hundreds of lines of manual orchestration logic. Instead of following a 10-step checklist per task (fetch issue, check labels, pick model, check for existing session, transition label, dispatch task, update state, log audit event...), it calls `task_pickup` and the plugin handles everything atomically — including session dispatch. Workers call `task_complete` themselves for atomic state updates, and can file follow-up issues via `task_create`.
DevClaw gives the orchestrator seven tools that replace hundreds of lines of manual orchestration logic. Instead of following a 10-step checklist per task (fetch issue, check labels, pick model, check for existing session, transition label, dispatch task, update state, log audit event...), it calls `task_pickup` and the plugin handles everything atomically — including session dispatch. Workers call `task_complete` themselves for atomic state updates, and can file follow-up issues via `task_create`.
## Developer tiers
DevClaw uses a developer seniority model. Each tier maps to a configurable LLM model:
| Tier | Role | Default model | Assigns to |
|------|------|---------------|------------|
| **junior** | Junior developer | `anthropic/claude-haiku-4-5` | Typos, single-file fixes, simple changes |
| **medior** | Mid-level developer | `anthropic/claude-sonnet-4-5` | Features, bug fixes, multi-file changes |
| **senior** | Senior developer | `anthropic/claude-opus-4-5` | Architecture, migrations, system-wide refactoring |
| **qa** | QA engineer | `anthropic/claude-sonnet-4-5` | Code review, test validation |
Configure which model each tier uses during setup or in `openclaw.json` plugin config.
## How it works
@@ -93,15 +106,15 @@ Workers (DEV/QA sub-agent sessions) call `task_complete` directly when they fini
### Auto-chaining
When a project has `autoChain: true`, `task_complete` automatically dispatches the next step:
- **DEV "done"** → QA is dispatched immediately (default model: grok)
- **QA "fail"** → DEV fix is dispatched immediately (reuses previous DEV model)
- **DEV "done"** → QA is dispatched immediately (using the qa tier)
- **QA "fail"** → DEV fix is dispatched immediately (reuses previous DEV tier)
- **QA "pass" / "refine"** → no chaining (pipeline done or needs human input)
When `autoChain` is false, `task_complete` returns a `nextAction` hint for the orchestrator to act on.
## Session reuse
Worker sessions are expensive to start — each new spawn requires the session to read the full codebase (~50K tokens). DevClaw maintains **separate sessions per model per role** (session-per-model design). When a DEV finishes task A and picks up task B on the same project with the same model, the plugin detects the existing session and sends the task directly — no new session needed.
Worker sessions are expensive to start — each new spawn requires the session to read the full codebase (~50K tokens). DevClaw maintains **separate sessions per tier per role** (session-per-tier design). When a medior dev finishes task A and picks up task B on the same project, the plugin detects the existing session and sends the task directly — no new session needed.
The plugin handles session dispatch internally via OpenClaw CLI. The orchestrator agent never calls `sessions_spawn` or `sessions_send` — it just calls `task_pickup` and the plugin does the rest.
@@ -114,26 +127,26 @@ sequenceDiagram
O->>DC: task_pickup({ issueId: 42, role: "dev" })
DC->>GL: Fetch issue, verify label
DC->>DC: Select model (haiku/sonnet/opus)
DC->>DC: Check existing session for selected model
DC->>DC: Assign tier (junior/medior/senior)
DC->>DC: Check existing session for assigned tier
DC->>GL: Transition label (To Do → Doing)
DC->>S: Dispatch task via CLI (create or reuse session)
DC->>DC: Update projects.json, write audit log
DC-->>O: { success: true, announcement: "🔧 DEV (sonnet) picking up #42" }
DC-->>O: { success: true, announcement: "🔧 DEV (medior) picking up #42" }
```
## Model selection
## Developer assignment
The orchestrator LLM analyzes each issue's title, description, and labels to choose the appropriate model tier, then passes it to `task_pickup` via the `model` parameter. This gives the LLM full context for the decision — it can weigh factors like codebase familiarity, task dependencies, and recent failure history that keyword matching would miss.
The orchestrator LLM evaluates each issue's title, description, and labels to assign the appropriate developer tier, then passes it to `task_pickup` via the `model` parameter. This gives the LLM full context for the decision — it can weigh factors like codebase familiarity, task dependencies, and recent failure history that keyword matching would miss.
The keyword heuristic in `model-selector.ts` serves as a **fallback only**, used when the orchestrator omits the `model` parameter.
| Complexity | Model | When |
|------------|-------|------|
| Simple | Haiku | Typos, CSS, renames, copy changes |
| Standard | Sonnet | Features, bug fixes, multi-file changes |
| Complex | Opus | Architecture, migrations, security, system-wide refactoring |
| QA | Grok | All QA tasks (code review, test validation) |
| Tier | Role | When |
|------|------|------|
| junior | Junior developer | Typos, CSS, renames, copy changes |
| medior | Mid-level developer | Features, bug fixes, multi-file changes |
| senior | Senior developer | Architecture, migrations, security, system-wide refactoring |
| qa | QA engineer | All QA tasks (code review, test validation) |
## State management
@@ -151,19 +164,19 @@ All project state lives in a single `memory/projects.json` file in the orchestra
"dev": {
"active": false,
"issueId": null,
"model": "haiku",
"model": "medior",
"sessions": {
"haiku": "agent:orchestrator:subagent:a9e4d078-...",
"sonnet": "agent:orchestrator:subagent:b3f5c912-...",
"opus": null
"junior": "agent:orchestrator:subagent:a9e4d078-...",
"medior": "agent:orchestrator:subagent:b3f5c912-...",
"senior": null
}
},
"qa": {
"active": false,
"issueId": null,
"model": "grok",
"model": "qa",
"sessions": {
"grok": "agent:orchestrator:subagent:18707821-..."
"qa": "agent:orchestrator:subagent:18707821-..."
}
}
}
@@ -172,7 +185,7 @@ All project state lives in a single `memory/projects.json` file in the orchestra
```
Key design decisions:
- **Session-per-model** — each model gets its own worker session, accumulating context independently. Model selection maps directly to a session key.
- **Session-per-tier** — each tier gets its own worker session, accumulating context independently. Tier selection maps directly to a session key.
- **Sessions preserved on completion** — when a worker completes a task, `sessions` map is **preserved** (only `active` and `issueId` are cleared). This enables session reuse on the next pickup.
- **Plugin-controlled dispatch** — the plugin creates and dispatches to sessions via OpenClaw CLI (`sessions.patch` + `openclaw agent`). The orchestrator agent never calls `sessions_spawn` or `sessions_send`.
- **Sessions persist indefinitely** — no auto-cleanup. `session_health` handles manual cleanup when needed.
@@ -181,27 +194,35 @@ All writes go through atomic temp-file-then-rename to prevent corruption.
## Tools
### `task_pickup`
### `devclaw_setup`
Pick up a task from the GitLab queue for a DEV or QA worker.
Set up DevClaw in an agent's workspace. Creates AGENTS.md, HEARTBEAT.md, role templates, and configures models. Can optionally create a new agent.
**Parameters:**
- `issueId` (number, required) — GitLab issue ID
- `newAgentName` (string, optional) — Create a new agent with this name
- `models` (object, optional) — Model overrides per tier: `{ junior, medior, senior, qa }`
### `task_pickup`
Pick up a task from the issue queue for a DEV or QA worker.
**Parameters:**
- `issueId` (number, required) — Issue ID
- `role` ("dev" | "qa", required) — Worker role
- `projectGroupId` (string, required) — Telegram group ID
- `model` (string, optional) — Model alias to use (e.g. haiku, sonnet, opus, grok). The orchestrator should analyze the issue complexity and choose. Falls back to keyword heuristic if omitted.
- `model` (string, optional) — Developer tier (junior, medior, senior, qa). The orchestrator should evaluate the task complexity and choose. Falls back to keyword heuristic if omitted.
**What it does atomically:**
1. Resolves project from `projects.json`
2. Validates no active worker for this role
3. Fetches issue from issue tracker, verifies correct label state
4. Selects model (LLM-chosen via `model` param, keyword heuristic fallback)
4. Assigns tier (LLM-chosen via `model` param, keyword heuristic fallback)
5. Loads role instructions from `roles/<project>/<role>.md` (fallback: `roles/default/<role>.md`)
6. Looks up existing session for selected model (session-per-model)
6. Looks up existing session for assigned tier (session-per-tier)
7. Transitions label (e.g. `To Do``Doing`)
8. Creates session via Gateway RPC if new (`sessions.patch`)
9. Dispatches task to worker session via CLI (`openclaw agent`) with role instructions appended
10. Updates `projects.json` state (active, issueId, model, session key)
10. Updates `projects.json` state (active, issueId, tier, session key)
11. Writes audit log entry
12. Returns announcement text for the orchestrator to post
@@ -216,9 +237,9 @@ Complete a task with one of four results. Called by workers (DEV/QA sub-agent se
- `summary` (string, optional) — For the Telegram announcement
**Results:**
- **DEV "done"** — Pulls latest code, moves label `Doing``To Test`, deactivates worker. If `autoChain` enabled, automatically dispatches QA (grok).
- **DEV "done"** — Pulls latest code, moves label `Doing``To Test`, deactivates worker. If `autoChain` enabled, automatically dispatches QA.
- **QA "pass"** — Moves label `Testing``Done`, closes issue, deactivates worker
- **QA "fail"** — Moves label `Testing``To Improve`, reopens issue. If `autoChain` enabled, automatically dispatches DEV fix (reuses previous model).
- **QA "fail"** — Moves label `Testing``To Improve`, reopens issue. If `autoChain` enabled, automatically dispatches DEV fix (reuses previous DEV tier).
- **QA "refine"** — Moves label `Testing``Refining`, awaits human decision
### `task_create`
@@ -284,24 +305,29 @@ Register a new project with DevClaw. Creates all required issue tracker labels (
Every tool call automatically appends an NDJSON entry to `memory/audit.log`. No manual logging required from the orchestrator agent.
```jsonl
{"ts":"2026-02-08T10:30:00Z","event":"task_pickup","project":"my-webapp","issue":42,"role":"dev","model":"sonnet","sessionAction":"send"}
{"ts":"2026-02-08T10:30:01Z","event":"model_selection","issue":42,"role":"dev","selected":"sonnet","reason":"Standard dev task"}
{"ts":"2026-02-08T10:30:00Z","event":"task_pickup","project":"my-webapp","issue":42,"role":"dev","tier":"medior","sessionAction":"send"}
{"ts":"2026-02-08T10:30:01Z","event":"model_selection","issue":42,"role":"dev","tier":"medior","reason":"Standard dev task"}
{"ts":"2026-02-08T10:45:00Z","event":"task_complete","project":"my-webapp","issue":42,"role":"dev","result":"done"}
```
## Installation
## Quick start
```bash
# Local (place in extensions directory — auto-discovered)
# 1. Install the plugin
cp -r devclaw ~/.openclaw/extensions/
# From npm (future)
openclaw plugins install @openclaw/devclaw
# 2. Run setup (interactive — creates agent, configures models, writes workspace files)
openclaw devclaw setup
# 3. Add bot to Telegram group, then register a project
# (via the agent in Telegram)
```
See the [Onboarding Guide](docs/ONBOARDING.md) for detailed instructions.
## Configuration
Optional config in `openclaw.json`:
Model tier configuration in `openclaw.json`:
```json
{
@@ -309,7 +335,12 @@ Optional config in `openclaw.json`:
"entries": {
"devclaw": {
"config": {
"glabPath": "/usr/local/bin/glab"
"models": {
"junior": "anthropic/claude-haiku-4-5",
"medior": "anthropic/claude-sonnet-4-5",
"senior": "anthropic/claude-opus-4-5",
"qa": "anthropic/claude-sonnet-4-5"
}
}
}
}
@@ -325,7 +356,7 @@ Restrict tools to your orchestrator agent only:
"list": [{
"id": "my-orchestrator",
"tools": {
"allow": ["task_pickup", "task_complete", "task_create", "queue_status", "session_health", "project_register"]
"allow": ["devclaw_setup", "task_pickup", "task_complete", "task_create", "queue_status", "session_health", "project_register"]
}
}]
}
@@ -359,7 +390,6 @@ workspace/
- [OpenClaw](https://openclaw.ai)
- Node.js >= 20
- [`glab`](https://gitlab.com/gitlab-org/cli) CLI installed and authenticated (GitLab provider), or [`gh`](https://cli.github.com) CLI (GitHub provider)
- A `memory/projects.json` in the orchestrator agent's workspace
## License