feat: refactor model selection to use developer tiers

- Replaced raw model aliases with developer tiers (junior, medior, senior, qa) in dispatch and model selection logic. - Updated `dispatchTask` to resolve models based on tiers and plugin configuration. - Modified `selectModel` to return tier names instead of model aliases based on task description. - Implemented migration logic for transitioning from old model aliases to new tier names in worker state. - Added setup logic for agent creation and model configuration in `setup.ts`. - Created shared templates for workspace files and instructions for DEV/QA workers. - Enhanced project registration to scaffold role files based on developer tiers. - Updated task management tools to reflect changes in model selection and tier assignment. - Introduced a new `devclaw_setup` tool for agent-driven setup and configuration. - Updated plugin configuration schema to support model mapping per developer tier.
2026-02-09 13:41:22 +08:00
parent 8a79755e4c
commit aa8e8dbd1b
16 changed files with 1162 additions and 257 deletions
--- a/README.md
+++ b/README.md
@@ -14,9 +14,22 @@ DevClaw fills that gap with guardrails. It gives the orchestrator atomic tools t

 ## The idea

-One orchestrator agent manages all your projects. It reads task backlogs, creates issues, decides priorities, and delegates work. For each task, DevClaw creates (or reuses) a **DEV** worker session to write code or a **QA** worker session to review it. Every Telegram group is a separate project — the orchestrator keeps them completely isolated while managing them all from a single process.
+One orchestrator agent manages all your projects. It reads task backlogs, creates issues, decides priorities, and delegates work. For each task, DevClaw assigns a developer from your **team** — a junior, medior, or senior dev writes the code, then a QA engineer reviews it. Every Telegram group is a separate project — the orchestrator keeps them completely isolated while managing them all from a single process.

-DevClaw gives the orchestrator six tools that replace hundreds of lines of manual orchestration logic. Instead of following a 10-step checklist per task (fetch issue, check labels, pick model, check for existing session, transition label, dispatch task, update state, log audit event...), it calls `task_pickup` and the plugin handles everything atomically — including session dispatch. Workers call `task_complete` themselves for atomic state updates, and can file follow-up issues via `task_create`.
+DevClaw gives the orchestrator seven tools that replace hundreds of lines of manual orchestration logic. Instead of following a 10-step checklist per task (fetch issue, check labels, pick model, check for existing session, transition label, dispatch task, update state, log audit event...), it calls `task_pickup` and the plugin handles everything atomically — including session dispatch. Workers call `task_complete` themselves for atomic state updates, and can file follow-up issues via `task_create`.
+
+## Developer tiers
+
+DevClaw uses a developer seniority model. Each tier maps to a configurable LLM model:
+
+| Tier | Role | Default model | Assigns to |
+|------|------|---------------|------------|
+| **junior** | Junior developer | `anthropic/claude-haiku-4-5` | Typos, single-file fixes, simple changes |
+| **medior** | Mid-level developer | `anthropic/claude-sonnet-4-5` | Features, bug fixes, multi-file changes |
+| **senior** | Senior developer | `anthropic/claude-opus-4-5` | Architecture, migrations, system-wide refactoring |
+| **qa** | QA engineer | `anthropic/claude-sonnet-4-5` | Code review, test validation |
+
+Configure which model each tier uses during setup or in `openclaw.json` plugin config.

 ## How it works

@@ -93,15 +106,15 @@ Workers (DEV/QA sub-agent sessions) call `task_complete` directly when they fini
 ### Auto-chaining

 When a project has `autoChain: true`, `task_complete` automatically dispatches the next step:
- **DEV "done"** → QA is dispatched immediately (default model: grok)
- **QA "fail"** → DEV fix is dispatched immediately (reuses previous DEV model)
+- **DEV "done"** → QA is dispatched immediately (using the qa tier)
+- **QA "fail"** → DEV fix is dispatched immediately (reuses previous DEV tier)
 - **QA "pass" / "refine"** → no chaining (pipeline done or needs human input)

 When `autoChain` is false, `task_complete` returns a `nextAction` hint for the orchestrator to act on.

 ## Session reuse

-Worker sessions are expensive to start — each new spawn requires the session to read the full codebase (~50K tokens). DevClaw maintains **separate sessions per model per role** (session-per-model design). When a DEV finishes task A and picks up task B on the same project with the same model, the plugin detects the existing session and sends the task directly — no new session needed.
+Worker sessions are expensive to start — each new spawn requires the session to read the full codebase (~50K tokens). DevClaw maintains **separate sessions per tier per role** (session-per-tier design). When a medior dev finishes task A and picks up task B on the same project, the plugin detects the existing session and sends the task directly — no new session needed.

 The plugin handles session dispatch internally via OpenClaw CLI. The orchestrator agent never calls `sessions_spawn` or `sessions_send` — it just calls `task_pickup` and the plugin does the rest.

@@ -114,26 +127,26 @@ sequenceDiagram

    O->>DC: task_pickup({ issueId: 42, role: "dev" })
    DC->>GL: Fetch issue, verify label
-    DC->>DC: Select model (haiku/sonnet/opus)
-    DC->>DC: Check existing session for selected model
+    DC->>DC: Assign tier (junior/medior/senior)
+    DC->>DC: Check existing session for assigned tier
    DC->>GL: Transition label (To Do → Doing)
    DC->>S: Dispatch task via CLI (create or reuse session)
    DC->>DC: Update projects.json, write audit log
-    DC-->>O: { success: true, announcement: "🔧 DEV (sonnet) picking up #42" }
+    DC-->>O: { success: true, announcement: "🔧 DEV (medior) picking up #42" }
 ```

-## Model selection
+## Developer assignment

-The orchestrator LLM analyzes each issue's title, description, and labels to choose the appropriate model tier, then passes it to `task_pickup` via the `model` parameter. This gives the LLM full context for the decision — it can weigh factors like codebase familiarity, task dependencies, and recent failure history that keyword matching would miss.
+The orchestrator LLM evaluates each issue's title, description, and labels to assign the appropriate developer tier, then passes it to `task_pickup` via the `model` parameter. This gives the LLM full context for the decision — it can weigh factors like codebase familiarity, task dependencies, and recent failure history that keyword matching would miss.

 The keyword heuristic in `model-selector.ts` serves as a **fallback only**, used when the orchestrator omits the `model` parameter.

-| Complexity | Model | When |
-|------------|-------|------|
-| Simple | Haiku | Typos, CSS, renames, copy changes |
-| Standard | Sonnet | Features, bug fixes, multi-file changes |
-| Complex | Opus | Architecture, migrations, security, system-wide refactoring |
-| QA | Grok | All QA tasks (code review, test validation) |
+| Tier | Role | When |
+|------|------|------|
+| junior | Junior developer | Typos, CSS, renames, copy changes |
+| medior | Mid-level developer | Features, bug fixes, multi-file changes |
+| senior | Senior developer | Architecture, migrations, security, system-wide refactoring |
+| qa | QA engineer | All QA tasks (code review, test validation) |

 ## State management

@@ -151,19 +164,19 @@ All project state lives in a single `memory/projects.json` file in the orchestra
      "dev": {
        "active": false,
        "issueId": null,
-        "model": "haiku",
+        "model": "medior",
        "sessions": {
-          "haiku": "agent:orchestrator:subagent:a9e4d078-...",
-          "sonnet": "agent:orchestrator:subagent:b3f5c912-...",
-          "opus": null
+          "junior": "agent:orchestrator:subagent:a9e4d078-...",
+          "medior": "agent:orchestrator:subagent:b3f5c912-...",
+          "senior": null
        }
      },
      "qa": {
        "active": false,
        "issueId": null,
-        "model": "grok",
+        "model": "qa",
        "sessions": {
-          "grok": "agent:orchestrator:subagent:18707821-..."
+          "qa": "agent:orchestrator:subagent:18707821-..."
        }
      }
    }
@@ -172,7 +185,7 @@ All project state lives in a single `memory/projects.json` file in the orchestra
 ```

 Key design decisions:
- **Session-per-model** — each model gets its own worker session, accumulating context independently. Model selection maps directly to a session key.
+- **Session-per-tier** — each tier gets its own worker session, accumulating context independently. Tier selection maps directly to a session key.
 - **Sessions preserved on completion** — when a worker completes a task, `sessions` map is **preserved** (only `active` and `issueId` are cleared). This enables session reuse on the next pickup.
 - **Plugin-controlled dispatch** — the plugin creates and dispatches to sessions via OpenClaw CLI (`sessions.patch` + `openclaw agent`). The orchestrator agent never calls `sessions_spawn` or `sessions_send`.
 - **Sessions persist indefinitely** — no auto-cleanup. `session_health` handles manual cleanup when needed.
@@ -181,27 +194,35 @@ All writes go through atomic temp-file-then-rename to prevent corruption.

 ## Tools

-### `task_pickup`
+### `devclaw_setup`

-Pick up a task from the GitLab queue for a DEV or QA worker.
+Set up DevClaw in an agent's workspace. Creates AGENTS.md, HEARTBEAT.md, role templates, and configures models. Can optionally create a new agent.

 **Parameters:**
- `issueId` (number, required) — GitLab issue ID
+- `newAgentName` (string, optional) — Create a new agent with this name
+- `models` (object, optional) — Model overrides per tier: `{ junior, medior, senior, qa }`
+
+### `task_pickup`
+
+Pick up a task from the issue queue for a DEV or QA worker.
+
+**Parameters:**
+- `issueId` (number, required) — Issue ID
 - `role` ("dev" | "qa", required) — Worker role
 - `projectGroupId` (string, required) — Telegram group ID
- `model` (string, optional) — Model alias to use (e.g. haiku, sonnet, opus, grok). The orchestrator should analyze the issue complexity and choose. Falls back to keyword heuristic if omitted.
+- `model` (string, optional) — Developer tier (junior, medior, senior, qa). The orchestrator should evaluate the task complexity and choose. Falls back to keyword heuristic if omitted.

 **What it does atomically:**
 1. Resolves project from `projects.json`
 2. Validates no active worker for this role
 3. Fetches issue from issue tracker, verifies correct label state
-4. Selects model (LLM-chosen via `model` param, keyword heuristic fallback)
+4. Assigns tier (LLM-chosen via `model` param, keyword heuristic fallback)
 5. Loads role instructions from `roles/<project>/<role>.md` (fallback: `roles/default/<role>.md`)
-6. Looks up existing session for selected model (session-per-model)
+6. Looks up existing session for assigned tier (session-per-tier)
 7. Transitions label (e.g. `To Do` → `Doing`)
 8. Creates session via Gateway RPC if new (`sessions.patch`)
 9. Dispatches task to worker session via CLI (`openclaw agent`) with role instructions appended
-10. Updates `projects.json` state (active, issueId, model, session key)
+10. Updates `projects.json` state (active, issueId, tier, session key)
 11. Writes audit log entry
 12. Returns announcement text for the orchestrator to post

@@ -216,9 +237,9 @@ Complete a task with one of four results. Called by workers (DEV/QA sub-agent se
 - `summary` (string, optional) — For the Telegram announcement

 **Results:**
- **DEV "done"** — Pulls latest code, moves label `Doing` → `To Test`, deactivates worker. If `autoChain` enabled, automatically dispatches QA (grok).
+- **DEV "done"** — Pulls latest code, moves label `Doing` → `To Test`, deactivates worker. If `autoChain` enabled, automatically dispatches QA.
 - **QA "pass"** — Moves label `Testing` → `Done`, closes issue, deactivates worker
- **QA "fail"** — Moves label `Testing` → `To Improve`, reopens issue. If `autoChain` enabled, automatically dispatches DEV fix (reuses previous model).
+- **QA "fail"** — Moves label `Testing` → `To Improve`, reopens issue. If `autoChain` enabled, automatically dispatches DEV fix (reuses previous DEV tier).
 - **QA "refine"** — Moves label `Testing` → `Refining`, awaits human decision

 ### `task_create`
@@ -284,24 +305,29 @@ Register a new project with DevClaw. Creates all required issue tracker labels (
 Every tool call automatically appends an NDJSON entry to `memory/audit.log`. No manual logging required from the orchestrator agent.

 ```jsonl
-{"ts":"2026-02-08T10:30:00Z","event":"task_pickup","project":"my-webapp","issue":42,"role":"dev","model":"sonnet","sessionAction":"send"}
-{"ts":"2026-02-08T10:30:01Z","event":"model_selection","issue":42,"role":"dev","selected":"sonnet","reason":"Standard dev task"}
+{"ts":"2026-02-08T10:30:00Z","event":"task_pickup","project":"my-webapp","issue":42,"role":"dev","tier":"medior","sessionAction":"send"}
+{"ts":"2026-02-08T10:30:01Z","event":"model_selection","issue":42,"role":"dev","tier":"medior","reason":"Standard dev task"}
 {"ts":"2026-02-08T10:45:00Z","event":"task_complete","project":"my-webapp","issue":42,"role":"dev","result":"done"}
 ```

-## Installation
+## Quick start

 ```bash
-# Local (place in extensions directory — auto-discovered)
+# 1. Install the plugin
 cp -r devclaw ~/.openclaw/extensions/

-# From npm (future)
-openclaw plugins install @openclaw/devclaw
+# 2. Run setup (interactive — creates agent, configures models, writes workspace files)
+openclaw devclaw setup
+
+# 3. Add bot to Telegram group, then register a project
+# (via the agent in Telegram)
 ```

+See the [Onboarding Guide](docs/ONBOARDING.md) for detailed instructions.
+
 ## Configuration

-Optional config in `openclaw.json`:
+Model tier configuration in `openclaw.json`:

 ```json
 {
@@ -309,7 +335,12 @@ Optional config in `openclaw.json`:
    "entries": {
      "devclaw": {
        "config": {
-          "glabPath": "/usr/local/bin/glab"
+          "models": {
+            "junior": "anthropic/claude-haiku-4-5",
+            "medior": "anthropic/claude-sonnet-4-5",
+            "senior": "anthropic/claude-opus-4-5",
+            "qa": "anthropic/claude-sonnet-4-5"
+          }
        }
      }
    }
@@ -325,7 +356,7 @@ Restrict tools to your orchestrator agent only:
    "list": [{
      "id": "my-orchestrator",
      "tools": {
-        "allow": ["task_pickup", "task_complete", "task_create", "queue_status", "session_health", "project_register"]
+        "allow": ["devclaw_setup", "task_pickup", "task_complete", "task_create", "queue_status", "session_health", "project_register"]
      }
    }]
  }
@@ -359,7 +390,6 @@ workspace/
 - [OpenClaw](https://openclaw.ai)
 - Node.js >= 20
 - [`glab`](https://gitlab.com/gitlab-org/cli) CLI installed and authenticated (GitLab provider), or [`gh`](https://cli.github.com) CLI (GitHub provider)
- A `memory/projects.json` in the orchestrator agent's workspace

 ## License