feat: create TEST.md markdown file (#78)
This commit is contained in:
29
README.md
29
README.md
@@ -89,10 +89,10 @@ stateDiagram-v2
|
|||||||
[*] --> Planning
|
[*] --> Planning
|
||||||
Planning --> ToDo: Ready for development
|
Planning --> ToDo: Ready for development
|
||||||
|
|
||||||
ToDo --> Doing: task_pickup (DEV)
|
ToDo --> Doing: task_pickup (DEV) ⇄ blocked
|
||||||
Doing --> ToTest: task_complete (DEV done)
|
Doing --> ToTest: task_complete (DEV done)
|
||||||
|
|
||||||
ToTest --> Testing: task_pickup (QA) or auto-chain
|
ToTest --> Testing: task_pickup (QA) / auto-chain ⇄ blocked
|
||||||
Testing --> Done: task_complete (QA pass)
|
Testing --> Done: task_complete (QA pass)
|
||||||
Testing --> ToImprove: task_complete (QA fail)
|
Testing --> ToImprove: task_complete (QA fail)
|
||||||
Testing --> Refining: task_complete (QA refine)
|
Testing --> Refining: task_complete (QA refine)
|
||||||
@@ -107,13 +107,22 @@ stateDiagram-v2
|
|||||||
|
|
||||||
Workers (DEV/QA sub-agent sessions) call `task_complete` directly when they finish — no orchestrator involvement needed for the state transition. Workers can also call `task_create` to file follow-up issues they discover during work.
|
Workers (DEV/QA sub-agent sessions) call `task_complete` directly when they finish — no orchestrator involvement needed for the state transition. Workers can also call `task_create` to file follow-up issues they discover during work.
|
||||||
|
|
||||||
|
### Completion enforcement
|
||||||
|
|
||||||
|
Three layers guarantee that `task_complete` always runs, preventing tasks from getting stuck in "Doing" or "Testing" forever:
|
||||||
|
|
||||||
|
1. **Completion contract** — Every task message includes a mandatory section requiring the worker to call `task_complete`, even on failure. Workers use `"blocked"` if stuck.
|
||||||
|
2. **Blocked result** — Both DEV and QA can return `"blocked"` to gracefully put a task back in queue (`Doing → To Do`, `Testing → To Test`) instead of silently dying.
|
||||||
|
3. **Stale worker watchdog** — The heartbeat health check detects workers active >2 hours and auto-reverts labels to queue, catching sessions that crashed or ran out of context.
|
||||||
|
|
||||||
### Auto-chaining
|
### Auto-chaining
|
||||||
|
|
||||||
When a project has `autoChain: true`, `task_complete` automatically dispatches the next step:
|
When a project has `autoChain: true`, `task_complete` automatically dispatches the next step:
|
||||||
|
|
||||||
- **DEV "done"** → QA is dispatched immediately (using the qa tier)
|
- **DEV "done"** → QA is dispatched immediately (using the qa tier)
|
||||||
- **QA "fail"** → DEV fix is dispatched immediately (reuses previous DEV tier)
|
- **QA "fail"** → DEV fix is dispatched immediately (reuses previous DEV tier)
|
||||||
- **QA "pass" / "refine"** → no chaining (pipeline done or needs human input)
|
- **QA "pass" / "refine" / "blocked"** → no chaining (pipeline done, needs human input, or returned to queue)
|
||||||
|
- **DEV "blocked"** → no chaining (returned to queue for retry)
|
||||||
|
|
||||||
When `autoChain` is false, `task_complete` returns a `nextAction` hint for the orchestrator to act on.
|
When `autoChain` is false, `task_complete` returns a `nextAction` hint for the orchestrator to act on.
|
||||||
|
|
||||||
@@ -237,21 +246,23 @@ Pick up a task from the issue queue for a DEV or QA worker.
|
|||||||
|
|
||||||
### `task_complete`
|
### `task_complete`
|
||||||
|
|
||||||
Complete a task with one of four results. Called by workers (DEV/QA sub-agent sessions) directly, or by the orchestrator.
|
Complete a task with a result. Called by workers (DEV/QA sub-agent sessions) directly, or by the orchestrator.
|
||||||
|
|
||||||
**Parameters:**
|
**Parameters:**
|
||||||
|
|
||||||
- `role` ("dev" | "qa", required)
|
- `role` ("dev" | "qa", required)
|
||||||
- `result` ("done" | "pass" | "fail" | "refine", required)
|
- `result` ("done" | "pass" | "fail" | "refine" | "blocked", required)
|
||||||
- `projectGroupId` (string, required)
|
- `projectGroupId` (string, required)
|
||||||
- `summary` (string, optional) — For the Telegram announcement
|
- `summary` (string, optional) — For the Telegram announcement
|
||||||
|
|
||||||
**Results:**
|
**Results:**
|
||||||
|
|
||||||
- **DEV "done"** — Pulls latest code, moves label `Doing` → `To Test`, deactivates worker. If `autoChain` enabled, automatically dispatches QA.
|
- **DEV "done"** — Pulls latest code, moves label `Doing` → `To Test`, deactivates worker. If `autoChain` enabled, automatically dispatches QA.
|
||||||
|
- **DEV "blocked"** — Moves label `Doing` → `To Do`, deactivates worker. Task returns to queue for retry.
|
||||||
- **QA "pass"** — Moves label `Testing` → `Done`, closes issue, deactivates worker
|
- **QA "pass"** — Moves label `Testing` → `Done`, closes issue, deactivates worker
|
||||||
- **QA "fail"** — Moves label `Testing` → `To Improve`, reopens issue. If `autoChain` enabled, automatically dispatches DEV fix (reuses previous DEV tier).
|
- **QA "fail"** — Moves label `Testing` → `To Improve`, reopens issue. If `autoChain` enabled, automatically dispatches DEV fix (reuses previous DEV tier).
|
||||||
- **QA "refine"** — Moves label `Testing` → `Refining`, awaits human decision
|
- **QA "refine"** — Moves label `Testing` → `Refining`, awaits human decision
|
||||||
|
- **QA "blocked"** — Moves label `Testing` → `To Test`, deactivates worker. Task returns to QA queue for retry.
|
||||||
|
|
||||||
### `task_update`
|
### `task_update`
|
||||||
|
|
||||||
@@ -321,10 +332,10 @@ Detects and optionally fixes state inconsistencies.
|
|||||||
|
|
||||||
**Checks:**
|
**Checks:**
|
||||||
|
|
||||||
- Active worker with no session key (critical)
|
- Active worker with no session key (critical, auto-fixable)
|
||||||
- Active worker whose session is dead — zombie (critical)
|
- Active worker whose session is dead — zombie (critical, auto-fixable)
|
||||||
- Worker active for >2 hours (warning)
|
- Worker active for >2 hours — stale watchdog (warning, auto-fixable: reverts label to queue)
|
||||||
- Inactive worker with lingering issue ID (warning)
|
- Inactive worker with lingering issue ID (warning, auto-fixable)
|
||||||
|
|
||||||
### `project_register`
|
### `project_register`
|
||||||
|
|
||||||
|
|||||||
@@ -403,6 +403,25 @@ Label: "Testing" → "Refining"
|
|||||||
|
|
||||||
Issue needs human decision. Pipeline pauses until human moves it to "To Do" or closes it.
|
Issue needs human decision. Pipeline pauses until human moves it to "To Do" or closes it.
|
||||||
|
|
||||||
|
#### 7d. Blocked (DEV or QA)
|
||||||
|
|
||||||
|
```
|
||||||
|
DEV Blocked: "Doing" → "To Do"
|
||||||
|
QA Blocked: "Testing" → "To Test"
|
||||||
|
```
|
||||||
|
|
||||||
|
Worker cannot complete (missing info, environment errors, etc.). Issue returns to queue for retry. No auto-chain — the task is available for the next heartbeat pickup.
|
||||||
|
|
||||||
|
### Completion enforcement
|
||||||
|
|
||||||
|
Three layers guarantee that `task_complete` always runs:
|
||||||
|
|
||||||
|
1. **Completion contract** — Every task message sent to a worker session includes a mandatory `## MANDATORY: Task Completion` section listing available results and requiring `task_complete` even on failure. Workers are instructed to use `"blocked"` if stuck.
|
||||||
|
|
||||||
|
2. **Blocked result** — Both DEV and QA can use `"blocked"` to gracefully return a task to queue without losing work. DEV blocked: `Doing → To Do`. QA blocked: `Testing → To Test`. This gives workers an escape hatch instead of silently dying.
|
||||||
|
|
||||||
|
3. **Stale worker watchdog** — The heartbeat's health check detects workers active for >2 hours. With `autoFix=true`, it deactivates the worker and reverts the label back to queue. This catches sessions that crashed, ran out of context, or otherwise failed without calling `task_complete`. The `session_health` tool provides the same check for manual invocation.
|
||||||
|
|
||||||
### Phase 8: Heartbeat (continuous)
|
### Phase 8: Heartbeat (continuous)
|
||||||
|
|
||||||
The heartbeat runs periodically (triggered by the agent or a scheduled message). It combines health check + queue scan:
|
The heartbeat runs periodically (triggered by the agent or a scheduled message). It combines health check + queue scan:
|
||||||
@@ -493,6 +512,8 @@ Every piece of data and where it lives:
|
|||||||
│ "✅ DEV done #42 — Login page with OAuth. Moved to QA queue."│
|
│ "✅ DEV done #42 — Login page with OAuth. Moved to QA queue."│
|
||||||
│ "🎉 QA PASS #42. Issue closed." │
|
│ "🎉 QA PASS #42. Issue closed." │
|
||||||
│ "❌ QA FAIL #42 — OAuth redirect broken. Sent back to DEV." │
|
│ "❌ QA FAIL #42 — OAuth redirect broken. Sent back to DEV." │
|
||||||
|
│ "🚫 DEV BLOCKED #42 — Missing dependencies. Returned to queue."│
|
||||||
|
│ "🚫 QA BLOCKED #42 — Env not available. Returned to QA queue."│
|
||||||
└─────────────────────────────────────────────────────────────────┘
|
└─────────────────────────────────────────────────────────────────┘
|
||||||
|
|
||||||
┌─────────────────────────────────────────────────────────────────┐
|
┌─────────────────────────────────────────────────────────────────┐
|
||||||
@@ -575,7 +596,8 @@ Provider selection is handled by `createProvider()` in `lib/providers/index.ts`.
|
|||||||
| projects.json corrupted | Tool can't parse JSON | Manual fix needed. Atomic writes (temp+rename) prevent partial writes. |
|
| projects.json corrupted | Tool can't parse JSON | Manual fix needed. Atomic writes (temp+rename) prevent partial writes. |
|
||||||
| Label out of sync | `task_pickup` verifies label before transitioning | Throws error if label doesn't match expected state. Agent reports mismatch. |
|
| Label out of sync | `task_pickup` verifies label before transitioning | Throws error if label doesn't match expected state. Agent reports mismatch. |
|
||||||
| Worker already active | `task_pickup` checks `active` flag | Throws error: "DEV worker already active on project". Must complete current task first. |
|
| Worker already active | `task_pickup` checks `active` flag | Throws error: "DEV worker already active on project". Must complete current task first. |
|
||||||
| Stale worker (>2h) | `session_health` flags as warning | Agent can investigate or `autoFix` can clear. |
|
| Stale worker (>2h) | `session_health` and heartbeat health check | `autoFix`: deactivates worker, reverts label to queue (To Do / To Test). Task available for next pickup. |
|
||||||
|
| Worker stuck/blocked | Worker calls `task_complete` with `"blocked"` | Deactivates worker, reverts label to queue. Issue available for retry. |
|
||||||
| `project_register` fails | Plugin catches error during label creation or state write | Clean error returned. No partial state — labels are idempotent, projects.json not written until all labels succeed. |
|
| `project_register` fails | Plugin catches error during label creation or state write | Clean error returned. No partial state — labels are idempotent, projects.json not written until all labels succeed. |
|
||||||
|
|
||||||
## File locations
|
## File locations
|
||||||
|
|||||||
@@ -55,7 +55,7 @@ export type DispatchResult = {
|
|||||||
* Reads role-specific instructions from workspace/roles/<project>/<role>.md
|
* Reads role-specific instructions from workspace/roles/<project>/<role>.md
|
||||||
* with fallback to workspace/roles/default/<role>.md.
|
* with fallback to workspace/roles/default/<role>.md.
|
||||||
*/
|
*/
|
||||||
async function buildTaskMessage(opts: {
|
export async function buildTaskMessage(opts: {
|
||||||
workspaceDir: string;
|
workspaceDir: string;
|
||||||
projectName: string;
|
projectName: string;
|
||||||
role: "dev" | "qa";
|
role: "dev" | "qa";
|
||||||
@@ -104,6 +104,12 @@ async function buildTaskMessage(opts: {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Build available results based on role
|
||||||
|
const availableResults =
|
||||||
|
role === "dev"
|
||||||
|
? '"done" (completed successfully) or "blocked" (cannot complete, need help)'
|
||||||
|
: '"pass" (approved), "fail" (issues found), "refine" (needs human input), or "blocked" (cannot complete)';
|
||||||
|
|
||||||
const parts = [
|
const parts = [
|
||||||
`${role.toUpperCase()} task for project "${projectName}" — Issue #${issueId}`,
|
`${role.toUpperCase()} task for project "${projectName}" — Issue #${issueId}`,
|
||||||
``,
|
``,
|
||||||
@@ -118,6 +124,24 @@ async function buildTaskMessage(opts: {
|
|||||||
parts.push(``, `---`, ``, roleInstructions.trim());
|
parts.push(``, `---`, ``, roleInstructions.trim());
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Mandatory completion contract
|
||||||
|
parts.push(
|
||||||
|
``,
|
||||||
|
`---`,
|
||||||
|
``,
|
||||||
|
`## MANDATORY: Task Completion`,
|
||||||
|
``,
|
||||||
|
`When you finish this task, you MUST call \`task_complete\` with:`,
|
||||||
|
`- \`role\`: "${role}"`,
|
||||||
|
`- \`projectGroupId\`: "${groupId}"`,
|
||||||
|
`- \`result\`: ${availableResults}`,
|
||||||
|
`- \`summary\`: brief description of what you did`,
|
||||||
|
``,
|
||||||
|
`⚠️ You MUST call task_complete even if you encounter errors or cannot finish.`,
|
||||||
|
`Use "blocked" with a summary explaining why you're stuck.`,
|
||||||
|
`Never end your session without calling task_complete.`,
|
||||||
|
);
|
||||||
|
|
||||||
return parts.join("\n");
|
return parts.join("\n");
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -193,8 +217,18 @@ export async function dispatchTask(
|
|||||||
|
|
||||||
await execFileAsync(
|
await execFileAsync(
|
||||||
"openclaw",
|
"openclaw",
|
||||||
["agent", "--session-id", sessionKey!, "--message", taskMessage],
|
[
|
||||||
{ timeout: 60_000 },
|
"gateway",
|
||||||
|
"call",
|
||||||
|
"agent",
|
||||||
|
"--params",
|
||||||
|
JSON.stringify({
|
||||||
|
idempotencyKey: randomUUID(),
|
||||||
|
sessionId: sessionKey!,
|
||||||
|
message: taskMessage,
|
||||||
|
}),
|
||||||
|
],
|
||||||
|
{ timeout: 30_000 },
|
||||||
);
|
);
|
||||||
|
|
||||||
dispatched = true;
|
dispatched = true;
|
||||||
|
|||||||
@@ -42,7 +42,7 @@ export type NotifyEvent =
|
|||||||
groupId: string;
|
groupId: string;
|
||||||
issueId: number;
|
issueId: number;
|
||||||
role: "dev" | "qa";
|
role: "dev" | "qa";
|
||||||
result: "done" | "pass" | "fail" | "refine";
|
result: "done" | "pass" | "fail" | "refine" | "blocked";
|
||||||
summary?: string;
|
summary?: string;
|
||||||
nextState?: string;
|
nextState?: string;
|
||||||
}
|
}
|
||||||
@@ -76,6 +76,7 @@ function buildMessage(event: NotifyEvent): string {
|
|||||||
pass: "🎉",
|
pass: "🎉",
|
||||||
fail: "❌",
|
fail: "❌",
|
||||||
refine: "🤔",
|
refine: "🤔",
|
||||||
|
blocked: "🚫",
|
||||||
};
|
};
|
||||||
const icon = icons[event.result] ?? "📋";
|
const icon = icons[event.result] ?? "📋";
|
||||||
const resultText: Record<string, string> = {
|
const resultText: Record<string, string> = {
|
||||||
@@ -83,6 +84,7 @@ function buildMessage(event: NotifyEvent): string {
|
|||||||
pass: "PASSED",
|
pass: "PASSED",
|
||||||
fail: "FAILED",
|
fail: "FAILED",
|
||||||
refine: "needs refinement",
|
refine: "needs refinement",
|
||||||
|
blocked: "BLOCKED",
|
||||||
};
|
};
|
||||||
const text = resultText[event.result] ?? event.result;
|
const text = resultText[event.result] ?? event.result;
|
||||||
let msg = `${icon} ${event.role.toUpperCase()} ${text} #${event.issueId}`;
|
let msg = `${icon} ${event.role.toUpperCase()} ${text} #${event.issueId}`;
|
||||||
|
|||||||
@@ -214,6 +214,41 @@ async function checkAndFixWorkerHealth(
|
|||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Check 4: Active for >2 hours (stale watchdog)
|
||||||
|
// A stale worker likely crashed or ran out of context without calling task_complete.
|
||||||
|
// Auto-fix reverts the label back to queue so the issue can be picked up again.
|
||||||
|
if (worker.active && worker.startTime && currentSessionKey) {
|
||||||
|
const startMs = new Date(worker.startTime).getTime();
|
||||||
|
const nowMs = Date.now();
|
||||||
|
const hoursActive = (nowMs - startMs) / (1000 * 60 * 60);
|
||||||
|
|
||||||
|
if (hoursActive > 2) {
|
||||||
|
if (autoFix) {
|
||||||
|
const revertLabel: StateLabel = role === "dev" ? "To Do" : "To Test";
|
||||||
|
const currentLabel: StateLabel = role === "dev" ? "Doing" : "Testing";
|
||||||
|
try {
|
||||||
|
if (worker.issueId) {
|
||||||
|
const primaryIssueId = Number(worker.issueId.split(",")[0]);
|
||||||
|
await provider.transitionLabel(primaryIssueId, currentLabel, revertLabel);
|
||||||
|
}
|
||||||
|
} catch {
|
||||||
|
// Best-effort label revert
|
||||||
|
}
|
||||||
|
|
||||||
|
await updateWorker(workspaceDir, groupId, role, {
|
||||||
|
active: false,
|
||||||
|
issueId: null,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
fixes.push({
|
||||||
|
project: project.name,
|
||||||
|
role,
|
||||||
|
type: "stale_worker",
|
||||||
|
fixed: autoFix,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
return fixes;
|
return fixes;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -129,14 +129,15 @@ export function createSessionHealthTool(api: OpenClawPluginApi) {
|
|||||||
issues.push(issue);
|
issues.push(issue);
|
||||||
}
|
}
|
||||||
|
|
||||||
// Check 3: Active for >2 hours (stale)
|
// Check 3: Active for >2 hours (stale watchdog)
|
||||||
|
// Worker likely crashed or ran out of context without calling task_complete.
|
||||||
if (worker.active && worker.startTime) {
|
if (worker.active && worker.startTime) {
|
||||||
const startMs = new Date(worker.startTime).getTime();
|
const startMs = new Date(worker.startTime).getTime();
|
||||||
const nowMs = Date.now();
|
const nowMs = Date.now();
|
||||||
const hoursActive = (nowMs - startMs) / (1000 * 60 * 60);
|
const hoursActive = (nowMs - startMs) / (1000 * 60 * 60);
|
||||||
|
|
||||||
if (hoursActive > 2) {
|
if (hoursActive > 2) {
|
||||||
issues.push({
|
const issue: Record<string, unknown> = {
|
||||||
type: "stale_worker",
|
type: "stale_worker",
|
||||||
severity: "warning",
|
severity: "warning",
|
||||||
project: project.name,
|
project: project.name,
|
||||||
@@ -146,7 +147,30 @@ export function createSessionHealthTool(api: OpenClawPluginApi) {
|
|||||||
sessionKey: currentSessionKey,
|
sessionKey: currentSessionKey,
|
||||||
issueId: worker.issueId,
|
issueId: worker.issueId,
|
||||||
message: `${role.toUpperCase()} has been active for ${Math.round(hoursActive * 10) / 10}h — may need attention`,
|
message: `${role.toUpperCase()} has been active for ${Math.round(hoursActive * 10) / 10}h — may need attention`,
|
||||||
|
};
|
||||||
|
|
||||||
|
if (autoFix) {
|
||||||
|
// Revert issue label back to queue
|
||||||
|
const revertLabel: StateLabel = role === "dev" ? "To Do" : "To Test";
|
||||||
|
const currentLabel: StateLabel = role === "dev" ? "Doing" : "Testing";
|
||||||
|
try {
|
||||||
|
if (worker.issueId) {
|
||||||
|
const primaryIssueId = Number(worker.issueId.split(",")[0]);
|
||||||
|
await provider.transitionLabel(primaryIssueId, currentLabel, revertLabel);
|
||||||
|
issue.labelReverted = `${currentLabel} → ${revertLabel}`;
|
||||||
|
}
|
||||||
|
} catch {
|
||||||
|
issue.labelRevertFailed = true;
|
||||||
|
}
|
||||||
|
|
||||||
|
await updateWorker(workspaceDir, groupId, role, {
|
||||||
|
active: false,
|
||||||
|
issueId: null,
|
||||||
});
|
});
|
||||||
|
issue.fixed = true;
|
||||||
|
fixesApplied++;
|
||||||
|
}
|
||||||
|
issues.push(issue);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -33,7 +33,7 @@ export function createTaskCompleteTool(api: OpenClawPluginApi) {
|
|||||||
return (ctx: ToolContext) => ({
|
return (ctx: ToolContext) => ({
|
||||||
name: "task_complete",
|
name: "task_complete",
|
||||||
label: "Task Complete",
|
label: "Task Complete",
|
||||||
description: `Complete a task: DEV done, QA pass, QA fail, or QA refine. Atomically handles: label transition, projects.json update, issue close/reopen, and audit logging. If the project has autoChain enabled, automatically dispatches the next step (DEV done → QA, QA fail → DEV fix).`,
|
description: `Complete a task: DEV done/blocked, QA pass/fail/refine/blocked. Atomically handles: label transition, projects.json update, issue close/reopen, and audit logging. If the project has autoChain enabled, automatically dispatches the next step (DEV done → QA, QA fail → DEV fix). Use "blocked" when the worker cannot complete the task (errors, missing info, etc.).`,
|
||||||
parameters: {
|
parameters: {
|
||||||
type: "object",
|
type: "object",
|
||||||
required: ["role", "result", "projectGroupId"],
|
required: ["role", "result", "projectGroupId"],
|
||||||
@@ -45,9 +45,9 @@ export function createTaskCompleteTool(api: OpenClawPluginApi) {
|
|||||||
},
|
},
|
||||||
result: {
|
result: {
|
||||||
type: "string",
|
type: "string",
|
||||||
enum: ["done", "pass", "fail", "refine"],
|
enum: ["done", "pass", "fail", "refine", "blocked"],
|
||||||
description:
|
description:
|
||||||
'Completion result: "done" (DEV finished), "pass" (QA approved), "fail" (QA found issues), "refine" (needs human input)',
|
'Completion result: "done" (DEV finished), "pass" (QA approved), "fail" (QA found issues), "refine" (needs human input), "blocked" (cannot complete, needs escalation)',
|
||||||
},
|
},
|
||||||
projectGroupId: {
|
projectGroupId: {
|
||||||
type: "string",
|
type: "string",
|
||||||
@@ -62,7 +62,7 @@ export function createTaskCompleteTool(api: OpenClawPluginApi) {
|
|||||||
|
|
||||||
async execute(_id: string, params: Record<string, unknown>) {
|
async execute(_id: string, params: Record<string, unknown>) {
|
||||||
const role = params.role as "dev" | "qa";
|
const role = params.role as "dev" | "qa";
|
||||||
const result = params.result as "done" | "pass" | "fail" | "refine";
|
const result = params.result as "done" | "pass" | "fail" | "refine" | "blocked";
|
||||||
const groupId = params.projectGroupId as string;
|
const groupId = params.projectGroupId as string;
|
||||||
const summary = params.summary as string | undefined;
|
const summary = params.summary as string | undefined;
|
||||||
const workspaceDir = ctx.workspaceDir;
|
const workspaceDir = ctx.workspaceDir;
|
||||||
@@ -72,14 +72,14 @@ export function createTaskCompleteTool(api: OpenClawPluginApi) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
// Validate result matches role
|
// Validate result matches role
|
||||||
if (role === "dev" && result !== "done") {
|
if (role === "dev" && result !== "done" && result !== "blocked") {
|
||||||
throw new Error(
|
throw new Error(
|
||||||
`DEV can only complete with result "done", got "${result}"`,
|
`DEV can only complete with "done" or "blocked", got "${result}"`,
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
if (role === "qa" && result === "done") {
|
if (role === "qa" && result === "done") {
|
||||||
throw new Error(
|
throw new Error(
|
||||||
`QA cannot use result "done". Use "pass", "fail", or "refine".`,
|
`QA cannot use result "done". Use "pass", "fail", "refine", or "blocked".`,
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -267,6 +267,24 @@ export function createTaskCompleteTool(api: OpenClawPluginApi) {
|
|||||||
output.announcement = `🤔 QA REFINE #${issueId}${summary ? ` — ${summary}` : ""}. Awaiting human decision.`;
|
output.announcement = `🤔 QA REFINE #${issueId}${summary ? ` — ${summary}` : ""}. Awaiting human decision.`;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// === DEV BLOCKED ===
|
||||||
|
if (role === "dev" && result === "blocked") {
|
||||||
|
await deactivateWorker(workspaceDir, groupId, "dev");
|
||||||
|
await provider.transitionLabel(issueId, "Doing", "To Do");
|
||||||
|
|
||||||
|
output.labelTransition = "Doing → To Do";
|
||||||
|
output.announcement = `🚫 DEV BLOCKED #${issueId}${summary ? ` — ${summary}` : ""}. Returned to queue.`;
|
||||||
|
}
|
||||||
|
|
||||||
|
// === QA BLOCKED ===
|
||||||
|
if (role === "qa" && result === "blocked") {
|
||||||
|
await deactivateWorker(workspaceDir, groupId, "qa");
|
||||||
|
await provider.transitionLabel(issueId, "Testing", "To Test");
|
||||||
|
|
||||||
|
output.labelTransition = "Testing → To Test";
|
||||||
|
output.announcement = `🚫 QA BLOCKED #${issueId}${summary ? ` — ${summary}` : ""}. Returned to QA queue.`;
|
||||||
|
}
|
||||||
|
|
||||||
// Send notification to project group
|
// Send notification to project group
|
||||||
const pluginConfig = api.pluginConfig as Record<string, unknown> | undefined;
|
const pluginConfig = api.pluginConfig as Record<string, unknown> | undefined;
|
||||||
const notifyConfig = getNotificationConfig(pluginConfig);
|
const notifyConfig = getNotificationConfig(pluginConfig);
|
||||||
@@ -275,12 +293,16 @@ export function createTaskCompleteTool(api: OpenClawPluginApi) {
|
|||||||
let nextState: string | undefined;
|
let nextState: string | undefined;
|
||||||
if (role === "dev" && result === "done") {
|
if (role === "dev" && result === "done") {
|
||||||
nextState = "QA queue";
|
nextState = "QA queue";
|
||||||
|
} else if (role === "dev" && result === "blocked") {
|
||||||
|
nextState = "returned to queue";
|
||||||
} else if (role === "qa" && result === "pass") {
|
} else if (role === "qa" && result === "pass") {
|
||||||
nextState = "Done!";
|
nextState = "Done!";
|
||||||
} else if (role === "qa" && result === "fail") {
|
} else if (role === "qa" && result === "fail") {
|
||||||
nextState = "back to DEV";
|
nextState = "back to DEV";
|
||||||
} else if (role === "qa" && result === "refine") {
|
} else if (role === "qa" && result === "refine") {
|
||||||
nextState = "awaiting human decision";
|
nextState = "awaiting human decision";
|
||||||
|
} else if (role === "qa" && result === "blocked") {
|
||||||
|
nextState = "returned to QA queue";
|
||||||
}
|
}
|
||||||
|
|
||||||
await notify(
|
await notify(
|
||||||
|
|||||||
@@ -29,7 +29,7 @@
|
|||||||
"scripts": {
|
"scripts": {
|
||||||
"check": "tsc --noEmit",
|
"check": "tsc --noEmit",
|
||||||
"watch": "tsc --noEmit --watch",
|
"watch": "tsc --noEmit --watch",
|
||||||
"test": "node --test lib/**/*.test.ts"
|
"test": "npx tsx --test test/**/*.test.ts"
|
||||||
},
|
},
|
||||||
"peerDependencies": {
|
"peerDependencies": {
|
||||||
"openclaw": ">=2026.0.0"
|
"openclaw": ">=2026.0.0"
|
||||||
|
|||||||
374
test/helpers.ts
Normal file
374
test/helpers.ts
Normal file
@@ -0,0 +1,374 @@
|
|||||||
|
/**
|
||||||
|
* helpers.ts — Shared utilities for DevClaw integration tests.
|
||||||
|
*
|
||||||
|
* Provides: gateway RPC wrapper, GitHub issue helpers, session verification,
|
||||||
|
* mock context factories, and automatic test resource cleanup.
|
||||||
|
*/
|
||||||
|
import { execFile } from "node:child_process";
|
||||||
|
import fs from "node:fs/promises";
|
||||||
|
import os from "node:os";
|
||||||
|
import path from "node:path";
|
||||||
|
import { promisify } from "node:util";
|
||||||
|
import type { ToolContext } from "../lib/types.js";
|
||||||
|
import { type ProjectsData, writeProjects } from "../lib/projects.js";
|
||||||
|
|
||||||
|
const execFileAsync = promisify(execFile);
|
||||||
|
|
||||||
|
// ── Constants ───────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
/** Prefix for all test session keys — used for cleanup sweeps */
|
||||||
|
export const TEST_SESSION_PREFIX = "agent:devclaw:subagent:test-";
|
||||||
|
|
||||||
|
/** Group ID used for the test project in projects.json.
|
||||||
|
* Uses the real DevClaw Telegram group so notifications are visible in the channel. */
|
||||||
|
export const TEST_GROUP_ID = "-5239235162";
|
||||||
|
|
||||||
|
/** Repo path for test issues (devclaw repo) */
|
||||||
|
export const TEST_REPO = "laurentenhoor/devclaw";
|
||||||
|
|
||||||
|
// ── Gateway RPC ─────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Call an OpenClaw gateway method. Returns parsed JSON response.
|
||||||
|
* Throws on gateway error or timeout.
|
||||||
|
*/
|
||||||
|
export async function gateway(
|
||||||
|
method: string,
|
||||||
|
params: Record<string, unknown> = {},
|
||||||
|
): Promise<Record<string, unknown>> {
|
||||||
|
const { stdout } = await execFileAsync(
|
||||||
|
"openclaw",
|
||||||
|
[
|
||||||
|
"gateway",
|
||||||
|
"call",
|
||||||
|
method,
|
||||||
|
"--params",
|
||||||
|
JSON.stringify(params),
|
||||||
|
"--json",
|
||||||
|
],
|
||||||
|
{ timeout: 30_000 },
|
||||||
|
);
|
||||||
|
|
||||||
|
// openclaw may output plugin registration lines before JSON
|
||||||
|
const jsonStart = stdout.indexOf("{");
|
||||||
|
if (jsonStart === -1) {
|
||||||
|
throw new Error(`No JSON in gateway response for ${method}: ${stdout}`);
|
||||||
|
}
|
||||||
|
return JSON.parse(stdout.slice(jsonStart));
|
||||||
|
}
|
||||||
|
|
||||||
|
// ── Session helpers ─────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
/** Check if a session exists in the gateway */
|
||||||
|
export async function sessionExists(key: string): Promise<boolean> {
|
||||||
|
try {
|
||||||
|
const result = await gateway("sessions.list", {
|
||||||
|
limit: 200,
|
||||||
|
agentId: "devclaw",
|
||||||
|
});
|
||||||
|
const sessions = result.sessions as Array<{ key: string }>;
|
||||||
|
return sessions.some((s) => s.key === key);
|
||||||
|
} catch {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/** Get token count for a session (0 = never started) */
|
||||||
|
export async function getSessionTokens(
|
||||||
|
key: string,
|
||||||
|
): Promise<number | null> {
|
||||||
|
try {
|
||||||
|
const result = await gateway("sessions.list", {
|
||||||
|
limit: 200,
|
||||||
|
agentId: "devclaw",
|
||||||
|
});
|
||||||
|
const sessions = result.sessions as Array<{
|
||||||
|
key: string;
|
||||||
|
totalTokens?: number;
|
||||||
|
}>;
|
||||||
|
const session = sessions.find((s) => s.key === key);
|
||||||
|
return session ? (session.totalTokens ?? 0) : null;
|
||||||
|
} catch {
|
||||||
|
return null;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// ── GitHub issue helpers ────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
/** Get current labels on a GitHub issue */
|
||||||
|
export async function getIssueLabels(
|
||||||
|
repo: string,
|
||||||
|
issueId: number,
|
||||||
|
): Promise<string[]> {
|
||||||
|
const { stdout } = await execFileAsync(
|
||||||
|
"gh",
|
||||||
|
["issue", "view", String(issueId), "--repo", repo, "--json", "labels"],
|
||||||
|
{ timeout: 15_000 },
|
||||||
|
);
|
||||||
|
const data = JSON.parse(stdout) as { labels: Array<{ name: string }> };
|
||||||
|
return data.labels.map((l) => l.name);
|
||||||
|
}
|
||||||
|
|
||||||
|
/** Get current state of a GitHub issue (OPEN/CLOSED) */
|
||||||
|
export async function getIssueState(
|
||||||
|
repo: string,
|
||||||
|
issueId: number,
|
||||||
|
): Promise<string> {
|
||||||
|
const { stdout } = await execFileAsync(
|
||||||
|
"gh",
|
||||||
|
["issue", "view", String(issueId), "--repo", repo, "--json", "state"],
|
||||||
|
{ timeout: 15_000 },
|
||||||
|
);
|
||||||
|
const data = JSON.parse(stdout) as { state: string };
|
||||||
|
return data.state;
|
||||||
|
}
|
||||||
|
|
||||||
|
/** Close a GitHub issue (best-effort) */
|
||||||
|
export async function closeIssue(
|
||||||
|
repo: string,
|
||||||
|
issueId: number,
|
||||||
|
): Promise<void> {
|
||||||
|
try {
|
||||||
|
await execFileAsync(
|
||||||
|
"gh",
|
||||||
|
["issue", "close", String(issueId), "--repo", repo],
|
||||||
|
{ timeout: 15_000 },
|
||||||
|
);
|
||||||
|
} catch {
|
||||||
|
// best-effort
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// ── Cleanup registry ────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Tracks all test resources (sessions + issues) for guaranteed cleanup.
|
||||||
|
*
|
||||||
|
* Usage:
|
||||||
|
* const cleanup = new TestCleanup();
|
||||||
|
* cleanup.trackSession("agent:devclaw:subagent:test-xxx");
|
||||||
|
* cleanup.trackIssue("laurentenhoor/devclaw", 42);
|
||||||
|
* await cleanup.cleanAll(); // in after() hook
|
||||||
|
*/
|
||||||
|
export class TestCleanup {
|
||||||
|
private sessions = new Set<string>();
|
||||||
|
private issues: Array<{ repo: string; id: number }> = [];
|
||||||
|
|
||||||
|
trackSession(key: string): void {
|
||||||
|
this.sessions.add(key);
|
||||||
|
}
|
||||||
|
|
||||||
|
trackIssue(repo: string, id: number): void {
|
||||||
|
this.issues.push({ repo, id });
|
||||||
|
}
|
||||||
|
|
||||||
|
async cleanAll(): Promise<void> {
|
||||||
|
// Delete tracked sessions
|
||||||
|
for (const key of this.sessions) {
|
||||||
|
try {
|
||||||
|
await gateway("sessions.delete", {
|
||||||
|
key,
|
||||||
|
deleteTranscript: true,
|
||||||
|
});
|
||||||
|
} catch {
|
||||||
|
// best-effort
|
||||||
|
}
|
||||||
|
}
|
||||||
|
this.sessions.clear();
|
||||||
|
|
||||||
|
// Close tracked issues
|
||||||
|
for (const { repo, id } of this.issues) {
|
||||||
|
await closeIssue(repo, id);
|
||||||
|
}
|
||||||
|
this.issues.length = 0;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Safety sweep: find and delete any test sessions from previous failed runs.
|
||||||
|
* Scans sessions.list for keys matching TEST_SESSION_PREFIX.
|
||||||
|
*/
|
||||||
|
export async function sweepTestSessions(): Promise<number> {
|
||||||
|
let cleaned = 0;
|
||||||
|
try {
|
||||||
|
const result = await gateway("sessions.list", {
|
||||||
|
limit: 200,
|
||||||
|
agentId: "devclaw",
|
||||||
|
});
|
||||||
|
const sessions = result.sessions as Array<{ key: string }>;
|
||||||
|
for (const session of sessions) {
|
||||||
|
if (session.key.startsWith(TEST_SESSION_PREFIX)) {
|
||||||
|
try {
|
||||||
|
await gateway("sessions.delete", {
|
||||||
|
key: session.key,
|
||||||
|
deleteTranscript: true,
|
||||||
|
});
|
||||||
|
cleaned++;
|
||||||
|
} catch {
|
||||||
|
// ignore
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
} catch {
|
||||||
|
// ignore
|
||||||
|
}
|
||||||
|
return cleaned;
|
||||||
|
}
|
||||||
|
|
||||||
|
// ── Mock factories ──────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Create a mock ToolContext simulating a group chat for the test project.
|
||||||
|
*/
|
||||||
|
export function makeTestContext(
|
||||||
|
groupId: string,
|
||||||
|
workspaceDir: string,
|
||||||
|
): ToolContext {
|
||||||
|
return {
|
||||||
|
config: {},
|
||||||
|
workspaceDir,
|
||||||
|
agentDir: "/tmp/devclaw-test-agent",
|
||||||
|
agentId: "devclaw",
|
||||||
|
sessionKey: `agent:devclaw:telegram:group:${groupId}`,
|
||||||
|
messageChannel: "telegram",
|
||||||
|
sandboxed: false,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Create a minimal mock OpenClawPluginApi for testing.
|
||||||
|
* Only provides the fields tools actually use: pluginConfig, logger, resolvePath.
|
||||||
|
*/
|
||||||
|
export function makeTestApi(pluginConfig?: Record<string, unknown>): any {
|
||||||
|
return {
|
||||||
|
id: "devclaw",
|
||||||
|
name: "DevClaw",
|
||||||
|
source: "test",
|
||||||
|
config: {},
|
||||||
|
pluginConfig: pluginConfig ?? {
|
||||||
|
devClawAgentIds: ["devclaw"],
|
||||||
|
models: {
|
||||||
|
junior: "anthropic/claude-haiku-4-5",
|
||||||
|
medior: "anthropic/claude-sonnet-4-5",
|
||||||
|
senior: "anthropic/claude-opus-4-5",
|
||||||
|
qa: "anthropic/claude-sonnet-4-5",
|
||||||
|
},
|
||||||
|
projectExecution: "parallel",
|
||||||
|
},
|
||||||
|
logger: {
|
||||||
|
debug: () => {},
|
||||||
|
info: () => {},
|
||||||
|
warn: () => {},
|
||||||
|
error: () => {},
|
||||||
|
},
|
||||||
|
runtime: {},
|
||||||
|
registerTool: () => {},
|
||||||
|
registerHook: () => {},
|
||||||
|
registerHttpHandler: () => {},
|
||||||
|
registerHttpRoute: () => {},
|
||||||
|
registerChannel: () => {},
|
||||||
|
registerGatewayMethod: () => {},
|
||||||
|
registerCli: () => {},
|
||||||
|
registerService: () => {},
|
||||||
|
registerProvider: () => {},
|
||||||
|
registerCommand: () => {},
|
||||||
|
resolvePath: (input: string) => input.replace("~", os.homedir()),
|
||||||
|
on: () => {},
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
// ── Workspace helpers ───────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Create a temp workspace directory with initial projects.json and role files.
|
||||||
|
* Returns the workspace path. Caller must clean up via fs.rm().
|
||||||
|
*/
|
||||||
|
export async function createTestWorkspace(opts?: {
|
||||||
|
groupId?: string;
|
||||||
|
autoChain?: boolean;
|
||||||
|
}): Promise<string> {
|
||||||
|
const groupId = opts?.groupId ?? TEST_GROUP_ID;
|
||||||
|
const autoChain = opts?.autoChain ?? false;
|
||||||
|
|
||||||
|
const tempDir = await fs.mkdtemp(
|
||||||
|
path.join(os.tmpdir(), "devclaw-scenario-test-"),
|
||||||
|
);
|
||||||
|
|
||||||
|
// Create required directories
|
||||||
|
await fs.mkdir(path.join(tempDir, "memory"), { recursive: true });
|
||||||
|
await fs.mkdir(path.join(tempDir, "roles", "default"), { recursive: true });
|
||||||
|
await fs.mkdir(path.join(tempDir, "roles", "devclaw"), {
|
||||||
|
recursive: true,
|
||||||
|
});
|
||||||
|
|
||||||
|
// Write initial projects.json
|
||||||
|
const initialData: ProjectsData = {
|
||||||
|
projects: {
|
||||||
|
[groupId]: {
|
||||||
|
name: "devclaw",
|
||||||
|
repo: `~/.openclaw/extensions/devclaw`,
|
||||||
|
groupName: "DevClaw - DevClaw",
|
||||||
|
deployUrl: "",
|
||||||
|
baseBranch: "main",
|
||||||
|
deployBranch: "main",
|
||||||
|
autoChain,
|
||||||
|
channel: "telegram",
|
||||||
|
dev: {
|
||||||
|
active: false,
|
||||||
|
issueId: null,
|
||||||
|
startTime: null,
|
||||||
|
model: null,
|
||||||
|
sessions: { junior: null, medior: null, senior: null },
|
||||||
|
},
|
||||||
|
qa: {
|
||||||
|
active: false,
|
||||||
|
issueId: null,
|
||||||
|
startTime: null,
|
||||||
|
model: null,
|
||||||
|
sessions: { qa: null },
|
||||||
|
},
|
||||||
|
},
|
||||||
|
},
|
||||||
|
};
|
||||||
|
await writeProjects(tempDir, initialData);
|
||||||
|
|
||||||
|
// Write minimal role files
|
||||||
|
await fs.writeFile(
|
||||||
|
path.join(tempDir, "roles", "default", "dev.md"),
|
||||||
|
"# DEV Worker Instructions\n\nThis is a test worker. Just acknowledge the task.\n",
|
||||||
|
);
|
||||||
|
await fs.writeFile(
|
||||||
|
path.join(tempDir, "roles", "default", "qa.md"),
|
||||||
|
"# QA Worker Instructions\n\nThis is a test QA worker. Just acknowledge the task.\n",
|
||||||
|
);
|
||||||
|
|
||||||
|
return tempDir;
|
||||||
|
}
|
||||||
|
|
||||||
|
// ── Result parser ───────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Parse the result from a tool's execute() call.
|
||||||
|
* Tools return jsonResult() which wraps the payload in AgentToolResult format.
|
||||||
|
*/
|
||||||
|
export function parseToolResult(result: unknown): Record<string, unknown> {
|
||||||
|
// jsonResult returns [{ type: "text", text: JSON.stringify(payload) }]
|
||||||
|
// or { content: [{ type: "text", text: "..." }] }
|
||||||
|
if (Array.isArray(result)) {
|
||||||
|
const first = result[0];
|
||||||
|
if (first && typeof first === "object" && "text" in first) {
|
||||||
|
return JSON.parse(first.text as string);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if (
|
||||||
|
result &&
|
||||||
|
typeof result === "object" &&
|
||||||
|
"content" in result
|
||||||
|
) {
|
||||||
|
const content = (result as any).content;
|
||||||
|
if (Array.isArray(content) && content[0]?.text) {
|
||||||
|
return JSON.parse(content[0].text);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
throw new Error(`Cannot parse tool result: ${JSON.stringify(result)}`);
|
||||||
|
}
|
||||||
@@ -13,7 +13,7 @@ import {
|
|||||||
deactivateWorker,
|
deactivateWorker,
|
||||||
readProjects,
|
readProjects,
|
||||||
writeProjects,
|
writeProjects,
|
||||||
} from "./projects.js";
|
} from "../lib/projects.js";
|
||||||
|
|
||||||
describe("Session persistence", () => {
|
describe("Session persistence", () => {
|
||||||
let tempDir: string;
|
let tempDir: string;
|
||||||
740
test/scenarios.test.ts
Normal file
740
test/scenarios.test.ts
Normal file
@@ -0,0 +1,740 @@
|
|||||||
|
/**
|
||||||
|
* scenarios.test.ts — Scenario-based integration tests for DevClaw plugin tools.
|
||||||
|
*
|
||||||
|
* Tests the full tool pipeline in realistic sequences against real gateway + GitHub.
|
||||||
|
* Each scenario exercises multiple tools in order, verifying BOTH return values
|
||||||
|
* AND actual side effects (session existence, issue labels, projects.json state).
|
||||||
|
*
|
||||||
|
* Prerequisites:
|
||||||
|
* - OpenClaw gateway running
|
||||||
|
* - `gh` CLI authenticated with access to laurentenhoor/devclaw
|
||||||
|
* - `openclaw` CLI in PATH
|
||||||
|
*
|
||||||
|
* Run with: npm test
|
||||||
|
*/
|
||||||
|
import { describe, it, before, after } from "node:test";
|
||||||
|
import assert from "node:assert";
|
||||||
|
import fs from "node:fs/promises";
|
||||||
|
|
||||||
|
import { createTaskCreateTool } from "../lib/tools/task-create.js";
|
||||||
|
import { createTaskPickupTool } from "../lib/tools/task-pickup.js";
|
||||||
|
import { createTaskCompleteTool } from "../lib/tools/task-complete.js";
|
||||||
|
import { createQueueStatusTool } from "../lib/tools/queue-status.js";
|
||||||
|
import { createSessionHealthTool } from "../lib/tools/session-health.js";
|
||||||
|
import { readProjects, writeProjects } from "../lib/projects.js";
|
||||||
|
import { resolveModel } from "../lib/tiers.js";
|
||||||
|
import {
|
||||||
|
TestCleanup,
|
||||||
|
TEST_GROUP_ID,
|
||||||
|
TEST_REPO,
|
||||||
|
createTestWorkspace,
|
||||||
|
gateway,
|
||||||
|
getIssueLabels,
|
||||||
|
getIssueState,
|
||||||
|
makeTestApi,
|
||||||
|
makeTestContext,
|
||||||
|
parseToolResult,
|
||||||
|
sessionExists,
|
||||||
|
sweepTestSessions,
|
||||||
|
} from "./helpers.js";
|
||||||
|
|
||||||
|
// ── Suite-level setup ───────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
describe("DevClaw Scenario Tests", { timeout: 240_000 }, () => {
|
||||||
|
before(async () => {
|
||||||
|
// Verify gateway is accessible
|
||||||
|
try {
|
||||||
|
await gateway("sessions.list", { limit: 1 });
|
||||||
|
} catch (err) {
|
||||||
|
throw new Error(
|
||||||
|
`Gateway not accessible — cannot run integration tests: ${(err as Error).message}. ` +
|
||||||
|
`Ensure 'openclaw' is in PATH and gateway is running.`,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Verify gh CLI is authenticated
|
||||||
|
try {
|
||||||
|
const { execFile } = await import("node:child_process");
|
||||||
|
const { promisify } = await import("node:util");
|
||||||
|
const execFileAsync = promisify(execFile);
|
||||||
|
await execFileAsync("gh", ["auth", "status"], { timeout: 10_000 });
|
||||||
|
} catch (err) {
|
||||||
|
throw new Error(
|
||||||
|
`GitHub CLI not authenticated — cannot run integration tests: ${(err as Error).message}`,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Sweep leftover test sessions from previous failed runs
|
||||||
|
await sweepTestSessions();
|
||||||
|
});
|
||||||
|
|
||||||
|
// ── Scenario 1: Full DEV lifecycle ──────────────────────────────────────
|
||||||
|
|
||||||
|
describe("Scenario 1: Full DEV lifecycle", () => {
|
||||||
|
const cleanup = new TestCleanup();
|
||||||
|
let workspaceDir: string;
|
||||||
|
let api: ReturnType<typeof makeTestApi>;
|
||||||
|
let ctx: ReturnType<typeof makeTestContext>;
|
||||||
|
|
||||||
|
let createdIssueId: number;
|
||||||
|
let spawnedSessionKey: string;
|
||||||
|
|
||||||
|
before(async () => {
|
||||||
|
workspaceDir = await createTestWorkspace({ autoChain: false });
|
||||||
|
api = makeTestApi();
|
||||||
|
ctx = makeTestContext(TEST_GROUP_ID, workspaceDir);
|
||||||
|
});
|
||||||
|
|
||||||
|
after(async () => {
|
||||||
|
await cleanup.cleanAll();
|
||||||
|
await fs.rm(workspaceDir, { recursive: true, force: true });
|
||||||
|
});
|
||||||
|
|
||||||
|
it("step 1: task_create creates a test issue with To Do label", async () => {
|
||||||
|
const tool = createTaskCreateTool(api)(ctx);
|
||||||
|
const result = parseToolResult(
|
||||||
|
await tool.execute("test", {
|
||||||
|
projectGroupId: TEST_GROUP_ID,
|
||||||
|
title: "[TEST] Scenario 1 — DEV lifecycle test",
|
||||||
|
description:
|
||||||
|
"Automated integration test. This issue will be cleaned up automatically.",
|
||||||
|
label: "To Do",
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
|
||||||
|
assert.strictEqual(result.success, true, "task_create should succeed");
|
||||||
|
assert.ok(result.issue, "Should return issue object");
|
||||||
|
createdIssueId = (result.issue as any).id as number;
|
||||||
|
assert.ok(createdIssueId > 0, "Issue ID should be positive");
|
||||||
|
cleanup.trackIssue(TEST_REPO, createdIssueId);
|
||||||
|
|
||||||
|
// Verify side effect: issue has "To Do" label in GitHub
|
||||||
|
const labels = await getIssueLabels(TEST_REPO, createdIssueId);
|
||||||
|
assert.ok(
|
||||||
|
labels.includes("To Do"),
|
||||||
|
`Issue should have "To Do" label, got: ${labels.join(", ")}`,
|
||||||
|
);
|
||||||
|
});
|
||||||
|
|
||||||
|
it("step 2: task_pickup spawns a worker session", async () => {
|
||||||
|
const tool = createTaskPickupTool(api)(ctx);
|
||||||
|
const result = parseToolResult(
|
||||||
|
await tool.execute("test", {
|
||||||
|
issueId: createdIssueId,
|
||||||
|
role: "dev",
|
||||||
|
projectGroupId: TEST_GROUP_ID,
|
||||||
|
model: "junior",
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
|
||||||
|
assert.strictEqual(result.success, true, `task_pickup should succeed: ${result.error ?? ""}`);
|
||||||
|
assert.strictEqual(result.sessionAction, "spawn", "Should spawn new session");
|
||||||
|
|
||||||
|
// task_pickup stores the session key in projects.json, not in the result
|
||||||
|
const data = await readProjects(workspaceDir);
|
||||||
|
const project = data.projects[TEST_GROUP_ID];
|
||||||
|
assert.ok(project.dev.sessions.junior, "Should have session key in projects.json");
|
||||||
|
spawnedSessionKey = project.dev.sessions.junior as string;
|
||||||
|
|
||||||
|
// Track for cleanup
|
||||||
|
cleanup.trackSession(spawnedSessionKey);
|
||||||
|
|
||||||
|
// Verify side effect: session exists in gateway
|
||||||
|
const exists = await sessionExists(spawnedSessionKey);
|
||||||
|
assert.ok(exists, `Session ${spawnedSessionKey} should exist in gateway`);
|
||||||
|
|
||||||
|
// Verify side effect: issue label transitioned to "Doing"
|
||||||
|
const labels = await getIssueLabels(TEST_REPO, createdIssueId);
|
||||||
|
assert.ok(
|
||||||
|
labels.includes("Doing"),
|
||||||
|
`Issue should have "Doing" label after pickup, got: ${labels.join(", ")}`,
|
||||||
|
);
|
||||||
|
|
||||||
|
// Verify side effect: worker is active
|
||||||
|
assert.strictEqual(project.dev.active, true, "Worker should be active");
|
||||||
|
assert.strictEqual(
|
||||||
|
project.dev.issueId,
|
||||||
|
String(createdIssueId),
|
||||||
|
"Worker should have correct issue ID",
|
||||||
|
);
|
||||||
|
});
|
||||||
|
|
||||||
|
it("step 3: task_complete (dev done) transitions to To Test", async () => {
|
||||||
|
const tool = createTaskCompleteTool(api)(ctx);
|
||||||
|
const result = parseToolResult(
|
||||||
|
await tool.execute("test", {
|
||||||
|
role: "dev",
|
||||||
|
result: "done",
|
||||||
|
projectGroupId: TEST_GROUP_ID,
|
||||||
|
summary: "Test task completed by integration test",
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
|
||||||
|
assert.strictEqual(result.success, true, `task_complete should succeed: ${result.error ?? ""}`);
|
||||||
|
assert.ok(
|
||||||
|
(result.labelTransition as string)?.includes("To Test"),
|
||||||
|
`Label should transition to "To Test", got: ${result.labelTransition}`,
|
||||||
|
);
|
||||||
|
|
||||||
|
// Verify side effect: issue label is now "To Test"
|
||||||
|
const labels = await getIssueLabels(TEST_REPO, createdIssueId);
|
||||||
|
assert.ok(
|
||||||
|
labels.includes("To Test"),
|
||||||
|
`Issue should have "To Test" label, got: ${labels.join(", ")}`,
|
||||||
|
);
|
||||||
|
|
||||||
|
// Verify side effect: worker deactivated but session preserved
|
||||||
|
const data = await readProjects(workspaceDir);
|
||||||
|
const project = data.projects[TEST_GROUP_ID];
|
||||||
|
assert.strictEqual(project.dev.active, false, "Worker should be inactive");
|
||||||
|
assert.strictEqual(project.dev.issueId, null, "Issue ID should be cleared");
|
||||||
|
assert.strictEqual(
|
||||||
|
project.dev.sessions.junior,
|
||||||
|
spawnedSessionKey,
|
||||||
|
"Session should be PRESERVED after completion",
|
||||||
|
);
|
||||||
|
});
|
||||||
|
|
||||||
|
it("step 4: task_pickup (qa) transitions to Testing", async () => {
|
||||||
|
const tool = createTaskPickupTool(api)(ctx);
|
||||||
|
const result = parseToolResult(
|
||||||
|
await tool.execute("test", {
|
||||||
|
issueId: createdIssueId,
|
||||||
|
role: "qa",
|
||||||
|
projectGroupId: TEST_GROUP_ID,
|
||||||
|
model: "qa",
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
|
||||||
|
assert.strictEqual(result.success, true, `QA pickup should succeed: ${result.error ?? ""}`);
|
||||||
|
|
||||||
|
// Read QA session key from projects.json
|
||||||
|
const data = await readProjects(workspaceDir);
|
||||||
|
const project = data.projects[TEST_GROUP_ID];
|
||||||
|
assert.ok(project.qa.sessions.qa, "Should have QA session key in projects.json");
|
||||||
|
cleanup.trackSession(project.qa.sessions.qa as string);
|
||||||
|
|
||||||
|
// Verify side effect: issue label transitioned to "Testing"
|
||||||
|
const labels = await getIssueLabels(TEST_REPO, createdIssueId);
|
||||||
|
assert.ok(
|
||||||
|
labels.includes("Testing"),
|
||||||
|
`Issue should have "Testing" label after QA pickup, got: ${labels.join(", ")}`,
|
||||||
|
);
|
||||||
|
|
||||||
|
// Verify side effect: QA worker is active
|
||||||
|
assert.strictEqual(project.qa.active, true, "QA worker should be active");
|
||||||
|
assert.strictEqual(
|
||||||
|
project.qa.issueId,
|
||||||
|
String(createdIssueId),
|
||||||
|
"QA worker should have correct issue ID",
|
||||||
|
);
|
||||||
|
});
|
||||||
|
|
||||||
|
it("step 5: task_complete (qa pass) transitions to Done and closes issue", async () => {
|
||||||
|
const tool = createTaskCompleteTool(api)(ctx);
|
||||||
|
const result = parseToolResult(
|
||||||
|
await tool.execute("test", {
|
||||||
|
role: "qa",
|
||||||
|
result: "pass",
|
||||||
|
projectGroupId: TEST_GROUP_ID,
|
||||||
|
summary: "QA passed by integration test",
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
|
||||||
|
assert.strictEqual(result.success, true, `QA complete should succeed: ${result.error ?? ""}`);
|
||||||
|
assert.ok(
|
||||||
|
(result.labelTransition as string)?.includes("Done"),
|
||||||
|
`Label should transition to "Done", got: ${result.labelTransition}`,
|
||||||
|
);
|
||||||
|
assert.strictEqual(result.issueClosed, true, "Issue should be closed");
|
||||||
|
|
||||||
|
// Verify side effect: issue label is now "Done"
|
||||||
|
const labels = await getIssueLabels(TEST_REPO, createdIssueId);
|
||||||
|
assert.ok(
|
||||||
|
labels.includes("Done"),
|
||||||
|
`Issue should have "Done" label, got: ${labels.join(", ")}`,
|
||||||
|
);
|
||||||
|
|
||||||
|
// Verify side effect: issue is closed
|
||||||
|
const state = await getIssueState(TEST_REPO, createdIssueId);
|
||||||
|
assert.strictEqual(state, "CLOSED", "Issue should be closed in GitHub");
|
||||||
|
|
||||||
|
// Verify side effect: QA worker deactivated, sessions preserved
|
||||||
|
const data = await readProjects(workspaceDir);
|
||||||
|
const project = data.projects[TEST_GROUP_ID];
|
||||||
|
assert.strictEqual(project.qa.active, false, "QA worker should be inactive");
|
||||||
|
assert.strictEqual(project.qa.issueId, null, "QA issue ID should be cleared");
|
||||||
|
|
||||||
|
// DEV session should still be preserved from earlier
|
||||||
|
assert.strictEqual(
|
||||||
|
project.dev.sessions.junior,
|
||||||
|
spawnedSessionKey,
|
||||||
|
"DEV session should still be PRESERVED after full lifecycle",
|
||||||
|
);
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
// ── Scenario 2: Queue status accuracy ─────────────────────────────────
|
||||||
|
|
||||||
|
describe("Scenario 2: Queue status accuracy", () => {
|
||||||
|
const cleanup = new TestCleanup();
|
||||||
|
let workspaceDir: string;
|
||||||
|
let api: ReturnType<typeof makeTestApi>;
|
||||||
|
let ctx: ReturnType<typeof makeTestContext>;
|
||||||
|
let issueIds: number[] = [];
|
||||||
|
|
||||||
|
before(async () => {
|
||||||
|
workspaceDir = await createTestWorkspace({ autoChain: false });
|
||||||
|
api = makeTestApi();
|
||||||
|
ctx = makeTestContext(TEST_GROUP_ID, workspaceDir);
|
||||||
|
});
|
||||||
|
|
||||||
|
after(async () => {
|
||||||
|
await cleanup.cleanAll();
|
||||||
|
await fs.rm(workspaceDir, { recursive: true, force: true });
|
||||||
|
});
|
||||||
|
|
||||||
|
it("step 1: create 3 issues with different labels", async () => {
|
||||||
|
const tool = createTaskCreateTool(api)(ctx);
|
||||||
|
|
||||||
|
const labels = ["To Do", "To Improve", "To Test"];
|
||||||
|
for (const label of labels) {
|
||||||
|
const result = parseToolResult(
|
||||||
|
await tool.execute("test", {
|
||||||
|
projectGroupId: TEST_GROUP_ID,
|
||||||
|
title: `[TEST] Queue test — ${label}`,
|
||||||
|
description: "Automated test issue for queue_status verification.",
|
||||||
|
label,
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
assert.strictEqual(result.success, true);
|
||||||
|
const issueId = (result.issue as any).id as number;
|
||||||
|
issueIds.push(issueId);
|
||||||
|
cleanup.trackIssue(TEST_REPO, issueId);
|
||||||
|
}
|
||||||
|
|
||||||
|
assert.strictEqual(issueIds.length, 3, "Should have created 3 issues");
|
||||||
|
});
|
||||||
|
|
||||||
|
it("step 2: queue_status shows all issues in correct buckets", async () => {
|
||||||
|
// Small delay for GitHub API eventual consistency
|
||||||
|
await new Promise((r) => setTimeout(r, 2_000));
|
||||||
|
|
||||||
|
const tool = createQueueStatusTool(api)(ctx);
|
||||||
|
const result = parseToolResult(
|
||||||
|
await tool.execute("test", {
|
||||||
|
projectGroupId: TEST_GROUP_ID,
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
|
||||||
|
// The result should contain projects with queue data
|
||||||
|
const projects = result.projects as Array<{
|
||||||
|
queue: {
|
||||||
|
toImprove: Array<{ id: number }>;
|
||||||
|
toTest: Array<{ id: number }>;
|
||||||
|
toDo: Array<{ id: number }>;
|
||||||
|
};
|
||||||
|
}>;
|
||||||
|
assert.ok(projects && projects.length > 0, "Should have project data");
|
||||||
|
|
||||||
|
const queue = projects[0].queue;
|
||||||
|
assert.ok(queue, "Should have queue data");
|
||||||
|
|
||||||
|
// Verify each bucket has our test issues
|
||||||
|
const toDoIds = queue.toDo.map((i) => i.id);
|
||||||
|
const toImproveIds = queue.toImprove.map((i) => i.id);
|
||||||
|
const toTestIds = queue.toTest.map((i) => i.id);
|
||||||
|
|
||||||
|
assert.ok(
|
||||||
|
toDoIds.includes(issueIds[0]),
|
||||||
|
`"To Do" bucket should contain issue ${issueIds[0]}`,
|
||||||
|
);
|
||||||
|
assert.ok(
|
||||||
|
toImproveIds.includes(issueIds[1]),
|
||||||
|
`"To Improve" bucket should contain issue ${issueIds[1]}`,
|
||||||
|
);
|
||||||
|
assert.ok(
|
||||||
|
toTestIds.includes(issueIds[2]),
|
||||||
|
`"To Test" bucket should contain issue ${issueIds[2]}`,
|
||||||
|
);
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
// ── Scenario 3: Session health detection ──────────────────────────────
|
||||||
|
|
||||||
|
describe("Scenario 3: Session health detection", () => {
|
||||||
|
let workspaceDir: string;
|
||||||
|
let api: ReturnType<typeof makeTestApi>;
|
||||||
|
let ctx: ReturnType<typeof makeTestContext>;
|
||||||
|
|
||||||
|
before(async () => {
|
||||||
|
workspaceDir = await createTestWorkspace({ autoChain: false });
|
||||||
|
api = makeTestApi();
|
||||||
|
ctx = makeTestContext(TEST_GROUP_ID, workspaceDir);
|
||||||
|
});
|
||||||
|
|
||||||
|
after(async () => {
|
||||||
|
await fs.rm(workspaceDir, { recursive: true, force: true });
|
||||||
|
});
|
||||||
|
|
||||||
|
it("step 1: inject zombie state into projects.json", async () => {
|
||||||
|
const data = await readProjects(workspaceDir);
|
||||||
|
const project = data.projects[TEST_GROUP_ID];
|
||||||
|
|
||||||
|
// Set active=true with a dead session key (session doesn't exist in gateway)
|
||||||
|
project.dev.active = true;
|
||||||
|
project.dev.issueId = "999";
|
||||||
|
project.dev.model = "medior";
|
||||||
|
project.dev.startTime = new Date(
|
||||||
|
Date.now() - 3 * 60 * 60 * 1000,
|
||||||
|
).toISOString(); // 3 hours ago
|
||||||
|
project.dev.sessions.medior =
|
||||||
|
"agent:devclaw:subagent:dead-zombie-session-000";
|
||||||
|
|
||||||
|
await writeProjects(workspaceDir, data);
|
||||||
|
|
||||||
|
// Verify it was written
|
||||||
|
const readBack = await readProjects(workspaceDir);
|
||||||
|
assert.strictEqual(
|
||||||
|
readBack.projects[TEST_GROUP_ID].dev.active,
|
||||||
|
true,
|
||||||
|
"Zombie state should be written",
|
||||||
|
);
|
||||||
|
});
|
||||||
|
|
||||||
|
it("step 2: session_health detects the zombie (no autoFix)", async () => {
|
||||||
|
const tool = createSessionHealthTool(api)(ctx);
|
||||||
|
const result = parseToolResult(
|
||||||
|
await tool.execute("test", {
|
||||||
|
autoFix: false,
|
||||||
|
activeSessions: [], // empty = zombie detection skipped, but stale_worker (3h) will be caught
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
|
||||||
|
assert.strictEqual(result.healthy, false, "Should report unhealthy");
|
||||||
|
assert.ok(
|
||||||
|
(result.issuesFound as number) > 0,
|
||||||
|
"Should find at least one issue",
|
||||||
|
);
|
||||||
|
|
||||||
|
const issues = result.issues as Array<{
|
||||||
|
type: string;
|
||||||
|
project: string;
|
||||||
|
role: string;
|
||||||
|
}>;
|
||||||
|
const zombieIssue = issues.find(
|
||||||
|
(i) =>
|
||||||
|
i.project === "devclaw" &&
|
||||||
|
i.role === "dev",
|
||||||
|
);
|
||||||
|
assert.ok(zombieIssue, "Should detect zombie for dev worker");
|
||||||
|
});
|
||||||
|
|
||||||
|
it("step 3: session_health fixes the zombie (autoFix=true)", async () => {
|
||||||
|
const tool = createSessionHealthTool(api)(ctx);
|
||||||
|
// Provide a non-empty activeSessions list that does NOT include the zombie key.
|
||||||
|
// This enables zombie detection (requires activeSessions.length > 0).
|
||||||
|
const result = parseToolResult(
|
||||||
|
await tool.execute("test", {
|
||||||
|
autoFix: true,
|
||||||
|
activeSessions: ["agent:devclaw:subagent:some-alive-session"],
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
|
||||||
|
assert.ok(
|
||||||
|
(result.fixesApplied as number) > 0,
|
||||||
|
"Should apply at least one fix",
|
||||||
|
);
|
||||||
|
|
||||||
|
// Verify side effect: worker is deactivated in projects.json
|
||||||
|
const data = await readProjects(workspaceDir);
|
||||||
|
const project = data.projects[TEST_GROUP_ID];
|
||||||
|
assert.strictEqual(
|
||||||
|
project.dev.active,
|
||||||
|
false,
|
||||||
|
"Worker should be deactivated after auto-fix",
|
||||||
|
);
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
// ── Scenario 4: Auto-chain DEV → QA ──────────────────────────────────
|
||||||
|
|
||||||
|
describe("Scenario 4: Auto-chain DEV → QA", () => {
|
||||||
|
const cleanup = new TestCleanup();
|
||||||
|
let workspaceDir: string;
|
||||||
|
let api: ReturnType<typeof makeTestApi>;
|
||||||
|
let ctx: ReturnType<typeof makeTestContext>;
|
||||||
|
|
||||||
|
let issueId: number;
|
||||||
|
let devSessionKey: string;
|
||||||
|
|
||||||
|
before(async () => {
|
||||||
|
workspaceDir = await createTestWorkspace({ autoChain: true });
|
||||||
|
api = makeTestApi();
|
||||||
|
ctx = makeTestContext(TEST_GROUP_ID, workspaceDir);
|
||||||
|
});
|
||||||
|
|
||||||
|
after(async () => {
|
||||||
|
await cleanup.cleanAll();
|
||||||
|
await fs.rm(workspaceDir, { recursive: true, force: true });
|
||||||
|
});
|
||||||
|
|
||||||
|
it("step 1: create issue and pick up for DEV", async () => {
|
||||||
|
// Create
|
||||||
|
const createTool = createTaskCreateTool(api)(ctx);
|
||||||
|
const createResult = parseToolResult(
|
||||||
|
await createTool.execute("test", {
|
||||||
|
projectGroupId: TEST_GROUP_ID,
|
||||||
|
title: "[TEST] Scenario 4 — Auto-chain DEV→QA",
|
||||||
|
description: "Automated test for auto-chain pipeline.",
|
||||||
|
label: "To Do",
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
issueId = (createResult.issue as any).id as number;
|
||||||
|
cleanup.trackIssue(TEST_REPO, issueId);
|
||||||
|
|
||||||
|
// Pickup DEV
|
||||||
|
const pickupTool = createTaskPickupTool(api)(ctx);
|
||||||
|
const pickupResult = parseToolResult(
|
||||||
|
await pickupTool.execute("test", {
|
||||||
|
issueId,
|
||||||
|
role: "dev",
|
||||||
|
projectGroupId: TEST_GROUP_ID,
|
||||||
|
model: "junior",
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
assert.strictEqual(pickupResult.success, true, `Pickup should succeed: ${pickupResult.error ?? ""}`);
|
||||||
|
|
||||||
|
// Read session key from projects.json (not in tool result)
|
||||||
|
const pickupData = await readProjects(workspaceDir);
|
||||||
|
devSessionKey = pickupData.projects[TEST_GROUP_ID].dev.sessions.junior as string;
|
||||||
|
assert.ok(devSessionKey, "Should have session key in projects.json");
|
||||||
|
cleanup.trackSession(devSessionKey);
|
||||||
|
});
|
||||||
|
|
||||||
|
it("step 2: task_complete (dev done) auto-chains to QA", async () => {
|
||||||
|
const tool = createTaskCompleteTool(api)(ctx);
|
||||||
|
const result = parseToolResult(
|
||||||
|
await tool.execute("test", {
|
||||||
|
role: "dev",
|
||||||
|
result: "done",
|
||||||
|
projectGroupId: TEST_GROUP_ID,
|
||||||
|
summary: "DEV done, should auto-chain to QA",
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
|
||||||
|
assert.strictEqual(result.success, true, `Complete should succeed: ${result.error ?? ""}`);
|
||||||
|
|
||||||
|
// Check auto-chain result
|
||||||
|
const autoChain = result.autoChain as Record<string, unknown> | undefined;
|
||||||
|
if (autoChain) {
|
||||||
|
assert.strictEqual(
|
||||||
|
autoChain.dispatched,
|
||||||
|
true,
|
||||||
|
"Auto-chain should dispatch QA",
|
||||||
|
);
|
||||||
|
assert.strictEqual(autoChain.role, "qa", "Should chain to QA role");
|
||||||
|
|
||||||
|
// Track QA session for cleanup
|
||||||
|
if (autoChain.sessionKey) {
|
||||||
|
cleanup.trackSession(autoChain.sessionKey as string);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Verify issue label moved to Testing (via auto-chain)
|
||||||
|
const labels = await getIssueLabels(TEST_REPO, issueId);
|
||||||
|
assert.ok(
|
||||||
|
labels.includes("Testing") || labels.includes("To Test"),
|
||||||
|
`Issue should have "Testing" or "To Test" label, got: ${labels.join(", ")}`,
|
||||||
|
);
|
||||||
|
|
||||||
|
// Verify QA worker is active in projects.json
|
||||||
|
const data = await readProjects(workspaceDir);
|
||||||
|
const project = data.projects[TEST_GROUP_ID];
|
||||||
|
if (autoChain?.dispatched) {
|
||||||
|
assert.strictEqual(
|
||||||
|
project.qa.active,
|
||||||
|
true,
|
||||||
|
"QA worker should be active after auto-chain",
|
||||||
|
);
|
||||||
|
}
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
// ── Scenario 5: Blocked result escalation ────────────────────────────
|
||||||
|
|
||||||
|
describe("Scenario 5: Blocked result escalation", () => {
|
||||||
|
const cleanup = new TestCleanup();
|
||||||
|
let workspaceDir: string;
|
||||||
|
let api: ReturnType<typeof makeTestApi>;
|
||||||
|
let ctx: ReturnType<typeof makeTestContext>;
|
||||||
|
|
||||||
|
let issueId: number;
|
||||||
|
|
||||||
|
before(async () => {
|
||||||
|
workspaceDir = await createTestWorkspace({ autoChain: false });
|
||||||
|
api = makeTestApi();
|
||||||
|
ctx = makeTestContext(TEST_GROUP_ID, workspaceDir);
|
||||||
|
});
|
||||||
|
|
||||||
|
after(async () => {
|
||||||
|
await cleanup.cleanAll();
|
||||||
|
await fs.rm(workspaceDir, { recursive: true, force: true });
|
||||||
|
});
|
||||||
|
|
||||||
|
it("step 1: create issue, pick up as DEV, then block", async () => {
|
||||||
|
// Create issue
|
||||||
|
const createTool = createTaskCreateTool(api)(ctx);
|
||||||
|
const createResult = parseToolResult(
|
||||||
|
await createTool.execute("test", {
|
||||||
|
projectGroupId: TEST_GROUP_ID,
|
||||||
|
title: "[TEST] Scenario 5 — DEV blocked escalation",
|
||||||
|
description: "Test that blocked DEV returns issue to queue.",
|
||||||
|
label: "To Do",
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
issueId = (createResult.issue as any).id as number;
|
||||||
|
cleanup.trackIssue(TEST_REPO, issueId);
|
||||||
|
|
||||||
|
// Pickup DEV
|
||||||
|
const pickupTool = createTaskPickupTool(api)(ctx);
|
||||||
|
const pickupResult = parseToolResult(
|
||||||
|
await pickupTool.execute("test", {
|
||||||
|
issueId,
|
||||||
|
role: "dev",
|
||||||
|
projectGroupId: TEST_GROUP_ID,
|
||||||
|
model: "junior",
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
assert.strictEqual(pickupResult.success, true, `Pickup should succeed: ${pickupResult.error ?? ""}`);
|
||||||
|
|
||||||
|
// Track session for cleanup
|
||||||
|
const data = await readProjects(workspaceDir);
|
||||||
|
const sessionKey = data.projects[TEST_GROUP_ID].dev.sessions.junior as string;
|
||||||
|
if (sessionKey) cleanup.trackSession(sessionKey);
|
||||||
|
|
||||||
|
// Complete with blocked
|
||||||
|
const completeTool = createTaskCompleteTool(api)(ctx);
|
||||||
|
const completeResult = parseToolResult(
|
||||||
|
await completeTool.execute("test", {
|
||||||
|
role: "dev",
|
||||||
|
result: "blocked",
|
||||||
|
projectGroupId: TEST_GROUP_ID,
|
||||||
|
summary: "Cannot complete — missing dependencies",
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
|
||||||
|
assert.strictEqual(completeResult.success, true, `Blocked should succeed: ${completeResult.error ?? ""}`);
|
||||||
|
assert.strictEqual(completeResult.labelTransition, "Doing → To Do", "Should revert to To Do");
|
||||||
|
|
||||||
|
// Verify side effect: issue label is back to "To Do"
|
||||||
|
const labels = await getIssueLabels(TEST_REPO, issueId);
|
||||||
|
assert.ok(
|
||||||
|
labels.includes("To Do"),
|
||||||
|
`Issue should have "To Do" label after DEV blocked, got: ${labels.join(", ")}`,
|
||||||
|
);
|
||||||
|
|
||||||
|
// Verify side effect: worker deactivated
|
||||||
|
const refreshedData = await readProjects(workspaceDir);
|
||||||
|
const project = refreshedData.projects[TEST_GROUP_ID];
|
||||||
|
assert.strictEqual(project.dev.active, false, "DEV worker should be inactive after blocked");
|
||||||
|
});
|
||||||
|
|
||||||
|
it("step 2: pick up as QA, then block", async () => {
|
||||||
|
// First do a DEV cycle to get to "To Test"
|
||||||
|
const pickupTool = createTaskPickupTool(api)(ctx);
|
||||||
|
await pickupTool.execute("test", {
|
||||||
|
issueId,
|
||||||
|
role: "dev",
|
||||||
|
projectGroupId: TEST_GROUP_ID,
|
||||||
|
model: "junior",
|
||||||
|
});
|
||||||
|
|
||||||
|
const completeTool = createTaskCompleteTool(api)(ctx);
|
||||||
|
await completeTool.execute("test", {
|
||||||
|
role: "dev",
|
||||||
|
result: "done",
|
||||||
|
projectGroupId: TEST_GROUP_ID,
|
||||||
|
summary: "DEV done",
|
||||||
|
});
|
||||||
|
|
||||||
|
// Now pick up as QA
|
||||||
|
const qaPickupResult = parseToolResult(
|
||||||
|
await pickupTool.execute("test", {
|
||||||
|
issueId,
|
||||||
|
role: "qa",
|
||||||
|
projectGroupId: TEST_GROUP_ID,
|
||||||
|
model: "qa",
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
assert.strictEqual(qaPickupResult.success, true, `QA pickup should succeed: ${qaPickupResult.error ?? ""}`);
|
||||||
|
|
||||||
|
// Track QA session for cleanup
|
||||||
|
const data = await readProjects(workspaceDir);
|
||||||
|
const qaSessionKey = data.projects[TEST_GROUP_ID].qa.sessions.qa as string;
|
||||||
|
if (qaSessionKey) cleanup.trackSession(qaSessionKey);
|
||||||
|
|
||||||
|
// Complete QA with blocked
|
||||||
|
const qaCompleteResult = parseToolResult(
|
||||||
|
await completeTool.execute("test", {
|
||||||
|
role: "qa",
|
||||||
|
result: "blocked",
|
||||||
|
projectGroupId: TEST_GROUP_ID,
|
||||||
|
summary: "Cannot test — environment not available",
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
|
||||||
|
assert.strictEqual(qaCompleteResult.success, true, `QA blocked should succeed: ${qaCompleteResult.error ?? ""}`);
|
||||||
|
assert.strictEqual(qaCompleteResult.labelTransition, "Testing → To Test", "Should revert to To Test");
|
||||||
|
|
||||||
|
// Verify side effect: issue label is back to "To Test"
|
||||||
|
const labels = await getIssueLabels(TEST_REPO, issueId);
|
||||||
|
assert.ok(
|
||||||
|
labels.includes("To Test"),
|
||||||
|
`Issue should have "To Test" label after QA blocked, got: ${labels.join(", ")}`,
|
||||||
|
);
|
||||||
|
|
||||||
|
// Verify side effect: QA worker deactivated
|
||||||
|
const refreshedData = await readProjects(workspaceDir);
|
||||||
|
const project = refreshedData.projects[TEST_GROUP_ID];
|
||||||
|
assert.strictEqual(project.qa.active, false, "QA worker should be inactive after blocked");
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
// ── Scenario 6: Model resolution ──────────────────────────────────────
|
||||||
|
|
||||||
|
describe("Scenario 6: Model resolution", () => {
|
||||||
|
it("resolves tier names to correct model IDs", () => {
|
||||||
|
assert.strictEqual(
|
||||||
|
resolveModel("junior"),
|
||||||
|
"anthropic/claude-haiku-4-5",
|
||||||
|
);
|
||||||
|
assert.strictEqual(
|
||||||
|
resolveModel("medior"),
|
||||||
|
"anthropic/claude-sonnet-4-5",
|
||||||
|
);
|
||||||
|
assert.strictEqual(
|
||||||
|
resolveModel("senior"),
|
||||||
|
"anthropic/claude-opus-4-5",
|
||||||
|
);
|
||||||
|
assert.strictEqual(
|
||||||
|
resolveModel("qa"),
|
||||||
|
"anthropic/claude-sonnet-4-5",
|
||||||
|
);
|
||||||
|
});
|
||||||
|
|
||||||
|
it("respects plugin config overrides", () => {
|
||||||
|
assert.strictEqual(
|
||||||
|
resolveModel("junior", { models: { junior: "custom/fast-model" } }),
|
||||||
|
"custom/fast-model",
|
||||||
|
);
|
||||||
|
});
|
||||||
|
|
||||||
|
it("passes through raw model IDs unchanged", () => {
|
||||||
|
assert.strictEqual(
|
||||||
|
resolveModel("openai/gpt-4o"),
|
||||||
|
"openai/gpt-4o",
|
||||||
|
);
|
||||||
|
});
|
||||||
|
});
|
||||||
|
});
|
||||||
Reference in New Issue
Block a user