Commit Graph

19 Commits

Author SHA1 Message Date
Lauren ten Hoor
dfeadf742a feat: make workflow states dynamic with XState-style statechart config (#147) (#160)
## Summary

Introduces a configurable workflow state machine that replaces all hardcoded
state labels. The default workflow matches current behavior exactly, ensuring
backward compatibility.

## Architecture

### lib/workflow.ts — Core workflow engine

XState-style statechart configuration:

```typescript
type StateConfig = {
  type: 'queue' | 'active' | 'hold' | 'terminal';
  role?: 'dev' | 'qa';
  label: string;
  color: string;
  priority?: number;
  on?: Record<string, TransitionTarget>;
};
```

All behavior is derived from the config:
- Queue states: `type: 'queue'`, grouped by role, ordered by priority
- Active states: `type: 'active'` — worker occupied
- Transitions: defined with optional actions (gitPull, detectPr, closeIssue, reopenIssue)
- Labels and colors: derived from state.label and state.color

### Derivation functions

- `getStateLabels()` — all labels for issue tracker sync
- `getLabelColors()` — label → color mapping
- `getQueueLabels(role)` — queue labels for a role, ordered by priority
- `getActiveLabel(role)` — the active/in-progress label for a role
- `getRevertLabel(role)` — queue label to revert to on failure
- `detectRoleFromLabel()` — detect role from a queue label
- `getCompletionRule(role, result)` — derive transition rule from config

## Files Changed

- **lib/workflow.ts** — NEW: workflow engine and default config
- **lib/providers/provider.ts** — deprecate STATE_LABELS, LABEL_COLORS; derive from workflow
- **lib/providers/github.ts** — use workflow config for label operations
- **lib/providers/gitlab.ts** — use workflow config for label operations
- **lib/services/pipeline.ts** — use getCompletionRule() from workflow
- **lib/services/tick.ts** — use workflow for queue/active labels
- **lib/services/health.ts** — use workflow for active/revert labels
- **lib/tools/work-start.ts** — use workflow for target label

## Backward Compatibility

- DEFAULT_WORKFLOW matches current hardcoded behavior exactly
- Deprecated exports kept for any external consumers
- No breaking changes to tool interfaces or project state

## Future Work

- Load per-project workflow overrides from projects.json
- User-facing config in projects/workflow.json
- Tool schema generation from workflow states
2026-02-13 18:50:09 +08:00
Lauren ten Hoor
24b35b3a3e fix: dispatch timeout causing missed Telegram notifications (#153) (#154)
## Problem

`dispatchTask()` shells out to `openclaw gateway call sessions.patch` which
times out when the gateway is busy, causing:
1. Notifications never fire (they're at the end of dispatchTask)
2. Worker state may not be recorded
3. Workers run silently

## Solution (3 changes)

### 1. Make `ensureSession` fire-and-forget
Session key is deterministic, so we don't need to wait for confirmation.
Health check catches orphaned state later.

### 2. Use runtime API for notifications instead of CLI
Pass `runtime` through opts and use direct API calls:
- `runtime.channel.telegram.sendMessageTelegram()`
- `runtime.channel.whatsapp.sendMessageWhatsApp()`
- etc.

### 3. Move notification before session dispatch
Fire workerStart/workerComplete notifications early (after label transition)
before the session calls that can timeout.

## Files Changed

- lib/dispatch.ts — fire-and-forget ensureSession, early notification, accept runtime
- lib/notify.ts — use runtime API for direct channel sends
- lib/services/pipeline.ts — early notification, accept runtime
- lib/services/tick.ts — pass runtime through to dispatchTask
- lib/tool-helpers.ts — accept runtime in tickAndNotify
- lib/tools/work-start.ts — pass api.runtime to dispatchTask
- lib/tools/work-finish.ts — pass api.runtime to executeCompletion/tickAndNotify
2026-02-13 17:29:25 +08:00
Lauren ten Hoor
130c38a314 fix: blocked issues now go to Refining to prevent infinite dispatch loop (#142) 2026-02-13 17:12:29 +08:00
Lauren ten Hoor
265f82f3a9 refactor: centralize notifications in core dispatch/completion functions (#150) 2026-02-13 17:00:42 +08:00
Lauren ten Hoor
c0b3e15581 feat: include issue comments in worker task context (#148) 2026-02-13 16:49:15 +08:00
Lauren ten Hoor
825c5e6f50 feat: redesign health check to triangulate projects.json, issue label, and session state (#143) (#145)
## Changes

- Remove `activeSessions` parameter from health check (was never populated)
- Add gateway session lookup via `openclaw gateway call status`
- Add issue label lookup via `provider.getIssue(issueId)`
- Implement detection matrix with 6 issue types:
  - session_dead: active worker but session missing in gateway
  - label_mismatch: active worker but issue not in Doing/Testing
  - stale_worker: active for >2h
  - stuck_label: inactive but issue has Doing/Testing label
  - orphan_issue_id: inactive but issueId set
  - issue_gone: active but issue deleted/closed

## Files

- lib/services/health.ts — complete rewrite with three-source triangulation
- lib/tools/health.ts — remove activeSessions param, fetch sessions from gateway
- lib/services/heartbeat.ts — remove empty activeSessions calls, pass sessions map
2026-02-13 16:20:21 +08:00
Lauren ten Hoor
83f1f1adf0 feat: implement runCommand wrapper and refactor command executions across modules 2026-02-13 10:50:35 +08:00
Lauren ten Hoor
b1961fd362 fix: send Telegram notifications for heartbeat pickups (#139) 2026-02-13 10:36:32 +08:00
Lauren ten Hoor
e4b54646da refactor: remove context awareness documentation and related code; streamline tool registration and context detection 2026-02-12 00:25:34 +08:00
Lauren ten Hoor
81543600fe refactor: remove work_heartbeat tool and related tests; update documentation and notification logic 2026-02-12 00:02:18 +08:00
Lauren ten Hoor
aaf7818c33 docs: enhance heartbeat service descriptions and CLI registration 2026-02-11 23:13:53 +08:00
Lauren ten Hoor
5df4b912c9 refactor: rename 'tier' to 'level' across the codebase
- Updated WorkerState type to use 'level' instead of 'tier'.
- Modified functions related to worker state management, including parseWorkerState, emptyWorkerState, getSessionForLevel, activateWorker, and deactivateWorker to reflect the new terminology.
- Adjusted health check logic to utilize 'level' instead of 'tier'.
- Refactored tick and setup tools to accommodate the change from 'tier' to 'level', including model configuration and workspace scaffolding.
- Updated tests to ensure consistency with the new 'level' terminology.
- Revised documentation and comments to reflect the changes in terminology from 'tier' to 'level'.
2026-02-11 03:04:17 +08:00
Lauren ten Hoor
2450181482 feat: enhance role-tier structure for models and update related configurations 2026-02-11 01:49:14 +08:00
Lauren ten Hoor
f2e71a35d8 feat: implement work heartbeat service for health checks and task dispatching
- Introduced a new heartbeat service that runs at defined intervals to perform health checks on workers and fill available task slots based on priority.
- Added a health tool to scan worker health across projects with optional auto-fix capabilities.
- Updated the status tool to provide a lightweight overview of worker states and queue counts without health checks.
- Enhanced task creation tool descriptions to clarify task state handling.
- Implemented tests for the work heartbeat logic, ensuring proper project resolution, worker state management, and task prioritization.
2026-02-11 01:04:30 +08:00
Lauren ten Hoor
3a58dde3ad fix: clear startTime when deactivating workers to prevent stale timestamps
Problem:
When workers were deactivated (task completed or fixed by health checks),
the startTime field was not being cleared. This caused:
- Inactive workers to retain stale timestamps
- Misleading duration data in projects.json
- Potential confusion in health checks and status displays

Example from projects.json:
{
  "qa": {
    "active": false,
    "issueId": null,
    "startTime": "2026-02-10T08:51:50.725Z",  // Stale!
    "tier": "qa"
  }
}

Root Cause:
The deactivateWorker() function only set active: false and issueId: null,
but did not clear startTime. Similarly, health check auto-fixes that
deactivated workers also failed to clear startTime.

Solution:
Always set startTime: null when deactivating a worker to ensure clean state.

Changes:
1. lib/projects.ts:
   - deactivateWorker() now sets startTime: null
   - Updated function comment to document this behavior

2. lib/services/health.ts:
   - All three auto-fix paths that deactivate workers now clear startTime:
     * active_no_session fix (line 77)
     * zombie_session fix (line 98)
     * stale_worker fix (line 138)

Impact:
- Inactive workers now have clean state (startTime: null)
- Duration calculations only apply to active workers
- Health checks work with accurate data
- No stale timestamps persisting across task completions
- Complements fix from #108 (which ensures startTime is set on activation)

Together with #108:
- #108: Always SET startTime when activating worker
- #113: Always CLEAR startTime when deactivating worker
- Result: startTime accurately reflects current task duration

Addresses issue #113
2026-02-11 00:28:30 +08:00
Lauren ten Hoor
ff83c25e8c feat: implement workerStart notifications for tick pickups and enhance tick handling 2026-02-10 23:14:12 +08:00
Lauren ten Hoor
70af40e986 Refactor setup and tool helpers for improved modularity and clarity
- Moved setup logic into dedicated files: agent.ts, config.ts, index.ts, workspace.ts.
- Introduced tool-helpers.ts for shared functions across tools, reducing boilerplate.
- Updated tools (status, task-comment, task-create, task-update, work-finish, work-start) to utilize new helper functions for workspace resolution and provider creation.
- Enhanced error handling and context detection in tools.
- Improved project resolution logic to streamline tool execution.
- Added new functionality for agent creation and configuration management in setup.
2026-02-10 22:51:35 +08:00
Lauren ten Hoor
55b062ac76 refactor: replace autoChain with projectTick queue scanning
Remove hard-coded auto-chain dispatch (DEV done→QA, QA fail→DEV) and
replace with a general-purpose projectTick service that scans the queue
and fills free worker slots after every state transition.

- Create lib/services/tick.ts: consolidates shared helpers and core
  projectTick() function from duplicated code in work-start/auto-pickup
- work_finish: replaces auto-chain block with projectTick call
- work_start: adds projectTick after dispatch to fill parallel slots
- auto_pickup: delegates per-project loop to projectTick
- Remove autoChain from Project type, migration code, and project-register
- Remove scheduling config dependency from work_finish
- Net -112 lines: simpler, self-healing pipeline

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 21:46:11 +08:00
Lauren ten Hoor
d7178bb8e5 refactor: reorganize task management imports and update task handling tools
- Updated import paths for task management providers in task-comment, task-create, and task-update tools.
- Removed deprecated task-complete and task-pickup tools, replacing them with work-finish and work-start tools for improved task handling.
- Enhanced work-finish and work-start tools to streamline task completion and pickup processes, including context-aware detection and auto-scheduling features.
- Updated package.json to include build scripts and main entry point.
- Modified tsconfig.json to enable output directory, declaration files, and source maps for better TypeScript support.
2026-02-10 21:39:41 +08:00