feat: Implement context detection and onboarding tools for DevClaw

- Add context-guard.ts to detect interaction context (via-agent, direct, group) and generate guardrails. - Introduce onboarding.ts for conversational onboarding context templates and workspace file checks. - Enhance setup.ts to support new agent creation with channel binding and migration of existing bindings. - Create analyze-channel-bindings.ts to analyze channel availability and detect binding conflicts. - Implement context-test.ts for debugging context detection. - Develop devclaw_onboard.ts for explicit onboarding tool that guides users through setup. - Update devclaw_setup.ts to include channel binding and migration support in setup process. - Modify project-register.ts to enforce project registration from group context and auto-populate group ID. - Enhance queue-status.ts to provide context-aware status checks and recommendations. - Update task tools (task-complete, task-create, task-pickup) to clarify group ID usage for Telegram/WhatsApp.
2026-02-09 18:34:45 +08:00
parent 32eb079521
commit a9a3fc3f1f
18 changed files with 1532 additions and 44 deletions
--- a/docs/CONTEXT-AWARENESS.md
+++ b/docs/CONTEXT-AWARENESS.md
@@ -0,0 +1,181 @@
+# Context-Aware DevClaw
+
+DevClaw now adapts its behavior based on how you interact with it.
+
+## Design Philosophy
+
+**One Group = One Project = One Team**
+
+DevClaw enforces strict boundaries between projects:
+- Each Telegram/WhatsApp group represents a **single project**
+- Each project has its **own dedicated dev/qa workers**
+- Project work happens **inside that project's group**
+- Setup and configuration happen **outside project groups**
+
+This design prevents:
+- ❌ Cross-project contamination (workers picking up wrong project's tasks)
+- ❌ Confusion about which project you're working on
+- ❌ Accidental registration of wrong groups
+- ❌ Setup discussions cluttering project work channels
+
+This enables:
+- ✅ Clear mental model: "This group = this project"
+- ✅ Isolated work streams: Each project progresses independently
+- ✅ Dedicated teams: Workers focus on one project at a time
+- ✅ Clean separation: Setup vs. operational work
+
+## Three Interaction Contexts
+
+### 1. **Via Another Agent** (Setup Mode)
+When you talk to your main agent (like Henk) about DevClaw:
+- ✅ Use: `devclaw_onboard`, `devclaw_setup`
+- ❌ Avoid: `task_pickup`, `queue_status` (operational tools)
+
+**Example:**
+```
+User → Henk: "Can you help me set up DevClaw?"
+Henk → Calls devclaw_onboard
+```
+
+### 2. **Direct Message to DevClaw Agent**
+When you DM the DevClaw agent directly on Telegram/WhatsApp:
+- ✅ Use: `queue_status` (all projects), `session_health` (system overview)
+- ❌ Avoid: `task_pickup` (project-specific work), setup tools
+
+**Example:**
+```
+User → DevClaw DM: "Show me the status of all projects"
+DevClaw → Calls queue_status (shows all projects)
+```
+
+### 3. **Project Group Chat**
+When you message in a Telegram/WhatsApp group bound to a project:
+- ✅ Use: `task_pickup`, `task_complete`, `task_create`, `queue_status` (auto-filtered)
+- ❌ Avoid: Setup tools, system-wide queries
+
+**Example:**
+```
+User → OpenClaw Dev Group: "@henk pick up issue #42"
+DevClaw → Calls task_pickup (only works in groups)
+```
+
+## How It Works
+
+### Context Detection
+Each tool automatically detects:
+- **Agent ID** - Is this the DevClaw agent or another agent?
+- **Message Channel** - Telegram, WhatsApp, or CLI?
+- **Session Key** - Is this a group chat or direct message?
+  - Format: `agent:{agentId}:{channel}:{type}:{id}`
+  - Telegram group: `agent:devclaw:telegram:group:-5266044536`
+  - WhatsApp group: `agent:devclaw:whatsapp:group:120363123@g.us`
+  - DM: `agent:devclaw:telegram:user:657120585`
+- **Project Binding** - Which project is this group bound to?
+
+### Guardrails
+Tools include context-aware guidance in their responses:
+```json
+{
+  "contextGuidance": "🛡️ Context: Project Group Chat (telegram)\n
+    You're in a Telegram group for project 'openclaw-core'.\n
+    Use task_pickup, task_complete for project work.",
+  ...
+}
+```
+
+## Integrated Tools
+
+### ✅ `devclaw_onboard`
+- **Works best:** Via another agent or direct DM
+- **Blocks:** Group chats (setup shouldn't happen in project groups)
+
+### ✅ `queue_status`
+- **Group context:** Auto-filters to that project
+- **Direct context:** Shows all projects
+- **Via-agent context:** Suggests using devclaw_onboard instead
+
+### ✅ `task_pickup`
+- **ONLY works:** In project group chats
+- **Blocks:** Direct DMs and setup conversations
+
+### ✅ `project_register`
+- **ONLY works:** In the Telegram/WhatsApp group you're registering
+- **Blocks:** Direct DMs and via-agent conversations
+- **Auto-detects:** Group ID from current chat (projectGroupId parameter now optional)
+
+**Why this matters:**
+- **Project Isolation**: Each group = one project = one dedicated team
+- **Clear Boundaries**: Forces deliberate project registration from within the project's space
+- **Team Clarity**: You're physically in the group when binding it, making the connection explicit
+- **No Mistakes**: Impossible to accidentally register the wrong group when you're in it
+- **Natural Workflow**: "This group is for Project X" → register Project X here
+
+## Testing
+
+### Debug Tool
+Use `context_test` to see what context is detected:
+```
+# In any context:
+context_test
+
+# Returns:
+{
+  "detectedContext": { "type": "group", "projectName": "openclaw-core" },
+  "guardrails": "🛡️ Context: Project Group Chat..."
+}
+```
+
+### Manual Testing
+1. **Setup Mode:** Message your main agent → "Help me configure DevClaw"
+2. **Status Check:** DM DevClaw agent (Telegram/WhatsApp) → "Show me the queue"
+3. **Project Work:** Post in project group (Telegram/WhatsApp) → "@henk pick up #42"
+
+Each context should trigger different guardrails.
+
+## Configuration
+
+Add to `~/.openclaw/openclaw.json`:
+```json
+"plugins": {
+  "entries": {
+    "devclaw": {
+      "config": {
+        "devClawAgentIds": ["henk-development", "devclaw-test"],
+        "models": { ... }
+      }
+    }
+  }
+}
+```
+
+The `devClawAgentIds` array lists which agents are DevClaw orchestrators.
+
+## Implementation Details
+
+- **Module:** [lib/context-guard.ts](../lib/context-guard.ts)
+- **Tests:** [tests/unit/context-guard.test.ts](../tests/unit/context-guard.test.ts) (15 passing)
+- **Integrated tools:** 4 key tools (`devclaw_onboard`, `queue_status`, `task_pickup`, `project_register`)
+- **Detection logic:** Checks agentId, messageChannel, sessionKey pattern matching
+
+## WhatsApp Support
+
+DevClaw **fully supports WhatsApp** groups with the same architecture as Telegram:
+
+- ✅ WhatsApp group detection via `sessionKey.includes("@g.us")`
+- ✅ Projects keyed by WhatsApp group ID (e.g., `"120363123@g.us"`)
+- ✅ Context-aware tools work identically for both channels
+- ✅ One project = one group (Telegram OR WhatsApp)
+
+**To register a WhatsApp project:**
+1. Go to the WhatsApp group chat
+2. Call `project_register` from within the group
+3. Group ID auto-detected from context
+
+The architecture treats Telegram and WhatsApp identically - the only difference is the group ID format.
+
+## Future Enhancements
+
+- [ ] Integrate into remaining tools (`task_complete`, `session_health`, `task_create`, `devclaw_setup`)
+- [ ] System prompt injection (requires OpenClaw core support)
+- [ ] Context-based tool filtering (hide irrelevant tools)
+- [ ] Per-project context overrides
--- a/docs/ONBOARDING.md
+++ b/docs/ONBOARDING.md
@@ -27,6 +27,20 @@ openclaw plugins list

 ### 2. Run setup

+There are three ways to set up DevClaw:
+
+#### Option A: Conversational onboarding (recommended)
+
+Call the `devclaw_onboard` tool from any agent that has the DevClaw plugin loaded. The agent will walk you through configuration step by step — asking about:
+- Agent selection (current or create new)
+- Channel binding (telegram/whatsapp/none) — for new agents only
+- Model tiers (accept defaults or customize)
+- Optional project registration
+
+The tool returns instructions that guide the agent through the QA-style setup conversation.
+
+#### Option B: CLI wizard
+
 ```bash
 openclaw devclaw setup
 ```
@@ -44,7 +58,7 @@ The setup wizard walks you through:
 Non-interactive mode:
 ```bash
 # Create new agent with default models
-openclaw devclaw setup --new-agent "My Dev Orchestrator" --non-interactive
+openclaw devclaw setup --new-agent "My Dev Orchestrator"

 # Configure existing agent with custom models
 openclaw devclaw setup --agent my-orchestrator \
@@ -52,9 +66,77 @@ openclaw devclaw setup --agent my-orchestrator \
  --senior "anthropic/claude-opus-4-5"
 ```

-### 3. Add the agent to the Telegram group
+#### Option C: Tool call (agent-driven)

-Add your orchestrator bot to the Telegram group for the project. The agent will now receive messages from this group and can operate on the linked project.
+**Conversational onboarding via tool:**
+```json
+devclaw_onboard({ mode: "first-run" })
+```
+
+The tool returns step-by-step instructions that guide the agent through the QA-style setup conversation.
+
+**Direct setup (skip conversation):**
+```json
+{
+  "newAgentName": "My Dev Orchestrator",
+  "channelBinding": "telegram",
+  "models": {
+    "junior": "anthropic/claude-haiku-4-5",
+    "senior": "anthropic/claude-opus-4-5"
+  }
+}
+```
+
+This calls `devclaw_setup` directly without conversational prompts.
+
+### 3. Channel binding (optional, for new agents)
+
+If you created a new agent during conversational onboarding and selected a channel binding (telegram/whatsapp), the agent is automatically bound and will receive messages from that channel. **Skip to step 4.**
+
+**Smart Migration**: If an existing agent already has a channel-wide binding (e.g., the old orchestrator receives all telegram messages), the onboarding agent will:
+1. Call `analyze_channel_bindings` to detect the conflict
+2. Ask if you want to migrate the binding from the old agent to the new one
+3. If you confirm, the binding is automatically moved — no manual config edit needed
+
+This is useful when you're replacing an old orchestrator with a new one.
+
+If you didn't bind a channel during setup, you have two options:
+
+**Option A: Manually edit `openclaw.json`** (for existing agents or post-creation binding)
+
+Add an entry to the `bindings` array:
+```json
+{
+  "bindings": [
+    {
+      "agentId": "my-orchestrator",
+      "match": {
+        "channel": "telegram"
+      }
+    }
+  ]
+}
+```
+
+For group-specific bindings:
+```json
+{
+  "agentId": "my-orchestrator",
+  "match": {
+    "channel": "telegram",
+    "peer": {
+      "kind": "group",
+      "id": "-1234567890"
+    }
+  }
+}
+```
+
+Restart OpenClaw after editing.
+
+**Option B: Add bot to Telegram/WhatsApp group**
+
+If using a channel-wide binding (no peer filter), the agent will receive all messages from that channel. Add your orchestrator bot to the relevant Telegram group for the project.

 ### 4. Register your project

@@ -165,7 +247,9 @@ Change which model powers each tier in `openclaw.json`:
 | Responsibility | Who | Details |
 |---|---|---|
 | Plugin installation | You (once) | `cp -r devclaw ~/.openclaw/extensions/` |
-| Agent + workspace setup | Plugin (`devclaw setup`) | Creates agent, configures models, writes workspace files |
+| Agent + workspace setup | Plugin (`devclaw_setup`) | Creates agent, configures models, writes workspace files |
+| Channel binding analysis | Plugin (`analyze_channel_bindings`) | Detects channel conflicts, validates channel configuration |
+| Channel binding migration | Plugin (`devclaw_setup` with `migrateFrom`) | Automatically moves channel-wide bindings between agents |
 | Label setup | Plugin (`project_register`) | 8 labels, created idempotently via `IssueProvider` |
 | Role file scaffolding | Plugin (`project_register`) | Creates `roles/<project>/dev.md` and `qa.md` from defaults |
 | Project registration | Plugin (`project_register`) | Entry in `projects.json` with empty worker state |
--- a/docs/TESTING.md
+++ b/docs/TESTING.md
@@ -0,0 +1,334 @@
+# DevClaw Testing Guide
+
+Comprehensive automated testing for DevClaw onboarding and setup.
+
+## Quick Start
+
+```bash
+# Install dependencies
+npm install
+
+# Run all tests
+npm test
+
+# Run with coverage report
+npm run test:coverage
+
+# Run in watch mode (auto-rerun on changes)
+npm run test:watch
+
+# Run with UI (browser-based test explorer)
+npm run test:ui
+```
+
+## Test Coverage
+
+### Scenario 1: New User (No Prior DevClaw Setup)
+**File:** `tests/setup/new-user.test.ts`
+
+**What's tested:**
+- First-time agent creation with default models
+- Channel binding creation (telegram/whatsapp)
+- Workspace file generation (AGENTS.md, HEARTBEAT.md, roles/, memory/)
+- Plugin configuration initialization
+- Error handling: channel not configured
+- Error handling: channel disabled
+
+**Example:**
+```typescript
+// Before: openclaw.json has no DevClaw agents
+{
+  "agents": { "list": [{ "id": "main", ... }] },
+  "bindings": [],
+  "plugins": { "entries": {} }
+}
+
+// After: New orchestrator created
+{
+  "agents": {
+    "list": [
+      { "id": "main", ... },
+      { "id": "my-first-orchestrator", ... }
+    ]
+  },
+  "bindings": [
+    { "agentId": "my-first-orchestrator", "match": { "channel": "telegram" } }
+  ],
+  "plugins": {
+    "entries": {
+      "devclaw": {
+        "config": {
+          "models": {
+            "junior": "anthropic/claude-haiku-4-5",
+            "medior": "anthropic/claude-sonnet-4-5",
+            "senior": "anthropic/claude-opus-4-5",
+            "qa": "anthropic/claude-sonnet-4-5"
+          }
+        }
+      }
+    }
+  }
+}
+```
+
+### Scenario 2: Existing User (Migration)
+**File:** `tests/setup/existing-user.test.ts`
+
+**What's tested:**
+- Channel conflict detection (existing channel-wide binding)
+- Binding migration from old agent to new agent
+- Custom model preservation during migration
+- Old agent preservation (not deleted)
+- Error handling: migration source doesn't exist
+- Error handling: migration source has no binding
+
+**Example:**
+```typescript
+// Before: Old orchestrator has telegram binding
+{
+  "agents": {
+    "list": [
+      { "id": "main", ... },
+      { "id": "old-orchestrator", ... }
+    ]
+  },
+  "bindings": [
+    { "agentId": "old-orchestrator", "match": { "channel": "telegram" } }
+  ]
+}
+
+// After: Binding migrated to new orchestrator
+{
+  "agents": {
+    "list": [
+      { "id": "main", ... },
+      { "id": "old-orchestrator", ... },
+      { "id": "new-orchestrator", ... }
+    ]
+  },
+  "bindings": [
+    { "agentId": "new-orchestrator", "match": { "channel": "telegram" } }
+  ]
+}
+```
+
+### Scenario 3: Power User (Multiple Agents)
+**File:** `tests/setup/power-user.test.ts`
+
+**What's tested:**
+- No conflicts with group-specific bindings
+- Channel-wide binding creation alongside group bindings
+- Multiple orchestrators coexisting
+- Routing logic (specific bindings win over channel-wide)
+- WhatsApp support
+- Scale testing (12+ orchestrators)
+
+**Example:**
+```typescript
+// Before: Two project orchestrators with group-specific bindings
+{
+  "agents": {
+    "list": [
+      { "id": "project-a-orchestrator", ... },
+      { "id": "project-b-orchestrator", ... }
+    ]
+  },
+  "bindings": [
+    {
+      "agentId": "project-a-orchestrator",
+      "match": { "channel": "telegram", "peer": { "kind": "group", "id": "-1001234567890" } }
+    },
+    {
+      "agentId": "project-b-orchestrator",
+      "match": { "channel": "telegram", "peer": { "kind": "group", "id": "-1009876543210" } }
+    }
+  ]
+}
+
+// After: Channel-wide orchestrator added (no conflicts)
+{
+  "agents": {
+    "list": [
+      { "id": "project-a-orchestrator", ... },
+      { "id": "project-b-orchestrator", ... },
+      { "id": "global-orchestrator", ... }
+    ]
+  },
+  "bindings": [
+    {
+      "agentId": "project-a-orchestrator",
+      "match": { "channel": "telegram", "peer": { "kind": "group", "id": "-1001234567890" } }
+    },
+    {
+      "agentId": "project-b-orchestrator",
+      "match": { "channel": "telegram", "peer": { "kind": "group", "id": "-1009876543210" } }
+    },
+    {
+      "agentId": "global-orchestrator",
+      "match": { "channel": "telegram" }  // Channel-wide (no peer)
+    }
+  ]
+}
+
+// Routing: Group messages go to specific agents, everything else goes to global
+```
+
+## Test Architecture
+
+### Mock File System
+The tests use an in-memory mock file system (`MockFileSystem`) that simulates:
+- Reading/writing openclaw.json
+- Creating/reading workspace files
+- Tracking command executions (openclaw agents add)
+
+**Why?** Tests run in isolation without touching the real file system, making them:
+- Fast (no I/O)
+- Reliable (no file conflicts)
+- Repeatable (clean state every test)
+
+### Fixtures
+Pre-built configurations for different user types:
+- `createNewUserConfig()` - Empty slate
+- `createCommonUserConfig()` - One orchestrator with binding
+- `createPowerUserConfig()` - Multiple orchestrators with group bindings
+- `createNoChannelConfig()` - Channel not configured
+- `createDisabledChannelConfig()` - Channel disabled
+
+### Assertions
+Reusable assertion helpers that make tests readable:
+```typescript
+assertAgentExists(mockFs, "my-agent", "My Agent");
+assertChannelBinding(mockFs, "my-agent", "telegram");
+assertWorkspaceFilesExist(mockFs, "my-agent");
+assertDevClawConfig(mockFs, { junior: "anthropic/claude-haiku-4-5" });
+```
+
+## CI/CD Integration
+
+### GitHub Actions
+```yaml
+name: Test
+on: [push, pull_request]
+jobs:
+  test:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v3
+      - uses: actions/setup-node@v3
+        with:
+          node-version: 20
+      - run: npm ci
+      - run: npm test
+      - run: npm run test:coverage
+      - uses: codecov/codecov-action@v3
+        with:
+          files: ./coverage/coverage-final.json
+```
+
+### GitLab CI
+```yaml
+test:
+  image: node:20
+  script:
+    - npm ci
+    - npm test
+    - npm run test:coverage
+  coverage: '/Lines\s*:\s*(\d+\.\d+)%/'
+  artifacts:
+    reports:
+      coverage_report:
+        coverage_format: cobertura
+        path: coverage/cobertura-coverage.xml
+```
+
+## Debugging Tests
+
+### Run specific test
+```bash
+npm test -- new-user              # Run all new-user tests
+npm test -- "should create agent" # Run tests matching pattern
+```
+
+### Debug with Node inspector
+```bash
+node --inspect-brk node_modules/.bin/vitest run
+```
+
+Then open Chrome DevTools at `chrome://inspect`
+
+### View coverage report
+```bash
+npm run test:coverage
+open coverage/index.html
+```
+
+## Adding Tests
+
+### 1. Choose the right test file
+- New feature → `tests/setup/new-user.test.ts`
+- Migration feature → `tests/setup/existing-user.test.ts`
+- Multi-agent feature → `tests/setup/power-user.test.ts`
+
+### 2. Write the test
+```typescript
+import { describe, it, expect, beforeEach } from "vitest";
+import { MockFileSystem } from "../helpers/mock-fs.js";
+import { createNewUserConfig } from "../helpers/fixtures.js";
+import { assertAgentExists } from "../helpers/assertions.js";
+
+describe("My new feature", () => {
+  let mockFs: MockFileSystem;
+
+  beforeEach(() => {
+    mockFs = new MockFileSystem(createNewUserConfig());
+  });
+
+  it("should do something useful", async () => {
+    // GIVEN: initial state (via fixture)
+    const beforeCount = countAgents(mockFs);
+
+    // WHEN: execute the operation
+    const config = mockFs.getConfig();
+    config.agents.list.push({
+      id: "test-agent",
+      name: "Test Agent",
+      workspace: "/home/test/.openclaw/workspace-test-agent",
+      agentDir: "/home/test/.openclaw/agents/test-agent/agent",
+    });
+    mockFs.setConfig(config);
+
+    // THEN: verify the outcome
+    assertAgentExists(mockFs, "test-agent", "Test Agent");
+    expect(countAgents(mockFs)).toBe(beforeCount + 1);
+  });
+});
+```
+
+### 3. Run your test
+```bash
+npm test -- "should do something useful"
+```
+
+## Best Practices
+
+### ✅ DO
+- Test one thing per test
+- Use descriptive test names ("should create agent with telegram binding")
+- Use fixtures for initial state
+- Use assertion helpers for readability
+- Test error cases
+
+### ❌ DON'T
+- Test implementation details (test behavior, not internals)
+- Share state between tests (use beforeEach)
+- Mock everything (only mock file system and commands)
+- Write brittle tests (avoid hard-coded UUIDs, timestamps)
+
+## Test Metrics
+
+Current coverage:
+- **Lines:** Target 80%+
+- **Functions:** Target 90%+
+- **Branches:** Target 75%+
+
+Run `npm run test:coverage` to see detailed metrics.