Files
devclaw-gitea/README2.md
Claude 089664a675 docs: rewrite 'what it looks like' to show multi-project auto-scheduling
Show two projects running overnight with heartbeat-driven dispatch,
auto-chaining, QA failures cycling back to DEV, and different developer
levels — all without human involvement. Manual mode shown as secondary.

https://claude.ai/code/session_01R3rGevPY748gP4uK2ggYag
2026-02-11 02:08:41 +00:00

15 KiB

DevClaw Logo

DevClaw

Turn any group chat into a dev team that ships.

DevClaw is a plugin for OpenClaw that turns your orchestrator agent into a development manager. It hires developers, assigns tasks, reviews code, and keeps the pipeline moving — across as many projects as you have group chats.


What it looks like

You have two projects in two Telegram groups. You go to bed. You wake up:

── Group: "Dev - My Webapp" ──────────────────────────────

Agent:  "⚡ Sending DEV (medior) for #42: Add login page"
Agent:  "✅ DEV DONE #42 — Login page with OAuth. Moved to QA."
Agent:  "🔍 Sending QA (reviewer) for #42: Add login page"
Agent:  "🎉 QA PASS #42. Issue closed."
Agent:  "⚡ Sending DEV (junior) for #43: Fix button color on /settings"
Agent:  "✅ DEV DONE #43 — Updated to brand blue. Moved to QA."
Agent:  "🔍 Sending QA (reviewer) for #43: Fix button color on /settings"
Agent:  "❌ QA FAIL #43 — Color doesn't match dark mode. Back to DEV."
Agent:  "⚡ Sending DEV (junior) for #43: Fix button color on /settings"

── Group: "Dev - My API" ─────────────────────────────────

Agent:  "🧠 Spawning DEV (senior) for #18: Migrate auth to OAuth2"
Agent:  "✅ DEV DONE #18 — OAuth2 provider with refresh tokens. Moved to QA."
Agent:  "🔍 Sending QA (reviewer) for #18: Migrate auth to OAuth2"
Agent:  "🎉 QA PASS #18. Issue closed."
Agent:  "⚡ Sending DEV (medior) for #19: Add rate limiting to /api/search"

Three issues shipped, one sent back for a fix (and auto-retried), another project's migration completed — all while you slept. The heartbeat scanned the queues, dispatched workers, chained DEV into QA, and chained QA failures back to DEV. No human in the loop.

You can also drive it manually:

You:    "Check the queue"
Agent:  "2 issues in To Do. DEV is idle. QA is idle."

You:    "Pick up #44 for DEV"
Agent:  "⚡ Sending DEV (medior) for #44: Refactor user profile page"

Same agent, as many groups as you want, fully isolated teams per project.


The problem DevClaw solves

OpenClaw is a great multi-agent runtime. It handles sessions, tools, channels, gateway RPC — everything you need to run AI agents. But it's a general-purpose platform. It has no opinion about how software gets built.

Without DevClaw, your orchestrator agent has to figure out on its own how to:

  • Pick the right model for the task complexity
  • Create or reuse the right worker session
  • Transition issue labels in the right order
  • Track which worker is doing what across projects
  • Chain DEV completion into QA review
  • Detect crashed workers and recover
  • Log everything for auditability

That's a lot of reasoning per task. LLMs do it imperfectly — they forget steps, corrupt state, pick the wrong model, lose session references. You end up babysitting the thing you built to avoid babysitting.

DevClaw moves all of that into deterministic plugin code. The agent says "pick up issue #42." The plugin handles the other 10 steps atomically. Every time, the same way, zero reasoning tokens spent on orchestration.


Meet your team

DevClaw doesn't think in model IDs. It thinks in people.

When a task comes in, you don't configure anthropic/claude-sonnet-4-5 — you assign a medior developer. The orchestrator evaluates task complexity and picks the right person for the job:

Developers

Level Assigns to Model
Junior Typos, CSS fixes, renames, single-file changes Haiku
Medior Features, bug fixes, multi-file changes Sonnet
Senior Architecture, migrations, system-wide refactoring Opus

QA

Level Assigns to Model
Reviewer Code review, test validation, PR inspection Sonnet
Tester Manual testing, smoke tests Haiku

A CSS typo gets the intern. A database migration gets the architect. You're not burning Opus tokens on a color change, and you're not sending Haiku to redesign your auth system.

Every mapping is configurable — swap in any model you want per level.


How a task moves through the pipeline

Every issue follows the same path, no exceptions. DevClaw enforces it:

Planning → To Do → Doing → To Test → Testing → Done
stateDiagram-v2
    [*] --> Planning
    Planning --> ToDo: Ready for development

    ToDo --> Doing: DEV picks up
    Doing --> ToTest: DEV done

    ToTest --> Testing: QA picks up (or auto-chains)
    Testing --> Done: QA pass (issue closed)
    Testing --> ToImprove: QA fail (back to DEV)
    Testing --> Refining: QA needs human input

    ToImprove --> Doing: DEV fixes (or auto-chains)
    Refining --> ToDo: Human decides

    Done --> [*]

These labels live on your actual GitHub/GitLab issues. Not in some internal database — in the tool you already use. Filter by Doing in GitHub to see what's in progress. Set up a webhook on Done to trigger deploys. The issue tracker is the source of truth.

What "atomic" means here

When you say "pick up #42 for DEV", the plugin does all of this in one operation:

  1. Verifies the issue is in the right state
  2. Picks the developer level (or uses what you specified)
  3. Transitions the label (To DoDoing)
  4. Creates or reuses the right worker session
  5. Dispatches the task with project-specific instructions
  6. Updates internal state
  7. Logs an audit entry

If step 4 fails, step 3 is rolled back. No half-states, no orphaned labels, no "the issue says Doing but nobody's working on it."


What happens behind the scenes

Workers report back themselves

When a developer finishes, they call work_finish directly — no orchestrator involved:

  • DEV "done" → label moves to To Test, QA starts automatically
  • DEV "blocked" → label moves back to To Do, task returns to queue
  • QA "pass" → label moves to Done, issue closes
  • QA "fail" → label moves to To Improve, DEV gets re-dispatched

The orchestrator doesn't need to poll, check, or coordinate. Workers are self-reporting.

Sessions accumulate context

Each developer level gets its own persistent session per project. Your medior dev that's done 5 features on my-app already knows the codebase — it doesn't re-read 50K tokens of source code every time it picks up a new task.

That's a ~40-60% token saving per task from session reuse alone.

Combined with tier selection (not using Opus when Haiku will do) and the token-free heartbeat (more on that next), DevClaw significantly reduces your token bill versus running everything through one large model.

Everything is logged

Every tool call writes an NDJSON line to audit.log:

cat audit.log | jq 'select(.event=="work_start")'

Full trace of every task, every level selection, every label transition, every health fix. No manual logging needed.


Automatic scheduling

DevClaw doesn't wait for you to tell it what to do next. A background heartbeat service continuously scans for available work and dispatches workers — zero LLM tokens, pure deterministic code.

The heartbeat

Every tick, the service runs two passes:

  1. Health pass — detects workers stuck for >2 hours, reverts their labels back to queue, deactivates them. Catches crashed sessions, context overflows, or workers that died without reporting back.
  2. Queue pass — scans for available tasks by priority (To Improve > To Test > To Do), fills free worker slots. DEV and QA slots are filled independently.

All CLI calls and JSON reads. Workers only consume tokens when they actually start coding or reviewing.

Auto-chaining

When enabled, task completions automatically trigger the next step:

  • DEV "done" → QA reviewer is dispatched immediately
  • QA "fail" → DEV is re-dispatched at the same level that originally worked on it
  • QA "pass" → issue closes, pipeline done
  • "blocked" → task returns to queue for retry, no chaining

No orchestrator involvement. The worker calls work_finish, the plugin handles the rest.

Execution modes

Each project is fully isolated — its own queue, workers, sessions, state. No cross-project contamination. Two levels of parallelism control how work gets scheduled:

  • Project-level (roleExecution) — DEV and QA work simultaneously on different tasks (default: parallel) or take turns (sequential)
  • Plugin-level (projectExecution) — all registered projects dispatch workers independently (default: parallel) or only one project runs at a time (sequential)

Configuration

All scheduling behavior is configurable in openclaw.json:

{
  "plugins": {
    "entries": {
      "devclaw": {
        "config": {
          "work_heartbeat": {
            "enabled": true,
            "intervalSeconds": 60,
            "maxPickupsPerTick": 4
          },
          "projectExecution": "parallel"
        }
      }
    }
  }
}

Per-project settings live in projects.json:

{
  "-1234567890": {
    "name": "my-app",
    "autoChain": true,
    "roleExecution": "parallel"
  }
}
Setting Where Default What it controls
work_heartbeat.enabled openclaw.json true Turn the heartbeat on/off
work_heartbeat.intervalSeconds openclaw.json 60 Seconds between ticks
work_heartbeat.maxPickupsPerTick openclaw.json 4 Max workers dispatched per tick
projectExecution openclaw.json "parallel" All projects at once, or one at a time
autoChain projects.json false Auto-dispatch next step on completion
roleExecution projects.json "parallel" DEV+QA at once, or one role at a time

See the Configuration reference for the full schema.


Task management

Your issues stay in your tracker

DevClaw doesn't have its own task database. All task state lives in GitHub Issues or GitLab Issues — auto-detected from your git remote. The eight pipeline labels are created on your repo when you register a project. Your project manager sees progress in GitHub without knowing DevClaw exists. Your CI/CD can trigger on label changes. If you stop using DevClaw, your issues and labels stay exactly where they are.

The provider is pluggable (IssueProvider interface). GitHub and GitLab work today. Jira, Linear, or anything else just needs to implement the same interface.

Creating, updating, and commenting

Tasks can come from anywhere — the orchestrator creates them from chat, workers file bugs they discover mid-task, or you create them directly in GitHub/GitLab:

You:    "Create an issue: fix the broken OAuth redirect"
Agent:  creates issue #43 with label "Planning"

You:    "Move #43 to To Do"
Agent:  transitions label Planning → To Do

You:    "Add a comment on #42: needs to handle the edge case for expired tokens"
Agent:  adds comment attributed to "orchestrator"

Workers can also comment during work — QA leaves review feedback, DEV posts implementation notes. Every comment carries role attribution so you know who said what.

Custom instructions per project

Each project gets instruction files that workers receive with every task they pick up:

workspace/projects/roles/
├── my-webapp/
│   ├── dev.md     "Run npm test before committing. Deploy URL: staging.example.com"
│   └── qa.md      "Check OAuth flow. Verify mobile responsiveness."
├── my-api/
│   ├── dev.md     "Run cargo test. Follow REST conventions in CONTRIBUTING.md"
│   └── qa.md      "Verify all endpoints return correct status codes."
└── default/
    ├── dev.md     (fallback for projects without custom instructions)
    └── qa.md

Deployment steps, test commands, coding standards, acceptance criteria — all injected at dispatch time, per project, per role.


Getting started

Prerequisites

  • OpenClaw installed (openclaw --version)
  • Node.js >= 20
  • gh CLI (GitHub) or glab CLI (GitLab), authenticated

Install

cp -r devclaw ~/.openclaw/extensions/

Set up through conversation

The easiest way to configure DevClaw is to just talk to your agent:

You:   "Help me set up DevClaw"
Agent: "I'll walk you through it. Should I use this agent as the
        orchestrator, or create a new one?"
You:   "Use this one"

Agent: "Want to bind a messaging channel?"
You:   "Telegram"

Agent: "Here are the default developer assignments:
        Junior → Haiku, Medior → Sonnet, Senior → Opus
        Reviewer → Sonnet, Tester → Haiku
        Keep these or customize?"
You:   "Keep them"

Agent: "Done. Want to register a project?"
You:   "Yes — my-app at ~/git/my-app, main branch"

Agent: "Project registered. 8 labels created on your repo.
        Role instructions scaffolded. Try: 'check the queue'"

You can also use the CLI wizard or non-interactive setup for scripted environments.


The toolbox

DevClaw gives the orchestrator 11 tools. These aren't just convenience wrappers — they're guardrails. Each tool encodes a complex multi-step operation into a single atomic call. The agent provides intent, the plugin handles mechanics. The agent physically cannot skip a label transition, forget to update state, or dispatch to the wrong session — those decisions are made by deterministic code, not LLM reasoning.

Tool What it does
work_start Pick up a task — resolves level, transitions label, dispatches session, logs audit
work_finish Complete a task — transitions label, updates state, auto-chains next step, ticks queue
task_create Create a new issue (used by workers to file bugs they discover)
task_update Manually change an issue's state label
task_comment Add a comment to an issue (with role attribution)
status Dashboard: queue counts + who's working on what
health Detect zombie workers, stale sessions, state inconsistencies
work_heartbeat Manually trigger a health check + queue dispatch cycle
project_register One-time project setup: creates labels, scaffolds instructions, initializes state
setup Agent + workspace initialization
onboard Conversational setup guide

Full parameters and usage in the Tools Reference.


Documentation

Architecture System design, session model, data flow, end-to-end diagrams
Tools Reference Complete reference for all 11 tools
Configuration openclaw.json, projects.json, heartbeat, notifications
Onboarding Guide Full step-by-step setup
QA Workflow QA process and review templates
Context Awareness How tools adapt to group vs. DM vs. agent context
Testing Test suite, fixtures, CI/CD
Management Theory The delegation model behind the design
Roadmap What's coming next

License

MIT