ParallelClaw MVP v3

Execution layer for OpenClaw: 4 patterns of parallelism,
proven by experiments and science.

4 weeks Talk While Work ✅ Ready Multi-Mind · Multi-Shard · Multi-Agent

Four Patterns of Parallelism

Not one "parallelism" — four distinct patterns with different economics. The user thinks in tasks, we choose the pattern.

Pattern 1 ✅ Ready

Talk While Work

Self-delegation: agent decides — direct reply or spawn sub-agent. Routing rules in AGENTS.md. The simplest pattern.

Pattern 2

Multi-Mind

/pc-ask — N models simultaneously, majority voting. Consensus for critical decisions. +17.9% GSM8K.

Pattern 3

Multi-Shard

/pc-scan — Sharded throughput. Stateless HTTP parallel calls. 43× speedup on batch tasks.

Pattern 4

Multi-Agent

/pc-do — Decomposition into stateful sub-agents with tools. Stanford: 7B beats 671B via specialization.

Key point: Talk While Work is a full-fledged pattern, not a "foundation layer." All four patterns are independent — you can use any without the others.

Pattern 1: Talk While Work ✅ Ready

Self-delegation via AGENTS.md — agent decides: fast reply or spawn sub-agent for heavy work.

How it works

Installed as an OpenClaw skill: talk-while-work
install.py adds a routing section to AGENTS.md
System embeds routing rules directly into the agent's system prompt
Agent receives 70+ routing decision examples
Before each response: direct reply vs sessions_spawn

Inside the .skill file

routing-examples.md — 70+ examples (greeting → direct, "build a bot" → spawn)
requirements.md — 3 tier: Hobby / Pro / Power
parallelclaw-patterns.md — description of all 4 patterns
Routing works immediately, zero config
True parallelism requires config check: RPM, maxChildrenPerAgent

Direct reply (main agent)

Greetings, translations, simple questions, concept explanations, single-context debugging. Fast, cheap, no overhead.

Delegate (sub-agent)

Code generation, web research, artifacts (presentations, reports), multi-step business tasks. Heavy work runs in background.

Routing rules (few-shot examples): "hello" → direct · "translate" → direct · "build a bot" → spawn · "compare 5 DBs" → spawn · "why is it slow" → direct · "I need a script" → spawn

Pattern 2: Multi-Mind · /pc-ask

Consensus voting — one prompt to N models simultaneously, majority voting, judge synthesis.

Mechanics

1 prompt → N models in parallel (3-6)
Each model answers independently
Majority voting for closed-form questions
LLM-judge synthesis for open-ended
Cost ×N, time ≈ 1 model

Use cases

Architectural decisions (4 opinions → consensus)
Code review (3 models, 2 must agree on issue)
Legal / compliance check
When the cost of error is high

Science: Self-Consistency (Wang et al.) — multiple reasoning paths + majority voting = +17.9% GSM8K, +11.0% SVAMP. Debate vs Voting (arxiv 2508.17536): majority voting explains nearly all quality gains, debate is 3× more expensive and not much better.

Honest disclaimer: Multi-Mind is the most expensive pattern per query. For every one Multi-Mind task, there are 20-50 Multi-Shard tasks in real work. Don't overuse it.

Pattern 3: Multi-Shard · /pc-scan

Sharded throughput — N chunks of work as parallel stateless HTTP calls.

Mechanics

N independent chunks → parallel HTTP direct to API
Stateless (no session, no tools) — maximum speed
N = 50-5000 concurrently
AIMD adaptive concurrency: start 20, 429 → ÷2, no 429 → +5
Cost by volume, time ÷N

Use cases

80-page PDF → summarization (8.2s vs 4 min)
500 reviews → sentiment analysis
200 files → code review batch
Mass translation / classification

Economics: 87 pages via Haiku = $0.17, 8.2s vs 4 min sequential. 43× speedup. The most underrated pattern. Batch processing revolution.

Why HTTP direct, not sub-agents: 10 HTTP calls = 26 sec. 10 sub-agents = 2-3 min. Sub-agent overhead (15-18s init) kills throughput. HTTP direct is 7.5× faster.

Pattern 4: Multi-Agent · /pc-do

Decomposition of a complex task into N full stateful sub-agents with tools, context, and memory.

Mechanics

Task → plan → N sub-agents with roles
Stateful: tools, context, filesystem
N = 3-10 (limited by overhead)
Reducer collects and synthesizes results
Expensive, powerful, for deep work

Example: Due Diligence

Agent 1: Finance & metrics
Agent 2: Team & founders
Agent 3: Competitors & market
Agent 4: Technology & IP
Agent 5: Legal risks
Merge → final report

Science: Stanford Self-Play (arXiv:2604.20209) — 7B model beat 671B DeepSeek-Prover-V2 through 3-agent specialization (Solver + Conjecturer + Guide). Proof: parallel specialization beats monolithic scale.

What We Proved With Experiments

Real numbers from this chat. Measured on HydraGPT + OpenClaw.

Speed

10 sub-agents in parallel in 4.2 sec. Wall-clock speedup 3.7× vs sequential. 8/10 returned results.

Overhead

Fixed overhead per sub-agent: 15-18 sec (initialization + planning). For simple tasks, overhead exceeds useful work.

HTTP Direct vs Sub-agents

10 HTTP calls: 26 sec. 10 sub-agents: 2-3 min. HTTP direct is 7.5× faster → smart routing is mandatory.

Model Override Bug

model="..." in sessions_spawn is ignored. System picks from fallback chain. Workaround: pool-based round-robin.

Thread Binding

Persistent sub-agents in Telegram direct chats are impossible. Sub-agents = fire-and-forget background workers, not chat companions.

Tools in Sub-agents

Sub-agents can use web_search, exec, read, write. Verified. Contrary to GitHub issues.

Conclusion: Smart routing is mandatory. Simple tasks (<30 sec) → HTTP direct. Complex (research, code gen, file ops) → sub-agents with tools. Model override bug requires pool-based routing.

Scientific Foundation

Not marketing. Concrete numbers from peer-reviewed papers.

Self-Consistency

Multiple reasoning paths + majority voting. +17.9% GSM8K, +11.0% SVAMP vs single-path. ParallelClaw: /pc-ask prompt --models 5.

Tree-of-Thought

Parallel path exploration beats single-chain for complex tasks. /pc-do task --branches 3 --depth 2.

LLMxMapReduce

Naive chunking → aggregator noise. Good MapReduce requires structured outputs (evidence + confidence) + intelligent reduce.

Stanford Self-Play

7B beats 671B through 3-agent specialization. Proof: parallel specialization beats monolithic scale.

Debate vs Voting (arxiv 2508.17536): Majority voting explains nearly all quality gains. Debate is 3× more expensive and not much better. Fan-out + voting = default. Debate only for adversarial use cases.

Architecture & Limitations

Honest: what works, what doesn't, and why it doesn't kill the product.

Limitation	Workaround	Communication
Model override bug	Pool-based round-robin	"Any available from config"
Thread binding unavailable	Fire-and-forget subagents	"Background workers"
Sub-agent overhead 15-18s	HTTP direct for simple tasks	"Parallel for 3+ subtasks"
Sequential within session	Multiple sessions	"Multiple sessions for true parallel"

Layer 1: Skills

Markdown instructions + slash commands. Zero infrastructure, open source. /pc-ask, /pc-scan, /pc-do.

Layer 2: MCP-server

Key vault, cost ledger, connection pool (AIMD), query cache. Observability and cost control.

Layer 3: Adaptive Routing

Thompson sampling bandit. Personal benchmark. Warm-start from aggregated priors across all users.

Installing Talk While Work ✅ Ready

Ready skill, packaged as .skill file. Works through AGENTS.md.

Install

Install the talk-while-work skill — install.py adds routing section to AGENTS.md.

Zero config — routing immediately

System prompt receives 70+ examples: when to direct, when to spawn. Works immediately.

Config check for parallelism

True parallelism requires: model RPM limits, maxChildrenPerAgent in gateway config.

3 tier: Hobby / Pro / Power

Hobby: 1 model, sequential fallback. Pro: multi-model pool. Power: adaptive routing + bandit.

Key difference from other patterns: Talk While Work is a pattern inside the agent itself (self-delegation). Multi-Mind/Shard/Agent are patterns for external orchestration. Different levels, not a hierarchy.

Client Journey: 7 Acts

From landing page to daily use.

Landing Page

GIF: /pc-ask → 4 opinions + voting. Copy install command.

Install

claude plugin install parallelclaw — 10-15s. Wizard: auto-detect API keys.

First wow: /pc-ask

4 models, voting, synthesis. $0.018, 5.2s. Mass intelligence — impossible in Cursor or ChatGPT.

Second wow: /pc-scan

87-page PDF in 8.2s. Batch processing revolution.

Daily use

/pc-ask for important decisions, /pc-scan for batch, /pc-do for deep research.

/pc-stats

Saved time, win-rate, cost breakdown. Visible value.

Adaptive Routing

After 100 calls — offer to enable bandit. "For Go code, you choose Sonnet 73% of the time."

MVP: 4 Weeks

Honest scope. Each week — a deliverable for the user.

Wk 1

Foundation: primitives + infra

Provider detection + wizard + keychain vault. pc.consensus, pc.shard, pc.compose. Skills /pc-ask, /pc-scan, /pc-do. SQLite cost ledger. Landing page with GIF.

Wk 2

Business Templates: Due Diligence Light

/pc-dd "Company" — 7 branches, all 3 orchestration patterns. /pc-compete, /pc-research. YAML definitions. Benchmark verification.

Wk 3

Polish: errors, retry, community

Retry with exponential backoff + graceful degradation. CONTRIBUTING.md + architecture.md. GitHub labels. 30-sec demo video. ClawHub publication.

Wk 4

Launch: first users

HN / Reddit / Habr launch. Outreach to adjacent tool authors. 5 design partners. Install-to-first-value < 10 min. /pc-stats MVP.

Links & Resources

parallelclaw.ai GitHub Skill: talk-while-work Self-Consistency (Wang et al.) Stanford Self-Play Debate vs Voting

ParallelClaw MVP v3 · 4 weeks to first users · parallelclaw.ai