vcro-compile
Compile orchestrator. Parallel extract, resolve, merge into the wiki.
Role
vcro-compile takes a list of PMC IDs (or NCT IDs) and runs the full extract, resolve, merge pipeline. It spawns one Sonnet subagent per paper for extraction — never loops serially in one context.
It runs on Opus. It orchestrates the fan-out but never reads papers itself (except for N=1 inline compiles). Sonnet workers do the extraction.
The load-bearing rule
One Sonnet subagent per paper. Never read multiple papers in one subagent context. Never run extract inside a loop in a single subagent.
The 50-paper wedge proved it: parallel Sonnet extract is 5-10x cheaper and 10x faster than serialized execution. 12 papers in 12 parallel subagents finishes in ~4 minutes; the same work in one serial loop takes 30+ minutes.
Scale-decision table
The right strategy depends on the number of papers. The orchestrator picks one row before starting.
N = 1
Inline in the orchestrator. No subagent. Read the paper, extract, resolve, merge. Subagent overhead is not worth it for one paper.
N = 2-10
One wave of N parallel Sonnet extracts. Default sweet spot. ~3-5 minutes wall time.
N = 11-50
Multiple waves of up to 10 parallel extracts. Read digests of wave K before launching K+1. ~5-15 minutes.
N = 50+
Shard pattern: split into batches of 50, run each as a separate wave-set. Coordinate with operator before starting. Cost will exceed $25.
Workflow steps
- Pre-flight — parse IDs, check which are already converted under
store/raw/, check the extract cache for already-extracted papers. - Extract — spawn one Sonnet subagent per paper. Each reads one
paper.md, picks 5-8 dimensions, writesfragments.json. Parallel waves of up to 10. - Resolve — one Sonnet subagent reads all fragments plus the wiki index. Decides NEW, MERGE_INTO, or AMBIGUOUS for each entity hint.
- Merge — one Sonnet subagent executes the resolution plan. Writes entity files. Hook-gated. Back-references applied.
- Post-flight — rebuild wiki index via
scripts/wiki_index.py. Log telemetry torun.jsonl.
Nesting warning
vcro-compile must run at the top level, not as a child of another agent. Claude Code's subagent system is one level deep — a subagent cannot spawn its own subagents. If vcro-os needs to compile, it reads this agent's instructions and executes the steps itself.
What it reads
- 3-5 sentence digests from extract subagents
- Extract cache at
store/runs/_cache/ - Wiki index for resolve decisions
What it does not do
- Read raw paper.md files (except N=1 inline)
- Run as a nested subagent
- Loop multiple papers in a single Sonnet context
- Spawn other Opus instances
Source: .claude/agents/vcro-compile.md