What viCRO does

Finds specimens that match what you want to do. Tells you whether they'll work for your specific assay at your specific provider. Shows you exactly which claims are verified and which need one phone call. Every number traceable to a source.

You ask

I need stool samples from IBD patients to run shotgun metagenomics, treatment-naive, at least 100 subjects.

Ten seconds later:

Can you get it? Yes, partially — 5 biobanks hold IBD stool at scale. Best path: SPARC-IBD → Cornell Microbiome Core, ~$25K for 100 samples. What blocks it: Treatment-naive subset count — one inquiry to CCF unlocks it.

Then a table showing every sourcing path with per-link evidence states. Green = grounded. Open = unknown, here's who to contact. No fake numbers.


Six things that are different

Every number traces to a source

Brokers say "WGBS costs $150-300/sample." We say "IMR charges $875/sample for 22Gb depth [verified: imr.bio/pricing.html]. Cornell charges $250/sample standard depth [verified: epicore.med.cornell.edu]." The buyer clicks the link and sees the same number.

The system goes and looks

Ask about a domain with zero wiki data. The system doesn't say "nothing found." It searches PubMed, ClinicalTrials.gov, the web. Visits provider pages. Extracts stated requirements. Delivers a sourcing chain from what it found. Then compiles into the wiki so the next question gets an instant answer.

The answer is a chain, not a report

Not 25 pages of prose. A chain:

specimen source PUBLISHEDspecimen fitness DERIVEDprovider PUBLISHEDcost PUBLISHEDtimeline open

The buyer sees which links are solid and which need one phone call.

Fit-for-purpose is provider-specific

"Will these specimens work for my assay?" depends on who runs it. Psomagen needs >200ng and DIN>7.0. Cornell accepts standard input. EM-seq works with 10ng. The system pairs your specimen source with a specific provider and evaluates fitness against that provider's stated requirements.

Honest gaps over fake answers

When the system can't ground a claim: [open_question — searched psomagen.com, no DIN threshold found for stool matrix]. A gap with provenance is infinitely more useful than a confident-sounding number from training data.

The wiki compounds

First query in a new domain: search, score thin leads, deliver immediately, compile in background. Second query: wiki has entities, discover finds them instantly, full scoring with provenance depth. The product gets faster with every query. But the first buyer never waits.


What a bounty found that a broker wouldn't tell you

Real query: 500 AD CSF samples for Olink targeted proteomics.

The cost picture. Academic core rate cards show Olink Target 96 at $113/sample, not the $40-60 that appears in training data. For 500 samples: $61K, not $25K. SomaScan 7K: $720/sample. For 500 samples: $365K, not $75K.

The blocking gate. ADNI specimens are "not for use in technology development." A commercial buyer who doesn't know this wastes 6-10 weeks on the RARC application before hitting a policy wall. The chain flags it on link 1.

The provider landscape. 9 academic cores found, 4 with published CSF-validated pricing. IU Indianapolis at $113. UCSD at $165. UH Houston at $100. The buyer picks based on real numbers, not a single vendor quote.

The feasibility evidence. 4 published studies ran Olink PEA on AD CSF at scale (797 samples in Nature Aging 2023, 463 DIAN CSF in Cell 2024). Prior art is the strongest signal that the path works.


Less is more

The aha is not more features. It's fewer lies.

No composite score. Three axes, buyer picks the weight.

No training-data fills. [open_question] over a plausible guess.

No compile gate. Search results are immediately usable.

No prose where a table works. The chain table is the recommendation.

No wiki-link to a nonexistent entity. Source URL or nothing.

Every piece of information the buyer sees is either grounded — they can verify it — or honestly flagged — they know it's unverified. The system never smooth-talks. It shows its work.


How it works

viCRO is an open-source CLI that runs on Claude Code. It builds a knowledge graph of biological specimens — who has what, where, at what quality, under what consent, at what cost.

Four phases in a loop:

Ingest papers, trials, biobank uploads → immutable markdown Compile raw docs → structured wiki entities, cross-linked Query read the wiki, score candidates, deliver a recommendation Lint scan for gaps, staleness, broken links → feed back to compile

The wiki starts empty and grows with every question. 20 skills. 4 agents. MIT licensed. Your data stays on your machine.

Get started