Provenance and transparency
The system does not take sides between brokers, hospitals, and direct sources. It scores provenance quality regardless of channel.
Provenance over bypass
A broker listing with full disclosed provenance scores higher than a direct source with undocumented protocol. A hospital biobank with detailed collection-protocol depth scores higher than a commercial vendor with shallow chain documentation, even if the vendor is faster and cheaper.
The score axis is provenance depth, not channel type. Channels are surfaced as opportunity_type -- published_cohort, hospital_inventory_signal, surplus_trial_samples, broker_listed_inventory, biobank_self_reported, bounty_bundle -- but they do not weight the score.
Gaps are first-class data
When the wiki cannot trace a sample back to its origin, the entity carries a low provenance_depth and the missing dimensions are listed in the "Open questions" section. The buyer sees the gap explicitly.
The system never invents provenance. It never fills a gap with a plausible guess. An entity with depth 0.3 and honest gaps is more useful than an entity with depth 0.8 and fabricated dimensions. Gaps are data. Lint queues them for follow-up.
Negative results count
A cohort that documents a failed analysis is more valuable than a cohort that is silent. Negative results -- a biomarker panel that showed no signal, an assay that failed on degraded specimens, a confounding variable that washed out an effect -- carry the same evidence standard as positive results. Verbatim quote. Source ID. Implication.
This prevents buyers from re-discovering dead ends. If the wiki records that plasma NfL showed no separation between AD and MCI in a particular cohort, the next buyer asking about NfL sees that finding before investing in the same experiment.
Both sides
The same wiki serves demand-side buyers and supply-side institutions. A buyer query reads the wiki and scores candidates. A supply-side onboarding reads the same wiki and produces a catalog listing draft for the institution to review.
There is no separate "buyer view" and "supplier view" of the data. One wiki. Institutions use cards.onboarding_view to see completeness, gaps, and demand signals. Buyers use cards.buyer_view (or the default card:) to see signal, action, and risk.
The 21 dimensions
The extract skill checks each paper against 21 intelligence dimensions. Provenance depth is the fraction covered. Here are the dimensions:
- Real numbers
- Sample usability
- Longitudinal structure
- Confounders and exposures
- Demographic composition
- Co-modalities and multi-omics value
- Effect sizes and model performance
- Replication and validation
- Access and consent scope
- Multi-site recruitment
- Negative results
- Sponsor and funding
- Published analysis code
- Collection protocol detail
- Biospecimen retention and types
- Comparative positioning
- Regulatory and ethical framework
- Data sharing infrastructure
- Contact and collaboration
- Specimen fitness for assay
- Cost and timeline signals
An entity with 10 of 21 dimensions covered has a depth of 0.48. An entity with 15 has 0.71. The median across the current wiki is around 0.45. Lint flags entities below 0.3 for priority re-compilation.