Entity Schema
The YAML frontmatter contract. Enforced by hook on every write to store/wiki/.
Required for every entity
entity_id: <slug> # lowercase, hyphens, ascii
type: <enum> # cohort | institution | investigator |
# platform | protocol | data_opportunity | bundle
canonical_name: <string> # non-empty
provenance:
sources: [<id>, ...] # non-empty list
last_compiled: <ISO date> # YYYY-MM-DD or full ISO 8601
card:
primary_signal: <string> # <= 200 chars
action: <string> # <= 120 chars
risk: <string> # <= 200 chars
data_opportunity fields
Required for cohorts surfaced as buyer-facing opportunities.
opportunity_type: <enum> # published_cohort | hospital_inventory_signal |
# surplus_trial_samples | broker_listed_inventory |
# biobank_self_reported | bounty_bundle
evidence_type: <enum> # direct | inferred | self_reported | composed
disease_area: [<string>] # non-empty list
modality: [<string>] # non-empty list
scoring:
scale: {confidence: low|medium|high}
cost: {confidence: low|medium|high}
quality: {provenance_depth: <0..1>, confidence: low|medium|high}
Optional: specimens block
For commission-intent scoring. Describes physical material in a freezer, not existing data.
specimens:
types: ["CSF", "plasma", "FFPE blocks"]
estimated_available_n: 500
access_route: "NIA RARC biorepository"
depletion_risk: "low -- well-aliquoted"
Institution cards
Institutions may use cards: (plural) instead of card: for dual views.
cards:
buyer_view:
primary_signal: "..."
action: "..."
risk: "..."
onboarding_view:
completeness: 0.6
effort_to_complete: "..."
demand_signal: "..."
what_we_know: [...]
what_is_missing: [...]
next_step: "..."
Institution services (optional)
services:
platforms: ["olink-target-96", "olink-explore-ht"]
rate_card_url: "https://..."
geography: "Indianapolis, IN, US"
access: "external academic, industry"
matrix_validated: ["plasma", "serum"]
Optional block for provider capabilities. Enriched by the bounty agent when a provider is discovered.
Bundle fields
opportunity_type: bounty_bundle
status: draft | ready | confirmed | executed
chain:
- link: specimen_source # closed enum (see below)
entity: <slug> # required for specimen_source, assay_execution
evidence: PUBLISHED # PUBLISHED | DERIVED | open_question
cost: "<figure or [open_question]>"
timeline: "<duration>"
note: "<one sentence>"
- link: assay_execution
entity: <slug>
evidence: DERIVED
cost: "$120-150/sample"
timeline: "4-6 weeks"
note: "Based on pricing-data.md"
total_known:
low: <number|null>
high: <number|null>
within_budget: true|false|unknown
Chain has 2-7 links. Link types: specimen_source | specimen_fitness | provider | logistics | assay_execution | prior_art | data_delivery. Evidence tiers: PUBLISHED | DERIVED | open_question. At least one link must have evidence != open_question.
Closed enums
Values outside these lists are rejected by the hook.
type: cohort, institution, investigator, platform, protocol, data_opportunity, bundleopportunity_type: published_cohort, hospital_inventory_signal, surplus_trial_samples, broker_listed_inventory, biobank_self_reported, bounty_bundleevidence_type: direct, inferred, self_reported, composedconfidence: low, medium, highstatus(bundle): draft, ready, confirmed, executedchain[].link(bundle): specimen_source, specimen_fitness, provider, logistics, assay_execution, prior_art, data_deliverychain[].evidence(bundle): PUBLISHED, DERIVED, open_questionback_reference.relation: parent_institution, sponsor, data_provider, collection_site, assay_platform, lead_pi, co_investigator, collection_protocol, related_trial, composed_into
Format rules
entity_idslug regex:^[a-z0-9]+(-[a-z0-9]+)*$provenance.sources: each entry must matchPMC\d+,PMID:\d+,DOI:...,NCT\d+, or a full URLprovenance_depth: float between 0.0 and 1.0confidence_score: float in [0.0, 1.0], 0.5 is blocked as a placeholder- All required string fields must be non-empty after trimming
What the hook does not validate
- Whether cited sources exist on disk
- Whether implications are sensible
- Whether scoring axes are calibrated
- Cross-entity link consistency (lint's job)
- Markdown body content
Source: .claude/rules/entity-schema.md