Entity Schema

The YAML frontmatter contract. Enforced by hook on every write to store/wiki/.

Required for every entity

entity_id: <slug>            # lowercase, hyphens, ascii
type: <enum>                 # cohort | institution | investigator |
                             # platform | protocol | data_opportunity | bundle
canonical_name: <string>     # non-empty
provenance:
  sources: [<id>, ...]       # non-empty list
  last_compiled: <ISO date>  # YYYY-MM-DD or full ISO 8601
card:
  primary_signal: <string>   # <= 200 chars
  action: <string>           # <= 120 chars
  risk: <string>             # <= 200 chars

data_opportunity fields

Required for cohorts surfaced as buyer-facing opportunities.

opportunity_type: <enum>     # published_cohort | hospital_inventory_signal |
                             # surplus_trial_samples | broker_listed_inventory |
                             # biobank_self_reported | bounty_bundle
evidence_type: <enum>        # direct | inferred | self_reported | composed
disease_area: [<string>]     # non-empty list
modality: [<string>]         # non-empty list
scoring:
  scale:   {confidence: low|medium|high}
  cost:    {confidence: low|medium|high}
  quality: {provenance_depth: <0..1>, confidence: low|medium|high}

Optional: specimens block

For commission-intent scoring. Describes physical material in a freezer, not existing data.

specimens:
  types: ["CSF", "plasma", "FFPE blocks"]
  estimated_available_n: 500
  access_route: "NIA RARC biorepository"
  depletion_risk: "low -- well-aliquoted"

Institution cards

Institutions may use cards: (plural) instead of card: for dual views.

cards:
  buyer_view:
    primary_signal: "..."
    action: "..."
    risk: "..."
  onboarding_view:
    completeness: 0.6
    effort_to_complete: "..."
    demand_signal: "..."
    what_we_know: [...]
    what_is_missing: [...]
    next_step: "..."

Institution services (optional)

services:
  platforms: ["olink-target-96", "olink-explore-ht"]
  rate_card_url: "https://..."
  geography: "Indianapolis, IN, US"
  access: "external academic, industry"
  matrix_validated: ["plasma", "serum"]

Optional block for provider capabilities. Enriched by the bounty agent when a provider is discovered.

Bundle fields

opportunity_type: bounty_bundle
status: draft | ready | confirmed | executed
chain:
  - link: specimen_source       # closed enum (see below)
    entity: <slug>             # required for specimen_source, assay_execution
    evidence: PUBLISHED         # PUBLISHED | DERIVED | open_question
    cost: "<figure or [open_question]>"
    timeline: "<duration>"
    note: "<one sentence>"
  - link: assay_execution
    entity: <slug>
    evidence: DERIVED
    cost: "$120-150/sample"
    timeline: "4-6 weeks"
    note: "Based on pricing-data.md"
total_known:
  low: <number|null>
  high: <number|null>
  within_budget: true|false|unknown

Chain has 2-7 links. Link types: specimen_source | specimen_fitness | provider | logistics | assay_execution | prior_art | data_delivery. Evidence tiers: PUBLISHED | DERIVED | open_question. At least one link must have evidence != open_question.

Closed enums

Values outside these lists are rejected by the hook.

Format rules

What the hook does not validate

Source: .claude/rules/entity-schema.md