---
id: s3-provenance-package-templates
title: Provenance Package Templates
module: GROW-S3
module_slug: grow-s3-data-provenance
cluster: Systems
type: template
version: v0.2.0
status: Draft
tier: membership
contract_role: ""
canonical_url: "https://grow.goodcombinator.ai/library/registry/s3-provenance-package-templates"
download_url: "https://grow.goodcombinator.ai/library/registry/s3-provenance-package-templates.md"
license: CC-BY-4.0 (proposed — owner confirmation required)
source: GROW by Good Combinator
retrieved_at: 2026-05-29
---

# Provenance Package Templates

## Purpose
A **provenance package** is the set of fillable scaffolds a builder ships alongside any GROW system that produces consequential output. Each scaffold below is usable as-is. Copy, fill, commit to the system's repo or governance space, and reference from the system's evaluation gate submission.

Six scaffolds:
1. Data Lineage Map
2. Provenance Register
3. Transformation Log
4. Reproducibility Checklist
5. Source-Confidence Matrix
6. Decision-Trace Template

---

## 1. Data Lineage Map (scaffold)

```
SYSTEM: <system_id@semver>
LINEAGE MAP VERSION: <semver>
OWNER: <role>
LAST REVIEWED: <YYYY-MM-DD>

SOURCES (from s3-source-inventory-template)
- <source_id_1>: <name> | confidence_band: <high|medium|low|unknown>
- <source_id_2>: <name> | confidence_band: <...>
- <source_id_3>: <name> | confidence_band: <...>

NODES
- Source: <source_id> -> Extraction: <op> -> Transformation: <op> ...
- (use s3-lineage-map-spec node types only: Source, Extraction, Transformation,
   Embedding, Retrieval, Inference, ToolCall, Decision, Output)

EDGES (every edge MUST carry: operation, data_shape, confidence_propagation, is_deterministic)
- <node_a> -> <node_b> | operation: <...> | data_shape: <...> | conf: <preserve|downgrade|upgrade-on-merge> | det: <true|false>

NON-DETERMINISTIC SURFACE
- List every edge where is_deterministic=false. These feed s3-reproducibility-controls.

DECISION NODE(S)
- <Decision id>: decision_origin recorded per C2 enum (agent|human-override|fallback|escalation)

OUTPUT
- <output_id>: content_hash <sha256:...> | downstream consumers <...>
```

---

## 2. Provenance Register (scaffold)

The register is the running index of every provenance record the system has emitted. One row per `record_id`. This is the table operators query during audits.

```
| record_id | timestamp | system_version | model | source_ids | decision_origin | confidence_band | retention_class | corrections | evidence_pointer |
|-----------|-----------|----------------|-------|------------|-----------------|-----------------|-----------------|-------------|------------------|
| <uuid>    | <iso>     | <semver>       | <id>  | [<...>]    | <enum>          | <enum>          | <class>         | [<id>...]   | <uri>            |
```

Operating rules:
- Append-only. Corrections are new rows that reference `supersedes_record_id` in `corrections`.
- Indexed by `record_id`, `timestamp`, and `source_id` (multi).
- Restricted/regulated rows are surfaced behind access logs per `s3-governance-retention-policy`.

---

## 3. Transformation Log (scaffold)

One transformation log per run. Mirrors `transformation_history` in the schema but is the human-readable working copy.

```
RUN
  record_id: <uuid>
  started_at: <iso>
  ended_at: <iso>

STEPS
  1) step_id: t1
     type: Extraction
     operation: pull fl-dep-lpa0381 disbursements as_of=<iso>
     inputs: source_id=fl-dep-lpa0381 snapshot_hash=<sha256>
     outputs: rows=412 hash=<sha256>
     is_deterministic: true
     confidence_propagation: preserve

  2) step_id: t2
     type: Transformation
     operation: join parcels x impact zones on parcel_id
     inputs: <prev outputs>
     outputs: rows=18204 hash=<sha256>
     is_deterministic: true
     confidence_propagation: preserve

  3) step_id: i1
     type: Inference
     operation: draft memo
     model: <id@rev>
     prompt_hash: <sha256>
     temperature: 0.2
     inputs_hash: <sha256>
     outputs_hash: <sha256>
     is_deterministic: false
     confidence_propagation: downgrade

NOTES
  - Any HITL event during this run is recorded as a decision_trace entry; do not log only in prose.
```

---

## 4. Reproducibility Checklist (scaffold)

Use before declaring a system "ready for the evaluation gate" and before any re-run.

```
PINS
[ ] Source snapshots content-hashed and stored
[ ] Lineage map version pinned in record.version.lineage_map_version
[ ] Code version pinned (git SHA or semver) in record.version.system
[ ] Model id + immutable revision in record.version.model
[ ] Prompt template sha256 in record.version.prompt_hash
[ ] Retriever config hashed and recorded
[ ] Tool registry versions pinned

DETERMINISM
[ ] Deterministic segments produce byte-identical outputs vs. stored hashes
[ ] Non-deterministic segments meet decision-path thresholds (s3-reproducibility-controls)
[ ] Non-deterministic surface inventoried in the lineage map

EVIDENCE
[ ] Every evidence_pointer resolves
[ ] Every consequential claim has a backward trace per s3-decision-traceability
[ ] No silent unsupported claims

GOVERNANCE
[ ] retention_class assigned (not defaulted for civic-7yr / regulated-indef)
[ ] permissions inherited from strictest source
[ ] pii_flags / jurisdiction set if applicable

SIGN-OFF
[ ] Owner: <role> <signature> <iso timestamp>
[ ] Evaluator: <role> <signature> <iso timestamp>
```

---

## 5. Source-Confidence Matrix (scaffold)

A compact view of where confidence comes from and where it goes. Feeds `s2-scoring-system` per C4.

```
| source_id | authoritative | freshness vs. update_frequency | confidence_band (current) | band_rationale | drives_confidence_for (claim_types) | downgrade_triggers |
|-----------|---------------|-------------------------------|---------------------------|----------------|--------------------------------------|--------------------|
| fl-dep-lpa0381    | true  | within window | high   | Statutory grant ledger reconciled weekly by treasurer.        | grant_disbursement, encumbrance        | weekly miss        |
| wcpa-parcels      | true  | within window | high   | County system-of-record under FS Ch. 193 with daily refresh.  | parcel_ownership, parcel_geometry       | API outage         |
| ownerrez-bookings | true  | within window | high   | Booking system-of-record; high for booking events themselves. | booking_event, payout_event             | reconcile delta > $50 |
| aifg-transcripts  | false | within window | medium | Machine transcript with measurable NER drift.                 | quote_attribution, episode_topic        | NER error rate     |
| ecoguardian-stream| true  | within window | medium | Authoritative sensor feed pending calibration verification.   | turbidity_p95, sensor_uptime            | sensor_calibration_due |
| airbnb-csv-dump   | false | stale         | low    | Manual monthly export, overlaps ownerrez; never standalone.   | revenue_cross_check                     | always low standalone |
```

`band_rationale` is a required per-row field on `s3-source-inventory-template` (contract C4). Carry it through this matrix verbatim — do not paraphrase between the inventory and the matrix.

---

## 6. Decision-Trace Template (scaffold)

One per consequential output. Maps each claim to evidence and to its `decision_origin`.

```
OUTPUT
  artifact: <path or uri>
  record_id: <uuid>
  shipped_at: <iso>
  audience: <internal | constituent | regulator | guest | publication>

CLAIMS
  1) text: "<claim text>"
     kind: retrieved data | model inference | user input | tool output
     source_id(s): [<...>]
     evidence_pointer: <uri or hash#range>
     confidence_band: <enum>
     inline_qualifier: <none | [unsupported] | [low confidence] | [stale source]>
     decision_origin event(s): [<event_id_1>, <event_id_2>]

  2) text: ...

UNSUPPORTED CLAIMS HANDLING
  [ ] Each unsupported claim was removed, rewritten, or qualified
  [ ] Any escalation routed per s1-hitl-review-policy

HITL EVENTS (C2 shape)
  - event_id: <id>
    timestamp: <iso>
    agent_id: <id>
    decision_origin: <agent|human-override|fallback|escalation>
    evidence_pointer: <uri>
    rationale: <text>

SIGN-OFF
  - Author (system or human): <id>
  - Reviewer (if HITL): <role/name>  <iso>
```

---

## How the Six Fit Together
The Lineage Map is the blueprint. The Provenance Register is the ledger. The Transformation Log is the per-run worksheet. The Reproducibility Checklist is the gate. The Source-Confidence Matrix is the dashboard that pushes `source_confidence` to Course 2. The Decision-Trace Template is what travels with the output to its reader.

Ship all six. Anything less is an outline, not provenance.
