--- id: e6-cost-model title: Cost Model module: GROW-S7 module_slug: grow-s7-compute-economics cluster: Execution type: spec version: v0.1.0 status: Gate-reviewed tier: membership contract_role: "" canonical_url: "https://grow.goodcombinator.ai/library/registry/e6-cost-model" download_url: "https://grow.goodcombinator.ai/library/registry/e6-cost-model.md" license: CC-BY-4.0 (proposed — owner confirmation required) source: GROW by Good Combinator retrieved_at: 2026-05-29 --- # Cost Model The Cost Model is the foundational financial inventory for an agent stack. It catalogs every category of spend — compute, model API calls, storage, retrieval, tooling, human review — assigns each a cost-per-unit formula, and establishes hard and soft budget ceilings that downstream routing and unit-economics artifacts must respect. Without a populated cost model, the routing policy has no budget signal, the quality-cost matrix has no denominator, and the unit economics worksheet has no cost rows. Fill this spec before any routing or optimization work begins. --- ## Cost Category Inventory Each category below requires a fillable block entry (see §Fillable Fields). Categories that are not applicable should be marked `N/A` rather than omitted — the blank space would leave downstream consumers unable to confirm coverage. ### Category definitions **Model API cost** — token-based or per-call charges from the model provider. Splits across `premium`, `standard`, `cheap`, and `deterministic` model tiers (see the locked glossary definition of `model tier`). Input and output tokens priced separately; caching discounts tracked separately. **Compute / infrastructure cost** — cloud VM, container, serverless-function, or GPU runtime underlying the orchestration layer. Includes cold-start overhead for latency-sensitive flows. **Storage cost** — vector-store writes/reads, object storage for evidence and audit records, relational tables for state and provenance. Priced per GB-month plus per-operation fees. **Retrieval cost** — embedding generation, vector similarity search, re-ranking, web search, or external index lookups. High-frequency retrieval is often the second-largest cost after model calls. **Tooling / connector cost** — third-party API subscriptions, per-call fees, or metered SaaS charges for tools the agent invokes (GIS lookups, permit APIs, email/calendar connectors, payment processors). **Human-review cost** — time-value of HITL gates and escalations. Expressed as reviewer hourly rate × median review time per event × escalation frequency. This is often invisible in cost models and materially understates total operating cost. **Egress / networking cost** — data transfer out of the cloud region, particularly for large retrieval payloads or media files. **Observability / logging cost** — APM, log storage, eval-run storage. Often a fixed-cost overhead tier. --- ## Fillable Fields ```yaml agent_id: billing_period: cost_categories: model_api: provider: tiers: premium: model_id: input_cost_per_1k_tokens: output_cost_per_1k_tokens: avg_tokens_per_call: calls_per_billing_period: cache_discount_pct: <0–100> standard: model_id: input_cost_per_1k_tokens: output_cost_per_1k_tokens: avg_tokens_per_call: calls_per_billing_period: cache_discount_pct: <0–100> cheap: model_id: input_cost_per_1k_tokens: output_cost_per_1k_tokens: avg_tokens_per_call: calls_per_billing_period: cache_discount_pct: <0–100> deterministic: description: calls_per_billing_period: cost_usd: 0 compute: runtime: unit_cost: estimated_units_per_billing_period: storage: vector_store_gb: vector_store_cost_per_gb_month: object_storage_gb: object_storage_cost_per_gb_month: operations_per_billing_period: cost_per_1k_operations: retrieval: embedding_calls_per_billing_period: embedding_cost_per_1k_calls: search_calls_per_billing_period: search_cost_per_call: rerank_calls_per_billing_period: rerank_cost_per_call: tooling_connectors: - tool_id: vendor: pricing_model: monthly_base_cost_usd: per_call_cost_usd: avg_calls_per_billing_period: human_review: reviewer_role: hourly_rate_usd: median_review_minutes: escalations_per_billing_period: egress: gb_per_billing_period: cost_per_gb_usd: observability: flat_monthly_cost_usd: budget_ceilings: hard_monthly_usd: soft_monthly_usd: hard_per_run_usd: hard_token_per_call: hard_latency_ms: cost_per_outcome: cost_per_task_usd: cost_per_user_usd: cost_per_workflow_usd: ``` --- ## Hard vs. Soft Ceilings A **hard ceiling** is non-negotiable: crossing it blocks the agent mid-run and emits a C8 HITL event to `s1-hitl-review-policy` with `decision_origin: escalation`. The human reviewer decides whether to continue, downgrade the model tier, or abort. A hard ceiling is set at the level where further spend is not sanctioned — either a vendor limit, a per-customer contract commitment, or an internal policy floor. A **soft ceiling** is a cost-efficiency warning: crossing it logs a routing advisory to `s2-audit-trail-schema` via the C7 payload in `e6-routing-policy`, but does not gate the run. Soft ceilings are set roughly 80% of the hard ceiling to give operators runway to intervene before the hard gate fires. The relationship must be: `soft_monthly_usd < hard_monthly_usd`. An inverted ceiling is a validation error the routing policy must reject at startup. --- ## Worked Example: GoodSam Community-Intelligence Agent (illustrative) The following example uses the GoodSam.ai platform context. All numbers are `(illustrative)` — they reflect plausible order-of-magnitude costs for a small production agent, not verified vendor pricing. ```yaml agent_id: goodsam-community-intel billing_period: monthly cost_categories: model_api: provider: "cloud-model-provider-A" # generic tiers: premium: model_id: "frontier-reasoning-model" # (illustrative) input_cost_per_1k_tokens: 0.015 # (illustrative) output_cost_per_1k_tokens: 0.060 # (illustrative) avg_tokens_per_call: 4200 calls_per_billing_period: 800 cache_discount_pct: 15 standard: model_id: "mid-tier-model" # (illustrative) input_cost_per_1k_tokens: 0.003 # (illustrative) output_cost_per_1k_tokens: 0.015 # (illustrative) avg_tokens_per_call: 2800 calls_per_billing_period: 6000 cache_discount_pct: 20 cheap: model_id: "small-fast-model" # (illustrative) input_cost_per_1k_tokens: 0.0004 # (illustrative) output_cost_per_1k_tokens: 0.0012 # (illustrative) avg_tokens_per_call: 900 calls_per_billing_period: 14000 cache_discount_pct: 30 deterministic: description: "intent-classifier regex + rule engine" calls_per_billing_period: 22000 cost_usd: 0 compute: runtime: cloud-function unit_cost: 0.20 # (illustrative) per 1M invocations estimated_units_per_billing_period: 45000 storage: vector_store_gb: 12 vector_store_cost_per_gb_month: 0.30 # (illustrative) object_storage_gb: 40 object_storage_cost_per_gb_month: 0.023 # (illustrative) operations_per_billing_period: 180000 cost_per_1k_operations: 0.005 # (illustrative) retrieval: embedding_calls_per_billing_period: 28000 embedding_cost_per_1k_calls: 0.10 # (illustrative) search_calls_per_billing_period: 18000 search_cost_per_call: 0.004 # (illustrative) rerank_calls_per_billing_period: 5000 rerank_cost_per_call: 0.007 # (illustrative) tooling_connectors: - tool_id: county-gis-lookup vendor: "FL GIS API" # (illustrative) pricing_model: per-call monthly_base_cost_usd: 0 per_call_cost_usd: 0.002 # (illustrative) avg_calls_per_billing_period: 3200 - tool_id: public-records-search vendor: "FS Chapter 119 portal" # (illustrative) pricing_model: subscription monthly_base_cost_usd: 50 # (illustrative) per_call_cost_usd: 0 avg_calls_per_billing_period: 900 human_review: reviewer_role: "District Program Coordinator" hourly_rate_usd: 65 # (illustrative) median_review_minutes: 8 escalations_per_billing_period: 35 egress: gb_per_billing_period: 4 cost_per_gb_usd: 0.09 # (illustrative) observability: flat_monthly_cost_usd: 45 # (illustrative) budget_ceilings: hard_monthly_usd: 900 # (illustrative) soft_monthly_usd: 720 # (illustrative) hard_per_run_usd: 0.85 # (illustrative) hard_token_per_call: 8000 cost_per_outcome: cost_per_task_usd: 0.18 # (illustrative) ~4900 tasks/month cost_per_user_usd: 2.25 # (illustrative) ~400 active users cost_per_workflow_usd: 0.52 # (illustrative) ~1730 workflow runs ``` ### Approximate monthly total (illustrative) | Category | Estimated USD/mo | |---|---| | Model API — premium tier | $63 | | Model API — standard tier | $126 | | Model API — cheap tier | $16 | | Compute (cloud functions) | $9 | | Storage (vector + object) | $5 | | Retrieval (embed + search + rerank) | $82 | | Tooling / connectors | $56 | | Human review (HITL time) | $304 | | Egress + observability | $49 | | **Total** | **~$710** | Human review is the largest single line item at ~43% of total spend — a pattern common to production agents operating in regulated or public-sector contexts. This is why HITL frequency is a first-class optimization target in `e6-waste-reduction-playbook`. --- ## Usage Notes Populate this spec before building `e6-routing-policy`. The routing policy's tier-selection logic must have verified cost-per-call figures to enforce the hard ceiling. Treat the `cost_per_outcome` block as a derived metric — update it each billing period rather than hardcoding a projection. If any tier's actual spend diverges from the estimate by more than 25%, reopen this spec and the quality-cost matrix together. The `hard_monthly_usd` field is the budget signal used by the C8 contract: when the routing policy's running tally crosses this value, it must emit the HITL event described in `e6-routing-policy`.