Revenue and cost trajectory as practices grow. Fixed overhead ($10.01) amortises rapidly — margin improves at scale.
| Practices | Revenue/mo | Infra cost | AI cost | Total cost | Profit | Margin |
|---|
May 2026 actuals · ~1.5 active practices · ap-southeast-2
| Service | May cost | Type | Per practice |
|---|---|---|---|
| Amplify (hosting) | $4.41 | Fixed | — |
| Domain ($48/yr ÷ 12) | $4.00 | Fixed (annual) | — |
| S3 (storage + requests) | $2.47 | Variable | $1.65 |
| Route 53 + ECR | $1.60 | Fixed | — |
| SQS | $1.21 | Variable | $0.81 |
| AppSync | $0.26 | Variable | $0.17 |
| DynamoDB | $0.15 | Variable | $0.10 |
| Lambda | $0.00 | Free tier | — |
| Cognito / CW / SNS / KMS | $0.00 | Free tier | — |
Adjust monthly job volumes to model usage scenarios. Token data from production logs (P90 conservative — 139 samples).
| Month | Bedrock | Infra (AU) | Total AU | Note |
|---|---|---|---|---|
| Mar 2026 | $13.61 | $5.59 | $75.16 | Incl. $48 domain + dev costs |
| Apr 2026 | $36.31 | $7.35 | $48.04 | — |
| May 2026 | $29.20 | $10.16 | $43.30 | — |
Lambda, Cognito, CloudWatch, SNS, KMS all billed $0.00 in May 2026. Current usage at ~1.5 practices.
| Service | May usage | Free limit | Breaks free tier at |
|---|---|---|---|
| Lambda | 23,907 GB-Sec (6%) | 400,000 GB-Sec/mo | ~25 practices |
| Cognito | 898 MAU | 50,000 MAU | ~83 practices |
| CloudWatch Logs | 0.57 GB-Mo | 5 GB/month | ~13 practices |
| SNS | 74 notifications | 1,000/month | ~20 practices |
165K input tokens per call with zero caching. System prompt is ~70K constant tokens across all calls.
| Scenario | Cost / call |
|---|---|
| Current (no cache) | — |
| With prompt caching | — |
| Saving per call | — |
134K input tokens per call with a large fixed template. Caching could save 40–50% on repeat calls within the TTL window.
| Scenario | Cost / call |
|---|---|
| Current (no cache) | — |
| With prompt caching | — |
| Saving per call | — |
Action item extraction runs on Sonnet at $3.30/M input. Claude Haiku 3.5 costs $0.80/M input — same quality for short extraction tasks.
| Model | Cost / extraction |
|---|---|
| Sonnet 4.5 (current) | — |
| Haiku 3.5 | — |
| Saving per call | — |
45 GB stored in S3 as of May 2026 (~30 GB/practice) with no expiry rule. AI outputs (PDFs, transcripts, presentations) accumulate indefinitely — S3 nearly doubled Mar→May. Recommend: transition to S3-IA after 90 days, Glacier after 365 days. Estimated saving: 40–60% of storage costs at steady state.
| Cost item | Source | Confidence | Note |
|---|---|---|---|
| Infrastructure costs | May 2026 AWS invoice | High | Real billing, ap-southeast-2 |
| Bedrock pricing | AWS console, Jun 2026 | High | Confirmed Sonnet 4.5 & 4.6 rates |
| AI token usage | 139 production log samples | High | P90 across 5 features |
| INFRA_VARIABLE ($3.00/mo) | May actuals ÷ 1.5 practices | Medium | 7% buffer above actual $2.79 |
| S3 storage model | 45 GB observed, May 2026 | Medium | Cumulative — no lifecycle policy yet |
| Scale projections | Linear extrapolation | Medium | Lambda free tier expires ~25 practices |