Cost Tracking
Running autonomous AI loops consumes tokens, and tokens cost money. Ralph-starter includes a built-in cost tracker that estimates token usage and calculates costs per iteration, giving you visibility into how much each loop run is spending.
Enabling Cost Tracking
Cost tracking is enabled by default. You can explicitly control it with CLI flags:
# Enabled by default -- these are equivalent:
ralph-starter run "add user dashboard" --preset feature
ralph-starter run "add user dashboard" --preset feature --track-cost
# Disable cost tracking:
ralph-starter run "add user dashboard" --preset feature --no-track-cost
Token Estimation
Ralph-starter estimates token counts from the raw text of each agent interaction rather than relying on the provider's token counter. The estimation uses different ratios depending on content type:
| Content Type | Characters per Token | Example |
|---|---|---|
| Prose (English text, docs, explanations) | ~4 chars/token | A 2,000-character paragraph is ~500 tokens |
| Code (source files, config, commands) | ~3.5 chars/token | A 2,000-character code block is ~571 tokens |
The tracker detects code content by checking for common patterns like ```, function, const, import, export, class, def, async, and await. If any of these patterns are found, the lower 3.5 chars/token ratio is applied.
These are approximations. Actual token counts may vary by 10-20% depending on the model's tokenizer, language, and content structure.
Model Pricing
The cost tracker uses the following pricing table to convert token estimates into dollar costs. Prices are per 1 million tokens.
| Model | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
| Claude 3 Opus | $15.00 | $75.00 |
| Claude 3.5 Sonnet | $3.00 | $15.00 |
| Claude 3.5 Haiku | $0.25 | $1.25 |
| GPT-4 | $30.00 | $60.00 |
| GPT-4 Turbo | $10.00 | $30.00 |
| Default (unknown models) | $3.00 | $15.00 |
If the model you are using is not in the table, the tracker falls back to the Default pricing (Claude 3.5 Sonnet-equivalent rates), which provides a conservative middle-ground estimate.
You can specify the model for more accurate pricing:
ralph-starter run "add feature" --preset feature --model claude-3-opus
Per-Iteration Cost Breakdown
When cost tracking is enabled and progress tracking is active, ralph-starter writes a cost summary to the activity.md file in your project directory. This summary is updated after each iteration.
The summary looks like this:
## Cost Summary
| Metric | Value |
|--------|-------|
| Total Iterations | 12 |
| Total Tokens | 45.2K |
| Input Tokens | 38.1K |
| Output Tokens | 7.1K |
| Total Cost | $0.221 |
| Avg Cost/Iteration | $0.018 |
| Projected Max Cost | $0.552 |
Understanding the Metrics
| Metric | Description |
|---|---|
| Total Iterations | Number of agent loop iterations completed so far |
| Total Tokens | Combined input and output tokens across all iterations |
| Input Tokens | Tokens sent to the model (prompts, context, code) |
| Output Tokens | Tokens generated by the model (responses, code output) |
| Total Cost | Estimated cumulative cost in USD |
| Avg Cost/Iteration | Total cost divided by number of iterations |
| Projected Max Cost | Extrapolated cost if the loop runs to its maxIterations limit |
The Projected Max Cost is only shown after 3 or more iterations, since earlier projections would be unreliable. It multiplies the average cost per iteration by the remaining iteration budget and adds the cost already incurred.
CLI Display
During loop execution, cost statistics are displayed in the terminal:
Tokens: 45.2K (38.1K in / 7.1K out)
Cost: $0.221 ($0.018/iteration avg)
Projected max cost: $0.552
Token counts use shorthand notation:
- Values under 1,000 are shown as-is (e.g.,
847) - Values from 1,000 to 999,999 use
Ksuffix (e.g.,45.2K) - Values above 1,000,000 use
Msuffix (e.g.,1.23M)
Cost values are formatted for readability:
- Costs under $0.01 are shown with cent symbol (e.g.,
0.50¢) - Costs from $0.01 to $0.99 show three decimal places (e.g.,
$0.221) - Costs $1.00 and above show two decimal places (e.g.,
$12.50)
Tips for Controlling Costs
Choose the Right Preset
Presets with lower maxIterations naturally cap spending:
| Preset | Max Iterations | Typical Use |
|---|---|---|
review | 10 | Read-only analysis |
incident-response | 15 | Quick focused fix |
feature-minimal | 20 | Simple implementation |
debug | 20 | Investigation |
feature | 30 | Standard development |
tdd-red-green | 50 | Full TDD cycle |
Use --max-iterations to Set Hard Limits
Override any preset's iteration cap:
# Cap at 10 iterations regardless of preset
ralph-starter run "add feature" --preset feature --max-iterations 10
Enable Circuit Breakers
Circuit breakers stop the loop when the agent is stuck, preventing token waste on repeated failures. See the Circuit Breaker documentation for details.
ralph-starter run "add feature" --circuit-breaker-failures 2 --circuit-breaker-errors 3
Use Cheaper Models
If your task does not require the most capable model, using a cheaper model reduces cost per iteration significantly. Claude 3.5 Haiku at $0.25/$1.25 per million tokens is 60x cheaper on input and 60x cheaper on output compared to Claude 3 Opus.
Use Rate Limiting
Rate limiting caps how fast the loop calls the API, giving you a natural spending brake. See the Rate Limiting documentation.
ralph-starter run "add feature" --rate-limit 30 # Max 30 calls/hour
Monitor the Projected Cost
Watch the Projected Max Cost in the activity.md file. If the projection exceeds your budget, you can stop the loop early (Ctrl+C) and restart with a lower --max-iterations or a different approach.
Cost Tracking Architecture
The cost tracker records each iteration independently:
- Input text (the prompt sent to the agent) is measured and converted to estimated input tokens.
- Output text (the agent's response) is measured and converted to estimated output tokens.
- Both are combined into a
TokenEstimate(inputTokens, outputTokens, totalTokens). - The estimate is multiplied by the model's pricing to produce a
CostEstimate(inputCost, outputCost, totalCost). - Each iteration's data is stored and used to compute running averages and projections.
The tracker instance is created when the loop starts (if trackCost is true) and is available throughout the loop's lifetime. At the end of the loop, the final statistics are included in the LoopResult.stats.costStats object.