Cost Tracking

Running autonomous AI loops consumes tokens, and tokens cost money. Ralph-starter includes a built-in cost tracker that estimates token usage and calculates costs per iteration, giving you visibility into how much each loop run is spending.

Enabling Cost Tracking

Cost tracking is enabled by default. You can explicitly control it with CLI flags:

# Enabled by default -- these are equivalent:
ralph-starter run "add user dashboard" --preset feature
ralph-starter run "add user dashboard" --preset feature --track-cost

# Disable cost tracking:
ralph-starter run "add user dashboard" --preset feature --no-track-cost

Token Estimation

Ralph-starter estimates token counts from the raw text of each agent interaction rather than relying on the provider's token counter. The estimation uses different ratios depending on content type:

Content Type	Characters per Token	Example
Prose (English text, docs, explanations)	~4 chars/token	A 2,000-character paragraph is ~500 tokens
Code (source files, config, commands)	~3.5 chars/token	A 2,000-character code block is ~571 tokens

The tracker detects code content by checking for common patterns like ```, function, const, import, export, class, def, async, and await. If any of these patterns are found, the lower 3.5 chars/token ratio is applied.

These are approximations. Actual token counts may vary by 10-20% depending on the model's tokenizer, language, and content structure.

Model Pricing

The cost tracker uses the following pricing table to convert token estimates into dollar costs. Prices are per 1 million tokens.

Model	Input (per 1M tokens)	Output (per 1M tokens)
Claude 3 Opus	$15.00	$75.00
Claude 3.5 Sonnet	$3.00	$15.00
Claude 3.5 Haiku	$0.25	$1.25
GPT-4	$30.00	$60.00
GPT-4 Turbo	$10.00	$30.00
Default (unknown models)	$3.00	$15.00

If the model you are using is not in the table, the tracker falls back to the Default pricing (Claude 3.5 Sonnet-equivalent rates), which provides a conservative middle-ground estimate.

You can specify the model for more accurate pricing:

ralph-starter run "add feature" --preset feature --model claude-3-opus

Per-Iteration Cost Breakdown

When cost tracking is enabled and progress tracking is active, ralph-starter writes a cost summary to the activity.md file in your project directory. This summary is updated after each iteration.

The summary looks like this:

## Cost Summary

| Metric | Value |
|--------|-------|
| Total Iterations | 12 |
| Total Tokens | 45.2K |
| Input Tokens | 38.1K |
| Output Tokens | 7.1K |
| Total Cost | $0.221 |
| Avg Cost/Iteration | $0.018 |
| Projected Max Cost | $0.552 |

Understanding the Metrics

Metric	Description
Total Iterations	Number of agent loop iterations completed so far
Total Tokens	Combined input and output tokens across all iterations
Input Tokens	Tokens sent to the model (prompts, context, code)
Output Tokens	Tokens generated by the model (responses, code output)
Total Cost	Estimated cumulative cost in USD
Avg Cost/Iteration	Total cost divided by number of iterations
Projected Max Cost	Extrapolated cost if the loop runs to its `maxIterations` limit

The Projected Max Cost is only shown after 3 or more iterations, since earlier projections would be unreliable. It multiplies the average cost per iteration by the remaining iteration budget and adds the cost already incurred.

CLI Display

During loop execution, cost statistics are displayed in the terminal:

Tokens: 45.2K (38.1K in / 7.1K out)
Cost: $0.221 ($0.018/iteration avg)
Projected max cost: $0.552

Token counts use shorthand notation:

Values under 1,000 are shown as-is (e.g., 847)
Values from 1,000 to 999,999 use K suffix (e.g., 45.2K)
Values above 1,000,000 use M suffix (e.g., 1.23M)

Cost values are formatted for readability:

Costs under $0.01 are shown with cent symbol (e.g., 0.50¢)
Costs from $0.01 to $0.99 show three decimal places (e.g., $0.221)
Costs $1.00 and above show two decimal places (e.g., $12.50)

Tips for Controlling Costs

Choose the Right Preset

Presets with lower maxIterations naturally cap spending:

Preset	Max Iterations	Typical Use
`review`	10	Read-only analysis
`incident-response`	15	Quick focused fix
`feature-minimal`	20	Simple implementation
`debug`	20	Investigation
`feature`	30	Standard development
`tdd-red-green`	50	Full TDD cycle

Use `--max-iterations` to Set Hard Limits

Override any preset's iteration cap:

# Cap at 10 iterations regardless of preset
ralph-starter run "add feature" --preset feature --max-iterations 10

Enable Circuit Breakers

Circuit breakers stop the loop when the agent is stuck, preventing token waste on repeated failures. See the Circuit Breaker documentation for details.

ralph-starter run "add feature" --circuit-breaker-failures 2 --circuit-breaker-errors 3

Use Cheaper Models

If your task does not require the most capable model, using a cheaper model reduces cost per iteration significantly. Claude 3.5 Haiku at $0.25/$1.25 per million tokens is 60x cheaper on input and 60x cheaper on output compared to Claude 3 Opus.

Use Rate Limiting

Rate limiting caps how fast the loop calls the API, giving you a natural spending brake. See the Rate Limiting documentation.

ralph-starter run "add feature" --rate-limit 30  # Max 30 calls/hour

Monitor the Projected Cost

Watch the Projected Max Cost in the activity.md file. If the projection exceeds your budget, you can stop the loop early (Ctrl+C) and restart with a lower --max-iterations or a different approach.

Cost Tracking Architecture

The cost tracker records each iteration independently:

Input text (the prompt sent to the agent) is measured and converted to estimated input tokens.
Output text (the agent's response) is measured and converted to estimated output tokens.
Both are combined into a TokenEstimate (inputTokens, outputTokens, totalTokens).
The estimate is multiplied by the model's pricing to produce a CostEstimate (inputCost, outputCost, totalCost).
Each iteration's data is stored and used to compute running averages and projections.

The tracker instance is created when the loop starts (if trackCost is true) and is available throughout the loop's lifetime. At the end of the loop, the final statistics are included in the LoopResult.stats.costStats object.

Enabling Cost Tracking​

Token Estimation​

Model Pricing​

Per-Iteration Cost Breakdown​

Understanding the Metrics​

CLI Display​

Tips for Controlling Costs​

Choose the Right Preset​

Use --max-iterations to Set Hard Limits​

Enable Circuit Breakers​

Use Cheaper Models​

Use Rate Limiting​

Monitor the Projected Cost​

Cost Tracking Architecture​