Why LLM cost tracking gets missed
Traditional software analytics rarely explain why AI spend changed. Token usage grows across prompts, background jobs, retries, and model changes, but most teams still only see a provider bill grouped by account. That makes it hard to answer simple questions: which feature is expensive, which release caused the spike, and whether a model upgrade actually improved anything enough to justify its cost.
The core metrics to track
Cost by feature
Tie each model call to the product feature or workflow that triggered it.
Input and output tokens
Separate prompt growth from completion growth so you know what changed.
Model and provider
Compare cost drift across OpenAI, Anthropic, Gemini, and fallback logic.
Request volume
Distinguish a healthy traffic increase from a runaway prompt loop.
A simple workflow that works
Instrument every model call
Capture provider, model, input tokens, output tokens, feature name, and the event timestamp right after each response.
Group spend by feature
Summaries, chat, search, and background enrichment should each be visible as separate cost centers.
Add alerts for sudden changes
Notify the team when daily spend, feature spend, or cost per request jumps past a threshold.
What good visibility looks like
A strong LLM cost tracking setup shows which features consume the most spend, how token usage changes over time, which model choices increased cost, and whether an alert came from higher traffic or worse prompt efficiency. The goal is not just reporting. It is making it obvious what to optimize next.