Llm usage records by ajaycj · Pull Request #4 · salesforce-misc/switchplane

ajaycj · 2026-04-27T23:03:09Z

Whats in the PR?
a switchplane.usage module will normalize token counts/cost estimates, AgentContext will emit llm.usage, and the devops graph will record the one LLM node plus estimated deterministic savings. I’m going to make those focused edits and add tests for the new usage helper and event emission.

salesforce-cla · 2026-04-27T23:03:14Z

Thanks for the contribution! Unfortunately we can't verify the commit author(s): Ajay Chinthalapalli Jayakumar <a***@s***.com>. One possible solution is to add that email to your GitHub account. Alternatively you can change your commits to another email and force push the change. After getting your commits associated with your GitHub account, refresh the status of this Pull Request.

demianbrecht

See inline comments. The main concern is that LangChain already surfaces token usage on every AIMessage via usage_metadata — this PR re-implements that extraction manually and adds significant ceremony to task code. A LangGraph callback approach would achieve the same result transparently.

demianbrecht · 2026-04-29T18:48:18Z

+    """Extract provider-reported token counts from common LangChain responses."""
+
+    usage_metadata = getattr(response, "usage_metadata", None)
+    if isinstance(usage_metadata, dict):
+        prompt = _coerce_int(usage_metadata.get("input_tokens") or usage_metadata.get("prompt_tokens"))
+        completion = _coerce_int(
+            usage_metadata.get("output_tokens")
+            or usage_metadata.get("completion_tokens")
+            or usage_metadata.get("generated_tokens")
+        )
+        total = _coerce_int(usage_metadata.get("total_tokens"))
+        if prompt is not None or completion is not None or total is not None:
+            return prompt, completion, total
+
+    response_metadata = getattr(response, "response_metadata", None)
+    if isinstance(response_metadata, dict):
+        token_usage = response_metadata.get("token_usage") or response_metadata.get("usage")
+        if isinstance(token_usage, dict):
+            prompt = _coerce_int(token_usage.get("prompt_tokens") or token_usage.get("input_tokens"))
+            completion = _coerce_int(token_usage.get("completion_tokens") or token_usage.get("output_tokens"))
+            total = _coerce_int(token_usage.get("total_tokens"))
+            if prompt is not None or completion is not None or total is not None:
+                return prompt, completion, total
+
+    return None, None, None


LangChain already surfaces token usage on every AIMessage via usage_metadata (input_tokens, output_tokens, total_tokens). This function is manually re-extracting data that the framework already provides — it doesn't add new information.

A LangGraph callback (on_llm_end) could capture this automatically without any custom extraction logic.

demianbrecht · 2026-04-29T18:48:18Z

+    "claude-sonnet-4-20250514": ModelPricing(3.0, 15.0),
+    "claude-sonnet-4-5-20250929": ModelPricing(3.0, 15.0),
+    "claude-sonnet-4-6": ModelPricing(3.0, 15.0),
+    "claude-opus-4-20250514": ModelPricing(15.0, 75.0),
+    "claude-opus-4-6-v1": ModelPricing(15.0, 75.0),
+    "claude-haiku-4-5-20251001": ModelPricing(1.0, 5.0),
+    "gpt-4o": ModelPricing(2.5, 10.0),
+    "gpt-4o-mini": ModelPricing(0.15, 0.60),
+    "gemini-2.0-flash": ModelPricing(0.10, 0.40),
+    "gemini-2.5-flash": ModelPricing(0.30, 2.50),
+    "gemini-2.5-pro": ModelPricing(1.25, 10.0),
+}


This pricing table will go stale immediately. Prices change frequently, model IDs are versioned, and this doesn't account for caching discounts, batch API pricing, prompt caching writebacks, etc.

If cost tracking is needed, it belongs in config or an external source — not a compiled-in dict that requires code changes to update.

demianbrecht · 2026-04-29T18:48:18Z

+    """Estimate USD cost for a model if pricing is known."""
+
+    pricing = MODEL_PRICING.get(model)
+    if pricing is None:
+        return None
+    cost = (prompt_tokens / 1_000_000 * pricing.input_per_million) + (
+        completion_tokens / 1_000_000 * pricing.output_per_million
+    )
+    return round(cost, 6)


This len(text) / 4 heuristic is very rough, and the results get stored in the same LLMUsageRecord alongside provider-reported actuals. Downstream consumers of these records have no reliable way to distinguish precision levels.

The estimated_tokens_saved metric (raw prompt tokens from this estimate minus actual prompt tokens from the provider) is comparing a pre-call guess against a post-call actual — not a meaningful comparison.

demianbrecht · 2026-04-29T18:48:18Z

+        usage = llm_usage_from_response(
+            response,
+            task_id=ctx.task_id,
+            model=model,
+            node_name="summarize",
+            fallback_prompt_text=f"{_SYSTEM_PROMPT}\n\n{prompt}",
+            fallback_completion_text=str(response.content),
+            estimated_raw_prompt_tokens=state["estimated_raw_prompt_tokens"],
+            metadata={
+                "deterministic_nodes": 3,
+                "llm_nodes": 1,
+                "rows_processed": state["rows_processed"],
+                "formatted_prompt_tokens_estimate": estimate_text_tokens(prompt),
+            },
+        )
+        ctx.record_llm_usage(
+            model=usage.model,
+            node_name=usage.node_name,
+            prompt_tokens=usage.prompt_tokens,
+            completion_tokens=usage.completion_tokens,
+            total_tokens=usage.total_tokens,
+            estimated_cost_usd=usage.estimated_cost_usd,
+            estimated_raw_prompt_tokens=usage.estimated_raw_prompt_tokens,
+            estimated_tokens_saved=usage.estimated_tokens_saved,
+            metadata=usage.metadata,
+        )
+        ctx.progress(
+            "LLM usage recorded",
+            prompt_tokens=usage.prompt_tokens,
+            completion_tokens=usage.completion_tokens,
+            total_tokens=usage.total_tokens,
+            estimated_cost_usd=usage.estimated_cost_usd,
+            estimated_tokens_saved=usage.estimated_tokens_saved,
        )


This is the core problem with the approach — the summarize node went from ~5 lines to 30+ lines of usage-tracking ceremony. Every LLM node in every task would need this same boilerplate.

If usage tracking is a framework concern, it should be transparent. A LangGraph callback on on_llm_end could emit the llm.usage event automatically with zero changes to task code.

salesforce-cla Bot added the cla:missing label Apr 27, 2026

ajaycj force-pushed the llm-usage-records branch from 818d0ba to 0c6bd87 Compare April 27, 2026 23:04

ajaycj closed this Apr 27, 2026

ajaycj reopened this Apr 27, 2026

ajaycj force-pushed the llm-usage-records branch from 5a0da28 to 92738e1 Compare April 27, 2026 23:09

switchplane.usage module will normalize token counts/cost estimates

905ea08

ajaycj force-pushed the llm-usage-records branch from 92738e1 to 905ea08 Compare April 27, 2026 23:13

salesforce-cla Bot added cla:signed and removed cla:missing labels Apr 27, 2026

demianbrecht reviewed Apr 29, 2026

View reviewed changes

demianbrecht requested changes Apr 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Llm usage records#4

Llm usage records#4
ajaycj wants to merge 1 commit into
salesforce-misc:mainfrom
ajaycj:llm-usage-records

ajaycj commented Apr 27, 2026

Uh oh!

salesforce-cla Bot commented Apr 27, 2026

Uh oh!

demianbrecht left a comment

Uh oh!

demianbrecht Apr 29, 2026

Uh oh!

demianbrecht Apr 29, 2026

Uh oh!

demianbrecht Apr 29, 2026

Uh oh!

demianbrecht Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ajaycj commented Apr 27, 2026

Uh oh!

salesforce-cla Bot commented Apr 27, 2026

Uh oh!

demianbrecht left a comment

Choose a reason for hiding this comment

Uh oh!

demianbrecht Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

demianbrecht Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

demianbrecht Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

demianbrecht Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants