Merged
Conversation
Deploying logfire-docs with
|
| Latest commit: |
02111b9
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://9ad08d96.logfire-docs.pages.dev |
| Branch Preview URL: | https://dmontagu-live-evals-docs.logfire-docs.pages.dev |
- Type column: derive `agent` vs `function` from `gen_ai.agent.name` on the event itself (stamped by the OnlineEvaluation capability via OTel baggage), not from the parent span. The platform query was reworked to drop the parent-span join, so decorator targets always classify as `function` now. - Recent events table cap: 50, not 20 (matches the detail page). - Rename "failure count" → "error count" in the detail-page and Evaluator Shapes sections, matching the UI rename from "failures" to "errors" (exception-driven errors read distinctly from boolean fails). - Attribute list: fix `evaluator_version` → `evaluator.version` (SDK emits with dots, matching `score.value` / `score.label`). Add `gen_ai.evaluation.evaluator.source` (JSON-serialized EvaluatorSpec) and `gen_ai.agent.name` (what drives the kind classification) to the list.
… link wording - Intro paragraph: "one row per evaluator" is wrong — each row is one target, carrying a sparkline per evaluator. Rewritten to describe target-rows with per-evaluator sparklines inside. - Sidebar label: "Evals: Live" → "Evals: Live Monitoring" to match nav-items.ts. Also mention the chevron that expands a row to show per-evaluator detail in the directory. - Trace link label: match the UI's actual "Open trace in live view" text rather than paraphrasing. - Add a short note on evaluator-version badges — a feature we added during the review pass but wasn't yet documented.
alexmojaki
reviewed
Apr 22, 2026
alexmojaki
reviewed
Apr 22, 2026
…-naming drift - mkdocs nav: rename old "Evals" entry to "Evals: Datasets & Experiments" and new "Live Evaluations" entry to "Evals: Live Monitoring" to match the platform sidebar labels. - Sweep stale UI-label references across evaluate/datasets/*.md and guides/web-ui/evals.md (sidebar click instructions, fictional breadcrumbs, and "Evals tab" phrasing that doesn't exist in the UI). - live-evals.md: link to semconv + @evaluate + evaluator versioning docs; replace "---" bullet separators with em-dashes; correct the Type-bullet to explain that an @evaluate function nested in a Pydantic AI agent run is classified as agent (baggage propagation); rewrite the evaluator.source bullet (UI groups by (target, name), so the source doesn't distinguish rows).
Both pages documented the same UI surface with ~30% overlap. Clarify their scope: - guides/web-ui/evals.md = reference for the Evals: Datasets & Experiments page (what you see, not how to create it). New intro states scope and cross-links to ui.md / sdk.md / PAI. - evaluate/datasets/ui.md = dataset/case lifecycle tasks (create, edit, manage cases, export). Drop the "Navigating Datasets" overlap block and the duplicate "Viewing Experiments" block; point readers at evals.md for those. New intro states the task-oriented scope and cross-links to evals.md.
Pair with platform PR that changes the Live Evaluations classifier to require gen_ai.agent.name == gen_ai.evaluation.target (so an @evaluate function called from inside a Pydantic AI agent no longer classifies as agent). Drops the baggage-propagation caveat from the Type bullet and clarifies that agent.name is still propagated for drill-down but is no longer the classifier.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
docs/guides/web-ui/live-evals.md— a walkthrough of the Live Evaluations page (directory, target detail, time window, sort, evaluator shapes, trace integration). Mirrors the shape of the existingevals.mddoc.mkdocs.ymlunder theEvaluate:nav section (right after the Evals guide) and in the search-grouping list.The Python side is already documented in
ai.pydantic.dev/evals/online-evaluation/, so this guide links out there rather than duplicating it.Test plan
uv run mkdocs buildsucceeds with no warnings/guides/web-ui/live-evals/