Add Trajectory.called_before?/4 for relative tool-call ordering assertions#565
Merged
Conversation
…tions Adds a predicate for asserting relative tool-call ordering — "tool A was called before tool B" — independent of how many other calls happen in between. This fills the gap between :superset (containment, order-independent) and :strict (whole sequence, exact count), a pattern that shows up constantly in agent trajectory evals. Semantics: - Returns true when any A precedes any B, i.e. min(index of A) < max(index of B). - Operates over the flat, ordered tool_calls list; same-turn calls are ordered by their position in that list. - Accepts a Trajectory, an LLMChain, or a bare list of tool-call maps. - By default a missing tool returns false; :require_both raises ArgumentError so a missing tool is detected rather than silently collapsing to false. Also adds the companion assert_called_before/4 and refute_called_before/4 macros to LangChain.Trajectory.Assertions. Closes #564 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
LangChain.Trajectory.called_before?/4and the companionassert_called_before/4/refute_called_before/4macros for assertingrelative tool-call ordering — "tool A was called before tool B" —
independent of how many other calls happen in between.
This fills the gap between the existing modes:
:supersetenforces containment but is order-independent.:strictrequires the whole sequence with exact count.Relative ordering ("search before answer", "validate before write") is the
common middle ground for agent trajectory evals, and previously required
hand-rolling index math over
tool_callsin every eval.API
Accepts a
Trajectory, anLLMChain, or a bare list of tool-call maps.Semantics decisions
These resolve the open questions from the issue:
reading:
min(index of A) < max(index of B). This tolerates interleaving(
A…B…A…B) while still catching a pure B-before-A ordering.false(sorefute_called_beforepasses vacuously). Passingrequire_both: trueraises an
ArgumentErrornaming the absent tool, so a missing tool isdetected rather than silently collapsing to
false. This mirrors themodule's existing convention of raising
ArgumentErroron misuse inmatches?/3, and applies symmetrically to bothassert_andrefute_.tool_calls, consistentwith the existing helpers.
Tests
25 new tests covering the predicate (ordering, missing tools, interleaving,
repeated calls,
LLMChain/list inputs,:require_bothraising) and bothassertion macros. Full
mix precommitis green (1799 tests + 31 doctests, 0failures).
Closes #564