Skip to content

feat(gmail): add include_analysis flag to get_gmail_thread_content#702

Merged
taylorwilsdon merged 4 commits intotaylorwilsdon:mainfrom
asreynolds1000:pr-d-thread-ownership-in-thread-content
Apr 28, 2026
Merged

feat(gmail): add include_analysis flag to get_gmail_thread_content#702
taylorwilsdon merged 4 commits intotaylorwilsdon:mainfrom
asreynolds1000:pr-d-thread-ownership-in-thread-content

Conversation

@asreynolds1000
Copy link
Copy Markdown

@asreynolds1000 asreynolds1000 commented Apr 17, 2026

Description

Supersedes #701 per your feedback. Instead of adding a new @server.tool, this PR bundles the thread-ownership analysis into the existing get_gmail_thread_content as an opt-in include_analysis: bool = False parameter.

Existing behavior preserved. When include_analysis=False (the default), get_gmail_thread_content returns the same formatted string it always has. When include_analysis=True, it returns a dict {"content": <existing string>, "analysis": <ownership metadata>}. No existing caller is affected.

Why the analysis exists. A common MCP-client pattern is synthesizing who-owes-whom from a thread. Clients often infer this from a date-range search on sent/received messages, which is fragile — same-day follow-ups get missed, thread context is lost, and the "last message" determination ends up wrong. The helper reads the actual thread chronology returned by threads().get(format='full') and extracts structured metadata.

Analysis shape

{
    "thread_id": str,
    "thread_subject": str | None,
    "last_sender": str | None,         # raw "Name <addr>"
    "last_timestamp": str | None,      # ISO-8601, UTC-aware
    "ball_in_court_of": "user" | "them" | None,
    "message_count_by_sender": {normalized_email: int},
    "participants": [normalized_email, ...],   # sorted, union of From/To/Cc
    "excluded_drafts": int,
    "message_count": int,
}

ball_in_court_of names whose court the ball is in (who owes the next reply):

  • "user" — authenticated user owes the reply (external party sent last)
  • "them" — other party owes the reply (user sent last)
  • None — unresolvable (all drafts, self-only thread, or malformed headers)

Edge cases handled

  • Plus-addressing. alex+foo@domain.com normalizes to alex@domain.com for identity comparison.
  • Drafts. Messages with the DRAFT label are excluded from the last-message determination and counted separately in excluded_drafts. A newer draft does NOT flip ball-in-court.
  • Malformed Date headers. Fall back to Gmail's internalDate (Unix ms) for chronological ordering.
  • Naive/aware datetime mixing. email.utils.parsedate_to_datetime returns a naive datetime when the Date header has no timezone. The internalDate fallback is aware UTC. Mixing them in > / < comparisons raises TypeError at runtime — the helper coerces naive → UTC at parse time so downstream comparisons are safe. Regression test included.
  • Multi-address parsing. Uses email.utils.getaddresses across From/To/Cc in one pass, not naive split(","). Handles quoted display names with embedded commas like "Doe, John" <john@example.com>, vendor@example.com correctly.
  • Self-only threads (user → user) and unparseable From headers return ball_in_court_of=None rather than silently defaulting.

Type of Change

  • New feature (non-breaking change which adds functionality)

Testing

Added two test files:

  • tests/gmail/test_thread_ownership_helpers.py21 pytest cases exercising the pure helpers (_analyze_thread_ownership_impl, _normalize_email, _parse_date_header) against fabricated threads.get(format='full') responses. No network, no credentials.
  • tests/gmail/test_get_gmail_thread_content_analysis.py5 integration cases covering the wire-up: default returns str, flag=True returns dict with both keys, analysis shape matches the documented contract, ball-in-court verdict is correct, content is byte-identical across both modes.

Full tests/gmail/ suite passes: 95 passed.

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas (timezone coercion rationale, ball-in-court edge cases, multi-alias limitation, getaddresses vs split)
  • My changes generate no new warnings
  • Allow edits from maintainers enabled

Known limitation

user_google_email is a single address. A user with multiple aliases (e.g., personal + work) won't be recognized as the same identity across a multi-account thread. Documented in the docstring.

Related

Closes #701 (replaced by this PR per maintainer guidance to bundle rather than add a new tool).

Summary by CodeRabbit

  • New Features

    • Thread retrieval can optionally include structured ownership analysis alongside the thread text (participants, per-sender counts, last sender/timestamp, and "ball in court"); text output remains identical with or without analysis.
    • Better handling of varied date formats and timezone normalization to UTC; consistent normalization of plus-address variants.
  • Tests

    • Added tests validating flag behavior, content parity, ownership-analysis semantics, draft exclusion, complex recipient parsing, date/time fallbacks, and malformed-header robustness.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 17, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: be022c46-ea35-4f15-8c07-2ca68e50ff0f

📥 Commits

Reviewing files that changed from the base of the PR and between 3e9011f and 3021e1b.

📒 Files selected for processing (4)
  • gmail/gmail_helpers.py
  • gmail/gmail_tools.py
  • tests/gmail/test_get_gmail_thread_content_analysis.py
  • tests/gmail/test_thread_ownership_helpers.py

📝 Walkthrough

Walkthrough

Adds an optional include_analysis flag to get_gmail_thread_content (now returns str or {"content","analysis"}) and introduces helpers to normalize emails, parse timestamps with UTC coercion, and compute thread-ownership metadata (last non-draft sender, participants, per-sender counts, excluded drafts, ball-in-court).

Changes

Cohort / File(s) Summary
Gmail tool entrypoint
gmail/gmail_tools.py
Added include_analysis: bool = False to get_gmail_thread_content, changed return type to `str
Core helpers
gmail/gmail_helpers.py
New module with _normalize_email, _parse_date_header, and _analyze_thread_ownership_impl implementing email normalization (lowercase, strip +tags), RFC822/internalDate parsing with naive→UTC coercion, draft exclusion, participant aggregation, per-sender counts, and ball-in-court determination.
Integration test
tests/gmail/test_get_gmail_thread_content_analysis.py
New async tests for get_gmail_thread_content, validating default string output, dict output with include_analysis=True, exact analysis keys, semantic ownership assertions, and content parity between modes.
Unit tests for helpers
tests/gmail/test_thread_ownership_helpers.py
New unit tests covering email normalization, date parsing/fallback and UTC coercion, draft handling, participant parsing (handles quoted commas), message counting, edge cases (empty/malformed headers, naive vs aware datetimes), and ball-in-court logic.

Sequence Diagram

sequenceDiagram
    participant Client as Client
    participant Tool as get_gmail_thread_content
    participant Gmail as Gmail API
    participant Formatter as Content Formatter
    participant Analyzer as Ownership Analyzer

    Client->>Tool: call(thread_id, include_analysis=True)
    Tool->>Gmail: threads.get(format="full")
    Gmail-->>Tool: raw thread payload
    Tool->>Formatter: format thread -> content (str)
    Formatter-->>Tool: content
    Tool->>Analyzer: _analyze_thread_ownership_impl(raw thread, user_email)
    Analyzer->>Analyzer: normalize emails\nparse Date/internalDate (coerce naive→UTC)\nexclude DRAFTs\ncount messages by sender\ndetermine ball_in_court_of
    Analyzer-->>Tool: analysis dict
    Tool-->>Client: {"content": content, "analysis": analysis}
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • taylorwilsdon

Poem

🐰 I hopped through headers, commas, and dates,
Stripped plus-tags and nudged naive times to UTC fates,
Drafts tucked aside while counts I declare,
Found who last spoke and who holds the air,
A tiny rabbit's audit—swift, neat, and fair.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 56.10% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely describes the main feature: adding an include_analysis flag to get_gmail_thread_content.
Description check ✅ Passed The description includes detailed explanation of changes, analysis shape, edge cases, testing approach, and related PR, but lacks explicit checkbox marks for the required Description and Type of Change sections per template.
Linked Issues check ✅ Passed The PR successfully implements all core objectives from #701: structured ownership analysis, ball-in-court verdict logic, draft exclusion, proper header parsing with timezone handling, robust multi-address parsing, and comprehensive tests (26 test cases).
Out of Scope Changes check ✅ Passed All changes are directly scoped to the feature requirements: new helper functions in gmail_helpers.py, updated get_gmail_thread_content signature with include_analysis parameter, and comprehensive test coverage for both helpers and integration.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
gmail/gmail_tools.py (1)

2442-2502: Optional: hoist email.utils / datetime / collections imports to module level.

_normalize_email, _parse_date_header, and _analyze_thread_ownership_impl each import parseaddr / parsedate_to_datetime / getaddresses / Counter / datetime / timezone inside the function body. The module already imports from email.utils at the top; consolidating these at module scope avoids repeated import lookups on each call and matches the rest of the file's style. Purely stylistic; safe to defer.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gmail/gmail_tools.py` around lines 2442 - 2502, The functions
_normalize_email, _parse_date_header, and _analyze_thread_ownership_impl perform
repeated local imports (parseaddr, parsedate_to_datetime, getaddresses, Counter,
datetime, timezone); hoist these imports to the module level instead: add the
required imports (from email.utils import parseaddr, parsedate_to_datetime,
getaddresses; from collections import Counter; from datetime import datetime,
timezone) near the top of the file and remove the in-function import statements
so the functions reference the module-level symbols directly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@gmail/gmail_tools.py`:
- Around line 2544-2567: The participant and sender counters are currently
adding malformed non-email tokens because _normalize_email can return strings
without "@"; update both places to only accept normalized values that contain
"@" before adding: in the getaddresses loop (the norm variable) only call
participants.add(norm) when norm and "@" in norm, and for the From header use
the from_norm check to require "@" in from_norm before incrementing
sender_counter[from_norm]; keep using _normalize_email and getaddresses but add
the explicit "@"-presence guard to both the participant loop and the
sender_counter increment to prevent free-text leakage.

---

Nitpick comments:
In `@gmail/gmail_tools.py`:
- Around line 2442-2502: The functions _normalize_email, _parse_date_header, and
_analyze_thread_ownership_impl perform repeated local imports (parseaddr,
parsedate_to_datetime, getaddresses, Counter, datetime, timezone); hoist these
imports to the module level instead: add the required imports (from email.utils
import parseaddr, parsedate_to_datetime, getaddresses; from collections import
Counter; from datetime import datetime, timezone) near the top of the file and
remove the in-function import statements so the functions reference the
module-level symbols directly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 174bc372-158b-47c5-b7fb-34020a11918e

📥 Commits

Reviewing files that changed from the base of the PR and between 3872ae4 and eb8fd67.

📒 Files selected for processing (3)
  • gmail/gmail_tools.py
  • tests/gmail/test_get_gmail_thread_content_analysis.py
  • tests/gmail/test_thread_ownership_helpers.py

Comment thread gmail/gmail_tools.py Outdated
@asreynolds1000 asreynolds1000 force-pushed the pr-d-thread-ownership-in-thread-content branch from eb8fd67 to c78b6ac Compare April 17, 2026 16:58
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
gmail/gmail_tools.py (1)

2467-2487: Good UTC coercion; one small thought on observability.

Coercing naive parsedate_to_datetime results to UTC before comparing with internalDate-derived datetimes avoids the TypeError on mixed-aware comparisons — nice catch in the docstring. Consider a logger.debug on the except branches so a thread with an entirely malformed Date header (and no internalDate) leaves a breadcrumb; currently both fallbacks swallow silently, which can make "why is last_timestamp None?" hard to diagnose in production.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gmail/gmail_tools.py` around lines 2467 - 2487, Add debug-level logging
inside both except blocks so failures parsing the Date header or converting
internal_date_ms leave breadcrumbs; specifically, in the block handling
parsedate_to_datetime failures (around the parsedate_to_datetime call and dt
variable) log the caught exception and the raw date_str, and in the block
converting internal_date_ms to a timestamp (around the ms conversion and
datetime.fromtimestamp call) log the exception and the raw internal_date_ms
value so callers of the surrounding function (which computes the email timestamp
used as the internalDate/last_timestamp fallback) can trace why None was
returned.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@gmail/gmail_tools.py`:
- Around line 2551-2567: The participants set is being populated before the
draft check so draft recipients leak into participants and skew
ball_in_court_of; either move the participants.add(norm) block to after the
is_draft guard (so addresses are only added for non-draft messages) or change
the external participant calculation to derive from sender_counter (e.g.
external_participants = set(sender_counter.keys()) - {normalized_user}) which is
already draft-filtered; update occurrences around the participants loop, the
is_draft check, and the external_participants/last_sender_norm logic to use the
chosen approach (refer to participants, is_draft, sender_counter,
external_participants, last_sender_norm, normalized_user).
- Line 2453: Add the missing return type annotation to the _parse_date_header
function: update its signature def _parse_date_header(date_str: str,
internal_date_ms: str | int | None) -> tuple[Optional[str], Optional[datetime]]
and ensure Optional and datetime are imported (from typing import Optional or
use PEP 604 equivalents and from datetime import datetime) so the function
explicitly returns (Optional[str], Optional[datetime]) as documented; keep
parameter names date_str and internal_date_ms unchanged.
- Line 2355: The _parse_date_header helper currently lacks a return type hint;
update its signature to return tuple[str | None, datetime | None] (i.e.,
annotate _parse_date_header as -> tuple[str | None, datetime | None]) so it
satisfies the project's mandatory type hints; ensure the function body still
returns values matching that tuple shape and that datetime is available in the
module namespace.

---

Nitpick comments:
In `@gmail/gmail_tools.py`:
- Around line 2467-2487: Add debug-level logging inside both except blocks so
failures parsing the Date header or converting internal_date_ms leave
breadcrumbs; specifically, in the block handling parsedate_to_datetime failures
(around the parsedate_to_datetime call and dt variable) log the caught exception
and the raw date_str, and in the block converting internal_date_ms to a
timestamp (around the ms conversion and datetime.fromtimestamp call) log the
exception and the raw internal_date_ms value so callers of the surrounding
function (which computes the email timestamp used as the
internalDate/last_timestamp fallback) can trace why None was returned.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 19ce1b5f-425e-4fb2-b3dd-0260becdd1ec

📥 Commits

Reviewing files that changed from the base of the PR and between eb8fd67 and c78b6ac.

📒 Files selected for processing (3)
  • gmail/gmail_tools.py
  • tests/gmail/test_get_gmail_thread_content_analysis.py
  • tests/gmail/test_thread_ownership_helpers.py
✅ Files skipped from review due to trivial changes (2)
  • tests/gmail/test_get_gmail_thread_content_analysis.py
  • tests/gmail/test_thread_ownership_helpers.py

Comment thread gmail/gmail_tools.py
Comment thread gmail/gmail_tools.py Outdated
Comment thread gmail/gmail_tools.py Outdated
@asreynolds1000 asreynolds1000 force-pushed the pr-d-thread-ownership-in-thread-content branch from c78b6ac to 24db65e Compare April 17, 2026 17:08
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
gmail/gmail_tools.py (2)

2343-2434: Consider an analysis_only mode to keep payloads concise.

When a "who-owes-whom" agent sets include_analysis=True, the response still embeds the full formatted thread content. For long threads (many messages, HTML-converted bodies), that inflates the LLM context with data the caller may not need. A follow-up could let callers opt out of the content payload (e.g., an additional response_mode: Literal["content", "analysis", "both"] = "content" parameter, or returning just the analysis dict when include_analysis="only").

As per coding guidelines: "Tool response payloads should be concise; redact verbose HTML/email bodies by default and offer links or IDs to fetch full content."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gmail/gmail_tools.py` around lines 2343 - 2434, Add an option to return only
the ownership analysis to avoid sending the full formatted thread: extend
get_gmail_thread_content signature (e.g., add response_mode:
Literal["content","analysis","both"] = "content" or a boolean analysis_only
flag) and update its docstring/return type accordingly; inside
get_gmail_thread_content, call _analyze_thread_ownership_impl(thread_response,
user_google_email) before or instead of building content when
response_mode=="analysis" (or analysis_only True) and skip calling
_format_thread_content/_fetch_raw_message_contents to avoid heavy payloads;
preserve current default behavior ("content"/both) for backwards compatibility
and ensure callers still receive {"content", "analysis"} when
response_mode=="both"; update logging and any callers/tests that expect the old
return shape.

2533-2533: Parameterize Counter for strict typing.

Per coding guidelines requiring mypy-strict type hints, annotate the element type. All keys are normalized email strings, so Counter[str] is precise.

♻️ Proposed diff
-    sender_counter: Counter = Counter()
+    sender_counter: Counter[str] = Counter()

As per coding guidelines: "Use mandatory type hints (mypy strict)".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gmail/gmail_tools.py` at line 2533, The variable sender_counter is currently
unparameterized; update its annotation from "sender_counter: Counter =
Counter()" to a generic form like "sender_counter: Counter[str] = Counter()" to
satisfy mypy-strict, and ensure the appropriate generic Counter is available
(either use typing.Counter or collections.Counter with generics depending on
project Python typing conventions) so the key type (normalized email strings) is
enforced.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@gmail/gmail_tools.py`:
- Around line 2343-2434: Add an option to return only the ownership analysis to
avoid sending the full formatted thread: extend get_gmail_thread_content
signature (e.g., add response_mode: Literal["content","analysis","both"] =
"content" or a boolean analysis_only flag) and update its docstring/return type
accordingly; inside get_gmail_thread_content, call
_analyze_thread_ownership_impl(thread_response, user_google_email) before or
instead of building content when response_mode=="analysis" (or analysis_only
True) and skip calling _format_thread_content/_fetch_raw_message_contents to
avoid heavy payloads; preserve current default behavior ("content"/both) for
backwards compatibility and ensure callers still receive {"content", "analysis"}
when response_mode=="both"; update logging and any callers/tests that expect the
old return shape.
- Line 2533: The variable sender_counter is currently unparameterized; update
its annotation from "sender_counter: Counter = Counter()" to a generic form like
"sender_counter: Counter[str] = Counter()" to satisfy mypy-strict, and ensure
the appropriate generic Counter is available (either use typing.Counter or
collections.Counter with generics depending on project Python typing
conventions) so the key type (normalized email strings) is enforced.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 174b4d83-5f42-4e4e-b974-da99d9c1bdf2

📥 Commits

Reviewing files that changed from the base of the PR and between c78b6ac and 24db65e.

📒 Files selected for processing (3)
  • gmail/gmail_tools.py
  • tests/gmail/test_get_gmail_thread_content_analysis.py
  • tests/gmail/test_thread_ownership_helpers.py
✅ Files skipped from review due to trivial changes (2)
  • tests/gmail/test_get_gmail_thread_content_analysis.py
  • tests/gmail/test_thread_ownership_helpers.py

@asreynolds1000 asreynolds1000 force-pushed the pr-d-thread-ownership-in-thread-content branch from 24db65e to 75d5c35 Compare April 17, 2026 17:18
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@gmail/gmail_tools.py`:
- Around line 2469-2477: The parsed date can be aware-with-offset so the
function must normalize all datetimes to UTC before returning; update the code
that uses parsedate_to_datetime to call astimezone(timezone.utc) for both naive
and aware datetimes (so the returned dt and dt.isoformat() are UTC), ensuring
the value assigned to analysis.last_timestamp (and any callers expecting
"ISO-8601, UTC-aware") always has a UTC tzinfo.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e3ec2613-91c5-441c-8a75-30f428d6dc75

📥 Commits

Reviewing files that changed from the base of the PR and between 24db65e and 75d5c35.

📒 Files selected for processing (3)
  • gmail/gmail_tools.py
  • tests/gmail/test_get_gmail_thread_content_analysis.py
  • tests/gmail/test_thread_ownership_helpers.py
✅ Files skipped from review due to trivial changes (1)
  • tests/gmail/test_thread_ownership_helpers.py

Comment thread gmail/gmail_tools.py Outdated
@asreynolds1000 asreynolds1000 force-pushed the pr-d-thread-ownership-in-thread-content branch from 75d5c35 to 26a7ce1 Compare April 17, 2026 17:26
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@gmail/gmail_tools.py`:
- Around line 2610-2650: The payload currently returns last_sender as the raw
From header (last_sender_raw) while callers and other fields use normalized
addresses; change the return to include both the raw header and a normalized
email field (e.g., keep "last_sender" as last_sender_raw or rename to
"last_sender_raw" and add "last_sender_email" set to last_sender_norm/normalized
email) so programmatic consumers can reliably index message_count_by_sender and
ball-in-court logic remains unchanged; update the tool/module docstring to
document the two fields and their formats and ensure the returned dict from the
function that builds the thread analysis includes the new key name you choose
(refer to last_sender_raw, last_sender_norm, normalized_user, and the returned
dict).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: dd934f8e-f62b-4494-a667-53a25abb0b2d

📥 Commits

Reviewing files that changed from the base of the PR and between 75d5c35 and 26a7ce1.

📒 Files selected for processing (3)
  • gmail/gmail_tools.py
  • tests/gmail/test_get_gmail_thread_content_analysis.py
  • tests/gmail/test_thread_ownership_helpers.py
✅ Files skipped from review due to trivial changes (1)
  • tests/gmail/test_thread_ownership_helpers.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/gmail/test_get_gmail_thread_content_analysis.py

Comment thread gmail/gmail_tools.py Outdated
…is flag

Bundles thread-ownership analysis (last sender, ball-in-court verdict,
per-sender message counts, participants) into the existing thread-read
tool as an opt-in flag, rather than adding a standalone tool. Per
maintainer guidance on PR taylorwilsdon#701: adding a new @server.tool would push
against the OpenAI native tool-calling payload limit, and the analysis
is only useful when reading a thread anyway, so a separate call would be
wasteful.

Design:

- `get_gmail_thread_content` gains `include_analysis: bool = False`.
- When False (default), returns the same formatted string — behavior
  unchanged for every existing caller.
- When True, returns a dict: {"content": <existing formatted string>,
  "analysis": <ownership metadata dict>}. Structured so downstream
  agents don't have to parse prose.

Analysis helpers (`_analyze_thread_ownership_impl`, `_normalize_email`,
`_parse_date_header`) are pure functions added to `gmail/gmail_tools.py`.
They handle:

- Plus-addressing normalization (alex+foo@domain → alex@domain)
- Drafts (label DRAFT) excluded from last-message determination;
  counted separately in `excluded_drafts`
- Malformed Date headers fall back to Gmail's `internalDate` (Unix ms)
- All timestamps coerced to UTC-aware at parse time so mixed
  naive/aware comparisons don't raise TypeError
- Quoted display names with embedded commas (e.g.,
  `"Doe, John" <john@ex.com>, vendor@ex.com`) parsed via
  `email.utils.getaddresses` — not naive `split(",")`
- Self-only threads return `ball_in_court_of=None` (no external party)
- Unparseable From headers return `ball_in_court_of=None`

`ball_in_court_of` names whose court the ball is in:
  "user"  → authenticated user owes the next reply (external sent last)
  "them"  → other party owes the next reply (user sent last)
  None    → unresolvable

Tests:

- `tests/gmail/test_thread_ownership_helpers.py` — 21 pytest cases
  covering the pure helpers (normalization, date parsing, ball-in-court
  logic, drafts, multi-party threads, empty/malformed thread data,
  quoted-comma recipients, naive/aware datetime regression).
- `tests/gmail/test_get_gmail_thread_content_analysis.py` — 5
  integration cases verifying the flag toggles the return shape
  correctly: default returns str, True returns dict with both keys,
  analysis shape matches helper contract, content parity across modes.

Full gmail suite: 95 passed.

Known limitation: `user_google_email` is a single address. A user with
multiple aliases (e.g., personal + work) won't be recognized as the same
identity across a multi-account thread. Documented in the docstring.

Supersedes closed PR taylorwilsdon#701, which proposed the same functionality as a
standalone `analyze_thread_ownership` tool.
@asreynolds1000 asreynolds1000 force-pushed the pr-d-thread-ownership-in-thread-content branch from 26a7ce1 to 772c6fc Compare April 17, 2026 17:41
@taylorwilsdon taylorwilsdon self-assigned this Apr 21, 2026
@taylorwilsdon taylorwilsdon added the enhancement New feature or request label Apr 21, 2026
@taylorwilsdon
Copy link
Copy Markdown
Owner

Thanks!

@taylorwilsdon
Copy link
Copy Markdown
Owner

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 21, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@taylorwilsdon
Copy link
Copy Markdown
Owner

@coderabbitai re-review please

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 28, 2026

@taylorwilsdon Sure! Triggering a full re-review of PR #702 now.


🧠 Learnings used
Learnt from: taylorwilsdon
Repo: taylorwilsdon/google_workspace_mcp PR: 0
File: :0-0
Timestamp: 2026-04-16T19:32:02.540Z
Learning: Repository: taylorwilsdon/google_workspace_mcp. User taylorwilsdon strongly prefers that all review comments be surfaced in a single comprehensive review pass rather than incrementally across multiple rounds. When reviewing PRs in this repository, be exhaustive and identify all issues at once — security, correctness, edge cases, style, test coverage, API consistency, etc. — to avoid generating new comments after the user has already resolved previous ones.
✅ Actions performed

Full review triggered.

@taylorwilsdon taylorwilsdon merged commit 729ba08 into taylorwilsdon:main Apr 28, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants