feat: add Google ADK memory integration example#542
feat: add Google ADK memory integration example#542m1lestones wants to merge 3 commits intoplastic-labs:mainfrom
Conversation
WalkthroughAdds a Google ADK example integrating Honcho persistent memory: documentation, a runnable async chat demo, utilities for client/session management, context retrieval, memory querying, and saving conversation turns, plus project packaging and environment setup. Changes
Sequence DiagramsequenceDiagram
participant User
participant Main as chat()
participant Honcho as Honcho API
participant Agent as LlmAgent
participant LLM as Google LLM
User->>Main: send message
Main->>Honcho: save user message
activate Honcho
Honcho-->>Main: confirm saved
deactivate Honcho
Main->>Honcho: get_context(ctx, tokens=2000)
activate Honcho
Honcho-->>Main: context history
deactivate Honcho
Main->>Main: build instruction (base + history)
Main->>Agent: create LlmAgent (register query_memory tool)
Main->>Agent: runner.run_async(user content, session ids)
Agent->>LLM: prompt with instruction + context
LLM-->>Agent: response (streamed events)
Agent-->>Main: stream events
Main->>Honcho: save assistant response
activate Honcho
Honcho-->>Main: confirm saved
deactivate Honcho
Main-->>User: return assistant response
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (2)
examples/google-adk/python/tools/get_context.py (1)
22-30: Add explicit guards around context retrieval.This path is all external I/O but currently has no explicit validation/error mapping. A small guard improves predictability for callers.
Suggested patch
def get_context( @@ - honcho = get_client() - user_peer = honcho.peer(ctx.user_id) - assistant_peer = honcho.peer(ctx.assistant_id) - session = honcho.session(ctx.session_id) - - session.add_peers([user_peer, assistant_peer]) - - context = session.context(tokens=tokens) - return context.to_openai(assistant=ctx.assistant_id) + if tokens <= 0: + raise ValueError("tokens must be greater than 0") + try: + honcho = get_client() + user_peer = honcho.peer(ctx.user_id) + assistant_peer = honcho.peer(ctx.assistant_id) + session = honcho.session(ctx.session_id) + session.add_peers([user_peer, assistant_peer]) + context = session.context(tokens=tokens) + return context.to_openai(assistant=ctx.assistant_id) + except Exception as exc: + raise RuntimeError("Failed to retrieve Honcho context") from excAs per coding guidelines, "Use explicit error handling with appropriate exception types."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/google-adk/python/tools/get_context.py` around lines 22 - 30, The context retrieval block lacks validation and error mapping; wrap the external calls (get_client(), honcho.peer(...), honcho.session(...), session.add_peers(...), session.context(...), and context.to_openai(...)) with explicit guards and exception handling: check that honcho, user_peer, assistant_peer, session, and context are not None before proceeding, catch and map likely I/O/runtime exceptions to a clear, specific exception (e.g., ContextRetrievalError or ValueError) with a descriptive message, and rethrow or return a well-defined error result so callers get predictable failures instead of raw exceptions.examples/google-adk/python/main.py (1)
53-56: Treat injected history as untrusted text.Formatting past messages directly into the system instruction makes it easier for malicious prior content to steer behavior. Add explicit delimiting/guardrails around injected history.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/google-adk/python/main.py` around lines 53 - 56, The current code interpolates prior messages directly into the system prompt via formatted/history/base, which treats injected history as trusted; change the assembly so the history is explicitly labeled and delimited (e.g., add a clear header like "User-provided conversation history (UNTRUSTED):" and wrap the joined messages in distinct delimiters such as "-----BEGIN HISTORY-----" / "-----END HISTORY-----") and also escape or sanitize message content from the history variable before joining to prevent prompt injection; update the string construction that produces formatted and the final return so the assistant sees the history only as separated, untrusted text with explicit guardrails rather than inline system content.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@examples/google-adk/python/main.py`:
- Around line 128-139: The code unconditionally calls save_memory(user_id,
response_text, "assistant", session_id) even when runner.run_async yields no
final response, causing save_memory to raise ValueError; add a guard after the
async loop that only calls save_memory when response_text is non-empty (e.g., if
response_text: ...), referencing the response_text variable and the
runner.run_async / event.is_final_response() logic to locate the loop and
save_memory call.
In `@examples/google-adk/python/tools/query_memory.py`:
- Around line 20-29: The current validation accepts whitespace-only queries and
lets transport/API exceptions from peer.chat bubble up; update the validation to
reject blank/whitespace-only input by using if not query or not query.strip()
(or equivalent) and raise ValueError("query must not be empty or whitespace");
wrap the peer.chat call (honcho = get_client(); peer = honcho.peer(user_id);
peer.chat(query=query)) in a try/except that catches underlying exceptions and
re-raises a clear, explicit exception (e.g., raise RuntimeError("Failed to query
memory") from e) so callers receive a consistent, descriptive error type and
original exception is preserved via exception chaining.
In `@examples/google-adk/python/tools/save_memory.py`:
- Around line 35-36: The code currently maps any unknown role to user_peer,
which can silently misattribute messages; before computing sender, validate the
role variable explicitly (e.g., check role == "assistant" or role == "user") and
raise a ValueError or return an error for any other value; then set sender =
assistant_peer if role == "assistant" elif role == "user" else (error), and
continue to call sender.message(content) and session.add_messages([...]) only
after the role check so typos like "assisstant" don't get persisted as the wrong
author.
---
Nitpick comments:
In `@examples/google-adk/python/main.py`:
- Around line 53-56: The current code interpolates prior messages directly into
the system prompt via formatted/history/base, which treats injected history as
trusted; change the assembly so the history is explicitly labeled and delimited
(e.g., add a clear header like "User-provided conversation history (UNTRUSTED):"
and wrap the joined messages in distinct delimiters such as "-----BEGIN
HISTORY-----" / "-----END HISTORY-----") and also escape or sanitize message
content from the history variable before joining to prevent prompt injection;
update the string construction that produces formatted and the final return so
the assistant sees the history only as separated, untrusted text with explicit
guardrails rather than inline system content.
In `@examples/google-adk/python/tools/get_context.py`:
- Around line 22-30: The context retrieval block lacks validation and error
mapping; wrap the external calls (get_client(), honcho.peer(...),
honcho.session(...), session.add_peers(...), session.context(...), and
context.to_openai(...)) with explicit guards and exception handling: check that
honcho, user_peer, assistant_peer, session, and context are not None before
proceeding, catch and map likely I/O/runtime exceptions to a clear, specific
exception (e.g., ContextRetrievalError or ValueError) with a descriptive
message, and rethrow or return a well-defined error result so callers get
predictable failures instead of raw exceptions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5b14151e-32a4-48df-8f8f-7b42c4ba24b3
📒 Files selected for processing (8)
examples/google-adk/README.mdexamples/google-adk/python/main.pyexamples/google-adk/python/pyproject.tomlexamples/google-adk/python/tools/__init__.pyexamples/google-adk/python/tools/client.pyexamples/google-adk/python/tools/get_context.pyexamples/google-adk/python/tools/query_memory.pyexamples/google-adk/python/tools/save_memory.py
…n, query trim+wrap
…s, update model - Remove add_peers() from get_context and save_memory — called on every turn was redundant; now called once via setup_session() at startup - Replace hardcoded demo-session with uuid.uuid4() per run so history does not accumulate across sessions - Update MODEL_ID to gemini-2.5-flash (2.0-flash no longer available) - Fix pyproject.toml build target for hatchling wheel discovery
|
Tested end-to-end against live Honcho API. Memory persists across sessions — agent correctly recalls user info when asked. Fixed |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
examples/google-adk/python/main.py (2)
161-176: Blockinginput()insideasync def mainand missing docstring.
input()blocks the event loop thread; harmless for this single-user demo but worth calling out since the module is otherwise async. Also, per coding guidelines,main()should carry a Google-style docstring like the other functions in this file.As per coding guidelines, "Use Google style docstrings".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/google-adk/python/main.py` around lines 161 - 176, The async main() currently uses blocking input() and lacks a Google-style docstring; update main to include a Google-style docstring and make the blocking call asynchronous (e.g., run input() in a thread via asyncio.to_thread or use an async console like aioconsole) so the event loop isn't blocked, ensuring the existing setup_session(user_id, session_id) and await chat(user_id, user_input, session_id) calls remain unchanged and integrated; add the docstring at the top of main describing its purpose, args, and return type per Google style.
124-135: RecreatingInMemorySessionServiceandRunnerevery turn drops ADK session state.
chat()constructs a newInMemorySessionServiceandRunneron each invocation, so ADK's own per-turn session (message history, tool-call state) starts empty every time. Honcho still provides the long-term memory viabuild_instruction, so the demo functions, but the ADK session abstraction is effectively unused and ADK-level features that rely on cross-turn state (e.g., intermediate events, tool streaming continuity) won't work as intended. Consider hoistingsession_service, the ADK session, andRunnerintosetup_session()/ module scope and only rebuilding theLlmAgentper turn when the instruction must change — or better, useinstructionas a callable/provider so the agent itself can be constructed once.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/google-adk/python/main.py` around lines 124 - 135, The chat() function currently creates a new InMemorySessionService, adk_session, and Runner each turn which resets ADK per-turn state; move InMemorySessionService, the created adk_session, and Runner out of chat() into setup_session() or module scope so they are instantiated once and reused across turns, and only reconstruct LlmAgent (or provide instruction as a callable) when the instruction changes; update references to InMemorySessionService, adk_session, Runner, chat(), setup_session(), LlmAgent, and build_instruction so session state (message history, tool-call state, streaming) persists between invocations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@examples/google-adk/python/main.py`:
- Around line 146-156: The final response handler currently takes only
parts[0].text, dropping subsequent parts; update the logic in the
runner.run_async loop that checks event.is_final_response() to iterate over
event.content.parts (preserving order) and concatenate all non-empty part.text
values into response_text (or otherwise serialize non-text parts if needed)
before the post-run persistence call; ensure response_text is built from all
parts prior to calling save_memory(user_id, response_text, "assistant",
session_id) so nothing is silently discarded.
---
Nitpick comments:
In `@examples/google-adk/python/main.py`:
- Around line 161-176: The async main() currently uses blocking input() and
lacks a Google-style docstring; update main to include a Google-style docstring
and make the blocking call asynchronous (e.g., run input() in a thread via
asyncio.to_thread or use an async console like aioconsole) so the event loop
isn't blocked, ensuring the existing setup_session(user_id, session_id) and
await chat(user_id, user_input, session_id) calls remain unchanged and
integrated; add the docstring at the top of main describing its purpose, args,
and return type per Google style.
- Around line 124-135: The chat() function currently creates a new
InMemorySessionService, adk_session, and Runner each turn which resets ADK
per-turn state; move InMemorySessionService, the created adk_session, and Runner
out of chat() into setup_session() or module scope so they are instantiated once
and reused across turns, and only reconstruct LlmAgent (or provide instruction
as a callable) when the instruction changes; update references to
InMemorySessionService, adk_session, Runner, chat(), setup_session(), LlmAgent,
and build_instruction so session state (message history, tool-call state,
streaming) persists between invocations.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 26d9dddd-243f-4648-80fc-2ff337b8f9f5
📒 Files selected for processing (5)
examples/google-adk/python/main.pyexamples/google-adk/python/pyproject.tomlexamples/google-adk/python/tools/get_context.pyexamples/google-adk/python/tools/query_memory.pyexamples/google-adk/python/tools/save_memory.py
✅ Files skipped from review due to trivial changes (1)
- examples/google-adk/python/pyproject.toml
🚧 Files skipped from review as they are similar to previous changes (2)
- examples/google-adk/python/tools/get_context.py
- examples/google-adk/python/tools/save_memory.py
| async for event in runner.run_async( | ||
| user_id=user_id, | ||
| session_id=adk_session.id, | ||
| new_message=user_content, | ||
| ): | ||
| if event.is_final_response() and event.content and event.content.parts: | ||
| response_text = event.content.parts[0].text or "" | ||
|
|
||
| # Persist assistant response after the run — only when the agent produced output | ||
| if response_text: | ||
| save_memory(user_id, response_text, "assistant", session_id) |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
google-adk Runner Event final_response content parts multiple text parts
💡 Result:
In Google Agent Development Kit (ADK), a Runner Event (google.adk.events.Event) contains content as google.genai.types.Content with a list of Parts (event.content.parts). Parts support multiple types including text, function_call, function_response, inline_data (audio/images), code_execution_result, and thought (for model reasoning when using planners like BuiltInPlanner with thinking_config.include_thoughts=True). Final responses are identified via event.is_final_response == True, which checks no function calls/responses, not partial, no trailing code result. These can have multiple parts, e.g., separate thought parts (part.thought=True) and text parts. Examples: - Text events: event.content.parts[0].text (common single part access, but list allows multiple). - Multimodal: text + inline_data parts. - Thinking agents: Multiple parts with thought=True (reasoning) and regular text; was a bug in streaming where they merged (fixed post-2025). - Metadata workaround: Multiple parts like text + function_response for metadata. Code to handle multiple parts: for part in event.content.parts: if part.text: print(part.text) if part.thought: print("Thought:", part.text) # etc. Official docs emphasize iterating parts; single part access [0] is for simple text cases. Recent fixes (2026 commits) ensure thought propagation in A2A/multi-agent.
Citations:
- 1: Thinking Parts and Text Parts get merged Into Single Part during Streaming Mode google/adk-python#770
- 2: https://google.github.io/adk-docs/streaming/dev-guide/part3/
- 3: https://adk.dev/streaming/dev-guide/part3/
- 4: https://adk.dev/events/
- 5: Metadata Support for ADK Part Types google/adk-java#855
- 6: https://github.com/google/adk-docs/blob/main/docs/events/index.md
- 7: https://github.com/Kjdragan/google-adk-tutorial/blob/main/08_event.md
- 8: https://github.com/google/adk-python/blob/main/src/google/adk/events/event.py
- 9: https://google.github.io/adk-docs/events/
- 10: A2A client renders thought parts together with final user-facing response google/adk-python#4676
- 11: google/adk-python@e59929e
🏁 Script executed:
# First, let's verify the file exists and check the actual code
fd -type f -name "main.py" | grep google-adkRepository: plastic-labs/honcho
Length of output: 233
🏁 Script executed:
# Check the exact lines mentioned in the review
if [ -f "examples/google-adk/python/main.py" ]; then
sed -n '140,160p' "examples/google-adk/python/main.py" | cat -n
fiRepository: plastic-labs/honcho
Length of output: 837
🏁 Script executed:
# Search for how this codebase or examples handle multi-part content elsewhere
rg -A 5 "event\.content\.parts" --type pyRepository: plastic-labs/honcho
Length of output: 627
🏁 Script executed:
# Check if there are imports from google-adk and how they handle content
rg -B 3 -A 3 "is_final_response|\.parts\[0\]" --type pyRepository: plastic-labs/honcho
Length of output: 668
Multiple text parts in final response are silently dropped — only the first part is persisted and returned.
When event.content.parts contains multiple parts (e.g., text + function_response, or interleaved text chunks), only parts[0].text is captured. This silently discards remaining content. Iterate across all text parts to preserve the complete response:
Proposed diff
- if event.is_final_response() and event.content and event.content.parts:
- response_text = event.content.parts[0].text or ""
+ if event.is_final_response() and event.content and event.content.parts:
+ response_text = "".join(
+ part.text for part in event.content.parts if getattr(part, "text", None)
+ )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@examples/google-adk/python/main.py` around lines 146 - 156, The final
response handler currently takes only parts[0].text, dropping subsequent parts;
update the logic in the runner.run_async loop that checks
event.is_final_response() to iterate over event.content.parts (preserving order)
and concatenate all non-empty part.text values into response_text (or otherwise
serialize non-text parts if needed) before the post-run persistence call; ensure
response_text is built from all parts prior to calling save_memory(user_id,
response_text, "assistant", session_id) so nothing is silently discarded.
|
Closing this as part of a broader prioritization shift and in an effort to minimize maintenance burden. Thanks for putting in the work on this! |
Summary
examples/google-adk/python/— a full Honcho memory integration for Google ADK agentsgemini-2.0-flashwith dynamicinstructioninjection and aFunctionToolfor natural language memory recallexamples/openai-agents/exampleWhat's included
How it works
build_instruction()fetches Honcho session context and injects conversation history into the agent'sinstructionstring before every LLM call.query_memorywrapped inFunctionToollets Gemini query Honcho's Dialectic API for long-term user facts.chat()persists the user message before the agent runs and the assistant response after, keeping Honcho in sync.Test plan
HONCHO_API_KEYandGOOGLE_API_KEYinpython/.envpip install google-adk honcho-ai python-dotenvcd python && python main.py🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Documentation