feat(db): add AWS RDS IAM authentication support by fossouo · Pull Request #563 · plastic-labs/honcho

fossouo · 2026-04-14T13:47:03Z

Summary

Add support for connecting Honcho to AWS RDS PostgreSQL using IAM authentication tokens instead of static database passwords. This enables operators to leverage AWS IAM policies for database access control, eliminating the need for long-lived credentials.

Changes

Configuration: Add AUTH_METHOD, AWS_REGION, RDS_HOSTNAME, RDS_PORT, RDS_USERNAME, AWS_PROFILE, and RDS_SSL_CA_BUNDLE fields to DBSettings with validation
AWS Auth Module: New src/aws_auth.py with generate_rds_auth_token() using boto3, with descriptive error wrapping
Database Engine: SQLAlchemy do_connect event listener for per-connection IAM token injection, SSL enforcement, pool_pre_ping and pool_recycle tuning
Migration Support: Alembic migrations work with IAM auth (fresh token + SSL for each migration run)
MCP Tool: New aws_rds_status tool for diagnosing RDS connectivity from AI workflows
Docker: Dockerfile downloads AWS RDS CA certificate bundle; docker-compose.yml.example includes IAM config section
Tests: Unit tests + property-based tests (hypothesis for Python, fast-check for TypeScript) covering 15 correctness properties

Backward Compatibility

When DB_AUTH_METHOD=password (the default), behavior is identical to the current implementation. No changes required for existing deployments.

Tested

Unit tests and property-based tests all pass
Live tested against an AWS RDS PostgreSQL 15 instance with IAM authentication enabled

Summary by CodeRabbit

New Features
- Added AWS IAM authentication support for RDS PostgreSQL databases as an alternative to password-based authentication.
- Configuration now supports IAM mode with AWS region, hostname, port, and username parameters.
- Automatic RDS SSL CA bundle integration for secure IAM-authenticated connections.
Tests
- Comprehensive test coverage for AWS IAM authentication functionality.
Chores
- Added AWS SDK dependency.
- Updated Docker build to include RDS CA certificate bundle.
- Updated configuration templates with IAM authentication examples.

coderabbitai · 2026-04-14T13:47:22Z

Walkthrough

The pull request introduces AWS RDS IAM authentication support by adding a new authentication mode to the database configuration layer. It includes a boto3-based token generator, extended database settings with IAM-specific fields, IAM-aware connection setup with event listeners, Docker/configuration templates documenting the feature, and comprehensive test coverage. AWS RDS CA bundle is fetched during Docker build.

Changes

Cohort / File(s)	Summary
Configuration & Documentation `.env.template`, `config.toml.example`, `docker-compose.yml.example`, `.gitignore`	Added commented examples and documentation for IAM authentication configuration; added `.kiro/` to gitignore.
Dependencies `pyproject.toml`	Added `boto3>=1.42.5` as a runtime dependency.
Docker `Dockerfile`	Inserted build step to fetch AWS RDS global CA bundle (`global-bundle.pem`) with validation that the downloaded file is non-empty.
AWS Authentication Module `src/aws_auth.py`	New module implementing `generate_rds_auth_token` function to produce short-lived RDS IAM tokens via boto3, with structured exception handling for missing profiles, credentials, client errors, and endpoint issues.
Database Configuration `src/config.py`	Extended `DBSettings` with `AUTH_METHOD` field (defaulting to `"password"`), AWS/RDS parameters (`AWS_REGION`, `RDS_HOSTNAME`, `RDS_USERNAME`, `RDS_PORT`, `AWS_PROFILE`, `RDS_SSL_CA_BUNDLE`), and post-validation hook `_require_iam_fields` enforcing required fields when IAM mode is selected.
Database Connection Logic `src/db.py`	Added IAM-aware URI reconstruction using RDS host/port/username, SSL configuration via `connect_args`, adjusted connection pooling (forced `pool_pre_ping` and limited `pool_recycle` to 900), SQLAlchemy event listener for token regeneration on each connection, and Alembic migration support with token injection and URL encoding.
Test Suite `tests/aws_auth/__init__.py`, `tests/aws_auth/test_aws_rds_auth.py`	Created new test package with comprehensive unit tests validating Pydantic configuration behavior, `generate_rds_auth_token` function with various error scenarios, and database engine setup assertions for IAM vs password modes.

Sequence Diagram(s)

sequenceDiagram
    participant App as Application
    participant DB as SQLAlchemy Engine
    participant Listener as do_connect Listener
    participant AWSAuth as generate_rds_auth_token
    participant Boto3 as boto3 Session
    participant RDS as AWS RDS

    App->>DB: Initialize connection pool
    App->>DB: Request connection
    DB->>Listener: Emit do_connect event
    Listener->>AWSAuth: Call generate_rds_auth_token()
    AWSAuth->>Boto3: Create Session (with optional profile)
    Boto3->>RDS: Call generate_db_auth_token()
    RDS-->>Boto3: Return short-lived token
    Boto3-->>AWSAuth: Return token
    AWSAuth-->>Listener: Return token
    Listener->>DB: Inject token into password field
    DB->>RDS: Connect using IAM token + SSL
    RDS-->>DB: Connection established
    DB-->>App: Provide connection

sequenceDiagram
    participant Alembic as Alembic Migration
    participant InitDB as init_db() Function
    participant AWSAuth as generate_rds_auth_token
    participant Boto3 as boto3 Session
    participant RDS as AWS RDS

    InitDB->>InitDB: Check AUTH_METHOD == "iam"
    InitDB->>AWSAuth: Generate migration token
    AWSAuth->>Boto3: Create Session
    Boto3->>RDS: generate_db_auth_token()
    RDS-->>Boto3: Return token
    Boto3-->>AWSAuth: Return token
    AWSAuth-->>InitDB: Return token
    InitDB->>InitDB: URL-encode token
    InitDB->>InitDB: Rebuild sqlalchemy.url with RDS host/port/username/token
    InitDB->>InitDB: Add SSL query parameters (sslmode, sslrootcert)
    InitDB->>Alembic: Provide configured URL
    Alembic->>RDS: Execute migrations over IAM-authenticated connection
    RDS-->>Alembic: Migrations applied
    Alembic-->>InitDB: Complete

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 Hops through AWS clouds with glee,
IAM tokens dance so free,
No passwords stored, just boto3 cheer,
RDS secured, crystal clear!
CA bundles fetched, connections shine,
Short-lived auth—perfectly fine! 🔐✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 62.79% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title 'feat(db): add AWS RDS IAM authentication support' clearly and concisely summarizes the main objective of the changeset—adding IAM authentication support for AWS RDS connections.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 7

🧹 Nitpick comments (5)

mcp/src/tools/aws-status.pbt.test.ts (1)

60-87: Property coverage misses “field omitted” cases.

Line 81–87 always generates a full object, so the Line 153–154 “or null if not provided” path is never exercised. Consider switching to partial-record payload generation (or fc.option for individual keys) so null/omission behavior is actually tested.

Also applies to: 153-155
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@mcp/src/tools/aws-status.pbt.test.ts` around lines 60 - 87, The
property-based test currently builds a full object with arbHealthResponse (using
arbAuthMethod, arbHostname, arbPort, arbRegion, arbStatus) so the code-path that
expects keys to be omitted or null is never exercised; change arbHealthResponse
to generate partial/optional keys (e.g., use fc.partial({...}) so fields can be
omitted, or wrap individual fields with fc.option(...) to produce
null/undefined) so the test will sometimes omit
rds_hostname/rds_port/aws_region/auth_method and exercise the "or null if not
provided" branches.

tests/aws_auth/test_aws_rds_pbt.py (2)

235-250: Remove unused st.data() parameter.

The data parameter from st.data() is never used in this test. Since the test doesn't need any additional generated data, the decorator can be simplified.

♻️ Proposed fix

     `@settings`(max_examples=100)
-    `@given`(data=st.data())
+    `@given`(st.just(None))  # or remove `@given` entirely if no property input needed
     `@patch`("src.aws_auth.boto3.Session")
-    def test_no_credentials_error_is_descriptive(self, mock_session_cls, data):
+    def test_no_credentials_error_is_descriptive(self, mock_session_cls, _):
         # Feature: aws-mcp-postgres, Property 4: Token generation errors are descriptive

Or simply remove the @given decorator if no property-based input is needed:

     `@settings`(max_examples=100)
-    `@given`(data=st.data())
     `@patch`("src.aws_auth.boto3.Session")
-    def test_no_credentials_error_is_descriptive(self, mock_session_cls, data):
+    def test_no_credentials_error_is_descriptive(self, mock_session_cls):

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/aws_auth/test_aws_rds_pbt.py` around lines 235 - 250, The test
test_no_credentials_error_is_descriptive declares an unused property-based input
by accepting data=st.data() via the `@given` decorator; remove the unused
st.data() parameter and the corresponding "data" argument from the test
signature, and either drop the `@given` decorator entirely or replace it with
`@settings`(max_examples=100) only so the test uses the patched boto3.Session and
MagicMock without expecting Hypothesis input (refer to
test_no_credentials_error_is_descriptive, st.data(), and the `@given` decorator).

614-617: Remove unnecessary f-string prefix.

The string literal has no placeholders, so the f prefix is unnecessary.

♻️ Proposed fix

         else:
             mock_client.generate_db_auth_token.side_effect = EndpointConnectionError(
-                endpoint_url=f"https://rds.us-east-1.amazonaws.com"
+                endpoint_url="https://rds.us-east-1.amazonaws.com"
             )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/aws_auth/test_aws_rds_pbt.py` around lines 614 - 617, The
EndpointConnectionError instantiation uses an unnecessary f-string for a plain
literal; update the test so mock_client.generate_db_auth_token.side_effect =
EndpointConnectionError(endpoint_url="https://rds.us-east-1.amazonaws.com")
(remove the leading f) to avoid the redundant f-string while keeping the same
endpoint_url value referenced in the mock_client.generate_db_auth_token setup.

src/db.py (2)

152-158: Consider extracting database name extraction to a helper function.

The database name extraction logic is duplicated (also at lines 26-27). A small helper function would improve maintainability.

♻️ Proposed helper function

def _extract_db_name(connection_uri: str) -> str:
    """Extract database name from a PostgreSQL connection URI."""
    db_part = connection_uri.rsplit("/", 1)[-1] if "/" in connection_uri else "postgres"
    return db_part.split("?")[0] or "postgres"

Then use _extract_db_name(settings.DB.CONNECTION_URI) in both locations.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/db.py` around lines 152 - 158, Extract the duplicated DB name parsing
into a helper called _extract_db_name(connection_uri: str) that implements the
existing logic (use connection_uri.rsplit("/", 1)[-1] if "/" in connection_uri
else "postgres", then split on "?" and fallback to "postgres"); replace both
places that compute db_name from settings.DB.CONNECTION_URI with a call to
_extract_db_name(settings.DB.CONNECTION_URI) so the parsing logic is centralized
and easy to reuse (update any references to db_name and remove the duplicated
parsing snippets).

139-147: Consider catching specific exception types.

The bare except Exception catches all exceptions including keyboard interrupts. Since generate_rds_auth_token raises RuntimeError for AWS-related failures, catching RuntimeError specifically would be more precise while still logging the context.

♻️ Proposed fix

         try:
             token = generate_rds_auth_token(
                 region=settings.DB.AWS_REGION,
                 hostname=settings.DB.RDS_HOSTNAME,
                 port=settings.DB.RDS_PORT,
                 username=settings.DB.RDS_USERNAME,
                 profile=settings.DB.AWS_PROFILE,
             )
-        except Exception:
+        except RuntimeError:
             logger.error(
                 "Failed to generate IAM auth token for Alembic migration "
                 "(region=%s, hostname=%s, username=%s)",
                 settings.DB.AWS_REGION,
                 settings.DB.RDS_HOSTNAME,
                 settings.DB.RDS_USERNAME,
             )
             raise

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/db.py` around lines 139 - 147, Change the bare except in the Alembic IAM
auth token generation to catch RuntimeError instead of Exception to avoid
swallowing unrelated interrupts; specifically, update the except block around
generate_rds_auth_token (the block that currently calls logger.error with
settings.DB.AWS_REGION, settings.DB.RDS_HOSTNAME, settings.DB.RDS_USERNAME) to
"except RuntimeError:" (log the same contextual message via logger.error and
then re-raise) so AWS-related failures are handled explicitly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docker-compose.yml.example`:
- Around line 32-43: The commented AWS RDS IAM env vars create a second
environment key if uncommented; update the docker-compose.yml.example to avoid
duplicate keys by moving those commented DB_* and CACHE_* entries into the
existing service "environment" block (or change the comment to instruct users to
replace the entire environment block rather than uncommenting), and clarify
around the existing env_file/environment keys so users only enable variables
within the single environment section (reference the env_file line and the
existing environment block in the service).

In `@Dockerfile`:
- Around line 41-44: The Dockerfile step that downloads global-bundle.pem
currently fetches it without integrity checks or retries; update the RUN that
creates /usr/local/share/aws and downloads global-bundle.pem to (1) accept or
set a known SHA256 checksum (via a build ARG or hardcoded variable), (2) use a
downloader with retry/timeout flags (curl or wget with --retry/--tries, --fail,
and reasonable timeouts) and (3) immediately verify the file with sha256sum (or
shasum) and fail the build if the checksum does not match, so the image build
aborts on mismatch; reference the existing RUN that writes
/usr/local/share/aws/global-bundle.pem and ensure the command returns non-zero
on download or verification failure.

In `@mcp/src/tools/aws-status.test.ts`:
- Around line 71-102: The test is mocking and asserting fields (auth_method,
rds_hostname, rds_port, aws_region) that the real /health endpoint does not
return; update the test in aws-status.test.ts to reflect the actual /health
contract by changing the mocked healthPayload and assertions to only
include/expect the real fields (e.g., status) or else adjust the
register/getHandler logic to transform /health into the extended shape;
specifically modify the vi.stubGlobal mockResolvedValue in the "returns healthy
response with all required fields" case (and the other affected cases around
lines 139-163 and 165-188) and the subsequent parsed assertions so they either
assert only on status (and connection_healthy/error as produced by your handler)
or mock the handler's upstream response shape that register/getHandler expects.

In `@mcp/src/tools/aws-status.ts`:
- Around line 18-23: The aws_rds_status tool is assuming /health returns
auth_method, rds_hostname, rds_port and aws_region but the backend /health only
returns { "status": "ok" }; update aws_rds_status to either call the correct
diagnostics endpoint that provides DB/IAM info (replace the /health fetch with
that endpoint) or modify aws_rds_status to align with the actual payload by
removing references to auth_method/rds_hostname/rds_port/aws_region, updating
the description text to match the real contract, and defensively handling
missing fields (e.g., return null/unknown or a clear error message when those
properties are absent). Ensure changes target the aws_rds_status function and
the description string so the code and docs remain consistent.

In `@src/aws_auth.py`:
- Around line 54-100: Replace the generic RuntimeError raises in the exception
handlers for NoCredentialsError, ClientError, and EndpointConnectionError with
the project-specific exceptions from src/exceptions.py: raise
AuthenticationException for credential/access failures (NoCredentialsError and
AccessDenied/Forbidden ClientError paths) and raise ValidationException for
misconfiguration/endpoint issues (EndpointConnectionError and any client error
that indicates invalid configuration). Update the three handlers around the
logger.error calls (the except blocks catching NoCredentialsError, ClientError,
and EndpointConnectionError) to import and use AuthenticationException and
ValidationException, preserve the original chained exceptions using "from exc",
and keep the detailed error messages currently passed to RuntimeError.
- Around line 42-47: Wrap the AWS session/client creation in a try/except that
catches botocore.exceptions.ProfileNotFound and logs/raises a clear actionable
message mentioning DB_AWS_PROFILE; specifically, surround the boto3.Session(...)
and session.client("rds", ...) calls (where session.client and
client.generate_db_auth_token are used) with error handling that imports
ProfileNotFound from botocore.exceptions, and on that exception emit a
descriptive error (or raise a custom/ValueError) indicating the provided AWS
profile name is invalid and suggest verifying DB_AWS_PROFILE.

In `@src/config.py`:
- Around line 180-195: The _require_iam_fields validator in DBSettings currently
checks RDS_PORT with "is None", which misses invalid ints (e.g. 0 or >65535);
update the check in _require_iam_fields to validate the numeric range of
RDS_PORT (ensure 1 <= self.RDS_PORT <= 65535) and treat out-of-range values as
missing by appending "DB_RDS_PORT" to missing, so invalid port numbers fail the
same validation path as a missing port.

---

Nitpick comments:
In `@mcp/src/tools/aws-status.pbt.test.ts`:
- Around line 60-87: The property-based test currently builds a full object with
arbHealthResponse (using arbAuthMethod, arbHostname, arbPort, arbRegion,
arbStatus) so the code-path that expects keys to be omitted or null is never
exercised; change arbHealthResponse to generate partial/optional keys (e.g., use
fc.partial({...}) so fields can be omitted, or wrap individual fields with
fc.option(...) to produce null/undefined) so the test will sometimes omit
rds_hostname/rds_port/aws_region/auth_method and exercise the "or null if not
provided" branches.

In `@src/db.py`:
- Around line 152-158: Extract the duplicated DB name parsing into a helper
called _extract_db_name(connection_uri: str) that implements the existing logic
(use connection_uri.rsplit("/", 1)[-1] if "/" in connection_uri else "postgres",
then split on "?" and fallback to "postgres"); replace both places that compute
db_name from settings.DB.CONNECTION_URI with a call to
_extract_db_name(settings.DB.CONNECTION_URI) so the parsing logic is centralized
and easy to reuse (update any references to db_name and remove the duplicated
parsing snippets).
- Around line 139-147: Change the bare except in the Alembic IAM auth token
generation to catch RuntimeError instead of Exception to avoid swallowing
unrelated interrupts; specifically, update the except block around
generate_rds_auth_token (the block that currently calls logger.error with
settings.DB.AWS_REGION, settings.DB.RDS_HOSTNAME, settings.DB.RDS_USERNAME) to
"except RuntimeError:" (log the same contextual message via logger.error and
then re-raise) so AWS-related failures are handled explicitly.

In `@tests/aws_auth/test_aws_rds_pbt.py`:
- Around line 235-250: The test test_no_credentials_error_is_descriptive
declares an unused property-based input by accepting data=st.data() via the
`@given` decorator; remove the unused st.data() parameter and the corresponding
"data" argument from the test signature, and either drop the `@given` decorator
entirely or replace it with `@settings`(max_examples=100) only so the test uses
the patched boto3.Session and MagicMock without expecting Hypothesis input
(refer to test_no_credentials_error_is_descriptive, st.data(), and the `@given`
decorator).
- Around line 614-617: The EndpointConnectionError instantiation uses an
unnecessary f-string for a plain literal; update the test so
mock_client.generate_db_auth_token.side_effect =
EndpointConnectionError(endpoint_url="https://rds.us-east-1.amazonaws.com")
(remove the leading f) to avoid the redundant f-string while keeping the same
endpoint_url value referenced in the mock_client.generate_db_auth_token setup.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4612dbe0-ffa3-4b8f-97a6-6397378e6dea

📥 Commits

Reviewing files that changed from the base of the PR and between 58f9abb and 48afe22.

📒 Files selected for processing (15)

.env.template
Dockerfile
config.toml.example
docker-compose.yml.example
mcp/src/server.ts
mcp/src/tools/aws-status.pbt.test.ts
mcp/src/tools/aws-status.test.ts
mcp/src/tools/aws-status.ts
pyproject.toml
src/aws_auth.py
src/config.py
src/db.py
tests/aws_auth/__init__.py
tests/aws_auth/test_aws_rds_auth.py
tests/aws_auth/test_aws_rds_pbt.py

coderabbitai · 2026-04-14T13:54:26Z

+    # -- AWS RDS IAM authentication (uncomment to enable) --
+    # environment:
+    #   - DB_AUTH_METHOD=iam
+    #   - DB_AWS_REGION=us-east-1
+    #   - DB_RDS_HOSTNAME=your-rds-instance.xxxxxxxxxxxx.us-east-1.rds.amazonaws.com
+    #   - DB_RDS_PORT=5432
+    #   - DB_RDS_USERNAME=iam_db_user
+    #   - DB_AWS_PROFILE=                          # optional: named AWS credentials profile
+    #   - DB_RDS_SSL_CA_BUNDLE=/usr/local/share/aws/global-bundle.pem
+    #   - CACHE_URL=redis://redis:6379/0?suppress=true
+    #   - CACHE_ENABLED=true
    env_file:


⚠️ Potential issue | 🟡 Minor

Avoid documenting a second environment key per service.

Uncommenting these blocks as written can leave duplicate environment keys in a service, which is error-prone. Prefer documenting IAM variables inside the existing environment block (or explicitly instruct to replace the whole block).

Also applies to: 66-76

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@docker-compose.yml.example` around lines 32 - 43, The commented AWS RDS IAM env vars create a second environment key if uncommented; update the docker-compose.yml.example to avoid duplicate keys by moving those commented DB_* and CACHE_* entries into the existing service "environment" block (or change the comment to instruct users to replace the entire environment block rather than uncommenting), and clarify around the existing env_file/environment keys so users only enable variables within the single environment section (reference the env_file line and the existing environment block in the service).

coderabbitai · 2026-04-14T13:54:26Z

+# Download AWS RDS CA certificate bundle for SSL connections to RDS
+RUN mkdir -p /usr/local/share/aws && \
+    python -c "import urllib.request; urllib.request.urlretrieve('https://truststore.pki.rds.amazonaws.com/global/global-bundle.pem', '/usr/local/share/aws/global-bundle.pem')"
+


⚠️ Potential issue | 🟠 Major

Harden CA bundle download (integrity + retry).

The current build step fetches trust material without integrity verification, which weakens supply-chain guarantees and reproducibility.

🔒 Suggested hardening

+# Optional: pin expected digest via build arg and rotate intentionally. +ARG RDS_CA_BUNDLE_SHA256 # Download AWS RDS CA certificate bundle for SSL connections to RDS -RUN mkdir -p /usr/local/share/aws && \ - python -c "import urllib.request; urllib.request.urlretrieve('https://truststore.pki.rds.amazonaws.com/global/global-bundle.pem', '/usr/local/share/aws/global-bundle.pem')" +RUN mkdir -p /usr/local/share/aws && \ + python - <<'PY' +import hashlib +import urllib.request +from pathlib import Path + +url = "https://truststore.pki.rds.amazonaws.com/global/global-bundle.pem" +dst = Path("/usr/local/share/aws/global-bundle.pem") +urllib.request.urlretrieve(url, dst) +if not dst.exists() or dst.stat().st_size == 0: + raise RuntimeError("Failed to download AWS RDS CA bundle") +PY

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@Dockerfile` around lines 41 - 44, The Dockerfile step that downloads global-bundle.pem currently fetches it without integrity checks or retries; update the RUN that creates /usr/local/share/aws and downloads global-bundle.pem to (1) accept or set a known SHA256 checksum (via a build ARG or hardcoded variable), (2) use a downloader with retry/timeout flags (curl or wget with --retry/--tries, --fail, and reasonable timeouts) and (3) immediately verify the file with sha256sum (or shasum) and fail the build if the checksum does not match, so the image build aborts on mismatch; reference the existing RUN that writes /usr/local/share/aws/global-bundle.pem and ensure the command returns non-zero on download or verification failure.

coderabbitai · 2026-04-14T13:54:26Z

+  it("returns healthy response with all required fields", async () => {
+    const { server, getHandler } = createMockServer();
+    const ctx = createMockContext({ baseUrl: "https://api.honcho.dev" });
+    register(server as never, ctx);
+
+    const healthPayload = {
+      status: "ok",
+      auth_method: "iam",
+      rds_hostname: "mydb.rds.amazonaws.com",
+      rds_port: 5432,
+      aws_region: "us-east-1",
+    };
+
+    vi.stubGlobal(
+      "fetch",
+      vi.fn().mockResolvedValue({
+        ok: true,
+        json: () => Promise.resolve(healthPayload),
+      }),
+    );
+
+    const result = await getHandler()();
+    expect(result.isError).toBeUndefined();
+
+    const parsed = JSON.parse(result.content[0].text);
+    expect(parsed.auth_method).toBe("iam");
+    expect(parsed.rds_hostname).toBe("mydb.rds.amazonaws.com");
+    expect(parsed.rds_port).toBe(5432);
+    expect(parsed.aws_region).toBe("us-east-1");
+    expect(parsed.connection_healthy).toBe(true);
+    expect(parsed.error).toBeNull();
+  });


⚠️ Potential issue | 🟠 Major

Tests currently validate a non-existent /health response shape.

Line 76 onward mocks auth_method, rds_hostname, rds_port, and aws_region as if /health returns them, then asserts them as expected output. Given current backend /health only returns status, this suite can pass while production behavior is incomplete. Please align mocks/assertions with the real endpoint contract (or update endpoint/tool together).

Also applies to: 139-163, 165-188

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@mcp/src/tools/aws-status.test.ts` around lines 71 - 102, The test is mocking and asserting fields (auth_method, rds_hostname, rds_port, aws_region) that the real /health endpoint does not return; update the test in aws-status.test.ts to reflect the actual /health contract by changing the mocked healthPayload and assertions to only include/expect the real fields (e.g., status) or else adjust the register/getHandler logic to transform /health into the extended shape; specifically modify the vi.stubGlobal mockResolvedValue in the "returns healthy response with all required fields" case (and the other affected cases around lines 139-163 and 165-188) and the subsequent parsed assertions so they either assert only on status (and connection_healthy/error as produced by your handler) or mock the handler's upstream response shape that register/getHandler expects.

coderabbitai · 2026-04-14T13:54:26Z

+    except NoCredentialsError as exc:
+        logger.error(
+            "AWS credentials not found for RDS IAM auth "
+            "(region=%s, hostname=%s, username=%s): %s",
+            region,
+            hostname,
+            username,
+            exc,
+        )
+        raise RuntimeError(
+            "AWS credentials not found. Ensure an IAM role is attached, "
+            "AWS environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY) "
+            "are set, or DB_AWS_PROFILE is configured with a valid profile."
+        ) from exc
+    except ClientError as exc:
+        logger.error(
+            "AWS client error during RDS IAM token generation "
+            "(region=%s, hostname=%s, username=%s): %s",
+            region,
+            hostname,
+            username,
+            exc,
+        )
+        error_code = exc.response.get("Error", {}).get("Code", "")
+        if "AccessDenied" in error_code or "Forbidden" in error_code:
+            raise RuntimeError(
+                "Access denied when generating RDS IAM auth token. "
+                "Ensure the IAM policy grants the 'rds-db:connect' action "
+                f"for the database user '{username}' on the target RDS resource."
+            ) from exc
+        raise RuntimeError(
+            f"AWS client error during RDS IAM token generation: {exc}"
+        ) from exc
+    except EndpointConnectionError as exc:
+        logger.error(
+            "Cannot reach AWS endpoint for RDS IAM token generation "
+            "(region=%s, hostname=%s, username=%s): %s",
+            region,
+            hostname,
+            username,
+            exc,
+        )
+        raise RuntimeError(
+            f"Cannot connect to AWS STS/RDS endpoint in region '{region}'. "
+            "Check network connectivity, VPC configuration, and ensure the "
+            "region is correct."
+        ) from exc


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify available project exception types and current usage patterns. fd -i exceptions.py src sed -n '1,240p' src/exceptions.py rg -n "raise .*Exception|class .*Exception" src -g '*.py' rg -n "raise RuntimeError" src/aws_auth.py src/db.py

Repository: plastic-labs/honcho

Length of output: 12017

Use project-specific exceptions instead of generic RuntimeError.

Raise AuthenticationException for credential/access failures and ValidationException for misconfiguration issues to maintain typed semantics and align with project conventions (see src/exceptions.py).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/aws_auth.py` around lines 54 - 100, Replace the generic RuntimeError raises in the exception handlers for NoCredentialsError, ClientError, and EndpointConnectionError with the project-specific exceptions from src/exceptions.py: raise AuthenticationException for credential/access failures (NoCredentialsError and AccessDenied/Forbidden ClientError paths) and raise ValidationException for misconfiguration/endpoint issues (EndpointConnectionError and any client error that indicates invalid configuration). Update the three handlers around the logger.error calls (the except blocks catching NoCredentialsError, ClientError, and EndpointConnectionError) to import and use AuthenticationException and ValidationException, preserve the original chained exceptions using "from exc", and keep the detailed error messages currently passed to RuntimeError.

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@mcp/src/tools/aws-status.ts`:
- Around line 29-34: The fetch to healthUrl using ctx.config.baseUrl and
ctx.config.apiKey must be protected with an AbortController and timeout so the
tool cannot hang; update the code around the fetch call to create an
AbortController, pass controller.signal to fetch(healthUrl, { headers:...,
signal }), start a timer (e.g., const timeoutId = setTimeout(() =>
controller.abort(), <timeoutMs>)) before the fetch, clearTimeout(timeoutId)
after the response is received (or in a finally block), and ensure you handle
the abort error path (inspect the thrown error or error.name === 'AbortError')
so the function fails fast on timeout instead of hanging.
- Around line 44-48: The code builds an AwsRdsStatusResult by asserting fields
on an unknown JSON object (variable data) which can produce invalid runtime
values; change each assertion to perform runtime type checks instead (e.g., use
typeof checks or helper predicates isString/isNumber) and only assign the value
when the check passes, otherwise set null; specifically update the mapping for
auth_method, rds_hostname, aws_region to check typeof === 'string' and for
rds_port to check typeof === 'number' (or parse/validate an integer if strings
are allowed), or extract these checks into a small validator function used by
the code that constructs AwsRdsStatusResult so the output conforms to the
expected runtime types.

In `@src/db.py`:
- Around line 24-27: _extract_db_name() and the IAM URL-rebuild logic
reconstruct the connection URI by naive string splits and drops the original
path and query; instead, parse the provided DB_CONNECTION_URI with
urllib.parse.urlsplit/urlunsplit (or sqlalchemy.engine.url.make_url) to extract
components, preserve the original path and query, and only replace/overlay the
username, password (IAM token), and scheme/port if needed for IAM branches;
update the IAM-rewriting code blocks that currently call _extract_db_name() and
build new URIs (the IAM branch around the earlier helper and the later IAM
branch in the same module) to resume the original parsed path/query (including
empty path) and attach the new credentials/token while leaving existing query
params intact.

In `@tests/aws_auth/test_aws_rds_pbt.py`:
- Around line 133-153: The test is re-implementing src.db behavior instead of
exercising it; update the property tests (e.g., the one using
generate_rds_auth_token) to patch src.config.settings values, reload/import
src.db, and assert against src.db's exported artifacts (connection_uri,
connect_args, do_connect behavior and init_db/Alembic URL) rather than calling
generate_rds_auth_token or rebuilding URI logic in the test; specifically,
replace direct calls to generate_rds_auth_token and manual token assembly with
patches to src.config.settings, reload src.db (or importlib.reload), spy/patch
the AWS client used inside src.db (the same client used by
generate_rds_auth_token) to return distinct tokens, then assert
src.db.connection_uri, src.db.connect_args, that src.db.do_connect injects a
fresh token per connection, and that init_db() yields the correct Alembic URL.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a2167c07-5f06-4bbd-80f4-81e38b216a5f

📥 Commits

Reviewing files that changed from the base of the PR and between 48afe22 and b24aca7.

⛔ Files ignored due to path filters (2)

mcp/package-lock.json is excluded by !**/package-lock.json
uv.lock is excluded by !**/*.lock

📒 Files selected for processing (9)

.gitignore
Dockerfile
docker-compose.yml.example
mcp/package.json
mcp/src/tools/aws-status.ts
src/aws_auth.py
src/config.py
src/db.py
tests/aws_auth/test_aws_rds_pbt.py

✅ Files skipped from review due to trivial changes (3)

.gitignore
mcp/package.json
docker-compose.yml.example

🚧 Files skipped from review as they are similar to previous changes (2)

Dockerfile
src/aws_auth.py

coderabbitai · 2026-04-14T14:07:23Z

+    def test_each_connection_gets_unique_token(self, mock_session_cls, num_connections):
+        # Feature: aws-mcp-postgres, Property 2: Fresh token injection on every connection
+        tokens = [f"token-{i}" for i in range(num_connections)]
+        mock_client = MagicMock()
+        mock_client.generate_db_auth_token.side_effect = tokens
+        mock_session_cls.return_value.client.return_value = mock_client
+
+        collected_tokens = []
+        for _ in range(num_connections):
+            t = generate_rds_auth_token(
+                region="us-east-1",
+                hostname="db.rds.amazonaws.com",
+                port=5432,
+                username="user",
+            )
+            collected_tokens.append(t)
+
+        # Each call should have produced a distinct token
+        assert len(collected_tokens) == num_connections
+        assert len(set(collected_tokens)) == num_connections
+        assert mock_client.generate_db_auth_token.call_count == num_connections


⚠️ Potential issue | 🟠 Major

These properties are asserting copied logic, not src.db behavior.

Properties 2/5/10/11/13/14/15 rebuild the src.db conditionals and URI assembly inside the test body, and Property 15 never calls init_db() at all. That means a regression in the real module can still leave this file green. Please exercise src.db directly here—e.g. patch src.config.settings, reload src.db, and assert the actual connection_uri, connect_args, do_connect token injection, and init_db() Alembic URL behavior instead of re-implementing them in the tests.

Also applies to: 287-298, 400-405, 434-440, 516-529, 560-582, 603-627

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/aws_auth/test_aws_rds_pbt.py` around lines 133 - 153, The test is re-implementing src.db behavior instead of exercising it; update the property tests (e.g., the one using generate_rds_auth_token) to patch src.config.settings values, reload/import src.db, and assert against src.db's exported artifacts (connection_uri, connect_args, do_connect behavior and init_db/Alembic URL) rather than calling generate_rds_auth_token or rebuilding URI logic in the test; specifically, replace direct calls to generate_rds_auth_token and manual token assembly with patches to src.config.settings, reload src.db (or importlib.reload), spy/patch the AWS client used inside src.db (the same client used by generate_rds_auth_token) to return distinct tokens, then assert src.db.connection_uri, src.db.connect_args, that src.db.do_connect injects a fresh token per connection, and that init_db() yields the correct Alembic URL.

ajspig · 2026-04-14T21:01:38Z

▎ Hey @fossouo — thanks for tackling this! IAM-based RDS auth is a useful addition for production deployments.

A couple things before this can move forward:

Scope: The core feature here is IAM token-based DB connections, but this PR includes a lot of extras — MCP tools (aws-status.ts), property-based tests, monitoring tooling — that push it to 6,400+ lines. Could you trim this down to just the IAM auth plumbing (src/db.py, src/config.py, .env.template, and focused unit tests)? The MCP tools and status monitoring can be a separate PR if there's interest.
Conflicts: Refactor clients.py to add modern features and more flexible configuration #459 is an in-flight config refactor that touches config.py, .env.template, and pyproject.toml. You'll want to wait for that to land and rebase on top of it before updating this PR.

Happy to review once it's scoped down and rebased!

fossouo · 2026-04-15T08:08:12Z

▎ Hey @fossouo — thanks for tackling this! IAM-based RDS auth is a useful addition for production deployments.

A couple things before this can move forward:

Scope: The core feature here is IAM token-based DB connections, but this PR includes a lot of extras — MCP tools (aws-status.ts), property-based tests, monitoring tooling — that push it to 6,400+ lines. Could you trim this down to just the IAM auth plumbing (src/db.py, src/config.py, .env.template, and focused unit tests)? The MCP tools and status monitoring can be a separate PR if there's interest.

Conflicts: Refactor clients.py to add modern features and more flexible configuration #459 is an in-flight config refactor that touches config.py, .env.template, and pyproject.toml. You'll want to wait for that to land and rebase on top of it before updating this PR.

Happy to review once it's scoped down and rebased!

Thanks for the feedback! I'll trim this down to just the core IAM auth plumbing (config, aws_auth, db engine, Dockerfile, and focused unit tests) and move the MCP tools + property-based tests to a follow-up PR.

I see #459 is still in flight — I'll wait for it to land and rebase on top before updating. Will ping here once it's ready for another look.

Add support for connecting to AWS RDS PostgreSQL using IAM authentication tokens instead of static passwords. This enables operators to leverage AWS IAM policies for database access control. Changes: - Add AUTH_METHOD, AWS_REGION, RDS_HOSTNAME, RDS_PORT, RDS_USERNAME, AWS_PROFILE, and RDS_SSL_CA_BUNDLE settings to DBSettings - Create aws_auth module for IAM token generation via boto3 - Update database engine with do_connect event listener for token injection - Add SSL enforcement and pool tuning for IAM mode - Update Alembic migrations to support IAM authentication - Add aws_rds_status MCP tool for connectivity diagnostics - Update Dockerfile with AWS RDS CA certificate bundle - Add unit tests and property-based tests (hypothesis + fast-check) - Add boto3 dependency

- Add ProfileNotFound exception handling in aws_auth.py - Validate RDS_PORT range (1-65535) via Annotated Field - Extract _extract_db_name helper to deduplicate DB name parsing - Narrow except clause to RuntimeError in init_db - Harden Dockerfile CA bundle download with existence check - Fix docker-compose duplicate environment key issue - Remove unused st.data() and unnecessary f-string in PBT tests - Update MCP tool description for defensive field handling

- Add AbortController timeout (5s) to MCP health check fetch - Use runtime typeof checks instead of TS type assertions for JSON fields - Parse CONNECTION_URI with urlsplit/urlunsplit to preserve path and query params - Overlay only IAM-specific pieces (username, password, host, port) on parsed URI

Remove MCP aws_rds_status tool, property-based tests, and lock file to reduce scope per maintainer feedback. Rebased on merged plastic-labs#459. MCP tools and PBT tests will be a follow-up PR.

coderabbitai

🧹 Nitpick comments (2)

tests/aws_auth/test_aws_rds_auth.py (2)

242-270: Test simulates logic rather than testing actual implementation.

test_iam_mode_engine_setup duplicates the URI construction logic from db.py rather than importing and testing the actual module. While this avoids module re-import complexity, it means the test won't catch regressions if db.py's logic changes.

Consider extracting the URI construction into a helper function in db.py that can be unit-tested directly, or document this as an integration concern.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/aws_auth/test_aws_rds_auth.py` around lines 242 - 270, The test
test_iam_mode_engine_setup duplicates the URI construction logic instead of
exercising the real implementation; refactor by moving the URI construction into
a small, testable helper in db.py (e.g., a function build_rds_uri(db_settings:
DBSettings) or get_iam_rds_uri) and update the test to import and call that
helper (or alternatively change the test to import db.py and call the existing
function that performs the same work). Ensure the helper accepts the
DBSettings-like input and returns the URI string used by the engine, then
replace the duplicated construction in the test with a call to that helper so
regressions in db.py are caught by the unit test.

223-240: Test assertions may not verify intended behavior.

Tests test_password_mode_no_do_connect_listener and test_password_mode_uses_connection_uri guard on os.environ.get("DB_AUTH_METHOD") != "iam" but don't actually verify the absence of the listener or the module's configuration state—they just assert the settings value. The tests pass when the environment condition holds but don't fail meaningfully if the condition is false.

Consider restructuring to use pytest.mark.skipif with a clear skip reason, or mock the settings during test setup to control the test environment deterministically.

♻️ Suggested refactor

+    `@pytest.mark.skipif`(
+        os.environ.get("DB_AUTH_METHOD") == "iam",
+        reason="Test only applicable in password mode"
+    )
     def test_password_mode_no_do_connect_listener(self):
         """In password mode the engine should NOT have a do_connect listener
         that injects IAM tokens."""
-        # The default settings use password mode. Verify that the db module
-        # did not register the IAM token injection path.
-        if os.environ.get("DB_AUTH_METHOD") != "iam":
-            from src import db as db_module
-
-            # In password mode, the module should not have imported aws_auth
-            # at module level (it's only imported inside the iam branch).
-            assert settings.DB.AUTH_METHOD == "password"
+        from src import db as db_module
+        # In password mode, the module should not have imported aws_auth
+        # at module level (it's only imported inside the iam branch).
+        assert settings.DB.AUTH_METHOD == "password"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/aws_auth/test_aws_rds_auth.py` around lines 223 - 240, The two tests
(test_password_mode_no_do_connect_listener,
test_password_mode_uses_connection_uri) currently gate behavior with a runtime
os.environ check and only assert settings.DB.AUTH_METHOD; update them to be
deterministic by either applying
pytest.mark.skipif(os.environ.get("DB_AUTH_METHOD") == "iam", reason="requires
password mode") or by mocking the environment/settings to force password mode
before importing src.db; then assert the actual behaviors: for
test_password_mode_no_do_connect_listener import src.db as db_module and verify
aws_auth was not imported at module level and that no IAM-token do_connect
listener is registered on the engine (check the engine event listeners list or
db_module.engine.dispatch), and for test_password_mode_uses_connection_uri
import connection_uri from src.db and assert connection_uri ==
settings.DB.CONNECTION_URI after forcing password mode.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/aws_auth/test_aws_rds_auth.py`:
- Around line 242-270: The test test_iam_mode_engine_setup duplicates the URI
construction logic instead of exercising the real implementation; refactor by
moving the URI construction into a small, testable helper in db.py (e.g., a
function build_rds_uri(db_settings: DBSettings) or get_iam_rds_uri) and update
the test to import and call that helper (or alternatively change the test to
import db.py and call the existing function that performs the same work). Ensure
the helper accepts the DBSettings-like input and returns the URI string used by
the engine, then replace the duplicated construction in the test with a call to
that helper so regressions in db.py are caught by the unit test.
- Around line 223-240: The two tests (test_password_mode_no_do_connect_listener,
test_password_mode_uses_connection_uri) currently gate behavior with a runtime
os.environ check and only assert settings.DB.AUTH_METHOD; update them to be
deterministic by either applying
pytest.mark.skipif(os.environ.get("DB_AUTH_METHOD") == "iam", reason="requires
password mode") or by mocking the environment/settings to force password mode
before importing src.db; then assert the actual behaviors: for
test_password_mode_no_do_connect_listener import src.db as db_module and verify
aws_auth was not imported at module level and that no IAM-token do_connect
listener is registered on the engine (check the engine event listeners list or
db_module.engine.dispatch), and for test_password_mode_uses_connection_uri
import connection_uri from src.db and assert connection_uri ==
settings.DB.CONNECTION_URI after forcing password mode.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 472c1193-861c-48e2-a31a-1aee9efdc844

📥 Commits

Reviewing files that changed from the base of the PR and between 5a3872d and 37d7d04.

⛔ Files ignored due to path filters (2)

mcp/package-lock.json is excluded by !**/package-lock.json
uv.lock is excluded by !**/*.lock

📒 Files selected for processing (11)

.env.template
.gitignore
Dockerfile
config.toml.example
docker-compose.yml.example
pyproject.toml
src/aws_auth.py
src/config.py
src/db.py
tests/aws_auth/__init__.py
tests/aws_auth/test_aws_rds_auth.py

✅ Files skipped from review due to trivial changes (6)

tests/aws_auth/init.py
pyproject.toml
docker-compose.yml.example
.env.template
.gitignore
config.toml.example

🚧 Files skipped from review as they are similar to previous changes (1)

Dockerfile

coderabbitai Bot reviewed Apr 14, 2026

View reviewed changes

ajspig added the needs-changes label Apr 14, 2026

dfosso added 5 commits April 28, 2026 16:22

chore: exclude .kiro specs and test script from PR

c8d4f92

refactor: scope PR to core IAM auth plumbing, rebase on plastic-labs#459

37d7d04

Remove MCP aws_rds_status tool, property-based tests, and lock file to reduce scope per maintainer feedback. Rebased on merged plastic-labs#459. MCP tools and PBT tests will be a follow-up PR.

fossouo force-pushed the feature/aws-rds-iam-authentication branch from 5a3872d to 37d7d04 Compare April 28, 2026 14:28

coderabbitai Bot reviewed Apr 28, 2026

View reviewed changes

Conversation

fossouo commented Apr 14, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Backward Compatibility

Tested

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated Code Review Effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

ajspig commented Apr 14, 2026

Uh oh!

fossouo commented Apr 15, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fossouo commented Apr 14, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 14, 2026 •

edited

Loading