Skip to content

source-sage-intacct: capture records with a null WHENMODIFIED & capture deletions only when we have permission#4465

Merged
Alex-Bair merged 3 commits into
mainfrom
bair/source-sage-intacct-more-misc-fixes
May 14, 2026
Merged

source-sage-intacct: capture records with a null WHENMODIFIED & capture deletions only when we have permission#4465
Alex-Bair merged 3 commits into
mainfrom
bair/source-sage-intacct-more-misc-fixes

Conversation

@Alex-Bair
Copy link
Copy Markdown
Member

@Alex-Bair Alex-Bair commented May 13, 2026

Description:

This PR's scope includes:

  • handling records that have a null WHENMODIFIED field. A null WHENMODIFIED is evidently possible. I'm not exactly sure why it's possible, but regardless of why WHENMODIFIED is null, I'm pretty sure the fix is to capture these null WHENMODIFIED records based on their WHENCREATED field instead.
  • not attempting to capture deletions if we can't access the AUDITHISTORY object required to figure out what's been deleted

See the commit messages for more details.

Documentation links affected:

The connector's documentation should be updated to mention that the connector needs permission to read the AUDITHISTORY object in order to capture deletions.

Notes for reviewers:

Sanity checks with flowctl raw discover and flowctl preview before and after my changes with our testing account gave the same outputs.

I also tested out the equivalent API calls for the new get_creations_since_request and get_creations_at_request with our test account, and they returned 200 OK as expected. They didn't return any records, but that's because there are no records in our test account without a non-null WHENMODIFIED field.

See this Slack thread for more context on the specific capture task hitting errors that will get unstuck after these changes are merged.

As part of this change, we've added a 2 minute lag to capturing updates to guard against the connector querying for updated and created records close together, then emitting the create after the update due to network or Sage API races. 2 minutes seems acceptable to me, but I can adjust it lower if we think 2 minutes is too much lag & the creation / update race doesn't seem valid.

@Alex-Bair Alex-Bair changed the title Bair/source sage intacct more misc fixes source-sage-intacct: capture records with a null WHENMODIFIED & capture deletions only when we have permission May 13, 2026
@Alex-Bair Alex-Bair force-pushed the bair/source-sage-intacct-more-misc-fixes branch 3 times, most recently from 92b0439 to 6fed094 Compare May 14, 2026 13:45
@Alex-Bair Alex-Bair marked this pull request as ready for review May 14, 2026 14:08
@Alex-Bair Alex-Bair requested a review from a team May 14, 2026 14:08
Comment on lines +211 to +227
# CreationRecord captures records that exist in Sage with a null WHENMODIFIED.
# They cannot be captured by the WHENMODIFIED-keyed incremental query, so a
# parallel sub-task keyed on WHENCREATED picks them up.
class CreationRecord(BaseDocument, extra="allow"):
RECORDNO: int
WHENCREATED: AwareDatetime

def cursor_value(self) -> AwareDatetime:
return self.WHENCREATED


def parse_backfill_record(raw: dict) -> "IncrementalResource | CreationRecord":
"""Routes to IncrementalResource when WHENMODIFIED is present, else
CreationRecord."""
if "WHENMODIFIED" in raw:
return IncrementalResource.model_validate(raw)
return CreationRecord.model_validate(raw)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CreationRecord captures records that exist in Sage with a null WHENMODIFIED

Shouldn't parse_backfill_record also be checking if WHENMOFIED is None?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't necessarily need to. parse_backfill_record is used to parse dumped SageRecords, and when SageRecords are dumped any None values are excluded:

kwargs.setdefault("exclude_none", True)

But it is safer to use raw.get("WHENMODIFIED") instead of "WHENMODIFIED" in raw when determining whether to validate with IncrementalRecord or CreationRecord in case how SageRecord serialization behavior changes later. I've updated this check to cover if WHENMODIFIED is None too.

@Alex-Bair Alex-Bair force-pushed the bair/source-sage-intacct-more-misc-fixes branch from 6fed094 to b4b5173 Compare May 14, 2026 17:11
Copy link
Copy Markdown
Contributor

@nicolaslazo nicolaslazo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

Alex-Bair added 3 commits May 14, 2026 16:34
Backfills have crashed validating records whose `WHENMODIFIED` was `null`.
It turns out that it's possible for `WHENMODIFIED` to be `null`, and
that creates a hole in our current incremental strategy that relies on
the presence of the `WHENMODIFIED` field for incremental replication queries;
these queries would miss records that have a `WHENCREATED` field but no
`WHENMODIFIED` field.

This commit fixes that gap with the following changes:
- Split records into `IncrementalResource` (keyed on `WHENMODIFIED`) and
  new `CreationRecord` (keyed on `WHENCREATED`). Both expose
  `cursor_value()` so the generic fetch helper works against either.
- `fetch_page` routes per-record by WHENMODIFIED presence, which should
  prevent the backfill from crashing when validating records without a
  `WHENMODIFIED` field.
- Add `fetch_creations` + `realtime_creations` / `lookback_creations`
  sub-tasks that query `WHENMODIFIED IS NULL AND WHENCREATED > since`.
- Add a 2-minute REALTIME_LAG horizon on the realtime modified
  sub-task so a near-simultaneous create-then-modify can't let the
  modified emission race ahead of the creation emission and overwrite
  newer state with older.
- Add state for the creation subtasks to existing captures' state.
…eadable

When the configured user lacks `AUDITHISTORY` read permission, the
deletion sub-tasks crash the capture with
`ValidationError: ['You do not have permission to view audit
history ...]`. This denial arrives without an `errorno`, so
`is_permission_error()`'s existing PL04000005 check doesn't recognize
it and the error propagates as a fatal `ValidationError`.

This commit updates the connector to not capture deletions if we cannot
read the `AUDITHISTORY` object. This is done by omitting the deletion
related subtasks from the `fetch_changes` `dict`. Note that the `initial_state`
still contains state for the deletion subtasks - that's done so if/when
permission to read the `AUDITHISTORY` is granted, the capture will automatically
start reading deletions after its next restart.
…MODIFIED

In the previous commit where incremental resources were updated to
capture documents without a WHENMODIFIED field, I missed updating the
`Resource.model` to reflect that `WHENMODIFIED` is not required. This
caused schema violation errors; the connector was trying to send
documents without `WHENMODIFIED` into a collection whose write schema
required `WHENMODIFIED`.

This commit fixes this by removing `WHENMODIFIED` from the model
provided during `Resource` instantiation. That will cause the
collection's write schema to drop `WHENMODIFIED` from it too & schema
violation errors should subside.
@Alex-Bair Alex-Bair force-pushed the bair/source-sage-intacct-more-misc-fixes branch from b4b5173 to f726b7e Compare May 14, 2026 20:35
@Alex-Bair Alex-Bair merged commit 8da94b3 into main May 14, 2026
123 of 129 checks passed
@Alex-Bair Alex-Bair deleted the bair/source-sage-intacct-more-misc-fixes branch May 14, 2026 20:41
Alex-Bair added a commit to estuary/flow that referenced this pull request May 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants