source-sage-intacct: capture records with a null WHENMODIFIED & capture deletions only when we have permission#4465
Merged
Conversation
null WHENMODIFIED & capture deletions only when we have permission
92b0439 to
6fed094
Compare
nicolaslazo
reviewed
May 14, 2026
Comment on lines
+211
to
+227
| # CreationRecord captures records that exist in Sage with a null WHENMODIFIED. | ||
| # They cannot be captured by the WHENMODIFIED-keyed incremental query, so a | ||
| # parallel sub-task keyed on WHENCREATED picks them up. | ||
| class CreationRecord(BaseDocument, extra="allow"): | ||
| RECORDNO: int | ||
| WHENCREATED: AwareDatetime | ||
|
|
||
| def cursor_value(self) -> AwareDatetime: | ||
| return self.WHENCREATED | ||
|
|
||
|
|
||
| def parse_backfill_record(raw: dict) -> "IncrementalResource | CreationRecord": | ||
| """Routes to IncrementalResource when WHENMODIFIED is present, else | ||
| CreationRecord.""" | ||
| if "WHENMODIFIED" in raw: | ||
| return IncrementalResource.model_validate(raw) | ||
| return CreationRecord.model_validate(raw) |
Contributor
There was a problem hiding this comment.
CreationRecord captures records that exist in Sage with a null WHENMODIFIED
Shouldn't parse_backfill_record also be checking if WHENMOFIED is None?
Member
Author
There was a problem hiding this comment.
We don't necessarily need to. parse_backfill_record is used to parse dumped SageRecords, and when SageRecords are dumped any None values are excluded:
But it is safer to use raw.get("WHENMODIFIED") instead of "WHENMODIFIED" in raw when determining whether to validate with IncrementalRecord or CreationRecord in case how SageRecord serialization behavior changes later. I've updated this check to cover if WHENMODIFIED is None too.
6fed094 to
b4b5173
Compare
Backfills have crashed validating records whose `WHENMODIFIED` was `null`. It turns out that it's possible for `WHENMODIFIED` to be `null`, and that creates a hole in our current incremental strategy that relies on the presence of the `WHENMODIFIED` field for incremental replication queries; these queries would miss records that have a `WHENCREATED` field but no `WHENMODIFIED` field. This commit fixes that gap with the following changes: - Split records into `IncrementalResource` (keyed on `WHENMODIFIED`) and new `CreationRecord` (keyed on `WHENCREATED`). Both expose `cursor_value()` so the generic fetch helper works against either. - `fetch_page` routes per-record by WHENMODIFIED presence, which should prevent the backfill from crashing when validating records without a `WHENMODIFIED` field. - Add `fetch_creations` + `realtime_creations` / `lookback_creations` sub-tasks that query `WHENMODIFIED IS NULL AND WHENCREATED > since`. - Add a 2-minute REALTIME_LAG horizon on the realtime modified sub-task so a near-simultaneous create-then-modify can't let the modified emission race ahead of the creation emission and overwrite newer state with older. - Add state for the creation subtasks to existing captures' state.
…eadable When the configured user lacks `AUDITHISTORY` read permission, the deletion sub-tasks crash the capture with `ValidationError: ['You do not have permission to view audit history ...]`. This denial arrives without an `errorno`, so `is_permission_error()`'s existing PL04000005 check doesn't recognize it and the error propagates as a fatal `ValidationError`. This commit updates the connector to not capture deletions if we cannot read the `AUDITHISTORY` object. This is done by omitting the deletion related subtasks from the `fetch_changes` `dict`. Note that the `initial_state` still contains state for the deletion subtasks - that's done so if/when permission to read the `AUDITHISTORY` is granted, the capture will automatically start reading deletions after its next restart.
…MODIFIED In the previous commit where incremental resources were updated to capture documents without a WHENMODIFIED field, I missed updating the `Resource.model` to reflect that `WHENMODIFIED` is not required. This caused schema violation errors; the connector was trying to send documents without `WHENMODIFIED` into a collection whose write schema required `WHENMODIFIED`. This commit fixes this by removing `WHENMODIFIED` from the model provided during `Resource` instantiation. That will cause the collection's write schema to drop `WHENMODIFIED` from it too & schema violation errors should subside.
b4b5173 to
f726b7e
Compare
Alex-Bair
added a commit
to estuary/flow
that referenced
this pull request
May 15, 2026
Documentation updates for estuary/connectors#4465.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description:
This PR's scope includes:
nullWHENMODIFIEDfield. AnullWHENMODIFIEDis evidently possible. I'm not exactly sure why it's possible, but regardless of whyWHENMODIFIEDisnull, I'm pretty sure the fix is to capture thesenullWHENMODIFIEDrecords based on theirWHENCREATEDfield instead.AUDITHISTORYobject required to figure out what's been deletedSee the commit messages for more details.
Documentation links affected:
The connector's documentation should be updated to mention that the connector needs permission to read the
AUDITHISTORYobject in order to capture deletions.Notes for reviewers:
Sanity checks with
flowctl raw discoverandflowctl previewbefore and after my changes with our testing account gave the same outputs.I also tested out the equivalent API calls for the new
get_creations_since_requestandget_creations_at_requestwith our test account, and they returned 200 OK as expected. They didn't return any records, but that's because there are no records in our test account without a non-nullWHENMODIFIEDfield.See this Slack thread for more context on the specific capture task hitting errors that will get unstuck after these changes are merged.
As part of this change, we've added a 2 minute lag to capturing updates to guard against the connector querying for updated and created records close together, then emitting the create after the update due to network or Sage API races. 2 minutes seems acceptable to me, but I can adjust it lower if we think 2 minutes is too much lag & the creation / update race doesn't seem valid.