Skip to content

Make soft_error! return Ok in OSS builds and remove .unwrap() at call#1276

Open
dkranchii wants to merge 2 commits into
mainfrom
soft_error_fix
Open

Make soft_error! return Ok in OSS builds and remove .unwrap() at call#1276
dkranchii wants to merge 2 commits into
mainfrom
soft_error_fix

Conversation

@dkranchii
Copy link
Copy Markdown
Contributor

Summary

soft_error! was designed to log non-critical errors and allow execution to continue. However, in open source builds, handle_soft_error unconditionally returned Err(err) instead of Ok(err), turning every soft error into a hard error. Combined with .unwrap() calls at multiple call sites, this caused panics/crashes for OSS users on non-critical failures like sqlite write errors or vacant path lookups.

Changes

app/buck2_env/src/soft_error.rs

  • Removed the is_open_source branch that forced all soft errors to return Err in OSS builds. soft_error! now returns Ok(err) uniformly, matching FB internal behavior. The BUCK2_HARD_ERROR env var still works for users who want to opt into hard errors.

app/buck2_execute_impl/src/materializers/deferred/artifact_tree.rs

  • Replaced .unwrap() with let _unused = on cleanup_finished_vacant soft error.

app/buck2_execute_impl/src/materializers/deferred/command_processor.rs

  • Replaced .unwrap() with let _unused = on 4 soft error call sites: clean_stale_no_config, materializer_materialize_error, has_artifact_update_time, and dynamic error_name in on_materialization.

app/buck2_execute_impl/src/sqlite/incremental_state_db.rs

  • Replaced .unwrap() with let _unused = on insert_to_incremental_db and delete_from_incremental_db soft errors.

Impact

  • OSS users will no longer experience unexpected panics from non-critical internal errors.
  • Errors are still logged/reported via the structured error handler.
  • BUCK2_HARD_ERROR env var continues to work as an opt-in escape hatch.
  • No change to FB internal behavior (was already returning Ok(err)).

Fixes

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 30, 2026
@meta-codesync
Copy link
Copy Markdown
Contributor

meta-codesync Bot commented Mar 30, 2026

@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this in D98698555. (Because this pull request was imported automatically, there will not be any future comments.)

@scottcao
Copy link
Copy Markdown
Contributor

Thanks for the PR. There's another report of this issue earlier, but I haven't had time to look into it. Regarding the actual soft_error function, it's currently used in buck2 in two ways.
Errors that we eventually want to disallow. Generally the reason we can't disallow it is because there's a lot of legacy tech debt internally that prevents us from just making this into an error today. We’d like to keep these as hard errors in open source since this isn’t an issue for open source.
Logging for errors that shouldn’t break the build.

I think ideally we should just parametrize soft_error function so that we have both types of behavior, and then use the kind that doesn’t cause hard errors for the logging purposes externally as well. Let me know if that sounds good to you.

@dkranchii
Copy link
Copy Markdown
Contributor Author

@scottcao Thanks, that makes sense.

@dkranchii
Copy link
Copy Markdown
Contributor Author

dkranchii commented Apr 16, 2026

@scottcao Thanks for the feedback! I've updated the PR to parameterize the behavior based on the existing deprecation flag in StructuredErrorOptions.

How it works now:

  • soft_error!("category", err, deprecation: true) → Still returns Err in OSS (hard error, no legacy users need the grace period)
  • soft_error!("category", err) (default deprecation: false) → Now returns Ok in OSS (just logging, shouldn't break the build)

@dkranchii
Copy link
Copy Markdown
Contributor Author

@copilot resolve the merge conflicts in this pull request

Only treat deprecation soft errors as hard errors in OSS builds.
Non-deprecation soft errors (used for logging) now return Ok in OSS,
matching FB internal behavior and preventing panics at .unwrap() sites.

Made-with: Cursor
@dkranchii
Copy link
Copy Markdown
Contributor Author

@scottcao can you review this pr. thanks!!

@cehoffman
Copy link
Copy Markdown
Contributor

We hit this on the OSS side and would benefit from this PR shipping.

Our default build path is Remote Execution against a Buildbarn cluster. Local Buck2 on macOS and on our Linux CI runners is a deliberately preserved fallback for when the RE pool, network, or certs are unavailable. The materializer panics in #1274 are what make that fallback fragile. We have not reproduced them under RE.

Today the fallback runs without SQLite materializer state at all because of this issue: we set sqlite_materializer_state = false repo-wide in .buckconfig and only flip it back to true in a gitignored .buckconfig.local for the RE path. That gives us a materially worse local story than RE and forces us to carry two materializer configurations with different behavior. Shipping this would let the fallback run with the same materializer setup as RE.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants