fix(slack): drop synthetic thread_ts to avoid invalid_thread_ts#212
fix(slack): drop synthetic thread_ts to avoid invalid_thread_ts#212LIU9293 wants to merge 4 commits into
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7b7cfa16b1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // a task/cron run, fall back to posting at the top of the channel rather | ||
| // than letting Slack reject the call with `invalid_thread_ts`. | ||
| const threadIsSynthetic = isSyntheticOwner(threadId); | ||
| const threadField = threadIsSynthetic ? {} : { thread_ts: threadId }; |
There was a problem hiding this comment.
Preserve synthetic question routing
When a task/cron run emits an ask_user question, this drops the synthetic thread_ts and posts the Slack question as a new top-level message, but the pending question is still stored under the synthetic thread id in request-run.ts (setPendingQuestion(request.channelId, request.threadId, ...)). Slack button handling later derives the thread from the posted message (body.message?.thread_ts || body.message?.ts in commands.ts), so a click/reply on this top-level question is routed under the real Slack timestamp instead of the synthetic thread and getPendingQuestion will not find it. In that scenario the user's answer starts a fresh turn instead of resuming the blocked task/cron run; the fallback needs to seed/mirror the synthetic session or otherwise map the posted question's ts back before exposing interactive/plain questions top-level.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Good catch — flagging this as an explicit follow-up rather than fixing in this PR.
This PR is intentionally narrow: stop posting an invalid thread_ts so Slack stops rejecting cron/task questions with invalid_thread_ts (Sentry ODE-DEAMON-7). Before this change, the question was lost entirely; after it, the question is at least visible at the top of the channel.
You're right that interactive button clicks / threaded replies under the new top-level message route through body.message?.thread_ts || body.message?.ts and so won't match the synthetic-thread pendingQuestion written by request-run.ts:495. That means: with this PR, the user sees the question but pressing a button starts a fresh turn instead of resuming the blocked cron/task run — strictly better than the previous "silently swallowed" failure mode, but still incomplete.
The proper fix needs one of:
- Mirror
pendingQuestionunder both the synthetic threadId and the posted message'stsonce we know it - Or re-key the pending-question entry after
chat.postMessagereturns the realts - Or seed the cron/task session early (right after the first
sendChannelMessage) so the synthetic id is gone by the timepostSlackQuestionruns
That's a session/state-routing change that belongs in its own PR. I'd rather land this fix now to stop the Sentry bleed and tackle the routing in a focused follow-up than block on it here. Will open a tracking issue.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cd11394fa4
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const result = await slackApp.client.chat.postMessage({ | ||
| channel: rawChannelId, | ||
| thread_ts: threadId, | ||
| ...(threadIsSynthetic ? {} : { thread_ts: threadId }), |
There was a problem hiding this comment.
Use the channel workspace token for synthetic top-level sends
When threadIsSynthetic is true this call is now a top-level channel post, but the token was still resolved with getSlackBotToken(channelId, threadId) before this branch. In multi-workspace installs where a scheduled task/cron fires before that channel has been seen by the message router, getSlackBotToken can fall back to the first registered token instead of the token for rawChannelId; the existing top-level helper avoids that by consulting getWorkspaceBotTokenForChannel(channelId) first. In that context the new fallback can still fail with channel_not_found/not_in_channel even though dropping thread_ts was supposed to make the message deliverable.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Addressed in 6da6431. Token resolution now mirrors sendChannelMessage when threadIsSynthetic is true:
getSlackBotTokenForProcessor(processorId)
?? (threadIsSynthetic
? (getWorkspaceBotTokenForChannel(channelId) ?? getSlackBotToken(channelId))
: getSlackBotToken(channelId, threadId));
The real-thread path is unchanged so existing registry bindings keep winning for in-flight replies. Suite still green (408 pass, 1 skip).
Addresses Codex P2 review on PR #212. When threadIsSynthetic is true, sendMessage degenerates to a top-level channel post (no thread_ts), but the token was still being resolved with getSlackBotToken(channelId, threadId) first. In multi-workspace installs where a scheduled task/cron fires before that channel has been seen by the message router, the registry has no entry for the fake thread_ts and getSlackBotToken can fall back to the first registered token instead of the token for rawChannelId — leading to a follow-on channel_not_found / not_in_channel even though dropping thread_ts was supposed to make the message deliverable. Mirror the token-resolution chain that sendChannelMessage already uses for top-level posts: processor token -> workspace-by-channel token -> getSlackBotToken for the synthetic path. The real-thread path is unchanged so registry bindings for in-flight thread replies still win. Tests: full suite still green (408 pass, 1 skip, 0 fail).
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6da643122b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const result = await client.chat.postMessage({ | ||
| channel: channelId, | ||
| thread_ts: threadId, | ||
| ...threadField, |
There was a problem hiding this comment.
Resolve the channel token before posting synthetic questions
When this synthetic branch turns a task/cron ask_user question into a top-level Slack post, it still uses the token selected by createSlackAdapter.sendQuestion, which calls getSlackBotToken(channelId, threadId) with the fake task:/cron-job: id rather than doing the workspace-first channel lookup used for top-level sends. In a multi-workspace install where the scheduled channel has not yet been bound in the Slack registry, that can fall back to the first registered bot token, so these newly top-level Block Kit/plain question posts fail with channel_not_found/not_in_channel even though sendMessage was fixed; the fresh evidence is that the question path still passes the old token into this spread.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Addressed in 1d4e434. The sendQuestion adapter now mirrors the token-resolution logic from sendMessage (client.ts:321): when threadId is a synthetic placeholder, resolve via getWorkspaceBotTokenForChannel(channelId) first, falling back to getSlackBotToken(channelId). The real-thread path is unchanged so existing registry bindings keep winning for in-flight replies.
const threadIsSynthetic = isSyntheticOwner(threadId);
const token = getSlackBotTokenForProcessor(processorId)
?? (threadIsSynthetic
? (getWorkspaceBotTokenForChannel(channelId) ?? getSlackBotToken(channelId))
: getSlackBotToken(channelId, threadId));
Full suite still green (408 pass, 1 skip).
Mirror the token-resolution logic from sendMessage in the sendQuestion adapter. When the thread id is a synthetic placeholder (task: / cron-job: / cron:), postSlackQuestion drops thread_ts and degenerates to a top-level channel post; in that case the registry has no entry for the fake thread_ts, and getSlackBotToken(channelId, fakeTs) can fall back to the first registered workspace token in multi-workspace installs. Use getWorkspaceBotTokenForChannel(channelId) first so the posted Block Kit / plain-text question lands with the token bound to the actual channel. Addresses Codex review comment on PR #212.
When a task or cron job's agent run emits an intermediate message,
the Slack adapter currently forwards the synthetic placeholder thread
id (`task:{id}` / `cron-job:{id}:{run}` / `cron:{id}`) directly as
`thread_ts`. Slack rejects those with `invalid_thread_ts` because
they are not real message timestamps, so the message is lost and the
delivery captures a Sentry error (ODE-DEAMON-7).
Synthetic owners are internal placeholders that exist before the
scheduler has posted a top-level message and learned the real thread
id; the scheduler later seeds the session for the real thread via
`seedCronChannelThreadSession` / `seedChannelThreadSession`. While
the agent is mid-run, the safe fallback is to post the intermediate
output at the top of the channel rather than reject it.
- sendMessage now omits thread_ts when threadId is synthetic and skips
binding the bot token to the fake thread id.
- postSlackQuestion does the same for both the plain-text and Block
Kit branches.
- New api.test.ts mocks the bolt app and asserts thread_ts is dropped
for cron-job:/task: ids and preserved for real Slack timestamps.
The previous mock.module factory only returned getApp + getSlackBotToken, so when later test files in the same Bun process imported other exports from packages/ims/slack/client (e.g. sendChannelMessage in packages/core/tasks/scheduler.ts and packages/core/test/web-routes.test.ts), Bun resolved them against the stub and threw: SyntaxError: Export named 'sendChannelMessage' not found in module '/home/runner/work/ode/ode/packages/ims/slack/client.ts' Locally test ordering hid this; CI's alphabetic ordering surfaced it as 2 failed tests / 2 unhandled errors in fast-checks. Fix: load the real ./client first and spread its exports into the stub before overriding getApp + getSlackBotToken. Downstream importers in the same process now still see every real export.
Addresses Codex P2 review on PR #212. When threadIsSynthetic is true, sendMessage degenerates to a top-level channel post (no thread_ts), but the token was still being resolved with getSlackBotToken(channelId, threadId) first. In multi-workspace installs where a scheduled task/cron fires before that channel has been seen by the message router, the registry has no entry for the fake thread_ts and getSlackBotToken can fall back to the first registered token instead of the token for rawChannelId — leading to a follow-on channel_not_found / not_in_channel even though dropping thread_ts was supposed to make the message deliverable. Mirror the token-resolution chain that sendChannelMessage already uses for top-level posts: processor token -> workspace-by-channel token -> getSlackBotToken for the synthetic path. The real-thread path is unchanged so registry bindings for in-flight thread replies still win. Tests: full suite still green (408 pass, 1 skip, 0 fail).
Mirror the token-resolution logic from sendMessage in the sendQuestion adapter. When the thread id is a synthetic placeholder (task: / cron-job: / cron:), postSlackQuestion drops thread_ts and degenerates to a top-level channel post; in that case the registry has no entry for the fake thread_ts, and getSlackBotToken(channelId, fakeTs) can fall back to the first registered workspace token in multi-workspace installs. Use getWorkspaceBotTokenForChannel(channelId) first so the posted Block Kit / plain-text question lands with the token bound to the actual channel. Addresses Codex review comment on PR #212.
1d4e434 to
a109625
Compare
|
Rebased onto latest
Full |
What
Stop forwarding synthetic placeholder thread ids (
task:{id}/cron-job:{id}:{run}/cron:{id}) as Slackthread_ts. When the value is synthetic, post at the top of the channel instead.Why
Sentry ODE-DEAMON-7 — "IM slack send failed: An API error occurred: invalid_thread_ts". The most recent events (2026-06-02) all have:
channel_id: C0ATGCJ0YK0thread_id: cron-job:a86fbdc5-01df-4caf-9e0c-c0c199f00379:1780441200000op: sendA task / cron run starts with a synthetic placeholder thread id; the scheduler only learns the real Slack
tsafter it posts the top-level result viasendChannelMessage. In between, any intermediate output the agent runtime emits flows throughim.sendMessage(channelId, syntheticThreadId, ...)→chat.postMessage({ thread_ts: "cron-job:..." })→ Slack rejects withinvalid_thread_ts, the message is lost, and Sentry captures a delivery failure.Sibling PR #211 already auto-disables cron jobs that hit a permanent channel error (e.g.
channel_not_found). This PR fixes the other mode of the same Sentry group: messages that would have landed in the channel except for the bogusthread_ts.Design notes
thread_ts). That keeps the message visible rather than silently swallowed.isSyntheticOwneralready exists inpackages/ims/shared/synthetic-owner.tswith full test coverage of the prefix set (task:/cron-job:/cron:); reuse it here so the matcher stays in one place.packages/ims/slack/client.ts:sendMessage(plus the bot-token binding helper, which must not bind a real token to a fake thread id) andpackages/ims/slack/api.tspostSlackQuestion(both the plain-text fallback and the Block Kit branch).thread_tsanalogues are routed through platform-specific resolvers that already special-case synthetic owners.Tests
packages/ims/slack/api.test.tsmock the bolt app and assert:cron-job:...thread id →postMessagecalled withoutthread_ts(Block Kit branch)task:...thread id → same (plain-text fallback)1717000000.000200) →thread_tspreservedbun testfull suite: 408 pass, 1 skip, 0 fail.Out of scope
channel_not_foundmode addressed by fix(cron): auto-disable cron jobs whose target channel is unreachable #211).