feat(workers): implement Reddit OAuth client credentials flow to bypa…#2887
feat(workers): implement Reddit OAuth client credentials flow to bypa…#2887kunal-rathore-111 wants to merge 2 commits into
Conversation
Greptile SummaryThis PR adds Reddit OAuth client-credentials support to the
Confidence Score: 3/5The fallback path is safe, but the OAuth token cache has two bugs that need fixing before enabling credentials in production. The buffer subtraction in apps/workers/metascraper-plugins/metascraper-reddit.ts — specifically the token caching logic around lines 109–152 Important Files Changed
Prompt To Fix All With AIFix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
apps/workers/metascraper-plugins/metascraper-reddit.ts:144-146
If `expires_in` is less than 300 (a short-lived token or unexpected server response), `(data.expires_in - 300)` is negative, making `redditAccessTokenExpiresAt` a timestamp in the past. Every subsequent call would skip the cache and issue a new token request, likely triggering Reddit's rate limit on the auth endpoint.
```suggestion
redditAccessToken = data.access_token;
// Expire 5 minutes before the actual expiration to be safe
redditAccessTokenExpiresAt = now + Math.max(0, data.expires_in - 300) * 1000;
```
### Issue 2 of 2
apps/workers/metascraper-plugins/metascraper-reddit.ts:112-152
**Concurrent token refresh race condition**
`getRedditAccessToken` has no concurrency guard. When multiple scrape jobs run in parallel and the cached token has just expired, all of them simultaneously pass the `redditAccessTokenExpiresAt > now` check before any one has written the new token. Each will then issue its own token-refresh request to Reddit's auth endpoint, potentially triggering rate limiting.
The existing URL-level cache in `fetchRedditPostData` avoids this correctly by storing the `Promise` before it resolves. The same pattern should be applied here — store a single in-flight `Promise<string | null>` and return it to all concurrent callers until it resolves.
Reviews (1): Last reviewed commit: "feat(workers): implement Reddit OAuth cl..." | Re-trigger Greptile |
Description
Fixes #2885
Reddit has recently updated their crawler policies, causing our unauthenticated
.jsonrequests to frequently get blocked.To resolve this, this PR implements the official Reddit OAuth Client Credentials flow for the
metascraper-redditplugin.Specifically:
REDDIT_CLIENT_IDandREDDIT_CLIENT_SECRETenvironment variables toserverConfig.oauth.reddit.comusing theBearertoken and a customUser-Agent..jsonpolling, ensuring the change is fully backward-compatible.How Has This Been Tested?
REDDIT_CLIENT_IDandREDDIT_CLIENT_SECRETare not present, it correctly falls back to unauthenticated.jsonrequests.oauth.reddit.com, and successfully fetches the Reddit metadata without being blocked.Screenshots (if appropriate)
Checklist:
Please describe to which degree, if any, an LLM was used in creating this pull request.
I collaborated with an AI coding assistant to help design the caching logic and integrate the standard OAuth Client Credentials flow.