feat: add word_confidence_threshold param to whisper#23
Conversation
Adds the wordConfidenceThreshold option (default 0.3) to the whisper API, mirroring the Python client and docs changes. Words with an OCR confidence below the threshold are excluded from the extracted text. Works with form, high_quality and table modes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
| Filename | Overview |
|---|---|
| index.js | Adds wordConfidenceThreshold param (default 0.3) to whisper() and forwards it as word_confidence_threshold; follows existing patterns for parameter handling. |
| test/retry.test.js | Adds two axios-adapter-based tests confirming the new param is forwarded with both a custom value and its default; consistent with the rest of the test file's mocking style. |
| package.json | Version bump 2.5.1 to 2.6.0; correct semver minor bump for a new backward-compatible feature. |
| package-lock.json | Lock file updated to reflect the version bump; no dependency changes. |
Sequence Diagram
sequenceDiagram
participant Caller
participant whisperFn as whisper method
participant AxiosClient as Axios Client
participant API as /whisper API
Caller->>whisperFn: wordConfidenceThreshold: 0.7
whisperFn->>whisperFn: build params with word_confidence_threshold
whisperFn->>AxiosClient: "POST /whisper with word_confidence_threshold=0.7"
AxiosClient->>API: HTTP POST with query params
API-->>AxiosClient: 200/202 response
AxiosClient-->>whisperFn: response object
whisperFn-->>Caller: Promise resolves with response
Reviews (3): Last reviewed commit: "test: add word_confidence_threshold forw..." | Re-trigger Greptile
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: Rahul Johny <116638720+johnyrahul@users.noreply.github.com>
jaseemjaskp
left a comment
There was a problem hiding this comment.
Automated review (Claude PR Review Toolkit). The change is small and consistent with the existing param-forwarding pattern. Two non-blocking findings below; details and a test-coverage note are in the PR thread summary.
| waitForCompletion = false, | ||
| waitTimeout = 180, | ||
| addLineNos = false, | ||
| wordConfidenceThreshold = 0.3, |
There was a problem hiding this comment.
[P2 · behavior] Client now forces word_confidence_threshold on every request.
With wordConfidenceThreshold = 0.3 as a client-side default, the param is sent on every whisper call (line 243) — including output_mode/mode combinations where the JSDoc says it has no effect (only form/high_quality/table). For callers who upgrade without touching their code, this silently pins the threshold to 0.3 instead of letting the server apply its own default.
- If the server default is already
0.3, this is a no-op and fine — please confirm it matches so there's no behavior change on upgrade. - This follows the existing convention (all params are forwarded unconditionally), so it's consistent — flagging only the divergence risk.
[P3 · validation] No range check. Confidence is expected in [0, 1]; an out-of-range value (e.g. 30 instead of 0.3) is forwarded as-is. Other numeric params aren't validated either, so this is optional — but a confidence score is more error-prone to mistype than a filter size.
- Fix missing leading ' *' on the blank JSDoc line before @returns. - Add tests asserting wordConfidenceThreshold is forwarded as word_confidence_threshold (custom value and the 0.3 default). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Thanks for the review. Addressed in 11cc514:
|
jaseemjaskp
left a comment
There was a problem hiding this comment.
Re-reviewed via the PR Review Toolkit (code-review, silent-failure, type-design, test-coverage, comment-analysis, simplification). The feature wiring is mechanically correct and idiomatic, and the two added forwarding tests lock in the snake_case mapping + default. The substantive behavioral concern (unconditional 0.3 default) and the range-validation note are already captured in my earlier open thread on index.js:218, which remains the right place to resolve that discussion. Remaining observations (engines/axios bundling, JSDoc wording, test DRY) are intentional or minor. Approving.
Summary
Adds the
wordConfidenceThresholdoption (default0.3) to thewhisperAPI, mirroring the corresponding changes in the Python client and docs:Words whose OCR confidence falls below the configured threshold are excluded from the extracted text. The parameter works only with
form,high_qualityandtablemodes.Changes
index.js: newwordConfidenceThresholdoption (default0.3), forwarded asword_confidence_thresholdin the/whisperrequest params, plus JSDoc.2.5.1→2.6.0.Notes
No unit test added: the JS test suite is integration-only (hits the live API with real files) and has no request-mocking setup equivalent to the Python PR's mocked URL-param assertions.
🤖 Generated with Claude Code