Releases: OpenWhispr/openwhispr
1.7.0
Breaking for Mac users: please delete your old OpenWhispr first. 1.7.0 uses a new app ID — we had to migrate Apple Developer accounts, so you'll be asked to re-grant permissions on first launch. Notes, settings, API keys, and downloaded models carry over automatically.
Summary
- More sign-in options — new: Sign in with Microsoft, Sign in with Apple on macOS. Auth runs on our own infra now (migrated to Better Auth); self-hosters can point at their own server.
- Sessions in your OS keychain — survive crashes and Electron restarts. No more random sign-outs.
- API keys encrypted at rest — all 12 BYOK + enterprise creds moved from
.envto the OS keychain. Silent one-time migration on first launch. - Background meeting recording — navigate to other notes, open Settings, switch tabs; the audio pipeline keeps going. New floating pill shows live mic levels and clicks back to the recording note.
- Per-note diarization preferences persist across stop/resume.
- Cleaner mic capture with a new acoustic gate + better echo cancellation. Music pause/resume on Windows works again.
- Per-scope LLM setup — pick different providers/models for cleanup, agent, formatting, and chat. You can now have text cleanup toggled ON but your dictation agent toggled OFF.
- NVIDIA Parakeet
parakeet-unified-en-0.6b— new English-only model, 5.91% avg WER, ~631 MB. openwhisprCLI talks directly to the desktop app via a local HTTP bridge (127.0.0.1, bearer-token); falls back to cloud when the app is closed.- Proxy-aware fetches + OS CA trust — corporate TLS interception works.
- Auto-learn corrections now work for Cyrillic, CJK, Arabic, Devanagari.
- A bunch of other small bug fixes and performance improvements.
Detailed changelog
- build(mac): migrate Apple signing to Gizmo Labs Inc. by @gabrielste1n in #639
- fix(qdrant): set cwd to STORAGE_DIR so packaged app doesn't crash on startup by @gabrielste1n in #640
- feat(sync): cross-device delete propagation by @gabrielste1n in #646
- feat(meeting): interop cloud streaming providers + AEC/VAD improvements by @gabrielste1n in #656
- Re-grant permissions modal for 1.6.11 upgraders by @gabrielste1n in #659
- fix(chat): persist first user message when creating a new conversation by @xAlcahest in #662
- fix(tls): trust OS CA store for Node-side TLS; keep realtime WS on OpenAI only by @gabrielste1n in #657
- Fix adding non-ASCII languages to dictionary by @wake0up0ne0 in #666
- Fix dictionary correction tracking for TextPattern-based controls (alongside existing ValuePattern support) by @wake0up0ne0 in #665
- Bundle ID migration: com.herotools.openwispr → com.gizmolabs.openwhispr by @gabrielste1n in #669
- feat(cli): local HTTP bridge for unified CLI by @gabrielste1n in #676
- fix(notes): update folder selection when moving notes between folders by @gabrielste1n in #678
- refactor: split dictation cleanup from dictation agent + per-scope LLM config by @gabrielste1n in #677
- refactor(auth): switch desktop to Better Auth + add Microsoft sign-in by @gabrielste1n in #686
- fix(network): proxy-aware fetches + actionable connectivity errors by @gabrielste1n in #687
- chore(release): 1.7.0 confidence cleanup by @gabrielste1n in #689
- ci(release): rename VITE_NEON_AUTH_URL → VITE_AUTH_URL by @gabrielste1n in #690
- fix(auth): initiate desktop OAuth from browser, not renderer by @gabrielste1n in #692
- feat(auth): Sign in with Apple on macOS by @gabrielste1n in #691
- fix(onnx): cap embedding segments and isolate inference in utility process by @gabrielste1n in #693
- fix(sidecars): reap children on quit and on next launch (#683) by @gabrielste1n in #694
- docs(changelog): plain-English 1.7.0 + lockfile refresh by @gabrielste1n in #695
- fix(llama): raise Vulkan startup timeout, stop server before re-download by @xAlcahest in #698
- fix(media): fall back to media key when GSMTC fails on Windows by @xAlcahest in #697
- docs(readme): acknowledge Hugging Face as model hub by @gabrielste1n in #703
- feat(transcribe): generate clientTranscriptionId for cloud sync dedup by @gabrielste1n in #702
- feat(security): encrypt API keys at rest via safeStorage by @gabrielste1n in #629
- fix(media): use AsTask bridge for WinRT async calls in GSMTC scripts by @xAlcahest in #706
- fix(media): resume playback on no-audio-detected event by @kdenney in #701
- feat(auth): switch desktop to Authorization: Bearer + safeStorage by @gabrielste1n in #704
- fix(ci): compile macos-media-remote binary in release and build workflows by @kdenney in #700
- feat(reasoning): self-hosted OpenAI-compatible parity + thinking-mode toggle by @gabrielste1n in #708
- feat(parakeet): add parakeet-unified-en-0.6b model by @milanleonard in #713
- feat(meeting): background recording with floating pill and responsive layout by @gabrielste1n in #709
- fix(runtime): unblock model downloads, surface ONNX worker errors, dedupe hotkey re-register by @gabrielste1n in #716
- fix(linux): resolve symlink in launcher wrapper by @xAlcahest in #717
New Contributors
- @wake0up0ne0 made their first contribution in #666
- @kdenney made their first contribution in #701
- @milanleonard made their first contribution in #713
1.6.10
Features
- Speaker diarization controls — per-meeting toggle, "N others in call" stepper, auto-label on 1-on-1s, fallback to "You/Others" when labeling is off
- Integrations hub — new top-level view with MCP connectivity for Claude, ChatGPT, and Cursor; API keys moved here
- Enterprise connections - connect to your organisations instance via bedrock
Bug Fixes & Improvements
- Settings reorganization — "AI Models" split into Speech-to-Text and Language Models with clearer sub-tabs
- Agent hotkey moved under Hotkeys
- Meeting transcription tuned to reduce GPT-4o hallucinations during silence
- Model cache opens at the correct folder so downloaded models are actually visible
- Sync — cleared transcripts and deleted folders no longer reappear after restart
- Echo leak detector now runs pre-AEC so it can actually see system audio bleed
1.6.9
New features
Transcript Export — Save transcripts to disk as TXT, SRT, or JSON directly from the meeting view.
Cloud Sync — Notes, folders, conversations, and transcriptions now sync bidirectionally across devices.
API Keys — Create and manage API keys from Settings to integrate OpenWhispr with your workflows programmatically.
Linux Push-to-Talk — Native push-to-talk via evdev with guided permission setup.
Auto-Paste Toggle — New setting to disable automatic pasting after dictation.
Agent Folders — The agent can now list, create, and match folders semantically when organizing notes.
Improvements
- Upgraded to Electron 41 and Node 24
- Redesigned provider tabs as compact pill buttons
- Local model descriptions replaced with clickable spec links
- llama.cpp detection now probes /v1/models and prefers /chat/completions
Fixes
- Windows hotkey no longer false-triggers after Win+L lock/unlock
- Fixed stuck recording when push-to-talk key-up is missed on Windows
- Relaxed speech gate thresholds with no-audio toast for failed transcriptions
- Retry button now shown on failed transcriptions
- Notes sync when updated externally
- Tuned speaker diarization to prevent excessive speaker creation
- TLS/certificate errors now surfaced in model download UI
- Floating icon position persists across restarts
- Fixed notification window sizing and dismiss circle visibility
1.6.8
New features
- Meeting recording with on-device speaker diarization — capture mic and system audio together, and every segment is automatically labeled with who said it. Diarization runs 100% locally: voices never leave your device. System audio capture now works on Windows and Linux (Chromium loopback and the native audio portal), not just macOS.
- Local live transcription preview — see your words appear in real time as you speak, streamed from a local Whisper or Parakeet model before you release the hotkey. Fully on-device, no cloud round trip.
- Self-hosted support — first-class option to point OpenWhispr at your own LAN server for both transcription and agent inference. One click to swap between OpenAI, Anthropic, Gemini, a local GGUF, or your self-hosted endpoint.
- Speaker to contact linking — voice profiles bind to calendar attendees, so recurring participants get named automatically after the first tag.
- Improved echo cancellation — WebRTC AEC now runs as a native sidecar, paired with a custom cross-correlation detector and text-level dedupe to cleanly handle external speakers, system-audio capture, and low-echo-quality mics.
- Gemma 4 models — Gemma 4 31B and Gemma 4 26B MoE added to the local model registry.
- Smart transcription error handling — when a transcription fails, inline CTAs let you retry, switch provider, or open logs.
Linux
- Hyprland global hotkeys via
hyprctl, plus correct active-window detection for terminal paste. - GNOME shortcut conflict detection, active-key display, and proper fallback when the requested shortcut is taken.
Fixes
- Windows: installer DPI scaling corrected, GPU compositing re-enabled.
- Local Whisper: stricter speech gate cuts hallucinated output on silent audio.
- Diarization: session-scoped results, disk spooling for long meetings, reliable spinner, correct speaker label propagation.
- Notes: transcripts scope to their originating note, view auto-switches when a meeting recording starts.
- UI: fixed 420px window width that grows only with text, consistent across phases; preview overlay redesigned.
- Settings: inference mode auto-switches when you select a new provider.
- i18n: speaker labels, cloud settings, integrations, and calendar strings tightened across locales.
- Stability: 22 React Compiler lint issues resolved, tighter effect dependencies, cleaner hook usage across 15 files.
Meeting AEC Helper v1.0.0
Prebuilt WebRTC Audio Processing sidecar for meeting mic echo cancellation.
Binaries built from native/meeting-aec-helper/ against pinned webrtc-apm + abseil-cpp commits.
1.6.7
AI Chat & Semantic Search — Ask questions about your notes with the new embedded chat panel. Conversations sync to the cloud, and a local semantic search engine finds notes by meaning, not just keywords.
Meeting Improvements — Calendar attendees automatically appear on meeting notes, auto-detection works for browser meetings, and echo cancellation cleans up mic input.
Save Notes as Files — Export your notes to local Markdown files that mirror your folder structure.
Responsive Settings — Settings dialog adapts gracefully to smaller windows.
Bug Fixes — Resolved issues with meeting transcription, folder switching, clipboard pasting, and participant saving. Improved Linux Wayland support and Windows build signing.
Huge thanks to @xAlcahest @DamianPala and others for their contributions to platform stability and improvements!
1.6.6
Rich Text Notes
- Notes now use a rich text editor with Obsidian-style live preview — Markdown syntax hides as you type, giving you a clean writing experience
Meeting Transcription Upgrades
- Dual-channel transcription — mic and system audio are captured separately with speaker-labeled chat bubbles so you can see who said what
- Timestamped segments — meeting transcripts now include timestamps in chronological order
- Smarter meeting notes — AI-generated summaries are now speaker-aware for more accurate meeting recaps
New Local Models
- Mistral Nemo 12B and Gemma 3 12B are now available for on-device AI processing
macOS Audio Improvements
- Native system audio capture via CoreAudio Tap — no more "screen recording" permission prompt on macOS 14.2+
- macOS 15+ now shows the correct system audio consent dialog instead of the legacy screen recording one
Linux
- KDE Wayland support — native global shortcuts now work on KDE Plasma via D-Bus, joining GNOME and Hyprland support
- Fixed KDE Plasma overlay window and hotkey behavior
- Fixed clipboard paste reliability on KDE Wayland
Simplified Permissions
- Permission prompts consolidated to a single "Grant Access" button
- Permissions are now re-checked against the OS each time you open settings, so the UI always reflects your actual state
Bug Fixes
- Fixed Gemini agent streaming routing to the wrong endpoint
- Fixed Windows mic volume being permanently altered during dictation
- Fixed mono transcription failure on Linux
- Fixed Bluetooth audio issues during meetings
- Fixed meeting detection notifications firing when you're already in a meeting
- Fixed held modifier keys not releasing before paste on Windows
- Fixed paused media being unpaused during dictation
- Fixed Google OAuth users skipping onboarding
- Fixed AI cleanup prompt refusing to transcribe command-like speech
- Fixed agent hotkey conflicts not showing a warning
1.6.5
- Meeting detection improvements
1.6.4
Meeting Mode
- Meeting mode hotkey — Assign a dedicated hotkey to instantly snap the panel into meeting mode and start a new meeting note
- Smarter meeting detection — Reduced false positives; background apps like FaceTime no longer trigger notifications unless mic activity is detected
- Meeting detection toggle — Disable meeting detection entirely from Settings
Auto-Update
- Update notification — A slide-in notification appears when a new version is available, with an "Update Now" button
Multi-Monitor
- Cursor-aware positioning — The floating icon now appears on whichever monitor your cursor is on
New Models
- GPT-5.4 — Added as the new flagship OpenAI model
- Qwen 3.5 — New local models added; removed sub-1B models
Bug Fixes
- Fixed paused media resuming unexpectedly on Windows when no active playback sessions exist
- Fixed Windows hotkey listener state corruption when switching hotkeys
- Fixed silence detection rejecting valid speech (lowered threshold)
- Fixed Windows paste not working in Windows Terminal (scan codes now included)
- Fixed API keys saved in Settings being overridden by shell environment variables
- Fixed local LLM models being deleted during Windows app updates
- Fixed hotkey tooltip display for multi-key combos on macOS
- Fixed RPM install conflicts with other Electron apps on Linux
- Added Tailscale VPN (CGNAT range) support for self-hosted setups
Other
- Panel start position now persists across app restarts
- Cross-window settings sync (hotkey changes apply instantly)
- Account deletion flow with cloud data cleanup
- Agent mode window renamed to "Agent Chat"
1.6.3
Clearer Permissions
- "Screen Recording" → "System Audio" — all permission prompts now clearly state we capture other participants' audio, not your screen
- Electron 39 — eliminates the purple "screen recording" indicator, the "Your screen is being observed" lock screen message, and the misleading permission prompt on macOS 14.2+
Better Soft Voice Recognition
- Auto Gain Control now enabled for dictation, automatically boosting quiet speech
- Lower VAD sensitivity so soft-spoken audio is no longer missed
- Less clipping at the start of speech — increased padding so quiet beginnings aren't cut off
Linux
- Hyprland Wayland support — native global shortcuts using
hyprctlkeybindings + D-Bus
Bug Fixes
- Fixed
wl-copyfailing silently on Wayland due to a too-short timeout - Fixed media staying paused after recording silence with "Pause media on dictation" enabled