Skip to content

Releases: OpenWhispr/openwhispr

1.7.0

04 May 16:11

Choose a tag to compare

Breaking for Mac users: please delete your old OpenWhispr first. 1.7.0 uses a new app ID — we had to migrate Apple Developer accounts, so you'll be asked to re-grant permissions on first launch. Notes, settings, API keys, and downloaded models carry over automatically.

Summary

  • More sign-in options — new: Sign in with Microsoft, Sign in with Apple on macOS. Auth runs on our own infra now (migrated to Better Auth); self-hosters can point at their own server.
  • Sessions in your OS keychain — survive crashes and Electron restarts. No more random sign-outs.
  • API keys encrypted at rest — all 12 BYOK + enterprise creds moved from .env to the OS keychain. Silent one-time migration on first launch.
  • Background meeting recording — navigate to other notes, open Settings, switch tabs; the audio pipeline keeps going. New floating pill shows live mic levels and clicks back to the recording note.
  • Per-note diarization preferences persist across stop/resume.
  • Cleaner mic capture with a new acoustic gate + better echo cancellation. Music pause/resume on Windows works again.
  • Per-scope LLM setup — pick different providers/models for cleanup, agent, formatting, and chat. You can now have text cleanup toggled ON but your dictation agent toggled OFF.
  • NVIDIA Parakeet parakeet-unified-en-0.6b — new English-only model, 5.91% avg WER, ~631 MB.
  • openwhispr CLI talks directly to the desktop app via a local HTTP bridge (127.0.0.1, bearer-token); falls back to cloud when the app is closed.
  • Proxy-aware fetches + OS CA trust — corporate TLS interception works.
  • Auto-learn corrections now work for Cyrillic, CJK, Arabic, Devanagari.
  • A bunch of other small bug fixes and performance improvements.

Detailed changelog

  • build(mac): migrate Apple signing to Gizmo Labs Inc. by @gabrielste1n in #639
  • fix(qdrant): set cwd to STORAGE_DIR so packaged app doesn't crash on startup by @gabrielste1n in #640
  • feat(sync): cross-device delete propagation by @gabrielste1n in #646
  • feat(meeting): interop cloud streaming providers + AEC/VAD improvements by @gabrielste1n in #656
  • Re-grant permissions modal for 1.6.11 upgraders by @gabrielste1n in #659
  • fix(chat): persist first user message when creating a new conversation by @xAlcahest in #662
  • fix(tls): trust OS CA store for Node-side TLS; keep realtime WS on OpenAI only by @gabrielste1n in #657
  • Fix adding non-ASCII languages to dictionary by @wake0up0ne0 in #666
  • Fix dictionary correction tracking for TextPattern-based controls (alongside existing ValuePattern support) by @wake0up0ne0 in #665
  • Bundle ID migration: com.herotools.openwispr → com.gizmolabs.openwhispr by @gabrielste1n in #669
  • feat(cli): local HTTP bridge for unified CLI by @gabrielste1n in #676
  • fix(notes): update folder selection when moving notes between folders by @gabrielste1n in #678
  • refactor: split dictation cleanup from dictation agent + per-scope LLM config by @gabrielste1n in #677
  • refactor(auth): switch desktop to Better Auth + add Microsoft sign-in by @gabrielste1n in #686
  • fix(network): proxy-aware fetches + actionable connectivity errors by @gabrielste1n in #687
  • chore(release): 1.7.0 confidence cleanup by @gabrielste1n in #689
  • ci(release): rename VITE_NEON_AUTH_URL → VITE_AUTH_URL by @gabrielste1n in #690
  • fix(auth): initiate desktop OAuth from browser, not renderer by @gabrielste1n in #692
  • feat(auth): Sign in with Apple on macOS by @gabrielste1n in #691
  • fix(onnx): cap embedding segments and isolate inference in utility process by @gabrielste1n in #693
  • fix(sidecars): reap children on quit and on next launch (#683) by @gabrielste1n in #694
  • docs(changelog): plain-English 1.7.0 + lockfile refresh by @gabrielste1n in #695
  • fix(llama): raise Vulkan startup timeout, stop server before re-download by @xAlcahest in #698
  • fix(media): fall back to media key when GSMTC fails on Windows by @xAlcahest in #697
  • docs(readme): acknowledge Hugging Face as model hub by @gabrielste1n in #703
  • feat(transcribe): generate clientTranscriptionId for cloud sync dedup by @gabrielste1n in #702
  • feat(security): encrypt API keys at rest via safeStorage by @gabrielste1n in #629
  • fix(media): use AsTask bridge for WinRT async calls in GSMTC scripts by @xAlcahest in #706
  • fix(media): resume playback on no-audio-detected event by @kdenney in #701
  • feat(auth): switch desktop to Authorization: Bearer + safeStorage by @gabrielste1n in #704
  • fix(ci): compile macos-media-remote binary in release and build workflows by @kdenney in #700
  • feat(reasoning): self-hosted OpenAI-compatible parity + thinking-mode toggle by @gabrielste1n in #708
  • feat(parakeet): add parakeet-unified-en-0.6b model by @milanleonard in #713
  • feat(meeting): background recording with floating pill and responsive layout by @gabrielste1n in #709
  • fix(runtime): unblock model downloads, surface ONNX worker errors, dedupe hotkey re-register by @gabrielste1n in #716
  • fix(linux): resolve symlink in launcher wrapper by @xAlcahest in #717

New Contributors

1.6.10

20 Apr 17:34
079ee98

Choose a tag to compare

Features

  • Speaker diarization controls — per-meeting toggle, "N others in call" stepper, auto-label on 1-on-1s, fallback to "You/Others" when labeling is off
  • Integrations hub — new top-level view with MCP connectivity for Claude, ChatGPT, and Cursor; API keys moved here
  • Enterprise connections - connect to your organisations instance via bedrock

Bug Fixes & Improvements

  • Settings reorganization — "AI Models" split into Speech-to-Text and Language Models with clearer sub-tabs
  • Agent hotkey moved under Hotkeys
  • Meeting transcription tuned to reduce GPT-4o hallucinations during silence
  • Model cache opens at the correct folder so downloaded models are actually visible
  • Sync — cleared transcripts and deleted folders no longer reappear after restart
  • Echo leak detector now runs pre-AEC so it can actually see system audio bleed

1.6.9

16 Apr 15:06

Choose a tag to compare

New features

Transcript Export — Save transcripts to disk as TXT, SRT, or JSON directly from the meeting view.

Cloud Sync — Notes, folders, conversations, and transcriptions now sync bidirectionally across devices.

API Keys — Create and manage API keys from Settings to integrate OpenWhispr with your workflows programmatically.

Linux Push-to-Talk — Native push-to-talk via evdev with guided permission setup.

Auto-Paste Toggle — New setting to disable automatic pasting after dictation.

Agent Folders — The agent can now list, create, and match folders semantically when organizing notes.

Improvements

  • Upgraded to Electron 41 and Node 24
  • Redesigned provider tabs as compact pill buttons
  • Local model descriptions replaced with clickable spec links
  • llama.cpp detection now probes /v1/models and prefers /chat/completions

Fixes

  • Windows hotkey no longer false-triggers after Win+L lock/unlock
  • Fixed stuck recording when push-to-talk key-up is missed on Windows
  • Relaxed speech gate thresholds with no-audio toast for failed transcriptions
  • Retry button now shown on failed transcriptions
  • Notes sync when updated externally
  • Tuned speaker diarization to prevent excessive speaker creation
  • TLS/certificate errors now surfaced in model download UI
  • Floating icon position persists across restarts
  • Fixed notification window sizing and dismiss circle visibility

1.6.8

15 Apr 02:17

Choose a tag to compare

New features

  • Meeting recording with on-device speaker diarization — capture mic and system audio together, and every segment is automatically labeled with who said it. Diarization runs 100% locally: voices never leave your device. System audio capture now works on Windows and Linux (Chromium loopback and the native audio portal), not just macOS.
  • Local live transcription preview — see your words appear in real time as you speak, streamed from a local Whisper or Parakeet model before you release the hotkey. Fully on-device, no cloud round trip.
  • Self-hosted support — first-class option to point OpenWhispr at your own LAN server for both transcription and agent inference. One click to swap between OpenAI, Anthropic, Gemini, a local GGUF, or your self-hosted endpoint.
  • Speaker to contact linking — voice profiles bind to calendar attendees, so recurring participants get named automatically after the first tag.
  • Improved echo cancellation — WebRTC AEC now runs as a native sidecar, paired with a custom cross-correlation detector and text-level dedupe to cleanly handle external speakers, system-audio capture, and low-echo-quality mics.
  • Gemma 4 models — Gemma 4 31B and Gemma 4 26B MoE added to the local model registry.
  • Smart transcription error handling — when a transcription fails, inline CTAs let you retry, switch provider, or open logs.

Linux

  • Hyprland global hotkeys via hyprctl, plus correct active-window detection for terminal paste.
  • GNOME shortcut conflict detection, active-key display, and proper fallback when the requested shortcut is taken.

Fixes

  • Windows: installer DPI scaling corrected, GPU compositing re-enabled.
  • Local Whisper: stricter speech gate cuts hallucinated output on silent audio.
  • Diarization: session-scoped results, disk spooling for long meetings, reliable spinner, correct speaker label propagation.
  • Notes: transcripts scope to their originating note, view auto-switches when a meeting recording starts.
  • UI: fixed 420px window width that grows only with text, consistent across phases; preview overlay redesigned.
  • Settings: inference mode auto-switches when you select a new provider.
  • i18n: speaker labels, cloud settings, integrations, and calendar strings tightened across locales.
  • Stability: 22 React Compiler lint issues resolved, tighter effect dependencies, cleaner hook usage across 15 files.

Meeting AEC Helper v1.0.0

14 Apr 23:40
bc4efb0

Choose a tag to compare

Prebuilt WebRTC Audio Processing sidecar for meeting mic echo cancellation.

Binaries built from native/meeting-aec-helper/ against pinned webrtc-apm + abseil-cpp commits.

1.6.7

03 Apr 02:08

Choose a tag to compare

AI Chat & Semantic Search — Ask questions about your notes with the new embedded chat panel. Conversations sync to the cloud, and a local semantic search engine finds notes by meaning, not just keywords.

Meeting Improvements — Calendar attendees automatically appear on meeting notes, auto-detection works for browser meetings, and echo cancellation cleans up mic input.

Save Notes as Files — Export your notes to local Markdown files that mirror your folder structure.

Responsive Settings — Settings dialog adapts gracefully to smaller windows.

Bug Fixes — Resolved issues with meeting transcription, folder switching, clipboard pasting, and participant saving. Improved Linux Wayland support and Windows build signing.

Huge thanks to @xAlcahest @DamianPala and others for their contributions to platform stability and improvements!

1.6.6

23 Mar 22:55

Choose a tag to compare

Rich Text Notes

  • Notes now use a rich text editor with Obsidian-style live preview — Markdown syntax hides as you type, giving you a clean writing experience

Meeting Transcription Upgrades

  • Dual-channel transcription — mic and system audio are captured separately with speaker-labeled chat bubbles so you can see who said what
  • Timestamped segments — meeting transcripts now include timestamps in chronological order
  • Smarter meeting notes — AI-generated summaries are now speaker-aware for more accurate meeting recaps

New Local Models

  • Mistral Nemo 12B and Gemma 3 12B are now available for on-device AI processing

macOS Audio Improvements

  • Native system audio capture via CoreAudio Tap — no more "screen recording" permission prompt on macOS 14.2+
  • macOS 15+ now shows the correct system audio consent dialog instead of the legacy screen recording one

Linux

  • KDE Wayland support — native global shortcuts now work on KDE Plasma via D-Bus, joining GNOME and Hyprland support
  • Fixed KDE Plasma overlay window and hotkey behavior
  • Fixed clipboard paste reliability on KDE Wayland

Simplified Permissions

  • Permission prompts consolidated to a single "Grant Access" button
  • Permissions are now re-checked against the OS each time you open settings, so the UI always reflects your actual state

Bug Fixes

  • Fixed Gemini agent streaming routing to the wrong endpoint
  • Fixed Windows mic volume being permanently altered during dictation
  • Fixed mono transcription failure on Linux
  • Fixed Bluetooth audio issues during meetings
  • Fixed meeting detection notifications firing when you're already in a meeting
  • Fixed held modifier keys not releasing before paste on Windows
  • Fixed paused media being unpaused during dictation
  • Fixed Google OAuth users skipping onboarding
  • Fixed AI cleanup prompt refusing to transcribe command-like speech
  • Fixed agent hotkey conflicts not showing a warning

1.6.5

17 Mar 17:26

Choose a tag to compare

  • Meeting detection improvements

1.6.4

16 Mar 15:09

Choose a tag to compare

Meeting Mode

  • Meeting mode hotkey — Assign a dedicated hotkey to instantly snap the panel into meeting mode and start a new meeting note
  • Smarter meeting detection — Reduced false positives; background apps like FaceTime no longer trigger notifications unless mic activity is detected
  • Meeting detection toggle — Disable meeting detection entirely from Settings

Auto-Update

  • Update notification — A slide-in notification appears when a new version is available, with an "Update Now" button

Multi-Monitor

  • Cursor-aware positioning — The floating icon now appears on whichever monitor your cursor is on

New Models

  • GPT-5.4 — Added as the new flagship OpenAI model
  • Qwen 3.5 — New local models added; removed sub-1B models

Bug Fixes

  • Fixed paused media resuming unexpectedly on Windows when no active playback sessions exist
  • Fixed Windows hotkey listener state corruption when switching hotkeys
  • Fixed silence detection rejecting valid speech (lowered threshold)
  • Fixed Windows paste not working in Windows Terminal (scan codes now included)
  • Fixed API keys saved in Settings being overridden by shell environment variables
  • Fixed local LLM models being deleted during Windows app updates
  • Fixed hotkey tooltip display for multi-key combos on macOS
  • Fixed RPM install conflicts with other Electron apps on Linux
  • Added Tailscale VPN (CGNAT range) support for self-hosted setups

Other

  • Panel start position now persists across app restarts
  • Cross-window settings sync (hotkey changes apply instantly)
  • Account deletion flow with cloud data cleanup
  • Agent mode window renamed to "Agent Chat"

1.6.3

14 Mar 15:16

Choose a tag to compare

Clearer Permissions

  • "Screen Recording" → "System Audio" — all permission prompts now clearly state we capture other participants' audio, not your screen
  • Electron 39 — eliminates the purple "screen recording" indicator, the "Your screen is being observed" lock screen message, and the misleading permission prompt on macOS 14.2+

Better Soft Voice Recognition

  • Auto Gain Control now enabled for dictation, automatically boosting quiet speech
  • Lower VAD sensitivity so soft-spoken audio is no longer missed
  • Less clipping at the start of speech — increased padding so quiet beginnings aren't cut off

Linux

  • Hyprland Wayland support — native global shortcuts using hyprctl keybindings + D-Bus

Bug Fixes

  • Fixed wl-copy failing silently on Wayland due to a too-short timeout
  • Fixed media staying paused after recording silence with "Pause media on dictation" enabled