Header image for Getting Handy to work in GNOME  Wayland (Ubuntu 25.10+)

Getting Handy to work in GNOME Wayland (Ubuntu 25.10+)

Background & what we’re trying to achieve

Handy (CJ Pais's Tauri 2.x speech-to-text desktop app, v0.8.3 Linux AppImage) is the chosen dictation tool. The required workflow:

  • Press a global keyboard shortcut to start recording.
  • Speak.
  • Press the shortcut again (toggle mode) to stop. Handy transcribes locally via Whisper.
  • The transcribed text appears in the focused field of whichever app the cursor was in (terminal, GUI text editor, browser text field, Electron app -- universal).

On Ubuntu 25.10 / GNOME 49 / Wayland, neither of those steps works out of the box. The keyboard shortcut only fires when Handy's window itself is focused, and Handy's default text-delivery mechanism fails because Mutter doesn't implement the protocol it depends on. Both problems are solvable from outside Handy with no upstream patch. Here’s how.

The solution overview

The dictation workflow has two distinct failure points on GNOME Wayland, each requiring a "go-around" fix. Neither can be addressed by changing Handy's settings alone. Both fixes work by replacing an in-process mechanism that GNOME Wayland refuses to support with an external mechanism that the system already permits.

Failure Why it fails on GNOME Wayland The go-around
Global hotkey doesn't fire when Handy is unfocused Tauri's global-shortcut plugin grabs keys via Wayland's keyboard listener, which by design only delivers to the focused surface. So Tauri's "global" shortcut is in fact window-local. A GNOME custom keybinding at the WM level. GNOME's keybinding handler runs in gnome-shell and fires regardless of which client has focus. It invokes Handy's CLI which IPCs the running daemon.
Transcribed text isn't typed/pasted into the focused window Handy's typing tool defaults route through wtype, which needs the zwp_virtual_keyboard_v1 Wayland protocol. Mutter (GNOME’s window manager & compositor) intentionally does not implement it (it's a wlroots-only protocol). Every paste method that simulates a keystroke through wtype fails. An External Script paste method invoking ydotool type, which synthesizes input via /dev/uinput. uinput is kernel-level -- the kernel forges input events that propagate through libinput to whichever app is focused, with no need for any Wayland protocol Mutter is missing.

See the appendix for why simpler fixes don’t work.

A note on the Tauri Typing Tool dropdown

Handy's UI on this system shows only Auto (Recommended) and wtype in the Typing Tool dropdown. The v0.8.3 source code at src-tauri/src/settings.rs actually defines the full enum TypingTool { Auto, Wtype, Kwtype, Dotool, Ydotool, Xdotool }. Handy's UI hides options whose backing binary isn't on PATH. After apt install ydotool, Ydotool does appear in the dropdown -- but it expects 1.x semantics, so it doesn't actually work with the apt version. That's why we use External Script instead of flipping the dropdown.

Implementing the fixes

NB — you’ll want to search and replace “\<your username>” with your actual username.

Problem 1: global hotkey delivery

Step 1: Bind a GNOME custom keyboard shortcut (with absolute path)

GNOME Settings -> Keyboard -> Custom Shortcuts, or via gsettings:

# Inspect current custom-keybindings list first:
gsettings get org.gnome.settings-daemon.plugins.media-keys custom-keybindings
# Add an entry path if needed:
gsettings set org.gnome.settings-daemon.plugins.media-keys custom-keybindings \\
  "['/org/gnome/settings-daemon/plugins/media-keys/custom-keybindings/custom1/']"

gsettings set org.gnome.settings-daemon.plugins.media-keys.custom-keybinding:/org/gnome/settings-daemon/plugins/media-keys/custom-keybindings/custom1/ name 'Handy Transcription'
gsettings set org.gnome.settings-daemon.plugins.media-keys.custom-keybinding:/org/gnome/settings-daemon/plugins/media-keys/custom-keybindings/custom1/ command '/home/<your username>/.local/bin/handy --toggle-transcription'
gsettings set org.gnome.settings-daemon.plugins.media-keys.custom-keybinding:/org/gnome/settings-daemon/plugins/media-keys/custom-keybindings/custom1/ binding '<Control>space'

Critical detail: the Command field must be an absolute path. GNOME's keybinding handler runs with the session PATH set at graphical login from ~/.config/environment.d/, not from ~/.zshrc. On Ubuntu 25.10 that PATH is /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/snap/bin -- no ~/.local/bin. A bare handy --toggle-transcription silently fails to resolve. (Long-term cleaner fix: add PATH=...:%h/.local/bin to ~/.config/environment.d/path.conf, which makes the keybinding handler and any future custom shortcuts find user-local binaries automatically. Optional.)

Step 2: Ensure Handy daemon is running

handy --toggle-transcription is a DBus IPC to the running Handy daemon. With no daemon running, the command exits 0 with no effect (a particularly silent failure mode). Handy creates an autostart entry at ~/.config/autostart/Handy.desktop automatically when autostart_enabled: true is set in its settings. Edit that file's Exec= line to add --start-hidden if you don't want Handy's window to pop visibly on every login:

Exec=/home/<your username>/.local/bin/handy --start-hidden

To start the daemon for the current session without rebooting:

nohup /home/<your username>/.local/bin/handy --start-hidden >/tmp/handy-startup.log 2>&1 &
disown

Problem 2: text delivery via ydotool

Step 3: Install ydotool and grant uinput access

sudo apt install -y ydotool
echo 'KERNEL=="uinput", GROUP="input", MODE="0660", OPTIONS+="static_node=uinput"' \\
  | sudo tee /etc/udev/rules.d/60-uinput.rules
echo uinput | sudo tee /etc/modules-load.d/uinput.conf
sudo modprobe uinput && sudo udevadm control --reload-rules && sudo udevadm trigger

What each piece does:

  • apt install ydotool -- installs ydotool 0.1.8-3build1 (the legacy monolithic version, no separate daemon; talks to /dev/uinput directly).
  • The udev rule -- when the kernel creates the uinput device, set its group to input and mode to 0660 so members of the input group can write to it without sudo. The OPTIONS+="static_node=uinput" clause makes systemd-udev pre-create the node with the right permissions even before any uevent fires (relevant because uinput is module-loaded, not auto-discovered).
  • /etc/modules-load.d/uinput.conf -- causes systemd-modules-load.service to modprobe uinput at boot. Without this, /dev/uinput won't exist after reboot until something else loads the module.
  • The modprobe + udevadm chain -- applies everything immediately without a reboot: load the module now, reload the rules, replay uevents so the new rule's permissions take effect on the just-created node.

Confirm the user is in the input group (groups should list it; was already true on this system as Ubuntu 25.10 default).

Step 4: Write the External Script

Save as /home/<your username>/.local/bin/handy-paste-wl-copy (name retained for historical reasons -- this script no longer uses wl-copy):

#!/usr/bin/env bash
# Handy External Script: type transcribed text directly into the focused window
# via ydotool's `type` subcommand. At --key-delay 0 this is faster than paste
# for typical dictation lengths and works universally (terminals, GUI editors,
# Notion, etc.) without depending on each app's paste-shortcut convention.

set -u

TEXT="${1:-}"
[ -z "${TEXT}" ] && exit 0

env -i \\
    HOME="${HOME}" \\
    USER="${USER:-<your username>}" \\
    LOGNAME="${LOGNAME:-${USER:-<your username>}}" \\
    PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \\
    LANG="${LANG:-en_US.UTF-8}" \\
    /usr/bin/ydotool type --key-delay 0 "${TEXT}" >/dev/null 2>&1

exit 0

Make it executable: chmod +x /home/<your username>/.local/bin/handy-paste-wl-copy.

Key design choices:

  • env -i sanitization. Handy's AppImage runtime prepends its own bundled lib/bin paths to LD_LIBRARY_PATH, PYTHONHOME, PYTHONPATH, GTK_PATH, etc. for the AppImage process and its children. Without scrubbing, the system /usr/bin/ydotool would attempt to dynamic-link against the AppImage's bundled libs, which can fail silently or behave erratically. env -i clears everything and we explicitly pass only what ydotool needs (PATH, HOME, USER, LOGNAME, LANG). Note that WAYLAND_DISPLAY and XDG_RUNTIME_DIR are not needed by ydotool because uinput is kernel-level, not a Wayland client.
  • ydotool type with -key-delay 0. Tested empirically: 225 chars completes in \~280ms with default -delay 100 initial wait. The default key-delay (12ms) would make this 2.8s -- noticeable. Zero delay is reliable on this system for typical dictation lengths.
  • No clipboard touch. Unlike a clipboard + paste-shortcut pattern, this leaves your clipboard contents alone. Earlier versions of the script wrote via wl-copy and dispatched Ctrl+Shift+V via ydotool; that approach worked in Ghostty but failed in GNOME Text Editor (which doesn't accept Ctrl+Shift+V as paste). Switching to ydotool type removed the app-by-app paste-shortcut compatibility problem entirely.

Step 5: Configure Handy

In Handy's settings (Advanced tab):

  • Paste Method: External Script
  • External Script Path: /home/<your username>/.local/bin/handy-paste-wl-copy
  • Clipboard Handling: Don't Modify (cosmetic; our script doesn't touch the clipboard, so this preference is honored implicitly).

These map to fields in ~/.local/share/com.pais.handy/settings_store.json:

"paste_method": "external_script",
"external_script_path": "/home/<your username>/.local/bin/handy-paste-wl-copy",
"clipboard_handling": "dont_modify",

Verification

Confirm each link in the chain works.

ydotool can drive /dev/uinput:

ls -la /dev/uinput
# Expected: crw-rw---- root input ... (group input, mode 660)
test -w /dev/uinput && echo "user can write" || echo "DENIED"
ydotool key shift 2>&1
# Should print "ydotoold backend unavailable" (informational, not an error) and exit 0.

ydotool type delivers fast enough:

LONG=$(printf 'the quick brown fox jumps over the lazy dog. %.0s' {1..5})
echo "Length: ${#LONG}"
time ydotool type --key-delay 0 "$LONG"
# Expected: ~250-350ms wall-clock for ~225 chars. Output should land in the focused window with no dropped characters.

Handy delegates to the script:

After triggering a transcription, the Handy log at ~/.local/share/com.pais.handy/logs/handy.log should show:

[handy_app_lib::clipboard][INFO] Using paste method: ExternalScript, delay: 60ms
[handy_app_lib::clipboard][INFO] Pasting via external script: /home/<your username>/.local/bin/handy-paste-wl-copy
[handy_app_lib::actions][DEBUG] Text pasted successfully in Nms

If the "Text pasted successfully" line shows a duration much greater than \~50ms, the script is hanging -- typically because a subprocess is daemonizing without closing inherited file descriptors. (Earlier wl-copy-based versions of the script hit this; ydotool type doesn't, but it's the canary to watch for.)

Troubleshooting / failure modes

  • Pressing the shortcut does nothing. Check gsettings get .../custom1/ command shows an absolute path. A bare handy --toggle-transcription won't resolve from gnome-shell's PATH.
  • Pressing the shortcut transcribes but no text appears. Tail ~/.local/share/com.pais.handy/logs/handy.log while testing. Look for the "Pasting via external script" line. If absent, Handy isn't invoking the script -- check Paste Method = External Script and the path field. If present but no text appears, run the script manually with a test argument to verify it works in isolation:
    Should type "manual-test" into whatever's currently focused.
/home/<your username>/.local/bin/handy-paste-wl-copy "manual-test"
  • Wrong characters appear (e.g. "2442" instead of text). Means ydotool got the wrong syntax (likely a confusion between 0.1.8 key-name and 1.x keycode:state). Confirm the script uses ydotool type (not ydotool key 29:1 47:1 ...).
  • Garbled characters with non-US keyboard layout. ydotool 0.1.8's type assumes US QWERTY. Non-US layouts mistranslate. Realistic fixes: switch to dotool (layout-aware; build from sr.ht source -- not in apt) or pursue the RemoteDesktop portal path (Handy upstream PR #689).
  • Handy is ready notification but the window doesn't focus when you click Settings in the tray menu. Separate problem -- Handy's Tauri tray icon doesn't expose the standard .Activate / .ContextMenu D-Bus methods, only .Scroll and .SecondaryActivate. This is a Tauri/dbusmenu activation-token gap, not a GNOME limitation. Workarounds: Alt+Tab to Handy, pin to dock, or middle-click the tray icon (which fires .SecondaryActivate). Upstream fix would need to come from Tauri.

Appendix: Why simpler fixes don't work

Here’s what didn’t work, and why.

Path Verdict Reason
Just use Tauri's in-app shortcut Half-dead Only fires while Handy is focused -- defeats the purpose of dictation
GNOME custom shortcut handy --toggle-transcription (bare name) Dead ~/.local/bin is NOT on gnome-shell's session PATH; the bare command silently fails to resolve. Absolute path required.
GNOME custom shortcut without a running Handy daemon Dead handy --toggle-transcription is an IPC to the running daemon; without daemon, exits 0 with no effect. Daemon must be autostarted.
Paste Method = Direct (Handy default) on GNOME Dead Routes through wtype -> fails on Mutter
Paste Method = Clipboard (Ctrl+V) Dead Writes to clipboard then sends Ctrl+V via wtype -> same Mutter failure. Worse: Handy's error-path tears down its wl-copy child process, so the clipboard ends up empty anyway.
QT_QPA_PLATFORM=xcb to force XWayland Dead Flatpak/AppImage's bundled Qt runtime lacks libxcb-cursor; DISPLAY isn't forwarded into the sandbox
Built-in Typing Tool = Ydotool with apt's ydotool 0.1.8 Dead Handy's dispatch expects ydotool 1.x (client + daemon, keycode:state syntax). Apt ships 0.1.8 (monolithic, no daemon, key-name syntax). Version mismatch -- the wrong syntax gets used and "2442" gets typed instead of Ctrl+V.
ydotool key 29:1 47:1 47:0 29:0 (1.x syntax) Dead 0.1.8 parses each token as an unknown key name, falls back to typing the first digit of each. "2442" is the literal evidence.

Feature Image Prompt:

Generate an image. The aesthetic should be cyberpunk with colors of neon pink, blue and purple. Do not add any people. Depict a glowing soundwave entering from the left and transforming into a stream of luminous text characters flowing toward a focused input field on the right. Between them, two heavily fortified barrier walls labeled with subtle compositor/protocol motifs block the direct path -- each wall stamped with a glowing "access denied" hexagon. Around each wall, a bright neon conduit reroutes the signal in a smooth go-around arc: the first detour loops up through a system-level keyboard sigil etched into the architecture, the second tunnels down through a deep kernel-level channel that emerges past the barrier. The two reroute paths converge and deliver the glowing text into the waiting field, which lights up to confirm delivery. Render everything as sleek translucent circuitry and data conduits against a dark high-tech backdrop, with volumetric neon glow, fine grid lines, and a sense of signal flowing around obstacles rather than through them.