The Features · Vol. II, No. 04

Every feature, unpacked.

Voicy is a single tool, held, spoken, released. But like a good pen, the trick is in the details. Here is the table of contents.

01 · The Gesture

Hold. Speak. Release.

One deliberate gesture, like pressing send. Voicy only listens while you hold a key. Release and the words slip into whatever app you were already in. Nothing to open, nothing to dismiss.

  • Any key on your keyboard can be the trigger: fn, right-option, caps-lock, even a side button on your mouse.
  • Hold to keep listening. Release to drop. Or set a quick-press mode if you prefer to toggle.
  • Push-to-talk learns your most-used key automatically on first run.
fn
hold0:03 · 28 wordsrelease
02 · The Capsule

A capsule that follows.

A discreet pill anchored at the bottom of your screen, so you can see what is being heard. It moves to whichever display your cursor is on, but always stays put along the bottom edge. Within glance, never in the way.

  • Always lives along the bottom edge, never covering the text you are working in.
  • Multi-display aware: jumps to whichever screen your cursor is on in 80 ms, then settles at the bottom.
  • Foldable to a single dot if you prefer something quieter.
Notes · Meeting prep
Quick agenda for tomorrow's review,
we cover the Q3 narrative, the new onboarding cohort, and the slide-count rule. I'll keep it under twelve
0:04
03 · The Edit

From rambled, to ready.

Cleanup is itself a mode. Set up an editing mode and Voicy clears the scaffolding as it sets: filler words gone, casing right, punctuation placed. Leave it off and you get plain, verbatim dictation.

  • Build it once, then "uh", "you know", and "kind of" disappear, unless you keep them in your style.
  • Sentence case, proper nouns, code identifiers, all spelled correctly.
  • Optional one-shot rewrite: dictate a rough thought, get a tightened paragraph.
What you said

um yeah can you like double-check that um the invoice went out to um Beatrice yesterday?

What Voicy set

Can you double-check the invoice went out to Beatrice yesterday?

04 · Modes

Modes you build.

Voicy ships with no ready-made modes. A mode is a short instruction you write once, a tone, a format, a vocabulary, then pin to an app. Start from a type like Email, Developer, or Translate, or a blank prompt, and make it yours.

  • Start from a type (Raw, Email, Developer, Translate) or a fully custom prompt of your own.
  • Pin a mode per app, so the same sentence comes out right in Mail and in your terminal.
  • Shape each one from a single example paragraph, and Voicy learns the rest.
One sentence, four modes you built
Raw
verbatim
send it now, gimme like ten minutes yeah
Email
cordial
I’ll have it across in ten minutes, thanks for the patience.
Developer
technical
shipping in ~10m, follow-up to come.
Custom
your prompt
The update will land within the quarter-hour; further notes to follow.
05 · The Brain

Teaches itself your words.

Names, projects, ticket IDs, internal jargon: Voicy keeps a private dictionary that grows as you use it. The third time you say "Aurelius", it knows.

  • Personal dictionary persists between machines if you sign in.
  • Per-app vocabulary so "VCY-204" only resolves in Linear.
  • Snippets: short triggers like /sig that expand into longer phrases.
Your dictionary312 entries · 6 apps
·VCY-204Linear ticket
·AureliusProject name
Lena Vukoviccolleague
·Bismarckstr.Address
/sigsnippet · Best, Adrian
·Heatmap-SlideGlossary · keep as-is
06 · Languages

12 languages. Auto-detect.

Voicy speaks the languages you do, twelve today. Auto-detect picks the right one in 300 ms, or lock a specific language per app. Pair source and output for live translation.

  • Twelve languages: DE, EN, FR, ES, IT, NL, PT, PL, SV, TR, JA, ZH.
  • Speak in German, write in English. Speak Japanese, paste it as French. Any pair.
  • Auto-detect for code-switchers: "ich brauch noch zehn minutes" gets the right call.
  • Locked-language mode for accents the engine sometimes mishears.
Spoken (DE)

Ich glaube wir sollten den Hotkey lernen lassen, anstatt ihn vorzuschreiben.

Set (EN · Slack)

I think we should let the hotkey learn itself, rather than prescribe it.

🇩🇪 DE🇺🇸 EN🇫🇷 FR🇪🇸 ES🇮🇹 IT🇯🇵 JA🇳🇱 NL🇵🇹 PT🇸🇪 SE🇹🇷 TR🇵🇱 PL🇨🇳 ZH
07 · Studio

Drop a recording. Get every word.

A separate part of the app for files: drag an interview, a memo, a board call. Voicy runs the same engine and hands back a transcript with speakers, timestamps, and editable segments.

  • Accepts m4a, mp3, wav, mp4, mov, m4v, anything QuickTime can play.
  • Speaker diarisation, configurable.
  • Export as .txt, .md, .srt, or copy a clean editorial draft.
Board-call_May-12.m4a
24.7 MB · 12:18 · 3 speakers · DE
● TranscribingSpeakers · ONTimestamps · ON
08 · On-Device

Stays on your Mac.

The engine ships with the app. Audio never leaves your machine. There is no transcript server, no sign-in, no telemetry by default, and never any audio uploads.

  • Runs on Apple Silicon at 4× realtime.
  • No account required. Buy with a card, get a license file.
  • Open the Network tab and watch: the app makes zero outbound calls in dictation mode.
Offline mode · ACTIVE
Engine v2.4.1 · 612 MB on disk
$ lsof -p $(pgrep Voicy) | grep -i 'tcp'
(no output: zero outbound connections)
your words stay yours.verified · since 2024