Under the hood

AI models in Vowen

Vowen runs best-in-class speech models — on your device for privacy and offline use, or in the cloud for maximum speed. Pick the one that fits the job.

How to read this

Two kinds of models, one app

On-device models run entirely on your computer. Your audio never leaves the machine and you don't need an internet connection — ideal for private, offline, or regulated work. Cloud models send audio to a provider for the fastest possible turnaround or a specific accuracy profile. Vowen supports both, and you choose per situation.

On-device

Private, offline transcription

ModelProviderRunsBest for
Parakeet TDT v3NVIDIAOn-deviceFastest on-device transcription; great real-time dictation on Apple Silicon.
Parakeet TDT v2NVIDIAOn-deviceProven, low-latency model for live voice typing.
Parakeet CTC 0.6BNVIDIAOn-deviceLightweight rescorer used for custom-vocabulary accuracy.
Whisper Large v3OpenAIOn-deviceHighest on-device accuracy across 100+ languages.
Whisper Large v3 TurboOpenAIOn-deviceNear-large accuracy at a fraction of the compute.
Whisper (small / medium)OpenAIOn-deviceSmaller footprints for older or lower-RAM machines.

Cloud

Fastest turnaround, by choice

ModelProviderRunsBest for
Whisper Large v3 / TurboGroqCloudExtremely fast cloud transcription via Groq's LPU.
Nova-3 / Nova-2DeepgramCloudFast, accurate streaming with strong punctuation.
ScribeElevenLabsCloudHigh-accuracy transcription with robust formatting.
Ink-WhisperCartesiaCloudLow-latency real-time streaming for live dictation.
SonioxSonioxCloudMultilingual transcription with speaker context.
VoxtralMistralCloudOpen-weight model with strong multilingual coverage.
gpt-4o-transcribeOpenAICloudFrontier-model accuracy for tough audio.

On-device by default, cloud by choice

Vowen defaults to processing on your machine. If you never want audio to leave your device, pick an on-device model and you're fully offline — no account, no upload, nothing stored on a server.

Support

Common questions.

What's the difference between Parakeet and Whisper?
Both run on your device. Parakeet (from NVIDIA) is optimized for speed and excels at real-time dictation, especially on Apple Silicon. Whisper (from OpenAI) tends to be the most accurate across a very wide range of languages and accents. Vowen lets you pick based on what you value most.
Do I need an internet connection?
No. With an on-device model (Parakeet or Whisper), Vowen transcribes entirely on your machine — no internet, and your audio never leaves the device. Cloud models are optional, for when you want maximum speed or a specific provider.
Which model is the most accurate?
For most languages, Whisper Large v3 on-device and frontier cloud models like gpt-4o-transcribe are at the top for accuracy. For speed, Parakeet on-device and Groq's Whisper in the cloud are hard to beat. The best choice depends on your hardware and whether you need offline.
Are cloud models private?
On-device models keep everything local by default. If you choose a cloud model, audio is sent to that provider only to produce the transcript. For sensitive or regulated work, use an on-device model to keep audio on your machine.
Can I switch models per task?
Yes. You can choose different models for different situations — a fast on-device model for quick dictation, or a higher-accuracy model for important transcripts.

One app, every model.

Run speech recognition your way — on-device or in the cloud. Free tier that doesn't expire.