Live Translation on Mac: Speak One Language, Hear Another

TL;DR

Speak in one language, hear the translation in another - all running on your Mac. ToolPiper chains speech-to-text, an LLM translator, and text-to-speech into a real-time local translation pipeline. No Google Translate, no cloud, no conversation data leaving your machine.

Screencast of live translation in ModelPiper - speaking English and hearing Portuguese output

1:30

Real-time speech translation: speak one language, hear another - no cloud

You're on a video call with a colleague in São Paulo. Your Portuguese is limited to "obrigado." Their English is functional but slow. The conversation is productive but exhausting for both of you.

Google Translate exists. Apple Translate exists. But both require typing, tapping, or holding your phone up to the screen. And both send everything through their servers.

A live translation pipeline running locally on your Mac changes this interaction completely. Speak English. Hear Portuguese. Or the reverse. Real-time, on-device, no cloud.

How does the live translation pipeline work?

Live translation chains three models, same as voice chat but with a translation step in the middle.

Stage 1: Speech-to-Text. Your spoken words are transcribed. Parakeet v3 handles 25 languages - it detects which language you're speaking automatically.

Stage 2: Translation (LLM). The transcribed text is sent to a language model with a translation prompt. Modern LLMs are surprisingly good at translation - they handle idioms, context, and natural phrasing better than traditional machine translation models because they understand meaning, not just word-for-word substitution.

Stage 3: Text-to-Speech. The translated text is spoken aloud in the target language.

The result: you speak in your language, and the translation plays through your speakers (or headphones) in the target language.

How do you set up live translation in ModelPiper?

Load the Live Translate template. It's pre-wired: Audio Capture → STT → LLM (translation) → TTS → Response.

The LLM block has a system prompt configured for translation. By default, it translates to English, but you change the target language by editing the prompt - "Translate the following text to Brazilian Portuguese" or whatever you need.

What can you use local live translation for?

International business calls. Not everyone on a global team speaks the same language fluently. A local translation pipeline running alongside your video call gives you real-time support without installing third-party plugins or routing audio through cloud services.

Travel preparation. Practice conversations in a language you're learning. Speak English, hear the translation, repeat it back. The STT will transcribe your attempt so you can compare.

Content localization. Have a script or presentation in English that needs to exist in another language? Speak it through the pipeline and get both a written translation and an audio version.

Document translation with voice output. Paste text into the input block instead of using audio capture, and the pipeline translates and speaks it. Useful for reading foreign-language emails or documents aloud in a language you understand.

Why does privacy matter for translation?

Translation services see everything. Every sentence you translate through Google Translate is data that Google uses. For casual translations - restaurant menus, road signs - that's fine. For business communications, legal documents, or private conversations, it's not.

Local translation means the content of your conversations stays on your machine. No logs. No data retention. No third-party access.

How does local translation quality compare to Google Translate?

A fair question: are local models as good at translation as Google Translate or DeepL?

For major language pairs - English to Spanish, French, German, Portuguese, Chinese, Japanese - a 3B-parameter model is genuinely good. It handles conversational language, technical terminology, and idiomatic expressions well. It's not perfect, and it occasionally misses nuance, but for practical communication it's more than adequate.

For less common language pairs, cloud services still have an edge because they're trained on more parallel data. But the gap is closing fast.

Try It

Download ModelPiper, install ToolPiper, and load the Live Translate template. Edit the translation prompt to your target language. Speak something, and hear the translation.

Your words stay on your Mac. The translation happens on your hardware.

This is part of a series on local-first AI workflows on macOS. Next up: Voice Cloning - replicate any voice, entirely on your Mac.

ModelPiper live translation pipeline showing STT → LLM translation → TTS chain

Four-stage pipeline: listen, transcribe, translate, speak

Live Translation: ToolPiper vs Cloud Translation Services

	ToolPiper	Google Translate	DeepL
Privacy	All text stays on your Mac	Text sent to Google	Text sent to DeepL
Works offline	Yes	Limited (downloaded languages)	No
Cost	Free (unlimited)	Free (limited) / $20/mo API	Free (limited) / €7.49/mo
Voice input/output	Yes (full STT + TTS pipeline)	Yes (mobile only)	No
Major language pairs	Good quality	Excellent	Excellent
Rare language pairs	Limited	Excellent	Good
Customization	Full (edit translation prompt)	None	Glossary (paid)

How to get started

1
Install ToolPiper and download a model
Install ToolPiper from modelpiper.com/download or modelpiper.com. The starter model downloads automatically. For better translation quality, download a 3B+ language model from the model browser.
2
Open the Live Translate template
Load the Live Translate template in ModelPiper. It pre-wires Audio Capture → STT → LLM (translation) → TTS → Response. The pipeline handles the full speech-to-speech translation loop.
3
Set your target language
Edit the LLM block's system prompt to specify your target language - for example, "Translate the following text to Brazilian Portuguese." The STT engine auto-detects the source language from your speech.
4
Speak and hear the translation
Hit record and speak in your language. The pipeline transcribes your speech, translates it, and speaks the translation aloud. The full text of both the original and translation appears in the pipeline output.

Frequently Asked Questions

How does local translation quality compare to Google Translate?

For major language pairs - English to Spanish, French, German, Portuguese, Chinese, Japanese - a 3B-parameter local model is genuinely good. It handles conversational language, technical terminology, and idiomatic expressions well. For less common language pairs, Google Translate and DeepL still have an edge due to more training data. The gap is closing fast.

Can I use live translation during a video call?

Yes. Run the Live Translate template alongside your video call. Speak into your Mac's microphone, and the pipeline transcribes, translates, and speaks the translation through your speakers or headphones. For the other party's speech, you'd need to route their audio into the pipeline (requires AudioPiper or a virtual audio device).

What languages does local live translation support?

The STT engine (Parakeet v3) supports 25 European languages with automatic detection. The LLM can translate between any language pair it was trained on, which includes most major world languages. TTS voice quality varies by language - English is the most polished. Output quality depends on how well the specific LLM handles the target language.

Can I translate text without voice input?

Yes. Paste text directly into the input block instead of using audio capture, and the pipeline translates and optionally speaks it. This is useful for reading foreign-language emails, documents, or web content aloud in a language you understand.

TranslationSpeech to TextText GenerationText to SpeechPrivacymacOSMultilingual

Live Translation on Mac: Speak One Language, Hear Another

How does the live translation pipeline work?

How do you set up live translation in ModelPiper?

What can you use local live translation for?

Why does privacy matter for translation?

How does local translation quality compare to Google Translate?

Try It

Live Translation: ToolPiper vs Cloud Translation Services

How to get started

Install ToolPiper and download a model

Open the Live Translate template

Set your target language

Speak and hear the translation

Frequently Asked Questions

Related

AI Providers