---
title: "Live Translation on Mac: Speak One Language, Hear Another"
description: "Real-time speech translation running entirely on your Mac - speak English, hear Portuguese. No cloud, no Google Translate, no data leaving your machine."
date: 2026-03-10
author: "Ben Racicot"
tags: ["Translation", "Speech to Text", "Text Generation", "Text to Speech", "Privacy", "macOS", "Multilingual"]
type: "article"
canonical: "https://modelpiper.com/blog/live-translation-mac-local/"
---

# Live Translation on Mac: Speak One Language, Hear Another

> Real-time speech translation running entirely on your Mac - speak English, hear Portuguese. No cloud, no Google Translate, no data leaving your machine.

## TL;DR

Speak in one language, hear the translation in another - all running on your Mac. ToolPiper chains speech-to-text, an LLM translator, and text-to-speech into a real-time local translation pipeline. No Google Translate, no cloud, no conversation data leaving your machine.

You're on a video call with a colleague in São Paulo. Your Portuguese is limited to "obrigado." Their English is functional but slow. The conversation is productive but exhausting for both of you.

Google Translate exists. Apple Translate exists. But both require typing, tapping, or holding your phone up to the screen. And both send everything through their servers.

A live translation pipeline running locally on your Mac changes this interaction completely. Speak English. Hear Portuguese. Or the reverse. Real-time, on-device, no cloud.

## How does the live translation pipeline work?

Live translation chains three models, same as voice chat but with a translation step in the middle.

**Stage 1: Speech-to-Text.** Your spoken words are transcribed. Parakeet v3 handles 25 languages - it detects which language you're speaking automatically.

**Stage 2: Translation (LLM).** The transcribed text is sent to a language model with a translation prompt. **Modern LLMs are surprisingly good at translation - they handle idioms, context, and natural phrasing better than traditional machine translation models** because they understand meaning, not just word-for-word substitution.

**Stage 3: Text-to-Speech.** The translated text is spoken aloud in the target language.

The result: you speak in your language, and the translation plays through your speakers (or headphones) in the target language.

## How do you set up live translation in ModelPiper?

Load the **Live Translate** template. It's pre-wired: Audio Capture → STT → LLM (translation) → TTS → Response.

The LLM block has a system prompt configured for translation. By default, it translates to English, but you change the target language by editing the prompt - "Translate the following text to Brazilian Portuguese" or whatever you need.

## What can you use local live translation for?

**International business calls.** Not everyone on a global team speaks the same language fluently. A local translation pipeline running alongside your video call gives you real-time support without installing third-party plugins or routing audio through cloud services.

**Travel preparation.** Practice conversations in a language you're learning. Speak English, hear the translation, repeat it back. The STT will transcribe your attempt so you can compare.

**Content localization.** Have a script or presentation in English that needs to exist in another language? Speak it through the pipeline and get both a written translation and an audio version.

**Document translation with voice output.** Paste text into the input block instead of using audio capture, and the pipeline translates and speaks it. Useful for reading foreign-language emails or documents aloud in a language you understand.

## Why does privacy matter for translation?

Translation services see everything. Every sentence you translate through Google Translate is data that Google uses. For casual translations - restaurant menus, road signs - that's fine. For business communications, legal documents, or private conversations, it's not.

**Local translation means the content of your conversations stays on your machine. No logs. No data retention. No third-party access.**

## How does local translation quality compare to Google Translate?

A fair question: are local models as good at translation as Google Translate or DeepL?

For major language pairs - English to Spanish, French, German, Portuguese, Chinese, Japanese - a 3B-parameter model is genuinely good. It handles conversational language, technical terminology, and idiomatic expressions well. It's not perfect, and it occasionally misses nuance, but for practical communication it's more than adequate.

For less common language pairs, cloud services still have an edge because they're trained on more parallel data. But the gap is closing fast.

## Try It

Download [ModelPiper](https://modelpiper.com), install ToolPiper, and load the Live Translate template. Edit the translation prompt to your target language. Speak something, and hear the translation.

Your words stay on your Mac. The translation happens on your hardware.

_This is part of a series on [local-first AI workflows on macOS](/blog/local-first-ai-macos). Next up: [Voice Cloning](/blog/voice-cloning-mac-local) - replicate any voice, entirely on your Mac._

## Steps

### 1. Install ToolPiper and download a model

Install ToolPiper from modelpiper.com/download or modelpiper.com. The starter model downloads automatically. For better translation quality, download a 3B+ language model from the model browser.

### 2. Open the Live Translate template

Load the Live Translate template in ModelPiper. It pre-wires Audio Capture → STT → LLM (translation) → TTS → Response. The pipeline handles the full speech-to-speech translation loop.

### 3. Set your target language

Edit the LLM block's system prompt to specify your target language - for example, "Translate the following text to Brazilian Portuguese." The STT engine auto-detects the source language from your speech.

### 4. Speak and hear the translation

Hit record and speak in your language. The pipeline transcribes your speech, translates it, and speaks the translation aloud. The full text of both the original and translation appears in the pipeline output.

## FAQ

### How does local translation quality compare to Google Translate?

For major language pairs - English to Spanish, French, German, Portuguese, Chinese, Japanese - a 3B-parameter local model is genuinely good. It handles conversational language, technical terminology, and idiomatic expressions well. For less common language pairs, Google Translate and DeepL still have an edge due to more training data. The gap is closing fast.

### Can I use live translation during a video call?

Yes. Run the Live Translate template alongside your video call. Speak into your Mac's microphone, and the pipeline transcribes, translates, and speaks the translation through your speakers or headphones. For the other party's speech, you'd need to route their audio into the pipeline (requires AudioPiper or a virtual audio device).

### What languages does local live translation support?

The STT engine (Parakeet v3) supports 25 European languages with automatic detection. The LLM can translate between any language pair it was trained on, which includes most major world languages. TTS voice quality varies by language - English is the most polished. Output quality depends on how well the specific LLM handles the target language.

### Can I translate text without voice input?

Yes. Paste text directly into the input block instead of using audio capture, and the pipeline translates and optionally speaks it. This is useful for reading foreign-language emails, documents, or web content aloud in a language you understand.
