Type "best AI chat app for Mac" into a search box and you get a list: BoltAI, Msty, Jan, LM Studio, a few more. Every screenshot looks the same. A sidebar of conversations, a model picker, a text box waiting for a question. And the screenshots are hiding the only distinction that matters.

Two kinds of software live in that lineup, and they photograph identically. One kind is an interface to intelligence that lives somewhere else. The other kind runs the intelligence. A chat window is one feature of an AI stack. The engine is the stack. Asking which chat app is best, before you've decided what runs the models, is choosing curtains before the house.

One distinction sorts the whole category, and once you see it, every comparison list on the internet reads differently - including ours. The product-by-product matrix (five tools, fifteen rows) lives in our platforms roundup. This page is about the two shapes.

What's the difference between an AI chat client and a local AI engine?

A chat client is an interface to models that run somewhere else. You bring API keys for cloud providers, or you point it at a local runner you installed separately. A local AI engine runs the models itself: it downloads the weights, does inference on your Mac's silicon, and serves a local API that other apps, including chat clients, connect to.

BoltAI is the clean example of a client. It's one of the best-built indie apps on the Mac - fully native, fast, refined over years - and it brings API keys to ten-plus cloud providers or points at a local server like Ollama or LM Studio, because it doesn't run models itself. For local AI, BoltAI's own docs walk you through installing a runner first. The client is the product. The models are your problem.

Msty sits a step closer to the metal without crossing the line. It manages engines rather than building one. The default local backend is a bundled, renamed Ollama, and it can also supervise llama.cpp and MLX services it runs in the background. The engine-management UX is genuinely good. It's still management, though. Someone else wrote every engine in the app.

On the engine side of the line, LM Studio bundles two engines, llama.cpp and Apple's MLX, behind the best model-runner GUI in the category. ToolPiper embeds upstream llama-server directly - build b9533, the same engine llama.cpp ships - and serves an OpenAI-compatible API on localhost:9998, with chat, voice, and tool surfaces built around it.

And then there's Jan, which blurs the line and earns credit for it. Jan bundles llama.cpp directly, so it's an engine and a chat app in one binary - the honest counterexample to a clean split. It's also the real open-source option in this lineup: Apache 2.0, free, no paid tier of any kind. Its gap is scope rather than architecture. No voice, MCP client only, no automation. Jan proves the categories can merge. It just hasn't gone far past chat once they do.

Why is "best AI chat app" the wrong first question?

Because every chat client needs something to talk to, and that something - a cloud provider or a local engine - decides your privacy, your monthly cost, and what works offline. Clients are swappable in minutes. The engine choice shapes everything downstream of it.

Look at what each layer actually controls. The client controls conversation folders, prompt libraries, keyboard shortcuts, how the markdown renders. The engine controls whether your prompts leave the machine, whether a heavy month adds another API bill or costs nothing, whether anything answers on a plane, and which models exist for the client to show in its picker. One of those lists is worth agonizing over. It isn't the first one.

Swappability runs the same direction. Every client here speaks OpenAI-compatible APIs, so moving between chat windows is an afternoon of re-pointing an endpoint and exporting some conversations. The engine is the decision with weight. It's where your model files live, what your other tools connect to, and in ToolPiper's case, what your coding agents call for tools. You can change curtains every season. Nobody re-pours a foundation because a nicer window treatment came out.

Which is why best-AI-chat-app lists read so much alike. They rank the window while the part that matters gets outsourced to "whatever you have running."

What still works when you unplug the internet?

Disconnect Wi-Fi, delete your API keys, and watch what survives. That's the cleanest test of which layer an app lives on: an engine keeps chatting, transcribing, and serving its local API. A client with no keys and no local runner is an empty window.

Run the lineup through it. BoltAI without API keys or a separate runner is an empty shell - that's not a knock, it's the design. Msty's free tier keeps chatting, because there's a bundled engine inside, but voice goes dark. Audio transcription in Msty rides your OpenAI key, and no offline speech path is documented. LM Studio keeps chat, document chat, and its local server running - there's no voice to test, since LM Studio Transcribe was still an unreleased waitlist page as of June 2026. Jan keeps working. ToolPiper keeps the most running - chat, transcription on the Neural Engine, visual pipelines, and the MCP server with over 300 tools across 26 macOS domains - because zero outbound calls is the architecture, not a setting.

The table is the category version of the same test.

When is a dedicated chat client worth adding?

Add a client on top of your engine when your daily work is cloud-API chat across several providers, or when a client feature is the best of its kind. BoltAI's AI Command palette and Msty's Split Chats both clear that bar.

This is where the argument has to be generous, because the good clients are genuinely good. AI Command is BoltAI's case in one feature. Select text in any app, hit a hotkey, pick from 38+ commands or type a request, get the result in place. For ad-hoc work on a highlighted paragraph it's the better affordance than anything in our own stack, full stop. The provider matrix is the other case - ten-plus clouds in one window, and it grows fast because aggregating providers is BoltAI's core business. It's $79 to $99 one-time with a year of updates and an optional renewal at 60% of the purchase price after that.

Msty's flagship is Split Chats - the same prompt running against multiple models side by side - and it's best in class. ToolPiper doesn't match it. The free desktop tier is real, needs no account, and the paid Aurum tier is $149 per user per year or $349 lifetime as of June 2026.

Notice what you're buying in both cases, though. A better window. The window assumes the house is already there - BoltAI by pointing at your keys or your runner, Msty by bundling someone else's engines under management. Worth paying for when one of those features is your daily verb. Not the place to start.

What order should you buy in?

Engine first, client second, API keys last. Pick the thing that runs the models, live with its built-in chat for a week, and only then decide whether a dedicated client earns a place on top of it.

Step one, pick the engine. LM Studio if model discovery and tuning is the priority and you want the best cross-platform model browser. It's free for home and work, with the caveats that it's an MCP client only and has no voice. Jan if open source is your filter - you can read the code, and there's nothing to pay, ever. ToolPiper if you want the engine and the platform in one app. The runner is free with no account: embedded upstream llama-server b9533, unlimited GGUF downloads stored as plain files any tool can load, the OpenAI-compatible API on localhost:9998. So are chat, transcription, pipelines, and the MCP server that hands Claude Code or Cursor 300+ tools on your Mac. Pro is $10/month for push-to-talk dictation, text-to-speech, local RAG, and all nine backends. Studio ($29) adds media tools, Max ($49) adds dev tools.

Step two, live with the engine's own chat. A week of real work answers most of the question. You'll either notice nothing missing, which is the common outcome, or you'll notice a specific absence with a name - a palette over selected text, side-by-side models, one particular provider.

Step three, add the client that fixes the named absence. AI Command means BoltAI. Split Chats means Msty. Either one points at the engine you already chose, because everything here speaks the same OpenAI-compatible protocol. That's the right relationship between the layers: the engine is the commitment, the client is the part you can change your mind about over lunch.

Download ToolPiper at modelpiper.com/download - free, no account, a local model answering in about a minute.

See also the BoltAI head-to-head, our explainer on what a local AI engine is, and the full platform roundup.