Set up your providers once. Then say "switch to my local Qwen" or "use OpenAI for this" — Claude Code calls ToolPiper's endpoint_set MCP tool and the next prompt routes through the new backend. No env vars, no shell restart, no editor restart.
In ToolPiper → Endpoints, you should see at least two entries — for example "Apple Intelligence (local)" and "OpenAI gpt-4o-mini (your key)". Both must be enabled. Open Claude Code and type /model to confirm both appear in the picker.
In Claude Code's /model picker, pick whichever you want as the default for this session. From here on, Claude Code routes through that endpoint until you say otherwise.
In the chat itself — not the slash-command picker — say one of these:
"Switch to my local Qwen one."
"Try Apple Intelligence for this."
"Use my OpenAI endpoint."
"What model am I using right now?"Behind the scenes: Claude Code parses your sentence, calls ToolPiper's endpoint_set MCP tool with scope: "session", ToolPiper updates the dev-token binding, and the next prompt routes through the new backend. No restart, no env-var change, no settings edit.
Hit a long-context problem? Say "I need more context for this." Claude Code calls endpoint_recommend — ToolPiper looks at your conversation size, your endpoints' context windows, and your recent latency telemetry, then auto-applies the right backend with a verbatim notice you can override.
Users don't learn API surfaces. They say "use my local one" and it works. ToolPiper makes the routing decision; the model surfaces it back in plain English.
Vanilla Anthropic doesn't know about your local Qwen. Vanilla Ollama doesn't know about your OpenAI key. ToolPiper sees both and reasons across them — context fit, capability, latency, free-vs-paid.
Your chat history, system prompt, and tool definitions stay intact across the switch. Only the inference backend changes; the model on the other side picks up where the last one left off.
You give them human-readable names ("Apple Intelligence (local)", "OpenAI gpt-4o-mini"). When you say "use my local one", Claude Code passes the request to <code class="font-mono text-xs text-white/70">endpoint_set</code> with the name; ToolPiper matches it. If you're ambiguous ("switch the model"), ToolPiper calls <code class="font-mono text-xs text-white/70">endpoint_list</code> and Claude Code asks which one.
Two safeguards. (1) ToolPiper's default-endpoint setting is a single dropdown — it's the default until you switch. Switches are explicit. (2) <code class="font-mono text-xs text-white/70">endpoint_recommend</code> includes a "local first" tiebreaker; if the local model fits, it's suggested even when OpenAI would also work.
Yes. Create a per-token binding in ToolPiper → Docs → Claude Code → Tokens. Bind that token to a specific endpoint, then export it in the shell as <code class="font-mono text-xs text-white/70">ANTHROPIC_API_KEY</code> instead of using the default. Different shells, different backends, no cross-talk.
Session-scope switches reset on next launch (clean slate). Global-scope switches persist — say "switch to my local one for everything" and Claude Code passes <code class="font-mono text-xs text-white/70">scope: "global"</code> to <code class="font-mono text-xs text-white/70">endpoint_set</code>. ToolPiper writes it to the global "Anthropic Proxy Backend" setting; survives restarts.
ToolPiper is a free download. Configure once and Claude Code routes through your Mac.