ToolPiper is the local AI server that powers ModelPiper. It runs on your Mac, manages inference engines, and exposes an OpenAI-compatible API at localhost:9998.
ModelPiper is the web app you're looking at right now. It talks to ToolPiper, a native macOS app running in the background. ToolPiper manages the actual AI engines — llama.cpp for LLMs, FluidAudio for speech, CoreML for images, and more.
When you create a Provider in ModelPiper, you're setting up a configuration that pairs an AI model with a specific engine. For example: "Use the Llama 3.2 3B model via llama.cpp" or "Use Parakeet for speech-to-text via FluidAudio." Each provider becomes a usable endpoint on the ToolPiper API.
ModelPiper uses an internal session key to stay connected to ToolPiper — you don't need to think about that. But if you want to build your own app on top of ToolPiper, you'll need a Developer Token.
A dev token lets you use ToolPiper from your own code, just like an OpenAI API key. Create one, drop it into any OpenAI-compatible SDK, and point the base URL at localhost:9998/v1. That's it.
Drop-in replacement for the OpenAI SDK — just change the base URL and API key. Questions? @ModelPiper on X.
curl http://localhost:9998/v1/chat/completions \
-H "Authorization: Bearer tp_dev_YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "llama-3.2-3b",
"messages": [{"role": "user", "content": "Hello!"}]
}'ToolPiper is also an MCP server. Install categories individually to control which tools your AI client sees — saves context tokens.
core11LLM inference, TTS, STT, embeddings, OCRanalysis8Image/text analysis, RAG, upscalingbrowser17Browser automation, scraping, assertionstesting6PiperTest CRUD and executionmotion5Pose estimation, stream processingoutreach12GitHub, HN, Reddit, X, content queuesystem29macOS system actions (ActionPiper)video12Video creator pipelineoauth4OAuth connection managementSee MCP docs for install commands, profiles, and full tool reference.
/v1/modelsList available models/v1/chat/completionsCreate chat completion (streaming supported)/v1/audio/transcriptionsTranscribe audio (STT)/v1/audio/speechSynthesize speech (TTS), supports voice cloning/v1/images/upscaleUpscale image 2x/4x via CoreML (returns PNG)/v1/video/upscaleUpscale video 2x via PiperSR (base64 in/out)/v1/video/upscale/urlUpscale video 2x from URL/v1/video/upscale/fileUpscale video 2x using file paths (no base64 overhead)/v1/video/upscale/{id}Check upscale job status/v1/video/upscale/{id}/resultDownload upscaled video result (streaming)/v1/benchmark/upscaleRun PiperSR benchmark suite (A–G)