ToolPiper API
ToolPiper is the local AI server that powers ModelPiper. It runs on your Mac, manages inference engines, and exposes an OpenAI-compatible API at localhost:9998.
http://localhost:9998/v1SpecGET /v1/openapi.jsonHow It Works
ModelPiper is the web app you're looking at right now. It talks to ToolPiper, a native macOS app running in the background. ToolPiper manages the actual AI engines — llama.cpp for LLMs, FluidAudio for speech, CoreML for images, and more.
When you create a Provider in ModelPiper, you're setting up a configuration that pairs an AI model with a specific engine. For example: Use the Llama 3.2 3B model via llama.cpp or Use Parakeet for speech-to-text via FluidAudio. Each provider becomes a usable endpoint on the ToolPiper API.
ModelPiper uses an internal session key to stay connected to ToolPiper — you don't need to think about that. But if you want to build your own app on top of ToolPiper, you'll need a Developer Token.
Developer Tokens
A dev token lets you use ToolPiper from your own code, just like an OpenAI API key. Create one, drop it into any OpenAI-compatible SDK, and point the base URL at localhost:9998/v1. That's it.
Claude Code Zero-config
Wire ToolPiper into Claude Code with one click. We generate a dev token, write ~/.claude/settings.json, and register ToolPiper as an MCP server. Claude Code's /model picker shows every endpoint you've configured here.
New to all of this? See the comparison vs Ollama / vLLM / LM Studio.
Anthropic Proxy Backend Phase 1 plumbing
ToolPiper will expose POST /v1/messages as an Anthropic-shape proxy so Claude Code (and any Anthropic-compatible client) can use any provider you've configured. Pick the global backend below, or bind a specific provider per token in the table above.
Quick Start
Drop-in replacement for the OpenAI SDK — just change the base URL and API key. Questions? @ModelPiper on X.
curl http://localhost:9998/v1/chat/completions \
-H "Authorization: Bearer tp_YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "llama-3.2-3b",
"messages": [{"role": "user", "content": "Hello!"}]
}'MCP Server 300+ tools
ToolPiper is also an MCP server. Install categories individually to control which tools your AI client sees — saves context tokens.
core12LLM inference, TTS, STT, embeddings, OCRanalysis8Image/text analysis, RAG, upscalingbrowser19Browser automation, scraping, assertionstesting6PiperTest CRUD and executionmotion5Pose estimation, stream processingoutreach12GitHub, HN, Reddit, X, content queuesystem29macOS system actionsvideo17Video creator pipelineoauth4OAuth connection managementsieve4MediaPiper sieve cache inspectionfilesystem13File read/write, git operations, shell commands
See MCP docs for install commands, profiles, and full tool reference.
Endpoints
Inference
OpenAI-compatible inference endpoints
- GET
/v1/modelsList available models - POST
/v1/chat/completionsCreate chat completion - POST
/v1/embeddingsCreate embedding (OpenAI-compatible)
Audio
Speech-to-text and text-to-speech
- POST
/v1/audio/transcriptionsTranscribe audio - POST
/v1/audio/speechText-to-speech - POST
/v1/audio/recordRecord an audio clip
Image
Image processing (upscale)
- POST
/v1/images/upscaleUpscale image - POST
/v1/video/upscaleUpscale video 2x - POST
/v1/video/upscale/fileUpscale video 2x (file paths) - POST
/v1/benchmark/upscaleRun PiperSR benchmark suite - POST
/v1/images/upscale/fileUpscale an image file on disk
RAG
Retrieval-Augmented Generation — collections, ingestion, and semantic search
- GET
/v1/rag/collectionsList RAG collections - POST
/v1/rag/collectionsCreate a RAG collection - GET
/v1/rag/collections/{collectionId}Get a collection - PUT
/v1/rag/collections/{collectionId}Update a collection - DELETE
/v1/rag/collections/{collectionId}Delete a collection - GET
/v1/rag/collections/{collectionId}/chunksList chunks in a collection - POST
/v1/rag/collections/{collectionId}/ingestStart document ingestion - GET
/v1/rag/collections/{collectionId}/ingest/statusGet ingestion progress - POST
/v1/rag/collections/{collectionId}/ingest/cancelCancel active ingestion - POST
/v1/rag/querySemantic search across collections - POST
/v1/rag/browse-folderOpen native folder picker
Cloud Proxy
Keychain-backed cloud API proxy (Pro feature)
- POST
/v1/cloud/proxyCloud API proxy (Pro) - GET
/v1/cloud/api-keyLoad cloud API key from Keychain (Pro) - POST
/v1/cloud/api-keySave cloud API key to Keychain (Pro) - DELETE
/v1/cloud/api-keyDelete cloud API key from Keychain (Pro)
Models
Model management, downloads, and HuggingFace integration
- GET
/v1/models/installedList downloaded models - GET
/v1/models/{modelId}Get model details - DELETE
/v1/models/{modelId}Delete a model - GET
/v1/models/storageDisk usage for models - GET
/v1/models/searchSearch HuggingFace for models - POST
/v1/models/downloadDownload a model from HuggingFace - GET
/v1/models/downloadsList active downloads - DELETE
/v1/models/downloads/{downloadId}Cancel a download - POST
/v1/models/scanScan for new models - GET
/v1/models/hf/{owner}/{repo}/filesList files in a HuggingFace repo
Model Configs
Curated model presets with availability status
- GET
/v1/model-configsList model presets - POST
/v1/model-configs/installInstall a model by preset ID
Engine
Inference engine control and model state
- GET
/v1/engine/statusEngine and backend status - POST
/v1/engine/loadLoad a model into the engine - POST
/v1/engine/unloadUnload a model or stop the engine - GET
/v1/models/statePer-model runtime states - POST
/v1/models/reloadReload a llama-server-backed model
Tokens
Developer token management (all tiers; requires the tokensManage scope)
- GET
/v1/tokensList developer tokens - POST
/v1/tokensCreate a developer token (all tiers; requires the tokensManage scope) - PATCH
/v1/tokens/{tokenId}Update a developer token's metadata - DELETE
/v1/tokens/{tokenId}Revoke a developer token
Apple
Apple-native framework tools (Vision, NLP) — no model downloads required
- POST
/v1/apple/ocrRecognize text in an image - GET
/v1/apple/ocr/languagesList supported OCR languages - POST
/v1/apple/barcodeDetect barcodes and QR codes - POST
/v1/apple/classifyClassify image content - POST
/v1/apple/face-detectDetect faces - POST
/v1/apple/saliencyDetect salient regions - POST
/v1/apple/rectanglesDetect rectangles - POST
/v1/apple/feature-printGenerate image feature vector - POST
/v1/apple/body-poseDetect human body poses - POST
/v1/apple/hand-poseDetect hand poses - POST
/v1/apple/animalsDetect animals - POST
/v1/apple/horizonDetect horizon angle - POST
/v1/apple/documentDetect document boundaries - POST
/v1/apple/nlp/languageDetect language - POST
/v1/apple/nlp/sentimentAnalyze sentiment - POST
/v1/apple/nlp/entitiesNamed entity recognition - POST
/v1/apple/nlp/tokenizeTokenize text - POST
/v1/apple/nlp/lemmatizeLemmatize text - POST
/v1/apple/nlp/posPart-of-speech tagging
System
Health checks, resource monitoring, events, and licensing
- GET
/statusHealth check - GET
/session-tokenObtain ambient bearer token - GET
/v1/permissionsmacOS permission snapshot - GET
/v1/system/resourcesGPU, RAM, and ANE utilization - GET
/v1/eventsSSE event stream - GET
/v1/licenseSubscription tier and features - GET
/v1/openapi.jsonOpenAPI specification - GET
/v1/audit/toolsTool dispatch audit log - GET
/v1/audit/tools/exportIncremental audit export - GET
/v1/snippets/match/statusSnippet match diagnostics
Configurations
Endpoint configuration management
- GET
/v1/configurationsList endpoint configurations - POST
/v1/configurationsCreate an endpoint configuration - PUT
/v1/configurations/{configId}Update an endpoint configuration - DELETE
/v1/configurations/{configId}Delete an endpoint configuration
Logs
Log ingestion, querying, and real-time streaming
- GET
/v1/logsQuery log entries - POST
/v1/logsIngest log entries - GET
/v1/logs/streamReal-time log stream (SSE) - POST
/v1/logs/clearClear all log entries - POST
/v1/logs/exportExport logs to file
Templates
Workflow templates
- GET
/v1/workflow-templatesList workflow templates
Browser
CDP-based browser automation
- GET
/v1/browser/statusBrowser connection status - POST
/v1/browser/connectConnect to a browser via CDP - POST
/v1/browser/disconnectDisconnect from browser - GET
/v1/browser/pagesList open browser pages - POST
/v1/browser/select-pageSelect a browser page - GET
/v1/browser/snapshotAccessibility tree snapshot - GET
/v1/browser/screenshotPage screenshot - GET
/v1/browser/consoleConsole messages - POST
/v1/browser/record/startStart recording user actions - POST
/v1/browser/record/stopStop recording user actions - GET
/v1/browser/record/streamRecording event stream (SSE) - POST
/v1/browser/network/enableEnable network logging - POST
/v1/browser/network/disableDisable network logging - GET
/v1/browser/networkList captured network entries - DELETE
/v1/browser/networkClear captured network entries - GET
/v1/browser/network/statusNetwork logging status - GET
/v1/browser/network/{requestId}/bodyGet response body for a captured request - POST
/v1/browser/trace/startStart performance trace - POST
/v1/browser/trace/stopStop performance trace - GET
/v1/browser/metricsGet current performance metrics - GET
/v1/browser/trace/statusTrace status - POST
/v1/browser/intercept/enableEnable network interception - POST
/v1/browser/intercept/disableDisable network interception - GET
/v1/browser/intercept/statusInterception status - GET
/v1/browser/mocksList all mock rules - POST
/v1/browser/mocksCreate a mock rule - DELETE
/v1/browser/mocksDelete all mock rules - PUT
/v1/browser/mocks/{id}Update a mock rule - DELETE
/v1/browser/mocks/{id}Delete a mock rule - POST
/v1/browser/coverage/startStart code coverage collection - POST
/v1/browser/coverage/stopStop code coverage and get report - GET
/v1/browser/coverage/statusCoverage status - GET
/v1/browser/storageRead browser storage - DELETE
/v1/browser/storageClear browser storage - POST
/v1/browser/storage/cookieSet a cookie - POST
/v1/browser/storage/localSet a localStorage item - POST
/v1/browser/storage/sessionSet a sessionStorage item - POST
/v1/browser/webauthn/enableEnable virtual authenticator - POST
/v1/browser/webauthn/disableDisable virtual authenticator - GET
/v1/browser/webauthn/statusWebAuthn status - GET
/v1/browser/webauthn/credentialsList virtual credentials - DELETE
/v1/browser/webauthn/credentials/{credentialId}Delete a virtual credential - POST
/v1/browser/webauthn/verifySet user verification state - POST
/v1/browser/autofill/credit-cardTrigger credit card autofill - POST
/v1/browser/autofill/addressTrigger address autofill - GET
/v1/browser/full-statusExtended browser status with pages and channels - POST
/v1/browser/actionPerform a browser action (click, fill, navigate, etc.) - POST
/v1/browser/assertPerform a browser assertion - POST
/v1/browser/healHeal a broken selector - POST
/v1/browser/evaluateExecute JavaScript in the browser - POST
/v1/browser/resizeResize browser viewport - POST
/v1/browser/dialogHandle a JavaScript dialog (alert, confirm, prompt) - POST
/v1/browser/close-tabClose a browser tab - POST
/v1/browser/new-tabOpen a new browser tab - GET
/v1/browser/channelsList available Chrome channels (dev, canary, stable) - GET
/v1/browser/discoverList API discovery results - POST
/v1/browser/discoverDiscover API endpoints from captured network traffic - GET
/v1/browser/discover/{mapId}Get a single API discovery result - DELETE
/v1/browser/discover/{mapId}Delete an API discovery result - POST
/v1/browser/replayReplay a discovered API call
Testing
PiperTest — visual test session management and execution
- GET
/v1/test-sessionsList all test sessions - POST
/v1/test-sessionsCreate a new test session - POST
/v1/test-sessions/importImport a test session from exported format - GET
/v1/test-sessions/{sessionId}Get a test session by ID - PUT
/v1/test-sessions/{sessionId}Update a test session - DELETE
/v1/test-sessions/{sessionId}Delete a test session - POST
/v1/test-sessions/{sessionId}/runRun a saved test session - POST
/v1/test-sessions/{sessionId}/run/cancelCancel an in-progress test run - POST
/v1/test-sessions/{sessionId}/exportExport a test session to Playwright or Cypress code - POST
/v1/tests/runRun inline test steps (no saved session required) - POST
/v1/browser/probe/scanScan the current page for interactive elements (PiperProbe) - GET
/v1/test-sessions/{sessionId}/probeGet probe results for a test session - POST
/v1/test-sessions/{sessionId}/coverageCompute combined coverage for a test session
Pose
Human pose estimation via Apple Vision / CoreML, including on-demand real-time 60fps skeleton streaming via WebSocket
- POST
/v1/pose/detectDetect human poses in an image - GET
/v1/pose/formatsList available pose output formats - GET
/v1/pose/modelsList available pose detection models - POST
/v1/pose/stream/startStart the pose stream WebSocket server - POST
/v1/pose/stream/stopStop the pose stream WebSocket server - GET
/v1/pose/stream/statusGet pose stream WebSocket status - GET
/v1/pose/streamReal-time pose streaming (WebSocket — on-demand)
Stream
Real-time stream processing
- POST
/v1/stream/startStart a stream processing session - POST
/v1/stream/stopStop the active stream processing session - GET
/v1/stream/statusGet stream session status
Scrape
CDP-based web page scraping with framework-aware readiness. Extracts content in up to 7 formats (markdown, text, readability, axTree, html, links, screenshot) from a single page load
- GET
/v1/scrapeList recent scrape jobs - POST
/v1/scrapeStart a web page scrape - GET
/v1/scrape/{jobId}Get scrape job status
Video
Video creator pipeline — settings, screenplays, recording, rendering, and narration
- GET
/v1/settings/videoGet video settings - PUT
/v1/settings/videoUpdate video settings
Voice Chat
Persistent voice conversation with model selection, conversation memory, and sentence-level TTS streaming
- GET
/v1/voice-chat/settingsGet voice chat settings - PUT
/v1/voice-chat/settingsUpdate voice chat settings - POST
/v1/voice-chat/sessionCreate a voice chat session - DELETE
/v1/voice-chat/sessionEnd the active voice chat session
Pipeline
Workflow pipeline orchestration
- POST
/v1/pipeline/runExecute a workflow pipeline (Pro) - POST
/v1/pipeline/run/cancelCancel the active pipeline run
Tool Permissions
MCP tool permission policies
- GET
/v1/tool-permissionsList MCP tool permission policies - PUT
/v1/tool-permissionsSet MCP tool permission policies
Tools
Unified tool catalog, retrieval probes, and session-scoped client-tool registration (PiperMatch / ToolGate plumbing)
- GET
/v1/tools/catalogSnapshot of the active ToolGate catalog (DEBUG only) - POST
/v1/tools/probeRank a candidate tool against PiperMatch on a set of cases - POST
/v1/tools/register-client-catalogRegister session-scoped client tools into the unified index - POST
/v1/tools/unregister-client-catalogUnregister a session's client tool catalog - GET
/v1/tools/nativeNative tool catalog (unfiltered)
Capture
Screen and color capture utilities
- POST
/v1/screenshotTake a screenshot - POST
/v1/color/pickPick a color from the screen - POST
/v1/camera/captureCapture a webcam still
Files
File-system utilities — archive, PDF extraction, code search
- POST
/v1/archiveCreate, extract, or list an archive - POST
/v1/pdf/extractExtract text, pages, metadata, or images from a PDF - POST
/v1/filesystem/edit-fileExact-string edit of a text file - POST
/v1/filesystem/move-fileMove or rename a file or directory - POST
/v1/search/codeGrep-style code search across a directory
Web
Outbound HTTP and web search
- POST
/v1/web/searchRun a web search - POST
/v1/http/requestPerform an outbound HTTP request
Translation
On-device text translation (Apple Translation framework)
- POST
/v1/translateTranslate text via the Apple Translation framework
Utilities
General macOS utilities — clipboard history, timers, QR codes, image transforms
- POST
/v1/clipboardRead, write, or manage clipboard history - POST
/v1/timersStart, list, cancel, or clear timers - POST
/v1/qr/generateGenerate a QR code PNG - POST
/v1/image/transformResize, crop, rotate, or convert an image
MCP
Model Context Protocol (Streamable HTTP transport)
- GET
/mcpGET not supported for MCP - POST
/mcpHandle MCP JSON-RPC request - DELETE
/mcpTerminate an MCP session - GET
/v1/mcp-serversList MCP servers - GET
/v1/mcp-servers/healthMCP server health - POST
/v1/mcp-servers/{name}/restartRestart an MCP server - POST
/v1/mcp-servers/{name}/reload-toolsReload an MCP server's tools - PUT
/v1/mcp-servers/{name}Add or update an MCP server
Anthropic
Anthropic Messages API proxy — drop-in backend for Claude Code and any Anthropic-shape client
- POST
/v1/messagesAnthropic Messages — drop-in proxy for Claude Code - POST
/v1/messages/count_tokensEstimate input tokens for an Anthropic Messages request - GET
/v1/settings/anthropic-proxyGet the global Anthropic Proxy backend setting - PATCH
/v1/settings/anthropic-proxyUpdate the global Anthropic Proxy backend setting
Claude Code
Zero-config integration with Anthropic's Claude Code CLI
- POST
/v1/claude-code/installInstall the claude-tp helper - POST
/v1/claude-code/uninstallUninstall the claude-tp helper - GET
/v1/claude-code/helper-statusclaude-tp helper status
SERP
Search-engine results capture, rank tracking, and autocomplete keyword expansion (async job + poll)
- POST
/v1/serp/autocompleteExpand a seed phrase via Google Suggest - POST
/v1/serp/searchStart a SERP capture job - GET
/v1/serp/search/{jobId}Poll a SERP capture job - POST
/v1/serp/rank-checkStart a SERP rank-check job - GET
/v1/serp/rank-check/{jobId}Poll a SERP rank-check job
API Connections
Saved cloud-provider API connections (key + base URL pairs) consumed by configurations
- GET
/v1/connectionsList API connections - POST
/v1/connectionsCreate an API connection - PUT
/v1/connections/{connectionId}Update an API connection - DELETE
/v1/connections/{connectionId}Delete an API connection
YouTube
YouTube transcript extraction
- GET
/v1/youtube/transcriptFetch YouTube video transcript
Connected Apps
Connected OAuth/bearer clients — listing, pending-consent management, blocking, and revocation
- GET
/v1/connected-appsList connected apps - GET
/v1/connected-apps/blockedList blocked apps - GET
/v1/connected-apps/pending-consentsList pending OAuth consents - POST
/v1/connected-apps/{id}/blockBlock a connected app - POST
/v1/connected-apps/{id}/revokeRevoke a connected app's token - POST
/v1/connected-apps/blocked/{id}/unblockUnblock a blocked client
Action Piper
macOS UI automation — VLM grounding resolver, circuit breaker, and accessibility telemetry
- POST
/v1/action-piper/resolveResolve a click target to coordinates - POST
/v1/action-piper/circuit/resetReset the Apple FM circuit breaker - GET
/v1/action-piper/telemetry/summaryAccessibility outcome telemetry
Legacy: Ollama-compatible API NewLegacy dialect
ToolPiper serves the Ollama wire dialect as a legacy surface — a migration on-ramp, not a destination. Two mounts, one module: the opt-in loopback-only listener on :11434 (Settings → General, off by default, no auth — upstream's own posture), and this documented :9998 mount at /legacy/ollama/api/* behind the normal bearer middleware (set a configurable client's base URL to http://127.0.0.1:9998/legacy/ollama). Every response carries RFC 9745 "Deprecation" and a Link rel="successor-version" header — the surface was born deprecated, and each endpoint below names its /v1/ successor, which is where new integrations should land. Wire shapes are pinned to recorded fixtures from Ollama 0.23.4. Modelfile semantics (create, push, copy, blobs) are permanently rejected with guidance naming the successor route. The layer is deliberately disposable: it exists while it earns its keep, then it gets deleted.
http://127.0.0.1:11434opt-in listener — Settings → General, off by default, loopback-onlyBase URLhttp://127.0.0.1:9998/legacy/ollamaalways mounted, bearer-authenticated- GET
/legacy/ollama/api/versionOllama-dialect version probe — successor: GET /version - GET
/legacy/ollama/api/tagsOllama-dialect installed-model list — successor: GET /v1/models/installed - GET
/legacy/ollama/api/psOllama-dialect loaded-model list — successor: GET /v1/models/state - POST
/legacy/ollama/api/showOllama-dialect per-model detail — successor: GET /v1/models/installed - POST
/legacy/ollama/api/chatOllama-dialect chat (NDJSON stream) — successor: POST /v1/chat/completions - POST
/legacy/ollama/api/generateOllama-dialect generate (NDJSON stream) — successor: POST /v1/chat/completions - POST
/legacy/ollama/api/embedOllama-dialect embeddings — successor: POST /v1/embeddings - POST
/legacy/ollama/api/embeddingsOllama-dialect embeddings (alias upstream itself deprecates) — successor: POST /v1/embeddings - POST
/legacy/ollama/api/pullOllama-dialect model pull (NDJSON progress) — successor: POST /v1/models/download - DELETE
/legacy/ollama/api/deleteOllama-dialect model delete — successor: DELETE /v1/models/{id}