ToolPiper API

Offline

ToolPiper is the local AI server that powers ModelPiper. It runs on your Mac, manages inference engines, and exposes an OpenAI-compatible API at localhost:9998.

By Ben Racicot, Founder & Lead Engineer— Updated 2026-06-11

Base URLhttp://localhost:9998/v1SpecGET /v1/openapi.json

How It Works

ModelPiper is the web app you're looking at right now. It talks to ToolPiper, a native macOS app running in the background. ToolPiper manages the actual AI engines — llama.cpp for LLMs, FluidAudio for speech, CoreML for images, and more.

When you create a Provider in ModelPiper, you're setting up a configuration that pairs an AI model with a specific engine. For example: Use the Llama 3.2 3B model via llama.cpp or Use Parakeet for speech-to-text via FluidAudio. Each provider becomes a usable endpoint on the ToolPiper API.

ModelPiper uses an internal session key to stay connected to ToolPiper — you don't need to think about that. But if you want to build your own app on top of ToolPiper, you'll need a Developer Token.

Developer Tokens

A dev token lets you use ToolPiper from your own code, just like an OpenAI API key. Create one, drop it into any OpenAI-compatible SDK, and point the base URL at localhost:9998/v1. That's it.

ToolPiper is offline. Start it to manage tokens.

Claude Code Zero-config

Wire ToolPiper into Claude Code with one click. We generate a dev token, write ~/.claude/settings.json, and register ToolPiper as an MCP server. Claude Code's /model picker shows every endpoint you've configured here.

Recipe · Free, on-device

Apple Intelligence

Neural Engine inference. No API key, no quota.

Recipe · 32K · Tool use

Local Qwen via llama.cpp

Best capability/privacy balance.

Recipe · BYOK

OpenAI with your key

Keychain-locked; never in dotfiles or logs.

Recipe · Conversational

Switch providers in chat

"Use my local one" — endpoint_set MCP tool.

New to all of this? See the comparison vs Ollama / vLLM / LM Studio.

ToolPiper is offline. Start it to connect Claude Code.

Anthropic Proxy Backend Phase 1 plumbing

ToolPiper will expose POST /v1/messages as an Anthropic-shape proxy so Claude Code (and any Anthropic-compatible client) can use any provider you've configured. Pick the global backend below, or bind a specific provider per token in the table above.

ToolPiper is offline. Start it to configure the proxy backend.

Quick Start

Drop-in replacement for the OpenAI SDK — just change the base URL and API key. Questions? @ModelPiper on X.

curl http://localhost:9998/v1/chat/completions \
  -H "Authorization: Bearer tp_YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3.2-3b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

MCP Server 300+ tools

ToolPiper is also an MCP server. Install categories individually to control which tools your AI client sees — saves context tokens.

core12LLM inference, TTS, STT, embeddings, OCR
analysis8Image/text analysis, RAG, upscaling
browser19Browser automation, scraping, assertions
testing6PiperTest CRUD and execution
motion5Pose estimation, stream processing
outreach12GitHub, HN, Reddit, X, content queue
system29macOS system actions
video17Video creator pipeline
oauth4OAuth connection management
sieve4MediaPiper sieve cache inspection
filesystem13File read/write, git operations, shell commands

See MCP docs for install commands, profiles, and full tool reference.

Endpoints

Inference

OpenAI-compatible inference endpoints

GET/v1/modelsList available models
POST/v1/chat/completionsCreate chat completion
POST/v1/embeddingsCreate embedding (OpenAI-compatible)

Audio

Speech-to-text and text-to-speech

POST/v1/audio/transcriptionsTranscribe audio
POST/v1/audio/speechText-to-speech
POST/v1/audio/recordRecord an audio clip

Image

Image processing (upscale)

POST/v1/images/upscaleUpscale image
POST/v1/video/upscaleUpscale video 2x
POST/v1/video/upscale/fileUpscale video 2x (file paths)
POST/v1/benchmark/upscaleRun PiperSR benchmark suite
POST/v1/images/upscale/fileUpscale an image file on disk

RAG

Retrieval-Augmented Generation — collections, ingestion, and semantic search

GET/v1/rag/collectionsList RAG collections
POST/v1/rag/collectionsCreate a RAG collection
GET/v1/rag/collections/{collectionId}Get a collection
PUT/v1/rag/collections/{collectionId}Update a collection
DELETE/v1/rag/collections/{collectionId}Delete a collection
GET/v1/rag/collections/{collectionId}/chunksList chunks in a collection
POST/v1/rag/collections/{collectionId}/ingestStart document ingestion
GET/v1/rag/collections/{collectionId}/ingest/statusGet ingestion progress
POST/v1/rag/collections/{collectionId}/ingest/cancelCancel active ingestion
POST/v1/rag/querySemantic search across collections
POST/v1/rag/browse-folderOpen native folder picker

Cloud Proxy

Keychain-backed cloud API proxy (Pro feature)

POST/v1/cloud/proxyCloud API proxy (Pro)
GET/v1/cloud/api-keyLoad cloud API key from Keychain (Pro)
POST/v1/cloud/api-keySave cloud API key to Keychain (Pro)
DELETE/v1/cloud/api-keyDelete cloud API key from Keychain (Pro)

Models

Model management, downloads, and HuggingFace integration

GET/v1/models/installedList downloaded models
GET/v1/models/{modelId}Get model details
DELETE/v1/models/{modelId}Delete a model
GET/v1/models/storageDisk usage for models
GET/v1/models/searchSearch HuggingFace for models
POST/v1/models/downloadDownload a model from HuggingFace
GET/v1/models/downloadsList active downloads
DELETE/v1/models/downloads/{downloadId}Cancel a download
POST/v1/models/scanScan for new models
GET/v1/models/hf/{owner}/{repo}/filesList files in a HuggingFace repo

Model Configs

Curated model presets with availability status

GET/v1/model-configsList model presets
POST/v1/model-configs/installInstall a model by preset ID

Engine

Inference engine control and model state

GET/v1/engine/statusEngine and backend status
POST/v1/engine/loadLoad a model into the engine
POST/v1/engine/unloadUnload a model or stop the engine
GET/v1/models/statePer-model runtime states
POST/v1/models/reloadReload a llama-server-backed model

Tokens

Developer token management (all tiers; requires the tokensManage scope)

GET/v1/tokensList developer tokens
POST/v1/tokensCreate a developer token (all tiers; requires the tokensManage scope)
PATCH/v1/tokens/{tokenId}Update a developer token's metadata
DELETE/v1/tokens/{tokenId}Revoke a developer token

Apple

Apple-native framework tools (Vision, NLP) — no model downloads required

POST/v1/apple/ocrRecognize text in an image
GET/v1/apple/ocr/languagesList supported OCR languages
POST/v1/apple/barcodeDetect barcodes and QR codes
POST/v1/apple/classifyClassify image content
POST/v1/apple/face-detectDetect faces
POST/v1/apple/saliencyDetect salient regions
POST/v1/apple/rectanglesDetect rectangles
POST/v1/apple/feature-printGenerate image feature vector
POST/v1/apple/body-poseDetect human body poses
POST/v1/apple/hand-poseDetect hand poses
POST/v1/apple/animalsDetect animals
POST/v1/apple/horizonDetect horizon angle
POST/v1/apple/documentDetect document boundaries
POST/v1/apple/nlp/languageDetect language
POST/v1/apple/nlp/sentimentAnalyze sentiment
POST/v1/apple/nlp/entitiesNamed entity recognition
POST/v1/apple/nlp/tokenizeTokenize text
POST/v1/apple/nlp/lemmatizeLemmatize text
POST/v1/apple/nlp/posPart-of-speech tagging

System

Health checks, resource monitoring, events, and licensing

GET/statusHealth check
GET/session-tokenObtain ambient bearer token
GET/v1/permissionsmacOS permission snapshot
GET/v1/system/resourcesGPU, RAM, and ANE utilization
GET/v1/eventsSSE event stream
GET/v1/licenseSubscription tier and features
GET/v1/openapi.jsonOpenAPI specification
GET/v1/audit/toolsTool dispatch audit log
GET/v1/audit/tools/exportIncremental audit export
GET/v1/snippets/match/statusSnippet match diagnostics

Configurations

Endpoint configuration management

GET/v1/configurationsList endpoint configurations
POST/v1/configurationsCreate an endpoint configuration
PUT/v1/configurations/{configId}Update an endpoint configuration
DELETE/v1/configurations/{configId}Delete an endpoint configuration

Logs

Log ingestion, querying, and real-time streaming

GET/v1/logsQuery log entries
POST/v1/logsIngest log entries
GET/v1/logs/streamReal-time log stream (SSE)
POST/v1/logs/clearClear all log entries
POST/v1/logs/exportExport logs to file

Templates

Workflow templates

GET/v1/workflow-templatesList workflow templates

Browser

CDP-based browser automation

GET/v1/browser/statusBrowser connection status
POST/v1/browser/connectConnect to a browser via CDP
POST/v1/browser/disconnectDisconnect from browser
GET/v1/browser/pagesList open browser pages
POST/v1/browser/select-pageSelect a browser page
GET/v1/browser/snapshotAccessibility tree snapshot
GET/v1/browser/screenshotPage screenshot
GET/v1/browser/consoleConsole messages
POST/v1/browser/record/startStart recording user actions
POST/v1/browser/record/stopStop recording user actions
GET/v1/browser/record/streamRecording event stream (SSE)
POST/v1/browser/network/enableEnable network logging
POST/v1/browser/network/disableDisable network logging
GET/v1/browser/networkList captured network entries
DELETE/v1/browser/networkClear captured network entries
GET/v1/browser/network/statusNetwork logging status
GET/v1/browser/network/{requestId}/bodyGet response body for a captured request
POST/v1/browser/trace/startStart performance trace
POST/v1/browser/trace/stopStop performance trace
GET/v1/browser/metricsGet current performance metrics
GET/v1/browser/trace/statusTrace status
POST/v1/browser/intercept/enableEnable network interception
POST/v1/browser/intercept/disableDisable network interception
GET/v1/browser/intercept/statusInterception status
GET/v1/browser/mocksList all mock rules
POST/v1/browser/mocksCreate a mock rule
DELETE/v1/browser/mocksDelete all mock rules
PUT/v1/browser/mocks/{id}Update a mock rule
DELETE/v1/browser/mocks/{id}Delete a mock rule
POST/v1/browser/coverage/startStart code coverage collection
POST/v1/browser/coverage/stopStop code coverage and get report
GET/v1/browser/coverage/statusCoverage status
GET/v1/browser/storageRead browser storage
DELETE/v1/browser/storageClear browser storage
POST/v1/browser/storage/cookieSet a cookie
POST/v1/browser/storage/localSet a localStorage item
POST/v1/browser/storage/sessionSet a sessionStorage item
POST/v1/browser/webauthn/enableEnable virtual authenticator
POST/v1/browser/webauthn/disableDisable virtual authenticator
GET/v1/browser/webauthn/statusWebAuthn status
GET/v1/browser/webauthn/credentialsList virtual credentials
DELETE/v1/browser/webauthn/credentials/{credentialId}Delete a virtual credential
POST/v1/browser/webauthn/verifySet user verification state
POST/v1/browser/autofill/credit-cardTrigger credit card autofill
POST/v1/browser/autofill/addressTrigger address autofill
GET/v1/browser/full-statusExtended browser status with pages and channels
POST/v1/browser/actionPerform a browser action (click, fill, navigate, etc.)
POST/v1/browser/assertPerform a browser assertion
POST/v1/browser/healHeal a broken selector
POST/v1/browser/evaluateExecute JavaScript in the browser
POST/v1/browser/resizeResize browser viewport
POST/v1/browser/dialogHandle a JavaScript dialog (alert, confirm, prompt)
POST/v1/browser/close-tabClose a browser tab
POST/v1/browser/new-tabOpen a new browser tab
GET/v1/browser/channelsList available Chrome channels (dev, canary, stable)
GET/v1/browser/discoverList API discovery results
POST/v1/browser/discoverDiscover API endpoints from captured network traffic
GET/v1/browser/discover/{mapId}Get a single API discovery result
DELETE/v1/browser/discover/{mapId}Delete an API discovery result
POST/v1/browser/replayReplay a discovered API call

Testing

PiperTest — visual test session management and execution

GET/v1/test-sessionsList all test sessions
POST/v1/test-sessionsCreate a new test session
POST/v1/test-sessions/importImport a test session from exported format
GET/v1/test-sessions/{sessionId}Get a test session by ID
PUT/v1/test-sessions/{sessionId}Update a test session
DELETE/v1/test-sessions/{sessionId}Delete a test session
POST/v1/test-sessions/{sessionId}/runRun a saved test session
POST/v1/test-sessions/{sessionId}/run/cancelCancel an in-progress test run
POST/v1/test-sessions/{sessionId}/exportExport a test session to Playwright or Cypress code
POST/v1/tests/runRun inline test steps (no saved session required)
POST/v1/browser/probe/scanScan the current page for interactive elements (PiperProbe)
GET/v1/test-sessions/{sessionId}/probeGet probe results for a test session
POST/v1/test-sessions/{sessionId}/coverageCompute combined coverage for a test session

Pose

Human pose estimation via Apple Vision / CoreML, including on-demand real-time 60fps skeleton streaming via WebSocket

POST/v1/pose/detectDetect human poses in an image
GET/v1/pose/formatsList available pose output formats
GET/v1/pose/modelsList available pose detection models
POST/v1/pose/stream/startStart the pose stream WebSocket server
POST/v1/pose/stream/stopStop the pose stream WebSocket server
GET/v1/pose/stream/statusGet pose stream WebSocket status
GET/v1/pose/streamReal-time pose streaming (WebSocket — on-demand)

Stream

Real-time stream processing

POST/v1/stream/startStart a stream processing session
POST/v1/stream/stopStop the active stream processing session
GET/v1/stream/statusGet stream session status

Scrape

CDP-based web page scraping with framework-aware readiness. Extracts content in up to 7 formats (markdown, text, readability, axTree, html, links, screenshot) from a single page load

GET/v1/scrapeList recent scrape jobs
POST/v1/scrapeStart a web page scrape
GET/v1/scrape/{jobId}Get scrape job status

Video

Video creator pipeline — settings, screenplays, recording, rendering, and narration

GET/v1/settings/videoGet video settings
PUT/v1/settings/videoUpdate video settings

Voice Chat

Persistent voice conversation with model selection, conversation memory, and sentence-level TTS streaming

GET/v1/voice-chat/settingsGet voice chat settings
PUT/v1/voice-chat/settingsUpdate voice chat settings
POST/v1/voice-chat/sessionCreate a voice chat session
DELETE/v1/voice-chat/sessionEnd the active voice chat session

Pipeline

Workflow pipeline orchestration

POST/v1/pipeline/runExecute a workflow pipeline (Pro)
POST/v1/pipeline/run/cancelCancel the active pipeline run

Tool Permissions

MCP tool permission policies

GET/v1/tool-permissionsList MCP tool permission policies
PUT/v1/tool-permissionsSet MCP tool permission policies

Tools

Unified tool catalog, retrieval probes, and session-scoped client-tool registration (PiperMatch / ToolGate plumbing)

GET/v1/tools/catalogSnapshot of the active ToolGate catalog (DEBUG only)
POST/v1/tools/probeRank a candidate tool against PiperMatch on a set of cases
POST/v1/tools/register-client-catalogRegister session-scoped client tools into the unified index
POST/v1/tools/unregister-client-catalogUnregister a session's client tool catalog
GET/v1/tools/nativeNative tool catalog (unfiltered)

Capture

Screen and color capture utilities

POST/v1/screenshotTake a screenshot
POST/v1/color/pickPick a color from the screen
POST/v1/camera/captureCapture a webcam still

Files

File-system utilities — archive, PDF extraction, code search

POST/v1/archiveCreate, extract, or list an archive
POST/v1/pdf/extractExtract text, pages, metadata, or images from a PDF
POST/v1/filesystem/edit-fileExact-string edit of a text file
POST/v1/filesystem/move-fileMove or rename a file or directory
POST/v1/search/codeGrep-style code search across a directory

Web

Outbound HTTP and web search

POST/v1/web/searchRun a web search
POST/v1/http/requestPerform an outbound HTTP request

Translation

On-device text translation (Apple Translation framework)

POST/v1/translateTranslate text via the Apple Translation framework

Utilities

General macOS utilities — clipboard history, timers, QR codes, image transforms

POST/v1/clipboardRead, write, or manage clipboard history
POST/v1/timersStart, list, cancel, or clear timers
POST/v1/qr/generateGenerate a QR code PNG
POST/v1/image/transformResize, crop, rotate, or convert an image

MCP

Model Context Protocol (Streamable HTTP transport)

GET/mcpGET not supported for MCP
POST/mcpHandle MCP JSON-RPC request
DELETE/mcpTerminate an MCP session
GET/v1/mcp-serversList MCP servers
GET/v1/mcp-servers/healthMCP server health
POST/v1/mcp-servers/{name}/restartRestart an MCP server
POST/v1/mcp-servers/{name}/reload-toolsReload an MCP server's tools
PUT/v1/mcp-servers/{name}Add or update an MCP server

Anthropic

Anthropic Messages API proxy — drop-in backend for Claude Code and any Anthropic-shape client

POST/v1/messagesAnthropic Messages — drop-in proxy for Claude Code
POST/v1/messages/count_tokensEstimate input tokens for an Anthropic Messages request
GET/v1/settings/anthropic-proxyGet the global Anthropic Proxy backend setting
PATCH/v1/settings/anthropic-proxyUpdate the global Anthropic Proxy backend setting

Claude Code

Zero-config integration with Anthropic's Claude Code CLI

POST/v1/claude-code/installInstall the claude-tp helper
POST/v1/claude-code/uninstallUninstall the claude-tp helper
GET/v1/claude-code/helper-statusclaude-tp helper status

SERP

Search-engine results capture, rank tracking, and autocomplete keyword expansion (async job + poll)

POST/v1/serp/autocompleteExpand a seed phrase via Google Suggest
POST/v1/serp/searchStart a SERP capture job
GET/v1/serp/search/{jobId}Poll a SERP capture job
POST/v1/serp/rank-checkStart a SERP rank-check job
GET/v1/serp/rank-check/{jobId}Poll a SERP rank-check job

API Connections

Saved cloud-provider API connections (key + base URL pairs) consumed by configurations

GET/v1/connectionsList API connections
POST/v1/connectionsCreate an API connection
PUT/v1/connections/{connectionId}Update an API connection
DELETE/v1/connections/{connectionId}Delete an API connection

YouTube

YouTube transcript extraction

GET/v1/youtube/transcriptFetch YouTube video transcript

Connected Apps

Connected OAuth/bearer clients — listing, pending-consent management, blocking, and revocation

GET/v1/connected-appsList connected apps
GET/v1/connected-apps/blockedList blocked apps
GET/v1/connected-apps/pending-consentsList pending OAuth consents
POST/v1/connected-apps/{id}/blockBlock a connected app
POST/v1/connected-apps/{id}/revokeRevoke a connected app's token
POST/v1/connected-apps/blocked/{id}/unblockUnblock a blocked client

Action Piper

macOS UI automation — VLM grounding resolver, circuit breaker, and accessibility telemetry

POST/v1/action-piper/resolveResolve a click target to coordinates
POST/v1/action-piper/circuit/resetReset the Apple FM circuit breaker
GET/v1/action-piper/telemetry/summaryAccessibility outcome telemetry

Legacy: Ollama-compatible API NewLegacy dialect

ToolPiper serves the Ollama wire dialect as a legacy surface — a migration on-ramp, not a destination. Two mounts, one module: the opt-in loopback-only listener on :11434 (Settings → General, off by default, no auth — upstream's own posture), and this documented :9998 mount at /legacy/ollama/api/* behind the normal bearer middleware (set a configurable client's base URL to http://127.0.0.1:9998/legacy/ollama). Every response carries RFC 9745 "Deprecation" and a Link rel="successor-version" header — the surface was born deprecated, and each endpoint below names its /v1/ successor, which is where new integrations should land. Wire shapes are pinned to recorded fixtures from Ollama 0.23.4. Modelfile semantics (create, push, copy, blobs) are permanently rejected with guidance naming the successor route. The layer is deliberately disposable: it exists while it earns its keep, then it gets deleted.

Ollama porthttp://127.0.0.1:11434opt-in listener — Settings → General, off by default, loopback-onlyBase URLhttp://127.0.0.1:9998/legacy/ollamaalways mounted, bearer-authenticated

GET/legacy/ollama/api/versionOllama-dialect version probe — successor: GET /version
GET/legacy/ollama/api/tagsOllama-dialect installed-model list — successor: GET /v1/models/installed
GET/legacy/ollama/api/psOllama-dialect loaded-model list — successor: GET /v1/models/state
POST/legacy/ollama/api/showOllama-dialect per-model detail — successor: GET /v1/models/installed
POST/legacy/ollama/api/chatOllama-dialect chat (NDJSON stream) — successor: POST /v1/chat/completions
POST/legacy/ollama/api/generateOllama-dialect generate (NDJSON stream) — successor: POST /v1/chat/completions
POST/legacy/ollama/api/embedOllama-dialect embeddings — successor: POST /v1/embeddings
POST/legacy/ollama/api/embeddingsOllama-dialect embeddings (alias upstream itself deprecates) — successor: POST /v1/embeddings
POST/legacy/ollama/api/pullOllama-dialect model pull (NDJSON progress) — successor: POST /v1/models/download
DELETE/legacy/ollama/api/deleteOllama-dialect model delete — successor: DELETE /v1/models/{id}