ToolPiper API

Offline

ToolPiper is the local AI server that powers ModelPiper. It runs on your Mac, manages inference engines, and exposes an OpenAI-compatible API at localhost:9998.

By , Founder & Lead Engineer— Updated
Base URLhttp://localhost:9998/v1SpecGET /v1/openapi.json

How It Works

ModelPiper is the web app you're looking at right now. It talks to ToolPiper, a native macOS app running in the background. ToolPiper manages the actual AI engines — llama.cpp for LLMs, FluidAudio for speech, CoreML for images, and more.

When you create a Provider in ModelPiper, you're setting up a configuration that pairs an AI model with a specific engine. For example: Use the Llama 3.2 3B model via llama.cpp or Use Parakeet for speech-to-text via FluidAudio. Each provider becomes a usable endpoint on the ToolPiper API.

ModelPiper uses an internal session key to stay connected to ToolPiper — you don't need to think about that. But if you want to build your own app on top of ToolPiper, you'll need a Developer Token.

Developer Tokens

A dev token lets you use ToolPiper from your own code, just like an OpenAI API key. Create one, drop it into any OpenAI-compatible SDK, and point the base URL at localhost:9998/v1. That's it.

ToolPiper is offline. Start it to manage tokens.

Claude Code Zero-config

Wire ToolPiper into Claude Code with one click. We generate a dev token, write ~/.claude/settings.json, and register ToolPiper as an MCP server. Claude Code's /model picker shows every endpoint you've configured here.

New to all of this? See the comparison vs Ollama / vLLM / LM Studio.

ToolPiper is offline. Start it to connect Claude Code.

Anthropic Proxy Backend Phase 1 plumbing

ToolPiper will expose POST /v1/messages as an Anthropic-shape proxy so Claude Code (and any Anthropic-compatible client) can use any provider you've configured. Pick the global backend below, or bind a specific provider per token in the table above.

ToolPiper is offline. Start it to configure the proxy backend.

Quick Start

Drop-in replacement for the OpenAI SDK — just change the base URL and API key. Questions? @ModelPiper on X.

curl http://localhost:9998/v1/chat/completions \
  -H "Authorization: Bearer tp_YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3.2-3b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

MCP Server 300+ tools

ToolPiper is also an MCP server. Install categories individually to control which tools your AI client sees — saves context tokens.

  • core12LLM inference, TTS, STT, embeddings, OCR
  • analysis8Image/text analysis, RAG, upscaling
  • browser19Browser automation, scraping, assertions
  • testing6PiperTest CRUD and execution
  • motion5Pose estimation, stream processing
  • outreach12GitHub, HN, Reddit, X, content queue
  • system29macOS system actions
  • video17Video creator pipeline
  • oauth4OAuth connection management
  • sieve4MediaPiper sieve cache inspection
  • filesystem13File read/write, git operations, shell commands

See MCP docs for install commands, profiles, and full tool reference.

Endpoints

Inference

OpenAI-compatible inference endpoints

  • GET/v1/modelsList available models
  • POST/v1/chat/completionsCreate chat completion
  • POST/v1/embeddingsCreate embedding (OpenAI-compatible)

Audio

Speech-to-text and text-to-speech

  • POST/v1/audio/transcriptionsTranscribe audio
  • POST/v1/audio/speechText-to-speech
  • POST/v1/audio/recordRecord an audio clip

Image

Image processing (upscale)

  • POST/v1/images/upscaleUpscale image
  • POST/v1/video/upscaleUpscale video 2x
  • POST/v1/video/upscale/fileUpscale video 2x (file paths)
  • POST/v1/benchmark/upscaleRun PiperSR benchmark suite
  • POST/v1/images/upscale/fileUpscale an image file on disk

RAG

Retrieval-Augmented Generation — collections, ingestion, and semantic search

  • GET/v1/rag/collectionsList RAG collections
  • POST/v1/rag/collectionsCreate a RAG collection
  • GET/v1/rag/collections/{collectionId}Get a collection
  • PUT/v1/rag/collections/{collectionId}Update a collection
  • DELETE/v1/rag/collections/{collectionId}Delete a collection
  • GET/v1/rag/collections/{collectionId}/chunksList chunks in a collection
  • POST/v1/rag/collections/{collectionId}/ingestStart document ingestion
  • GET/v1/rag/collections/{collectionId}/ingest/statusGet ingestion progress
  • POST/v1/rag/collections/{collectionId}/ingest/cancelCancel active ingestion
  • POST/v1/rag/querySemantic search across collections
  • POST/v1/rag/browse-folderOpen native folder picker

Cloud Proxy

Keychain-backed cloud API proxy (Pro feature)

  • POST/v1/cloud/proxyCloud API proxy (Pro)
  • GET/v1/cloud/api-keyLoad cloud API key from Keychain (Pro)
  • POST/v1/cloud/api-keySave cloud API key to Keychain (Pro)
  • DELETE/v1/cloud/api-keyDelete cloud API key from Keychain (Pro)

Models

Model management, downloads, and HuggingFace integration

  • GET/v1/models/installedList downloaded models
  • GET/v1/models/{modelId}Get model details
  • DELETE/v1/models/{modelId}Delete a model
  • GET/v1/models/storageDisk usage for models
  • GET/v1/models/searchSearch HuggingFace for models
  • POST/v1/models/downloadDownload a model from HuggingFace
  • GET/v1/models/downloadsList active downloads
  • DELETE/v1/models/downloads/{downloadId}Cancel a download
  • POST/v1/models/scanScan for new models
  • GET/v1/models/hf/{owner}/{repo}/filesList files in a HuggingFace repo

Model Configs

Curated model presets with availability status

  • GET/v1/model-configsList model presets
  • POST/v1/model-configs/installInstall a model by preset ID

Engine

Inference engine control and model state

  • GET/v1/engine/statusEngine and backend status
  • POST/v1/engine/loadLoad a model into the engine
  • POST/v1/engine/unloadUnload a model or stop the engine
  • GET/v1/models/statePer-model runtime states
  • POST/v1/models/reloadReload a llama-server-backed model

Tokens

Developer token management (all tiers; requires the tokensManage scope)

  • GET/v1/tokensList developer tokens
  • POST/v1/tokensCreate a developer token (all tiers; requires the tokensManage scope)
  • PATCH/v1/tokens/{tokenId}Update a developer token's metadata
  • DELETE/v1/tokens/{tokenId}Revoke a developer token

Apple

Apple-native framework tools (Vision, NLP) — no model downloads required

  • POST/v1/apple/ocrRecognize text in an image
  • GET/v1/apple/ocr/languagesList supported OCR languages
  • POST/v1/apple/barcodeDetect barcodes and QR codes
  • POST/v1/apple/classifyClassify image content
  • POST/v1/apple/face-detectDetect faces
  • POST/v1/apple/saliencyDetect salient regions
  • POST/v1/apple/rectanglesDetect rectangles
  • POST/v1/apple/feature-printGenerate image feature vector
  • POST/v1/apple/body-poseDetect human body poses
  • POST/v1/apple/hand-poseDetect hand poses
  • POST/v1/apple/animalsDetect animals
  • POST/v1/apple/horizonDetect horizon angle
  • POST/v1/apple/documentDetect document boundaries
  • POST/v1/apple/nlp/languageDetect language
  • POST/v1/apple/nlp/sentimentAnalyze sentiment
  • POST/v1/apple/nlp/entitiesNamed entity recognition
  • POST/v1/apple/nlp/tokenizeTokenize text
  • POST/v1/apple/nlp/lemmatizeLemmatize text
  • POST/v1/apple/nlp/posPart-of-speech tagging

System

Health checks, resource monitoring, events, and licensing

  • GET/statusHealth check
  • GET/session-tokenObtain ambient bearer token
  • GET/v1/permissionsmacOS permission snapshot
  • GET/v1/system/resourcesGPU, RAM, and ANE utilization
  • GET/v1/eventsSSE event stream
  • GET/v1/licenseSubscription tier and features
  • GET/v1/openapi.jsonOpenAPI specification
  • GET/v1/audit/toolsTool dispatch audit log
  • GET/v1/audit/tools/exportIncremental audit export
  • GET/v1/snippets/match/statusSnippet match diagnostics

Configurations

Endpoint configuration management

  • GET/v1/configurationsList endpoint configurations
  • POST/v1/configurationsCreate an endpoint configuration
  • PUT/v1/configurations/{configId}Update an endpoint configuration
  • DELETE/v1/configurations/{configId}Delete an endpoint configuration

Logs

Log ingestion, querying, and real-time streaming

  • GET/v1/logsQuery log entries
  • POST/v1/logsIngest log entries
  • GET/v1/logs/streamReal-time log stream (SSE)
  • POST/v1/logs/clearClear all log entries
  • POST/v1/logs/exportExport logs to file

Templates

Workflow templates

  • GET/v1/workflow-templatesList workflow templates

Browser

CDP-based browser automation

  • GET/v1/browser/statusBrowser connection status
  • POST/v1/browser/connectConnect to a browser via CDP
  • POST/v1/browser/disconnectDisconnect from browser
  • GET/v1/browser/pagesList open browser pages
  • POST/v1/browser/select-pageSelect a browser page
  • GET/v1/browser/snapshotAccessibility tree snapshot
  • GET/v1/browser/screenshotPage screenshot
  • GET/v1/browser/consoleConsole messages
  • POST/v1/browser/record/startStart recording user actions
  • POST/v1/browser/record/stopStop recording user actions
  • GET/v1/browser/record/streamRecording event stream (SSE)
  • POST/v1/browser/network/enableEnable network logging
  • POST/v1/browser/network/disableDisable network logging
  • GET/v1/browser/networkList captured network entries
  • DELETE/v1/browser/networkClear captured network entries
  • GET/v1/browser/network/statusNetwork logging status
  • GET/v1/browser/network/{requestId}/bodyGet response body for a captured request
  • POST/v1/browser/trace/startStart performance trace
  • POST/v1/browser/trace/stopStop performance trace
  • GET/v1/browser/metricsGet current performance metrics
  • GET/v1/browser/trace/statusTrace status
  • POST/v1/browser/intercept/enableEnable network interception
  • POST/v1/browser/intercept/disableDisable network interception
  • GET/v1/browser/intercept/statusInterception status
  • GET/v1/browser/mocksList all mock rules
  • POST/v1/browser/mocksCreate a mock rule
  • DELETE/v1/browser/mocksDelete all mock rules
  • PUT/v1/browser/mocks/{id}Update a mock rule
  • DELETE/v1/browser/mocks/{id}Delete a mock rule
  • POST/v1/browser/coverage/startStart code coverage collection
  • POST/v1/browser/coverage/stopStop code coverage and get report
  • GET/v1/browser/coverage/statusCoverage status
  • GET/v1/browser/storageRead browser storage
  • DELETE/v1/browser/storageClear browser storage
  • POST/v1/browser/storage/cookieSet a cookie
  • POST/v1/browser/storage/localSet a localStorage item
  • POST/v1/browser/storage/sessionSet a sessionStorage item
  • POST/v1/browser/webauthn/enableEnable virtual authenticator
  • POST/v1/browser/webauthn/disableDisable virtual authenticator
  • GET/v1/browser/webauthn/statusWebAuthn status
  • GET/v1/browser/webauthn/credentialsList virtual credentials
  • DELETE/v1/browser/webauthn/credentials/{credentialId}Delete a virtual credential
  • POST/v1/browser/webauthn/verifySet user verification state
  • POST/v1/browser/autofill/credit-cardTrigger credit card autofill
  • POST/v1/browser/autofill/addressTrigger address autofill
  • GET/v1/browser/full-statusExtended browser status with pages and channels
  • POST/v1/browser/actionPerform a browser action (click, fill, navigate, etc.)
  • POST/v1/browser/assertPerform a browser assertion
  • POST/v1/browser/healHeal a broken selector
  • POST/v1/browser/evaluateExecute JavaScript in the browser
  • POST/v1/browser/resizeResize browser viewport
  • POST/v1/browser/dialogHandle a JavaScript dialog (alert, confirm, prompt)
  • POST/v1/browser/close-tabClose a browser tab
  • POST/v1/browser/new-tabOpen a new browser tab
  • GET/v1/browser/channelsList available Chrome channels (dev, canary, stable)
  • GET/v1/browser/discoverList API discovery results
  • POST/v1/browser/discoverDiscover API endpoints from captured network traffic
  • GET/v1/browser/discover/{mapId}Get a single API discovery result
  • DELETE/v1/browser/discover/{mapId}Delete an API discovery result
  • POST/v1/browser/replayReplay a discovered API call

Testing

PiperTest — visual test session management and execution

  • GET/v1/test-sessionsList all test sessions
  • POST/v1/test-sessionsCreate a new test session
  • POST/v1/test-sessions/importImport a test session from exported format
  • GET/v1/test-sessions/{sessionId}Get a test session by ID
  • PUT/v1/test-sessions/{sessionId}Update a test session
  • DELETE/v1/test-sessions/{sessionId}Delete a test session
  • POST/v1/test-sessions/{sessionId}/runRun a saved test session
  • POST/v1/test-sessions/{sessionId}/run/cancelCancel an in-progress test run
  • POST/v1/test-sessions/{sessionId}/exportExport a test session to Playwright or Cypress code
  • POST/v1/tests/runRun inline test steps (no saved session required)
  • POST/v1/browser/probe/scanScan the current page for interactive elements (PiperProbe)
  • GET/v1/test-sessions/{sessionId}/probeGet probe results for a test session
  • POST/v1/test-sessions/{sessionId}/coverageCompute combined coverage for a test session

Pose

Human pose estimation via Apple Vision / CoreML, including on-demand real-time 60fps skeleton streaming via WebSocket

  • POST/v1/pose/detectDetect human poses in an image
  • GET/v1/pose/formatsList available pose output formats
  • GET/v1/pose/modelsList available pose detection models
  • POST/v1/pose/stream/startStart the pose stream WebSocket server
  • POST/v1/pose/stream/stopStop the pose stream WebSocket server
  • GET/v1/pose/stream/statusGet pose stream WebSocket status
  • GET/v1/pose/streamReal-time pose streaming (WebSocket — on-demand)

Stream

Real-time stream processing

  • POST/v1/stream/startStart a stream processing session
  • POST/v1/stream/stopStop the active stream processing session
  • GET/v1/stream/statusGet stream session status

Scrape

CDP-based web page scraping with framework-aware readiness. Extracts content in up to 7 formats (markdown, text, readability, axTree, html, links, screenshot) from a single page load

  • GET/v1/scrapeList recent scrape jobs
  • POST/v1/scrapeStart a web page scrape
  • GET/v1/scrape/{jobId}Get scrape job status

Video

Video creator pipeline — settings, screenplays, recording, rendering, and narration

  • GET/v1/settings/videoGet video settings
  • PUT/v1/settings/videoUpdate video settings

Voice Chat

Persistent voice conversation with model selection, conversation memory, and sentence-level TTS streaming

  • GET/v1/voice-chat/settingsGet voice chat settings
  • PUT/v1/voice-chat/settingsUpdate voice chat settings
  • POST/v1/voice-chat/sessionCreate a voice chat session
  • DELETE/v1/voice-chat/sessionEnd the active voice chat session

Pipeline

Workflow pipeline orchestration

  • POST/v1/pipeline/runExecute a workflow pipeline (Pro)
  • POST/v1/pipeline/run/cancelCancel the active pipeline run

Tool Permissions

MCP tool permission policies

  • GET/v1/tool-permissionsList MCP tool permission policies
  • PUT/v1/tool-permissionsSet MCP tool permission policies

Tools

Unified tool catalog, retrieval probes, and session-scoped client-tool registration (PiperMatch / ToolGate plumbing)

  • GET/v1/tools/catalogSnapshot of the active ToolGate catalog (DEBUG only)
  • POST/v1/tools/probeRank a candidate tool against PiperMatch on a set of cases
  • POST/v1/tools/register-client-catalogRegister session-scoped client tools into the unified index
  • POST/v1/tools/unregister-client-catalogUnregister a session's client tool catalog
  • GET/v1/tools/nativeNative tool catalog (unfiltered)

Capture

Screen and color capture utilities

  • POST/v1/screenshotTake a screenshot
  • POST/v1/color/pickPick a color from the screen
  • POST/v1/camera/captureCapture a webcam still

Files

File-system utilities — archive, PDF extraction, code search

  • POST/v1/archiveCreate, extract, or list an archive
  • POST/v1/pdf/extractExtract text, pages, metadata, or images from a PDF
  • POST/v1/filesystem/edit-fileExact-string edit of a text file
  • POST/v1/filesystem/move-fileMove or rename a file or directory
  • POST/v1/search/codeGrep-style code search across a directory

Web

Outbound HTTP and web search

  • POST/v1/web/searchRun a web search
  • POST/v1/http/requestPerform an outbound HTTP request

Translation

On-device text translation (Apple Translation framework)

  • POST/v1/translateTranslate text via the Apple Translation framework

Utilities

General macOS utilities — clipboard history, timers, QR codes, image transforms

  • POST/v1/clipboardRead, write, or manage clipboard history
  • POST/v1/timersStart, list, cancel, or clear timers
  • POST/v1/qr/generateGenerate a QR code PNG
  • POST/v1/image/transformResize, crop, rotate, or convert an image

MCP

Model Context Protocol (Streamable HTTP transport)

  • GET/mcpGET not supported for MCP
  • POST/mcpHandle MCP JSON-RPC request
  • DELETE/mcpTerminate an MCP session
  • GET/v1/mcp-serversList MCP servers
  • GET/v1/mcp-servers/healthMCP server health
  • POST/v1/mcp-servers/{name}/restartRestart an MCP server
  • POST/v1/mcp-servers/{name}/reload-toolsReload an MCP server's tools
  • PUT/v1/mcp-servers/{name}Add or update an MCP server

Anthropic

Anthropic Messages API proxy — drop-in backend for Claude Code and any Anthropic-shape client

  • POST/v1/messagesAnthropic Messages — drop-in proxy for Claude Code
  • POST/v1/messages/count_tokensEstimate input tokens for an Anthropic Messages request
  • GET/v1/settings/anthropic-proxyGet the global Anthropic Proxy backend setting
  • PATCH/v1/settings/anthropic-proxyUpdate the global Anthropic Proxy backend setting

Claude Code

Zero-config integration with Anthropic's Claude Code CLI

  • POST/v1/claude-code/installInstall the claude-tp helper
  • POST/v1/claude-code/uninstallUninstall the claude-tp helper
  • GET/v1/claude-code/helper-statusclaude-tp helper status

SERP

Search-engine results capture, rank tracking, and autocomplete keyword expansion (async job + poll)

  • POST/v1/serp/autocompleteExpand a seed phrase via Google Suggest
  • POST/v1/serp/searchStart a SERP capture job
  • GET/v1/serp/search/{jobId}Poll a SERP capture job
  • POST/v1/serp/rank-checkStart a SERP rank-check job
  • GET/v1/serp/rank-check/{jobId}Poll a SERP rank-check job

API Connections

Saved cloud-provider API connections (key + base URL pairs) consumed by configurations

  • GET/v1/connectionsList API connections
  • POST/v1/connectionsCreate an API connection
  • PUT/v1/connections/{connectionId}Update an API connection
  • DELETE/v1/connections/{connectionId}Delete an API connection

YouTube

YouTube transcript extraction

  • GET/v1/youtube/transcriptFetch YouTube video transcript

Connected Apps

Connected OAuth/bearer clients — listing, pending-consent management, blocking, and revocation

  • GET/v1/connected-appsList connected apps
  • GET/v1/connected-apps/blockedList blocked apps
  • GET/v1/connected-apps/pending-consentsList pending OAuth consents
  • POST/v1/connected-apps/{id}/blockBlock a connected app
  • POST/v1/connected-apps/{id}/revokeRevoke a connected app's token
  • POST/v1/connected-apps/blocked/{id}/unblockUnblock a blocked client

Action Piper

macOS UI automation — VLM grounding resolver, circuit breaker, and accessibility telemetry

  • POST/v1/action-piper/resolveResolve a click target to coordinates
  • POST/v1/action-piper/circuit/resetReset the Apple FM circuit breaker
  • GET/v1/action-piper/telemetry/summaryAccessibility outcome telemetry

Legacy: Ollama-compatible API NewLegacy dialect

ToolPiper serves the Ollama wire dialect as a legacy surface — a migration on-ramp, not a destination. Two mounts, one module: the opt-in loopback-only listener on :11434 (Settings → General, off by default, no auth — upstream's own posture), and this documented :9998 mount at /legacy/ollama/api/* behind the normal bearer middleware (set a configurable client's base URL to http://127.0.0.1:9998/legacy/ollama). Every response carries RFC 9745 "Deprecation" and a Link rel="successor-version" header — the surface was born deprecated, and each endpoint below names its /v1/ successor, which is where new integrations should land. Wire shapes are pinned to recorded fixtures from Ollama 0.23.4. Modelfile semantics (create, push, copy, blobs) are permanently rejected with guidance naming the successor route. The layer is deliberately disposable: it exists while it earns its keep, then it gets deleted.

Ollama porthttp://127.0.0.1:11434opt-in listener — Settings → General, off by default, loopback-onlyBase URLhttp://127.0.0.1:9998/legacy/ollamaalways mounted, bearer-authenticated
  • GET/legacy/ollama/api/versionOllama-dialect version probe — successor: GET /version
  • GET/legacy/ollama/api/tagsOllama-dialect installed-model list — successor: GET /v1/models/installed
  • GET/legacy/ollama/api/psOllama-dialect loaded-model list — successor: GET /v1/models/state
  • POST/legacy/ollama/api/showOllama-dialect per-model detail — successor: GET /v1/models/installed
  • POST/legacy/ollama/api/chatOllama-dialect chat (NDJSON stream) — successor: POST /v1/chat/completions
  • POST/legacy/ollama/api/generateOllama-dialect generate (NDJSON stream) — successor: POST /v1/chat/completions
  • POST/legacy/ollama/api/embedOllama-dialect embeddings — successor: POST /v1/embeddings
  • POST/legacy/ollama/api/embeddingsOllama-dialect embeddings (alias upstream itself deprecates) — successor: POST /v1/embeddings
  • POST/legacy/ollama/api/pullOllama-dialect model pull (NDJSON progress) — successor: POST /v1/models/download
  • DELETE/legacy/ollama/api/deleteOllama-dialect model delete — successor: DELETE /v1/models/{id}