Shortcuts.app is the macOS automation layer most people forget exists. It can call HTTP APIs, run shell commands, and trigger from menubar, keyboard, or Siri. Pair it with ToolPiper and you get on-demand AI behind any keystroke or menu pick. Same for hotkeys bound through Raycast, Karabiner, BetterTouchTool, or ToolPiper's own hotkey system.

What can macOS Shortcuts do with ToolPiper?

Any action Shortcuts can express, you can route through ToolPiper. The Get Contents of URL action POSTs to ToolPiper's local HTTP API. The response comes back as a dictionary you can pipe into the next step. OCR the clipboard, transcribe a Voice Memo, summarize a selected article, run a screen Q&A from a shortcut on your menu bar.

A few shortcuts that already work end-to-end. An OCR-clipboard shortcut is three steps: get the clipboard, POST it to /vision/ocr, show the result. A Voice Memo transcriber picks a memo, POSTs to /audio/transcribe, saves the transcript as a Note. A summarize-selection shortcut grabs the clipboard, POSTs to /chat with a summarize prompt, and replaces the clipboard with the result. A full screen Q&A chain pipes /vision/screenshot into /chat with the image as context, then speaks the answer through /audio/speak.

How do I build a Shortcut that calls ToolPiper?

Open Shortcuts.app, create a new shortcut, add a Get Contents of URL action. URL is http://127.0.0.1:9998/<endpoint>. Method is POST. Headers include Content-Type: application/json. Request Body is JSON. Pipe the dictionary response into your next step.

Step by step for a clipboard OCR shortcut:

  1. Open Shortcuts.app. Create a new Shortcut. Name it "OCR Clipboard".
  2. Add: Get Clipboard. First action. Captures whatever's in the clipboard.
  3. Add: Get Contents of URL. URL: http://127.0.0.1:9998/vision/ocr. Method: POST. Headers: Content-Type = application/json. Request Body: JSON. The body is { "image": "<clipboard>" } where <clipboard> is the Get Clipboard variable.
  4. Add: Get Dictionary Value. Get the text key from the response.
  5. Add: Show Result. Display the extracted text.
  6. Run. Copy an image to your clipboard, then run the shortcut. The OCR'd text appears.

Once it works, pin the shortcut to your menu bar or assign it a keyboard shortcut from Shortcuts.app's preferences. You now have one-keystroke OCR backed by local AI.

How do hotkeys call ToolPiper?

Any tool that can run a shell command on a hotkey can call ToolPiper. The shell command is usually a single curl invocation against the HTTP API, optionally with jq for response parsing. Raycast script commands, Karabiner-Elements complex modifications, BetterTouchTool triggered actions, Hammerspoon bindings.

Sample Raycast script command, saved as screen-ocr.sh in your Raycast scripts folder:

#!/bin/bash
# @raycast.title Screen OCR
# @raycast.mode silent
# @raycast.packageName ToolPiper

SHOT=$(curl -s -X POST http://127.0.0.1:9998/vision/screenshot \
  -H 'Content-Type: application/json' \
  -d '{ "mode": "full" }' | jq -r '.path')

TEXT=$(curl -s -X POST http://127.0.0.1:9998/vision/ocr \
  -H 'Content-Type: application/json' \
  -d "{ \"path\": \"$SHOT\" }" | jq -r '.text')

echo -n "$TEXT" | pbcopy
osascript -e 'display notification "Screen text copied to clipboard" with title "ToolPiper"'

Bind a hotkey to that command in Raycast's preferences. Press the hotkey, get the visible screen text copied to your clipboard. The whole flow runs locally.

How does push-to-talk dictation use this?

ToolPiper ships with built-in push-to-talk dictation that uses the same HTTP API path internally. Holding the right Option key starts an audio recording. Releasing it stops the recording, transcribes through /audio/transcribe, and pastes the result wherever your cursor is.

The push-to-talk system is documented in ToolPiper's Preferences -> Push to Talk. Both the dictation hotkey (right Option by default) and the command hotkey (right Command by default) live there. You can change the modifiers, the minimum audio threshold, and the post-transcription behavior (paste, copy, or show).

The command hotkey is a more interesting variant. Hold right Command, speak a request, release. ToolPiper transcribes the audio, sends the text to the configured LLM along with the full MCP tool catalog, and the model decides which tools to call. Say "take a screenshot and tell me what's on screen" while holding right Command and a few seconds later you get a notification with the answer. All local, no cloud round-trip.

What if a Shortcut times out or fails?

The default timeout in Shortcuts' Get Contents of URL is generous, but long inference can still exceed it. For tools like chat against a large local model or video_render, expect 30+ seconds. If a Shortcut fails silently, check whether the underlying call returned an error.

Two debugging tactics:

  1. Test the curl command directly first. Run the same POST from Terminal with curl. If it works there, the issue is in Shortcuts' handling.
  2. Add error handling in the Shortcut. Use the If action on the response's error field. Show the error message in a notification. Most tool errors are surfaced as readable strings.

For complex workflows (multi-step chains, retry logic, error fan-out), consider writing a shell script instead and binding it to a hotkey. Shortcuts is excellent for short flows but starts to struggle as branching gets deep.