---
title: "VisionPiper: Screen Capture with Live AI Streaming for Mac"
description: "Capture any screen region, record video, export GIFs, and stream live to vision models. A steerable camera for your screen. Free on the Mac App Store."
type: "product"
canonical: "https://modelpiper.com/visionpiper/"
---

# VisionPiper: Screen Capture with Live AI Streaming for Mac

> Capture any screen region, record video, export GIFs, and stream live to vision models. A steerable camera for your screen. Free on the Mac App Store.

**Free on Mac App Store**

## Screen capture that talks to AI.

Capture any region, record video, export GIFs, and stream live to vision models. A steerable camera for your screen.

VisionPiper is a standalone menu bar app for macOS that captures screen regions with precision borders, records H.264 video, trims and exports as GIF/WebP/MP4, and streams frames over WebSocket to ToolPiper's vision models at 30fps. Think of it as a screen-mounted camera you can point at anything.

- [Download from Mac App Store](https://apps.apple.com/us/app/visionpiper/id6759798951?mt=12)
- [See vision AI workflows](/workflows/vision-ai)

## Precision Screen Capture

#### 8-Window Border

Four edges + four corners, pass-through interior. Click through the capture region to interact with everything inside it normally.

#### Movable While Recording

Drag the border to follow content while recording is active. A steerable camera that tracks multi-step workflows across panels.

#### Region Persistence

Last capture region is remembered between sessions. Launch VisionPiper and your previous region is already set.

#### Keyboard Shortcuts

Quick capture, toggle recording, adjust region from keyboard. Fast enough for rapid bug documentation.

#### Multi-Monitor

Works across all connected displays. Place the capture region on any monitor regardless of resolution or scaling.

#### High DPI

Retina-aware capture at native resolution. Every pixel captured at the display's actual density.

## Recording & Export

#### H.264 Video

SCStream + AVAssetWriter for efficient, hardware-accelerated recording. Captures the selected region as .mp4 with minimal CPU overhead.

#### GIF Export

Built-in cgif + libimagequant for high-quality, small-file GIFs. Allocates color information to changing pixels, not static backgrounds.

#### WebP Export

Modern format for web-ready screen captures. Full color and alpha channel support, smaller files than GIF.

#### Trim Editor

Cut start and end of recordings before export. No external editor needed. Preview, trim, and export in one step.

## Live AI Streaming

#### 30fps WebSocket Stream

Metal JPEG frames streamed to ToolPiper on port 10000. Hardware-accelerated encoding keeps CPU usage low during continuous streaming.

#### Screen Q&A

Ask AI questions about what's on your screen in real time. The vision model sees exactly what you see, updated every frame.

#### Image Narration

Vision model describes the screen content, TTS reads it aloud. An accessibility workflow that works with any on-screen content.

#### OCR Pipeline

Extract text from any screen region using Apple Vision OCR. No cloud API, no rate limits. Runs entirely on-device.

## FAQ

### Does it work without ToolPiper?

Yes. Screen capture, video recording, trim editing, and GIF/WebP/MP4 export all work standalone. ToolPiper is only needed for AI streaming — sending frames to vision models for Screen Q&A, image narration, and OCR.

### What macOS version is required?

macOS 26 or later on Apple Silicon (M1 or newer). VisionPiper uses ScreenCaptureKit and Metal, which require Apple Silicon.

### Can I click through the capture region?

Yes. The 8-window architecture uses four edge windows and four corner windows with a pass-through interior. Everything inside the border is fully interactive — clicks, drags, and scrolls pass through to the underlying app.

### Does it record system audio?

No. VisionPiper captures video only. For audio capture, use AudioPiper — a separate free app that records mic, system audio, and per-app audio via Core Audio Taps.
