Screen capture on macOS is solid for the basics - Command+Shift+4 for screenshots, Command+Shift+5 for recording. But once you need a specific region that you can move during recording, or you want to export as GIF, or you want to feed what's on your screen into an AI model - the built-in tools stop and the paid options start.

CleanShot X is $29. Kap is free but limited to basic recording. None of them can stream your screen to an AI vision model in real time.

VisionPiper is a free macOS menu bar app that captures any screen region, records video you can steer, exports to GIF/WebP/MP4, and streams live to AI vision models.

How does it work?

You define a region on screen - any size, any monitor. VisionPiper draws a visible border around it that's fully click-through, so you can interact with everything inside the region normally. Think of it as a viewfinder that doesn't get in the way.

When you record, VisionPiper writes H.264 .mp4 video of whatever is inside that region. The key feature: the region is moveable during recording. You can drag it to follow a multi-step workflow, reposition it to capture a different panel, or track an element that moves on screen. We call this the "steerable camera" - instead of recording your whole screen and cropping later, you're directing the camera in real time.

When you stop recording, the trim editor opens. Cut the start and end, then export as GIF, WebP, MP4 (H.264), or MP4 (HEVC) with configurable scale and frame rate. The GIF encoder produces noticeably smaller and sharper files than generic converters - it allocates color information to the parts of the image that are actually changing between frames, rather than treating every pixel equally.

The feature that has no equivalent in other capture tools: live AI streaming. VisionPiper can stream frames at 30fps to ToolPiper (our local AI engine). From there, a vision-capable model like Qwen3 VL sees your screen and can answer questions about it - that's how the Screen Q&A workflow works. The model sees exactly what you see, updated in real time, and everything runs on your Mac.

How do you use VisionPiper?

Download VisionPiper from the Mac App Store (free). It lives in your menu bar. macOS will ask for Screen Recording permission the first time - grant it once, and you're set.

To capture a screenshot: Click the menu bar icon, adjust the region by dragging the border if needed, then click Capture. Done.

To record a screencast: Click Record. The selected region records as H.264 .mp4. Move the region around as needed to follow what you're demonstrating. Click Stop. The trim editor opens - cut, preview, and export in your format of choice.

To make a GIF: Record first, then export from the trim editor as GIF. Set the scale (full size or downscaled) and frame rate. For higher quality and smaller files, try WebP - it supports full color and alpha, unlike GIF's 256-color limit.

To stream your screen to AI: With ToolPiper running, VisionPiper streams frames automatically. Open the Screen Q&A template in ModelPiper and start asking questions about what's on your screen.

VisionPiper works across multiple monitors - place the region on any connected display, regardless of resolution or scaling differences.

What people actually use this for

Bug reports. You found a visual bug and need to show exactly what's happening. Select the region around it, record the reproduction steps, export as GIF. One file in the issue ticket that shows everything. The steerable camera means you can follow a bug through multiple screens or panels without recording the entire desktop.

Asking AI about what's on your screen. You're staring at a log output and can't find the error. Or there's a chart you need help interpreting. Or you want a second opinion on a CSS layout. Stream the region to a vision model and ask. The model sees what you see and responds in context - no screenshots, no copy-paste, no uploading to a cloud service.

Documentation and tutorials. Record a workflow, trim the dead time at the beginning and end, export as GIF for your README or WebP for a blog post. The border shows viewers exactly which part of the screen you're focused on.

Presentation recording. Put the region over the slide area and hit record. If the presenter switches to a demo, drag the region to follow. You get a focused recording of the content, not a recording of your entire desktop with Slack notifications popping up.

Monitoring dashboards. Stream a Grafana dashboard or deployment log to a vision model. VisionPiper detects when the screen content actually changes, so it doesn't waste processing on static frames.

What VisionPiper doesn't do

VisionPiper captures a region of your screen - it's not a full video editor. There's no annotation, no callouts, no cursor highlighting. If you need those, CleanShot X or a dedicated editor is the better choice. VisionPiper is intentionally focused: capture, record, export, stream.

Screen Recording permission is required, and macOS shows a one-time prompt for it. Some users find this permission invasive - it's the same one that triggers for any screen recording or sharing app.

DRM-protected content (like streaming video in certain browsers) may appear as a black region in the capture. This is an OS-level restriction that affects all screen capture tools.

AI streaming requires ToolPiper running on your Mac. Without it, VisionPiper works as a standalone capture and recording tool - the AI features just won't be available.

Get VisionPiper

VisionPiper is free on the Mac App Store. Requires macOS 26 or later and Apple Silicon (M1+).

For AI vision workflows (like Screen Q&A), you'll also need ToolPiper - our local AI engine that runs vision models on your Mac.

This is part of a series on local-first AI workflows on macOS. See also: Screen Q&A, Image Narration.