article2026-03-31by Ben Racicot

Test Recorder for Browser on Mac: AX-Enriched, Not Brittle

TL;DR

PiperTest's recorder captures browser interactions through an injected JavaScript observer (RecorderScript), then enriches each action with CDP accessibility tree data - cdpRole, cdpName, matchCount, ancestors, axPath, elementMeta, and bounding box. The result is structured IPiperTestStep objects that self-heal, not brittle code files. Unlike Playwright codegen (which generates CSS-heavy code) or Cypress Studio (which outputs command chains), PiperTest recording produces AX-enriched structured steps with full element metadata. No browser driver binary. Direct CDP WebSocket to Chrome.

PiperTest recording session showing a browser with the recorder active and captured steps enriched with AX metadata in a tree view

Recording captures interactions and enriches them with accessibility tree metadata

Test recorders have existed for over a decade. Selenium IDE launched in 2006. Every major testing framework has one now. Playwright has codegen. Cypress has Studio. BugBug, Testim, and mabl all have point-and-click recorders. The idea isn't new.

What's new is what happens between the click and the test step. Most recorders capture a DOM event and generate a selector. Click a button, get .btn-primary-lg. Fill a field, get #email-input. The selector is a snapshot of the DOM at recording time, and it starts decaying the moment you save the test.

PiperTest's recorder does something different. It captures the DOM event, then asks Chrome's accessibility tree what the element actually is - its role, its accessible name, its position in the semantic page structure. The output isn't a CSS selector. It's a structured step with full AX metadata that tells the test runner not just where to click, but what it's clicking and why it's identifiable.

How does the recording pipeline work?

Recording is a three-stage pipeline. Each stage adds information that the next stage uses.

Stage 1: JavaScript event capture (RecorderScript)

When you start recording, PiperTest injects a JavaScript observer into the page via CDP's Runtime.evaluate. This is the RecorderScript - a self-contained script that listens for user interactions and emits structured action objects.

The script captures clicks, form fills, keyboard presses, scrolls, and navigation events. For each event, it does several things immediately in JavaScript:

Finds the interactive ancestor. When you click a <span> inside a <button>, the click event targets the span. The script walks up the DOM tree to find the nearest interactive element - the button, link, or input that the user actually intended to interact with. It checks against a set of interactive HTML tags (button, a, input, select, textarea) and interactive ARIA roles (button, link, tab, menuitem, combobox, textbox). This prevents the recorder from targeting inner decoration elements that have no stable identity.

Computes an accessible name. The script checks aria-label, then aria-labelledby (resolving to the referenced element's text), then alt text for images, then placeholder and title for inputs, then text content for elements whose ARIA role derives naming from content (buttons, links, headings, tabs). The name computation follows the W3C Accessible Name and Description Computation spec as closely as possible in a lightweight client-side implementation.

Extracts multiple selector strategies. For each element, the script collects: the inferred ARIA role, the computed accessible name, the associated label text (from <label for="..."> or parent <label>), the data-testid attribute, the visible text content, and a minimal CSS selector (ID, class-based, or nth-of-type). These aren't all used - they're collected so the enrichment stage can choose the best one.

Stores the element reference. The script assigns each action an incrementing ID and stores the DOM element reference in window.__piperElements[id]. This reference is used in Stage 2 to query Chrome's accessibility tree for the exact element that was interacted with.

The captured action is serialized to JSON and sent to ToolPiper via CDP's Runtime.addBinding mechanism. This is a synchronous, reliable channel from page JavaScript to the CDP client - no HTTP calls, no WebSocket overhead, no message loss.

Stage 2: AX tree enrichment (CDPBrowserService)

When ToolPiper receives a recorded action, it doesn't trust the JavaScript-computed selectors. Instead, it queries Chrome's real accessibility tree for the element that was clicked.

The enrichment process uses enrichForSelector, a single-pass function that:

Resolves the stored DOM element reference to a CDP backendNodeId
Calls Accessibility.queryAXTree with that node ID to get the element's AX node
Extracts the cdpRole (Chrome's computed ARIA role) and cdpName (Chrome's computed accessible name)
Checks how many other elements on the page match the same role and name combination (matchCount)
Walks the AX tree ancestry to build the axPath - the full chain from the document root to the target element
Collects elementMeta: the element's tag, role, name, description, and bounding box

This is the critical step. The JavaScript-side name computation is a best-effort approximation. Chrome's AX tree is the authoritative source. When they disagree - and they sometimes do, especially for complex ARIA patterns, shadow DOM content, and framework-generated attributes - the CDP values win.

The matchCount is particularly important. If role:button:Submit matches 3 buttons on the page, the selector is ambiguous. PiperTest detects this and escalates to a hierarchical selector: role:form:Login > role:button:Submit, using the axPath ancestors to scope the selector to a unique context. This happens automatically during recording. The test author doesn't need to think about selector uniqueness.

Stage 3: Step construction

The enriched data is assembled into an IPiperTestStep - PiperTest's structured test step format. Each step contains:

action: type (click, fill, navigate, press, hover, scroll, wait), selector, value, URL
description: a human-readable summary ("Click the Sign In button," "Fill the Email field")
axPath: the full accessibility tree path from page root to the target element
elementMeta: tag, role, name, description, bounding box coordinates
selector strategies: the preferred AX selector plus fallbacks (label, testid, text, css)

The step is appended to the test session. The UI updates in real time, showing each recorded step as a row in the test tree with its description and selector.

What makes this different from Playwright codegen?

Playwright's codegen records your browser interactions and generates TypeScript (or Python, or Java) test code in real time. It's fast, it's integrated into Playwright's CLI, and it produces runnable tests immediately. For a single recording session, codegen is impressively smooth.

The problems show up downstream.

The output is code, not data. Codegen produces a .spec.ts file with await page.click() calls. This is a text file in a programming language. It can't be visually edited. It can't be reordered with drag-and-drop. It can't be enriched with metadata after recording. It's frozen in the form it was generated. PiperTest produces structured JSON steps that can be edited, reordered, and re-enriched at any time.

Selectors default to CSS. Codegen generates whatever selector it thinks is best. The priority is CSS selectors, then role selectors, then text selectors. As one analysis noted, "raw codegen output tends to include brittle selectors" and the default output "is unlikely to survive even modest changes in your app or CI pipeline." You need to configure a locator policy and often manually clean up the generated selectors. PiperTest's recorder defaults to AX selectors because it queries the real accessibility tree, not the DOM.

No assertions generated. Codegen captures actions - clicks, fills, navigations - but doesn't generate assertions. You get a script that replays your clicks but doesn't verify any outcomes. You add assertions manually after recording. PiperTest's recorder captures actions and lets you add assertions interactively during or after recording, with all 7 assertion types available through the UI.

No self-healing. When a codegen-produced selector breaks, the test fails and you fix it manually. When a PiperTest selector breaks, the runner's fuzzy AX matching finds the renamed element in 5-15ms. The difference is 15 milliseconds of automatic repair vs. 5-15 minutes of manual debugging.

No enrichment metadata. Codegen produces a selector. PiperTest produces a selector plus the axPath, elementMeta, bounding box, match count, and ancestor chain. This metadata powers self-healing (the runner knows the element's structural context, not just its selector string) and makes the test human-readable (the description says "Click the Sign In button" instead of await page.locator('.auth-submit').click()).

What about Cypress Studio?

Cypress Studio (available in Cypress's interactive Test Runner) records interactions and generates Cypress commands. It's the most visually integrated recorder in the framework category - you interact with your app inside the Cypress browser panel, and commands appear in real time.

Cypress Studio has the same fundamental limitation as codegen: the output is code. It generates .click(), .type(), and .should() commands in a .cy.ts file. The selectors are CSS-based (Cypress doesn't have role-based selectors natively, though @testing-library/cypress adds them). The recording can't be visually edited after generation without editing code.

Cypress Studio's strength is its time-travel debugging. You can hover over any recorded command and see the exact DOM state at that moment. This is genuinely useful for understanding what happened during recording. PiperTest doesn't have equivalent playback visualization - its strength is in the enrichment pipeline, not the debugging experience.

Why does direct CDP matter?

PiperTest's recorder runs through a persistent WebSocket connection to Chrome DevTools Protocol. No browser driver binary. No WebDriver protocol translation. No Selenium server. No process spawning per action.

This matters for three reasons:

Speed. Each recorded action is captured, enriched, and stored in under 50ms. There's no binary-to-binary IPC overhead. The WebSocket is already open from the moment you connected to Chrome. Action capture is a JavaScript binding call. Enrichment is one or two CDP calls over the same WebSocket. Step construction is in-memory JSON assembly.

Reliability. Browser driver binaries (chromedriver, geckodriver) are external processes that can crash, hang, or fall out of sync with the browser version. CDP is Chrome's native debugging protocol - it's always compatible with the Chrome version you're connected to because it's part of Chrome itself. No version mismatch issues.

Access to the real AX tree. CDP's Accessibility domain provides direct access to Chrome's computed accessibility tree. WebDriver's accessibility APIs are limited and inconsistent across implementations. The real AX tree includes shadow DOM content, computed roles, and the same data that screen readers consume. This is what makes AX-enriched recording possible.

What about complex interaction types?

The recorder handles the common interaction vocabulary: clicks, fills, keyboard presses, hovers, scrolls, and navigations. For each type, the RecorderScript has specialized capture logic:

Clicks. Click events on inner elements (spans inside buttons, SVGs inside links) are resolved to the nearest interactive ancestor. Double-clicks and right-clicks are captured separately. A pending click timer (100ms) deduplicates cases where a click on a checkbox fires both a click and a change event.

Form fills. Input events are debounced with a timer. As you type, the recorder waits for a pause in typing (or a blur event) before capturing the final value. This produces one fill step with the complete text instead of N keystroke steps. The value captured is the input's current value property, not the individual keystrokes.

Select elements. When PiperTest replays a fill on a <select> element, the smart fill system auto-detects it and uses programmatic option selection instead of text input. Date inputs, time inputs, range sliders, and color pickers all get native value setters that bypass the browser's widget UI. This is handled at replay time, not recording time - the recorder captures the value, and the runner picks the right interaction strategy based on the element type.

Keyboard events. Special keys (Enter, Tab, Escape, arrow keys) are captured as press actions with the key name. Regular typing is captured as fill actions on the focused element. The recorder distinguishes between typing into an input (fill) and pressing a key for navigation or interaction (press).

Scrolls. Scroll events are debounced to prevent hundreds of scroll steps during a long scroll gesture. One scroll step is captured per scroll pause.

The recorder won't capture drag-and-drop, file uploads, or browser dialog interactions. These are platform-level interactions that can't be reliably observed from injected JavaScript. For these cases, you add steps manually in the test editor after recording the rest of the flow.

What happens when you stop recording?

When recording stops, three things happen:

The RecorderScript is cleaned up. window.__piperRecorderCleanup() removes all event listeners that were added during recording. The page returns to normal with no residual JavaScript.
The stored element references are cleared. window.__piperElements is deleted. These references were only needed during enrichment and aren't needed for replay.
PiperProbe triggers. PiperTest's interaction coverage system scans the current page's AX tree, builds an interaction map of every interactive element, and computes initial coverage based on the recorded steps. This shows you immediately how much of the page your recording covers.

The test session is now a JSON object with structured steps, AX metadata, and coverage data. You can edit it, add assertions, reorder steps, and run it immediately.

How does recorded data power self-healing?

The enrichment metadata isn't just for display. It's the foundation of PiperTest's self-healing system.

When a selector breaks during replay, the heal loop uses the axPath and elementMeta to narrow the search. If role:button:Submit no longer matches, the runner knows from the axPath that the button was inside a form with role "form" and name "Checkout." It searches for buttons inside that form first. If the button was renamed to "Place Order," the Levenshtein distance against the original name is computed. If the structural context matches and the name is close enough, the healed selector is used.

Without the recording-time metadata, healing would be a blind search across the entire page. With it, healing is a scoped search in the right context. Recording quality directly determines healing quality. This is why PiperTest invests in deep enrichment at recording time rather than generating the cheapest possible selector.

Can AI generate recordings?

Yes. The test_save MCP tool accepts structured test steps in the same format the recorder produces. An AI agent can take a browser snapshot (via browser_snapshot), analyze the AX tree, generate PiperTest steps, and save them as a test session. The saved steps are indistinguishable from recorded steps - they go through the same enrichment pipeline and benefit from the same self-healing.

The difference: AI-generated steps have whatever metadata the AI includes. Recorded steps have the full enrichment pipeline's output. For self-healing purposes, both work. For human readability, recorded steps tend to have more complete descriptions because the recorder observes the actual interaction context.

A practical workflow combines both: record the main flow manually (you know the happy path best), then use AI to generate additional steps for error handling, edge cases, and alternative paths. The AI reads the existing steps for context and generates complementary coverage.

Try it

Download ToolPiper from modelpiper.com/download. Open Chrome, navigate to any web application, and click Record. Browse normally. When you're done, click Stop. Your test is ready to run, edit, or export.

Compare the output to Playwright codegen (npx playwright codegen) on the same flow. Look at the selectors. PiperTest gives you role:button:Sign In. Codegen gives you whatever CSS or XPath it computed from the DOM. Ask yourself which selector survives the next CSS refactor.

This is part of a series on AI-powered testing workflows. For self-healing selectors, see Self-Healing Test Selectors. For test export, see Export Tests to Playwright and Cypress. For temporal assertions, see Temporal Assertions.

PiperTest recording pipeline diagram showing JavaScript capture, CDP AX enrichment, and structured step output with metadata

Three-stage pipeline: JS event capture, AX tree enrichment, structured step construction

Test Recording: PiperTest vs Playwright Codegen vs Cypress Studio

	PiperTest Recorder	Playwright Codegen	Cypress Studio
Output format	Structured JSON steps (data)	TypeScript/Python code (text)	Cypress commands (code)
Default selector strategy	AX tree (role, label, name)	CSS-first with role fallback	CSS selectors
AX enrichment	Full (axPath, elementMeta, matchCount)	None	None
Self-healing on replay	Yes (fuzzy AX match, 5-15ms)	No (manual fix)	No (manual fix)
Assertion generation	Interactive (7 types in UI)	None (add manually to code)	Basic (.should suggestions)
Visual editing after record	Yes (tree UI, drag, inline edit)	No (edit code file)	No (edit code file)
Ambiguity detection	Yes (matchCount + auto-scoping)	No	No
Browser driver required	No (direct CDP WebSocket)	Yes (Playwright binary)	No (runs in browser)
Interaction coverage	Yes (PiperProbe on stop)	No	No
Export to CI frameworks	Playwright + Cypress code	Native Playwright	Native Cypress
Smart fill (select, date, range)	Yes (auto-detects input type)	Partial	No
Recording cleanup	Full (listener removal + ref cleanup)	Process terminates	Session ends

Frequently Asked Questions

Does the recorder work with any web application?

Yes. The RecorderScript is injected via CDP and captures standard DOM events. It works with React, Angular, Vue, Svelte, plain HTML, or any other framework. The JavaScript capture is framework-agnostic because it listens at the event level, not the component level. The AX enrichment is also framework-agnostic because Chrome builds the same accessibility tree regardless of which framework rendered the DOM.

What happens if the page navigates during recording?

When Chrome navigates to a new page, the RecorderScript is re-injected automatically. The navigation itself is captured as a navigate step. Element references from the previous page are invalidated (they no longer exist in the DOM), but that's expected because the enrichment for those elements already completed before the navigation. Recording continues seamlessly across page transitions.

Can I edit steps during recording?

Yes. The test tree updates in real time as you record. You can click any already-recorded step to edit its selector, value, or description while the recorder continues capturing new interactions. You can also delete steps that were mistakes (accidental clicks, wrong form values) without stopping the recording. This lets you clean up the test as you go rather than editing everything afterward.

How does the recorder handle single-page application navigation?

SPA navigations (pushState, replaceState, hash changes) don't trigger a full page load, so the RecorderScript stays active. URL changes are captured as navigate steps. The AX tree is re-queried for each subsequent interaction because the page content may have changed. If the SPA replaces the entire DOM (common in route transitions), the stored element references for previous actions are already enriched and unaffected.

Is the recorder compatible with Chrome extensions?

The recorder works alongside most Chrome extensions. However, extensions that heavily modify the DOM (ad blockers, content injectors, accessibility overlays) may affect what the recorder captures because the modified DOM is what Chrome's AX tree reflects. For the most accurate recording, use a clean browser profile or ToolPiper's dedicated debug profile, which launches Chrome with a separate user data directory.

TestingBrowser AutomationAccessibilityPrivacymacOSDeveloper Tools

Self-Healing Test Selectors: How PiperTest Fixes Broken Tests AutomaticallyHow recorded AX metadata powers the self-healing system Export Tests to Playwright and Cypress From Mac in One ClickExport recorded steps to CI-ready framework code Visual Testing on Mac: Record, Replay, Export to PlaywrightThe full visual testing workflow from recording to CI