Test recorders have existed for over a decade. Selenium IDE launched in 2006. Every major testing framework has one now. Playwright has codegen. Cypress has Studio. BugBug, Testim, and mabl all have point-and-click recorders. The idea isn't new.
What's new is what happens between the click and the test step. Most recorders capture a DOM event and generate a selector. Click a button, get .btn-primary-lg. Fill a field, get #email-input. The selector is a snapshot of the DOM at recording time, and it starts decaying the moment you save the test.
PiperTest's recorder does something different. It captures the DOM event, then asks Chrome's accessibility tree what the element actually is - its role, its accessible name, its position in the semantic page structure. The output isn't a CSS selector. It's a structured step with full AX metadata that tells the test runner not just where to click, but what it's clicking and why it's identifiable.
How does the recording pipeline work?
Recording is a three-stage pipeline. Each stage adds information that the next stage uses.
Stage 1: JavaScript event capture (RecorderScript)
When you start recording, PiperTest injects a JavaScript observer into the page via CDP's Runtime.evaluate. This is the RecorderScript - a self-contained script that listens for user interactions and emits structured action objects.
The script captures clicks, form fills, keyboard presses, scrolls, and navigation events. For each event, it does several things immediately in JavaScript:
Finds the interactive ancestor. When you click a <span> inside a <button>, the click event targets the span. The script walks up the DOM tree to find the nearest interactive element - the button, link, or input that the user actually intended to interact with. It checks against a set of interactive HTML tags (button, a, input, select, textarea) and interactive ARIA roles (button, link, tab, menuitem, combobox, textbox). This prevents the recorder from targeting inner decoration elements that have no stable identity.
Computes an accessible name. The script checks aria-label, then aria-labelledby (resolving to the referenced element's text), then alt text for images, then placeholder and title for inputs, then text content for elements whose ARIA role derives naming from content (buttons, links, headings, tabs). The name computation follows the W3C Accessible Name and Description Computation spec as closely as possible in a lightweight client-side implementation.
Extracts multiple selector strategies. For each element, the script collects: the inferred ARIA role, the computed accessible name, the associated label text (from <label for="..."> or parent <label>), the data-testid attribute, the visible text content, and a minimal CSS selector (ID, class-based, or nth-of-type). These aren't all used - they're collected so the enrichment stage can choose the best one.
Stores the element reference. The script assigns each action an incrementing ID and stores the DOM element reference in window.__piperElements[id]. This reference is used in Stage 2 to query Chrome's accessibility tree for the exact element that was interacted with.
The captured action is serialized to JSON and sent to ToolPiper via CDP's Runtime.addBinding mechanism. This is a synchronous, reliable channel from page JavaScript to the CDP client - no HTTP calls, no WebSocket overhead, no message loss.
Stage 2: AX tree enrichment (CDPBrowserService)
When ToolPiper receives a recorded action, it doesn't trust the JavaScript-computed selectors. Instead, it queries Chrome's real accessibility tree for the element that was clicked.
The enrichment process uses enrichForSelector, a single-pass function that:
- Resolves the stored DOM element reference to a CDP
backendNodeId - Calls
Accessibility.queryAXTreewith that node ID to get the element's AX node - Extracts the
cdpRole(Chrome's computed ARIA role) andcdpName(Chrome's computed accessible name) - Checks how many other elements on the page match the same role and name combination (
matchCount) - Walks the AX tree ancestry to build the
axPath- the full chain from the document root to the target element - Collects
elementMeta: the element's tag, role, name, description, and bounding box
This is the critical step. The JavaScript-side name computation is a best-effort approximation. Chrome's AX tree is the authoritative source. When they disagree - and they sometimes do, especially for complex ARIA patterns, shadow DOM content, and framework-generated attributes - the CDP values win.
The matchCount is particularly important. If role:button:Submit matches 3 buttons on the page, the selector is ambiguous. PiperTest detects this and escalates to a hierarchical selector: role:form:Login > role:button:Submit, using the axPath ancestors to scope the selector to a unique context. This happens automatically during recording. The test author doesn't need to think about selector uniqueness.
Stage 3: Step construction
The enriched data is assembled into an IPiperTestStep - PiperTest's structured test step format. Each step contains:
- action: type (click, fill, navigate, press, hover, scroll, wait), selector, value, URL
- description: a human-readable summary ("Click the Sign In button," "Fill the Email field")
- axPath: the full accessibility tree path from page root to the target element
- elementMeta: tag, role, name, description, bounding box coordinates
- selector strategies: the preferred AX selector plus fallbacks (label, testid, text, css)
The step is appended to the test session. The UI updates in real time, showing each recorded step as a row in the test tree with its description and selector.
What makes this different from Playwright codegen?
Playwright's codegen records your browser interactions and generates TypeScript (or Python, or Java) test code in real time. It's fast, it's integrated into Playwright's CLI, and it produces runnable tests immediately. For a single recording session, codegen is impressively smooth.
The problems show up downstream.
The output is code, not data. Codegen produces a .spec.ts file with await page.click() calls. This is a text file in a programming language. It can't be visually edited. It can't be reordered with drag-and-drop. It can't be enriched with metadata after recording. It's frozen in the form it was generated. PiperTest produces structured JSON steps that can be edited, reordered, and re-enriched at any time.
Selectors default to CSS. Codegen generates whatever selector it thinks is best. The priority is CSS selectors, then role selectors, then text selectors. As one analysis noted, "raw codegen output tends to include brittle selectors" and the default output "is unlikely to survive even modest changes in your app or CI pipeline." You need to configure a locator policy and often manually clean up the generated selectors. PiperTest's recorder defaults to AX selectors because it queries the real accessibility tree, not the DOM.
No assertions generated. Codegen captures actions - clicks, fills, navigations - but doesn't generate assertions. You get a script that replays your clicks but doesn't verify any outcomes. You add assertions manually after recording. PiperTest's recorder captures actions and lets you add assertions interactively during or after recording, with all 7 assertion types available through the UI.
No self-healing. When a codegen-produced selector breaks, the test fails and you fix it manually. When a PiperTest selector breaks, the runner's fuzzy AX matching finds the renamed element in 5-15ms. The difference is 15 milliseconds of automatic repair vs. 5-15 minutes of manual debugging.
No enrichment metadata. Codegen produces a selector. PiperTest produces a selector plus the axPath, elementMeta, bounding box, match count, and ancestor chain. This metadata powers self-healing (the runner knows the element's structural context, not just its selector string) and makes the test human-readable (the description says "Click the Sign In button" instead of await page.locator('.auth-submit').click()).
What about Cypress Studio?
Cypress Studio (available in Cypress's interactive Test Runner) records interactions and generates Cypress commands. It's the most visually integrated recorder in the framework category - you interact with your app inside the Cypress browser panel, and commands appear in real time.
Cypress Studio has the same fundamental limitation as codegen: the output is code. It generates .click(), .type(), and .should() commands in a .cy.ts file. The selectors are CSS-based (Cypress doesn't have role-based selectors natively, though @testing-library/cypress adds them). The recording can't be visually edited after generation without editing code.
Cypress Studio's strength is its time-travel debugging. You can hover over any recorded command and see the exact DOM state at that moment. This is genuinely useful for understanding what happened during recording. PiperTest doesn't have equivalent playback visualization - its strength is in the enrichment pipeline, not the debugging experience.
Why does direct CDP matter?
PiperTest's recorder runs through a persistent WebSocket connection to Chrome DevTools Protocol. No browser driver binary. No WebDriver protocol translation. No Selenium server. No process spawning per action.
This matters for three reasons:
Speed. Each recorded action is captured, enriched, and stored in under 50ms. There's no binary-to-binary IPC overhead. The WebSocket is already open from the moment you connected to Chrome. Action capture is a JavaScript binding call. Enrichment is one or two CDP calls over the same WebSocket. Step construction is in-memory JSON assembly.
Reliability. Browser driver binaries (chromedriver, geckodriver) are external processes that can crash, hang, or fall out of sync with the browser version. CDP is Chrome's native debugging protocol - it's always compatible with the Chrome version you're connected to because it's part of Chrome itself. No version mismatch issues.
Access to the real AX tree. CDP's Accessibility domain provides direct access to Chrome's computed accessibility tree. WebDriver's accessibility APIs are limited and inconsistent across implementations. The real AX tree includes shadow DOM content, computed roles, and the same data that screen readers consume. This is what makes AX-enriched recording possible.
What about complex interaction types?
The recorder handles the common interaction vocabulary: clicks, fills, keyboard presses, hovers, scrolls, and navigations. For each type, the RecorderScript has specialized capture logic:
Clicks. Click events on inner elements (spans inside buttons, SVGs inside links) are resolved to the nearest interactive ancestor. Double-clicks and right-clicks are captured separately. A pending click timer (100ms) deduplicates cases where a click on a checkbox fires both a click and a change event.
Form fills. Input events are debounced with a timer. As you type, the recorder waits for a pause in typing (or a blur event) before capturing the final value. This produces one fill step with the complete text instead of N keystroke steps. The value captured is the input's current value property, not the individual keystrokes.
Select elements. When PiperTest replays a fill on a <select> element, the smart fill system auto-detects it and uses programmatic option selection instead of text input. Date inputs, time inputs, range sliders, and color pickers all get native value setters that bypass the browser's widget UI. This is handled at replay time, not recording time - the recorder captures the value, and the runner picks the right interaction strategy based on the element type.
Keyboard events. Special keys (Enter, Tab, Escape, arrow keys) are captured as press actions with the key name. Regular typing is captured as fill actions on the focused element. The recorder distinguishes between typing into an input (fill) and pressing a key for navigation or interaction (press).
Scrolls. Scroll events are debounced to prevent hundreds of scroll steps during a long scroll gesture. One scroll step is captured per scroll pause.
The recorder won't capture drag-and-drop, file uploads, or browser dialog interactions. These are platform-level interactions that can't be reliably observed from injected JavaScript. For these cases, you add steps manually in the test editor after recording the rest of the flow.
What happens when you stop recording?
When recording stops, three things happen:
- The RecorderScript is cleaned up.
window.__piperRecorderCleanup()removes all event listeners that were added during recording. The page returns to normal with no residual JavaScript. - The stored element references are cleared.
window.__piperElementsis deleted. These references were only needed during enrichment and aren't needed for replay. - PiperProbe triggers. PiperTest's interaction coverage system scans the current page's AX tree, builds an interaction map of every interactive element, and computes initial coverage based on the recorded steps. This shows you immediately how much of the page your recording covers.
The test session is now a JSON object with structured steps, AX metadata, and coverage data. You can edit it, add assertions, reorder steps, and run it immediately.
How does recorded data power self-healing?
The enrichment metadata isn't just for display. It's the foundation of PiperTest's self-healing system.
When a selector breaks during replay, the heal loop uses the axPath and elementMeta to narrow the search. If role:button:Submit no longer matches, the runner knows from the axPath that the button was inside a form with role "form" and name "Checkout." It searches for buttons inside that form first. If the button was renamed to "Place Order," the Levenshtein distance against the original name is computed. If the structural context matches and the name is close enough, the healed selector is used.
Without the recording-time metadata, healing would be a blind search across the entire page. With it, healing is a scoped search in the right context. Recording quality directly determines healing quality. This is why PiperTest invests in deep enrichment at recording time rather than generating the cheapest possible selector.
Can AI generate recordings?
Yes. The test_save MCP tool accepts structured test steps in the same format the recorder produces. An AI agent can take a browser snapshot (via browser_snapshot), analyze the AX tree, generate PiperTest steps, and save them as a test session. The saved steps are indistinguishable from recorded steps - they go through the same enrichment pipeline and benefit from the same self-healing.
The difference: AI-generated steps have whatever metadata the AI includes. Recorded steps have the full enrichment pipeline's output. For self-healing purposes, both work. For human readability, recorded steps tend to have more complete descriptions because the recorder observes the actual interaction context.
A practical workflow combines both: record the main flow manually (you know the happy path best), then use AI to generate additional steps for error handling, edge cases, and alternative paths. The AI reads the existing steps for context and generates complementary coverage.
Try it
Download ToolPiper from the Mac App Store. Open Chrome, navigate to any web application, and click Record. Browse normally. When you're done, click Stop. Your test is ready to run, edit, or export.
Compare the output to Playwright codegen (npx playwright codegen) on the same flow. Look at the selectors. PiperTest gives you role:button:Sign In. Codegen gives you whatever CSS or XPath it computed from the DOM. Ask yourself which selector survives the next CSS refactor.
This is part of a series on AI-powered testing workflows. For self-healing selectors, see Self-Healing Test Selectors. For test export, see Export Tests to Playwright and Cypress. For temporal assertions, see Temporal Assertions.