Every testing framework has a way to wait for things. Playwright has waitFor() and auto-retrying assertions. Cypress has built-in retry-ability with configurable timeouts. Selenium has explicit and implicit waits. The pattern is always the same: pause until a condition is true, then move on.

This works for one question: "Is this thing true right now?" It doesn't work for three questions that matter just as much:

  • "Does this remain true for the entire test?" (An error banner shouldn't appear. A loading spinner shouldn't flash. A modal shouldn't open unexpectedly.)
  • "Does this become true within a specific window?" (The search results should appear within 2 seconds. The notification should dismiss within 5 seconds.)
  • "Is this true at the very next state change?" (After clicking Submit, the form should be hidden immediately. Not eventually. Now.)

These are temporal properties. They describe behavior over time, not at a single point. Every other testing tool collapses them into point-in-time checks with arbitrary timeouts. PiperTest treats them as first-class assertions with their own execution model.

What's wrong with waitFor and retry-ability?

Playwright's expect(locator).toBeVisible() retries until the element is visible or the timeout expires. This is good engineering. It handles the common case where the UI hasn't finished rendering when the assertion runs. But it only answers one question: "Is this element visible at some point within 5 seconds?"

It doesn't tell you whether the element was briefly visible, then hidden, then visible again. It doesn't tell you whether the element appeared in 200ms or 4,900ms. It doesn't tell you whether the element stayed visible for the rest of the test. It's a gate: pass or fail, then move on.

Cypress's retry-ability is similar. .should('be.visible') retries until the assertion passes or the default timeout (4 seconds) expires. It's clever in how it chains - Cypress re-queries the DOM from the root of the command chain on each retry, so a flaky DOM doesn't fool it. But it's still a point-in-time gate. Pass, then forget.

The problem shows up in real testing scenarios:

A loading spinner that flashes. You want to assert that the loading spinner doesn't appear during a fast navigation. With waitFor(), you can check that it's hidden at one point. But it might flash for 100ms between your action and your assertion. Your test passes. Your user saw a janky flash. The bug ships.

An error banner that should never appear. You want to verify that no error messages show up during a multi-step checkout flow. With traditional assertions, you'd need to add an explicit check after every single step. If you miss one step, the error banner appears and disappears before your next assertion, and you never know.

A notification that should auto-dismiss. You want to verify that a success toast appears and then disappears within 5 seconds. With waitFor(), you can check that it appeared. Then you add a waitFor({ state: 'hidden' }) with a timeout. But if the timeout is too short, the test is flaky. If it's too long, the test is slow. And you still don't know whether the notification stayed for exactly the right duration.

These aren't edge cases. They're the gaps where bugs hide in production because the testing model can't express the property you want to verify.

What are temporal assertions?

Temporal assertions are a verification model borrowed from formal methods and adapted for browser testing. In formal verification, temporal logic (specifically Linear Temporal Logic, or LTL) describes properties that hold over sequences of states: "this is always true," "this is eventually true," "this is true in the next state."

PiperTest implements three temporal modes that map to the most useful LTL operators for UI testing:

Always: the invariant

"This condition must hold at every subsequent step for the rest of the test."

When you create an always assertion, you're declaring an invariant. The TemporalRunner evaluates the condition before every subsequent step in the test. If it fails at any point, the residual is immediately marked as failed, and the violation is recorded with the step ID where it broke.

Use cases:

  • "The error banner must remain hidden throughout the checkout flow"
  • "The user's name must remain visible in the header during all navigation"
  • "The cart total must never show $0.00 while items are in the cart"

An always assertion that survives the entire test is marked as passed at the end of the run. It's the testing equivalent of "prove this never goes wrong."

Eventually: the liveness property

"This condition must become true within a specified deadline."

An eventually assertion is created with a withinMs deadline. The TemporalRunner evaluates the condition after each step and at 100ms polling intervals. If the condition becomes true at any point before the deadline, the residual resolves as passed. If the deadline expires without the condition becoming true, it fails.

Use cases:

  • "The search results must appear within 2,000ms"
  • "The loading indicator must disappear within 5,000ms"
  • "The WebSocket connection status must show 'connected' within 3,000ms"

Unlike a waitFor() with a timeout, an eventually assertion doesn't block the test. Other steps continue executing while the residual is alive. The test keeps running, and the temporal assertion resolves in the background based on the state it observes at each evaluation point.

Next: the immediate check

"This condition must hold at the very next evaluation."

A next assertion is the simplest temporal mode. It's evaluated once - at the next step - and immediately resolved as passed or failed. After that single evaluation, it's done.

Use cases:

  • "After clicking Submit, the form must be hidden on the next step"
  • "After toggling the switch, the settings panel must be visible immediately"
  • "After deleting an item, the count must decrease by exactly one"

next is a stronger statement than a regular assertion. A regular assertion checks the condition at the current step. A next assertion explicitly checks at the following step, verifying that the state transition happened correctly.

How does the residual evaluation model work?

The term "residual" comes from runtime verification research. A temporal assertion that hasn't resolved yet is a residual obligation - something the system still needs to prove or disprove.

PiperTest's TemporalRunner maintains a list of active residuals. Here's the lifecycle:

Registration. When a test step has a temporal assertion, the runner calls register() with the step ID, assertion definition, and temporal config (mode + optional deadline). The TemporalRunner creates a Residual struct with the creation timestamp (from ContinuousClock), the mode, and the deadline in milliseconds.

Validation. Invalid configurations fail immediately. An eventually without withinMs is resolved as failed at registration time with the error "eventually mode requires withinMs." Unknown modes fail with a descriptive error. This prevents silent misconfiguration from producing confusing results later.

Evaluation. Before each step's own action, the TemporalRunner evaluates all active residuals. For each one:

  • always: Check the assertion. If it fails, mark the residual as failed with the current step ID. If it passes, leave it active.
  • eventually: Check the assertion. If it passes, mark it as resolved-passed with the resolution time. If it fails, check the deadline. If the deadline expired, mark it as failed. If not, leave it active.
  • next: Check the assertion. Mark it as passed or failed. Remove it from active residuals regardless of outcome.

Single-poll checks. Each residual evaluation uses a 100ms timeout for the actual CDP assertion check. This bounds the per-residual cost. With the 50-residual cap, the maximum per-step overhead is 5 seconds (50 residuals times 100ms). In practice, most evaluations complete in 5-20ms because the assertion either clearly passes or clearly fails.

End-of-run finalization. After the last step executes, remaining active residuals are finalized. always residuals that never failed are marked as passed - they held the invariant. eventually residuals that never passed are marked as failed - the deadline expired. This ensures every temporal assertion has a definitive outcome.

Why a 50-residual cap?

The cap exists to prevent pathological test designs from making execution impractically slow. If a test registered 500 always residuals, each step would need to evaluate all 500, turning a millisecond operation into a multi-second bottleneck.

Fifty residuals with 100ms polling means 5 seconds maximum per-step overhead. In practice, a well-designed test uses 3-10 temporal assertions. The cap is a safety valve, not a target.

When the cap is reached, additional temporal assertion registrations fail immediately with the error "Temporal residual cap reached (50)." The step is marked as failed, making the cap violation visible in the test results rather than silently dropping assertions.

How is this different from everything else?

The short answer: nobody else does this. The longer answer requires looking at what each tool offers and where the gaps are.

Playwright's approach: expect(locator).toBeVisible({ timeout: 5000 }) retries for up to 5 seconds. This is a point-in-time check with a retry window. It doesn't evaluate across steps. It doesn't track whether the condition held continuously. It blocks the test until it resolves. Playwright also has expect.poll() which polls a custom function at intervals, but it's still a single-condition gate, not a multi-step invariant.

Cypress's approach: .should('be.visible') retries using Cypress's built-in retry-ability, re-querying the DOM from the command chain root on each retry. The default timeout is 4 seconds. Like Playwright, this is a gate: the assertion passes when the condition is true, then execution moves on. Cypress doesn't have a concept of persistent conditions across steps.

TestCafe's approach: TestCafe had smart assertion query mechanism that re-evaluated assertions using a timeout-based approach. Functionally equivalent to Playwright's retry. No persistent residuals.

WebdriverIO's approach: waitUntil() polls a condition at intervals. Single-condition, single-resolution. No multi-step evaluation.

The common thread: every framework treats assertions as gates. Check the condition, wait if needed, pass or fail, move on. None of them can express "this must stay true" or "this must happen within a window while other things are also happening."

PiperTest's temporal assertions exist because teams asked for them. During testing of complex applications with dynamic UI - dashboards with live data, multi-step wizards with progressive disclosure, real-time collaborative editors - the question kept coming up: "How do I assert that the sidebar stays visible while I'm interacting with the main content?" The answer was always "add an assertion after every step." Temporal assertions replace that manual repetition with a single declaration.

What does a real test with temporal assertions look like?

Here's a multi-step form wizard with three temporal properties:

navigate    https://app.example.com/onboarding
assert      always   role:banner:visible        // header stays visible
fill        label:First Name     Jane
fill        label:Last Name      Doe
click       role:button:Next
assert      eventually(3000)  role:heading = "Step 2"  // next page loads within 3s
assert      next    role:form:Step 1 = hidden           // old form hides immediately
fill        label:Company        Acme Corp
click       role:button:Next
assert      text    role:heading = "Step 3"

Three temporal assertions in a 9-step test:

  1. The always assertion on the banner is evaluated before steps 3-9. If the banner disappears at any point during the flow, the test fails with the exact step where it happened.
  2. The eventually(3000) assertion on the heading is created at step 5 (after clicking Next) and resolved when the "Step 2" heading appears, or failed if 3 seconds pass without it appearing. Other steps continue executing while this residual is alive.
  3. The next assertion on the old form is evaluated at step 7 (the fill after clicking Next). If the Step 1 form is still visible at that point, the assertion fails.

Without temporal assertions, the same test would need explicit assertions peppered throughout:

navigate    https://app.example.com/onboarding
assert      visible  role:banner                       // check 1
fill        label:First Name     Jane
assert      visible  role:banner                       // check 2
fill        label:Last Name      Doe
assert      visible  role:banner                       // check 3
click       role:button:Next
// now wait for step 2...
assert      visible  role:heading = "Step 2"            // with timeout?
assert      hidden   role:form:Step 1                  // and this too?
assert      visible  role:banner                       // check 4
fill        label:Company        Acme Corp
assert      visible  role:banner                       // check 5
click       role:button:Next
assert      visible  role:banner                       // check 6
assert      text     role:heading = "Step 3"

Six manual banner checks instead of one always declaration. An ambiguous timeout on the Step 2 heading instead of a clear eventually(3000). And a separate assertion for the form hiding instead of a precise next check. The temporal version is shorter, clearer, and more precise about what it's verifying.

How do temporal assertions interact with self-healing?

Temporal residuals use the same assertion engine as regular assertions. CDPBrowserService.checkAssertion() handles the actual AX tree query. This means temporal assertions benefit from the same AX-native selector stability that regular assertions get.

If a temporal assertion's selector needs healing (the element was renamed), the same fuzzy AX matching applies. The residual keeps evaluating against the healed selector. Self-healing and temporal evaluation are orthogonal - they compose naturally because they operate on different axes (selector resolution vs. time).

What about export?

Temporal assertions export to Playwright and Cypress as one-time checks with // TEMPORAL: comments:

// TEMPORAL: always(visible) - evaluated across steps in PiperTest; one-time check below
await expect(page.getByRole('banner')).toBeVisible();

The comment preserves the temporal intent. The code performs a point-in-time check. Neither Playwright nor Cypress has native equivalents for always or eventually semantics, so the export is honest about the downgrade. A developer can add explicit multi-point assertions in the exported code if continuous verification matters for CI.

Why does this matter for async-heavy applications?

Modern web applications are asynchronous by default. API calls return at unpredictable times. WebSocket messages arrive on their own schedule. Framework rendering cycles (React reconciliation, Angular change detection, Vue reactivity batching) create timing gaps between state changes and DOM updates.

A 2024 study on handling time in tests noted that the standard approaches - fake timers, explicit waits, polling loops - address symptoms rather than root causes. Fake timers require control over the timer implementation, which breaks when third-party libraries use their own timing. Explicit waits guess at durations, creating flaky tests when the guess is wrong. Polling loops add complexity and still don't express the actual property you care about.

Temporal assertions express the property directly. "This must eventually be true within 3 seconds" is a clear contract. "This must always be true" is an invariant. "This must be true next" is a state transition check. The assertion language matches how you think about async behavior, not how the framework implements waiting.

For teams building real-time applications, this is transformative. A dashboard with live data feeds needs to verify that UI components remain consistent while data updates. A collaborative editor needs to verify that presence indicators appear within a time window. A notification system needs to verify that toasts auto-dismiss. Temporal assertions express these properties in one line each.

Try it

Download ToolPiper from the Mac App Store. Record a test, add temporal assertions through the step editor, and run it. The test results show each residual's lifecycle: when it was created, when it was evaluated, and when it resolved.

Start with one always assertion on something that should never change during your flow - the header, the navigation, the user's avatar. Run the test. If it catches something you didn't expect, you've already found a bug that traditional assertions would miss.

This is part of a series on AI-powered testing workflows. For background health monitoring, see Browser Health Monitoring. For the export story, see Export Tests to Playwright and Cypress. For the visual recorder, see Test Recorder for Browser on Mac.