Performance & Token Efficiency

Specwright's pipeline has two token-heavy phases: Phase 4 (browser exploration) and Phase 7 (BDD generation). Both are optimised by default, and there are additional steps you can take to reduce token usage further.


Built-in Exploration Optimisations (Phase 4)

Agent memory as selector cache

On every exploration run, the planner agent reads .claude/agent-memory/playwright-test-planner/MEMORY.md before opening a browser. If the memory contains selectors for the target URL, the agent enters verification mode instead of full exploration:

ModeBrowser callsWhen
Full exploration≤ 20 callsMemory empty or URL not seen before
Verification mode≤ 5 callsMemory has selectors for this URL

Verification mode navigates, takes one snapshot, and confirms selectors are still valid — skipping all discovery interactions.

Instruction-count-driven budget

The budget scales with your instructions[] array: 1 navigate + 1 snapshot + 1 interaction per instruction. A config with 4 instructions uses at most 12 browser calls. The agent stops exploring as soon as all instructions have confirmed selectors.

Targeted snapshots

After the initial full-page snapshot, the planner uses the ref parameter to snapshot only the relevant element region — not the full page. This reduces snapshot token size by 60–80% for content-heavy pages.

Source-code pre-discovery (localhost only)

For pageURL values pointing to localhost, the planner greps src/ for data-testid attributes before opening a browser. Any testids found in source code are added as selector hints, reducing the number of browser interactions needed to confirm them.


Built-in Generation Optimisations (Phase 7)

generate-context.md knowledge document

On repeat runs, Phase 7 reads e2e-tests/.knowledge/generate-context.md (~2.5 KB) instead of the full stepHelpers.js source (~17 KB). The knowledge document contains the same FIELD_TYPES constants, faker patterns, and API call signatures in a pre-extracted table format.

Generate or regenerate this file after updating stepHelpers.js:

node e2e-tests/scripts/extract-generate-context.js
SourceSizeUsed when
generate-context.md~2.5 KBFile exists (repeat runs)
stepHelpers.js~17 KBFile missing (first run / fallback)

Planner memory selector table

When the planner memory has a ## Key Selectors section for the target module, Phase 7 reads that table (~1–3 KB) instead of seed.spec.js (~6.6 KB raw JS). The structured table is faster for the code-generator to parse.

No shared/ directory scan

The plan file produced by Phase 4 includes a "Shared steps to reuse" section listing exactly which shared step files apply. Phase 7 reads only those named files — never the entire shared/ directory.


Tips to Reduce Token Consumption

Write precise instructions[]

Each instruction maps to one browser interaction. Vague entries like "Test the form" cause the planner to guess what to interact with, leading to unnecessary exploration calls. Precise entries like "Click 'Create List', enter a name with <gen_test_data>, click Save" give the planner a clear budget.

Use explore: false when you don't need live browser discovery

Set explore: false when you're confident the instructions are precise enough to write test cases without live exploration. This skips the browser session entirely and generates BDD files directly from the parsed plan.

Good candidates for explore: false:

  • Localhost projects with a rich source base — the planner can grep src/ for data-testid attributes and read component files to infer selectors without opening a browser
  • Re-runs after a successful exploration — agent memory already has validated selectors; skip re-exploration and go straight to generation
  • Well-documented external apps — when your instructions[] are detailed enough that selector discovery isn't needed (e.g. you already know the testids or ARIA roles)
{
  moduleName: '@CheckoutFlow',
  pageURL: 'http://localhost:3000/checkout',
  explore: false,   // uses src/ grep + agent memory instead of browser
  instructions: [
    'Verify the order summary shows the correct item count and total price.',
    'Click "Place Order" with valid card details — verify confirmation screen.',
  ],
}

When to keep explore: true: Whenever you want the agent to discover selectors via live browser interaction — this is the default and always valid. explore: false is purely an optimisation for when you're confident live discovery isn't needed.

Skip execution during iteration

While refining your instructions[], set both execution flags to false to run only exploration and generation (Phases 1–7):

{
  runExploredCases: false,   // skip Phase 5 seed validation
  runGeneratedCases: false,  // skip Phase 8 test execution
}

Re-enable them only when you're satisfied with the generated BDD files.

Limit subModuleName[] to actual phases

Each submodule name triggers a separate exploration pass. Only list the phases that exist in your workflow:

// ❌ Wastes tokens exploring non-existent phases
subModuleName: ['@0-Setup', '@1-Action', '@2-Verify', '@3-Cleanup'],

// ✅ Only what's needed
subModuleName: ['@0-Setup', '@1-Action', '@2-Verify'],

Keep generate-context.md up to date

Run the extraction script whenever you add new FIELD_TYPES or faker patterns to stepHelpers.js. Stale knowledge docs fall back to the full 17 KB source read.