Playwright has matured into the default choice for end-to-end testing in modern web applications, and for good reason: auto-waiting, multi-browser support, native network interception, and a test runner that understands parallelism out of the box. But having a powerful tool does not automatically produce maintainable tests. After running Playwright at scale across multiple teams — and teaching it in my automation courses at UPC — I have compiled the patterns that separate test suites that scale from those that collapse under their own weight.
Project Structure That Scales
The first decision that determines long-term maintainability is how you organize your files. I recommend this structure for any team working with more than 20 test files:
tests/
e2e/
auth/
login.spec.ts
signup.spec.ts
payments/
checkout.spec.ts
refund.spec.ts
pages/
LoginPage.ts
CheckoutPage.ts
BasePage.ts
fixtures/
auth.fixture.ts
data.fixture.ts
utils/
api-helpers.ts
test-data-factory.ts
playwright.config.ts
The key principles: test specs live under tests/e2e/ organized by feature domain, Page Objects live under pages/, custom fixtures under fixtures/, and shared utilities under utils/. This separation makes it immediately clear where new code belongs and prevents the "everything in one folder" entropy that plagues growing test suites.
Page Object Model Done Right
The Page Object Model (POM) is the most widely recommended pattern for E2E test organization, but it is also the most commonly misimplemented. The typical mistake is creating "god objects" — a single DashboardPage class with 50 methods that covers every possible interaction on the dashboard. These become impossible to maintain because every dashboard change touches the same file.
Instead, design Page Objects around user intents, not page URLs. A checkout flow might involve a CartPage, a ShippingFormPage, and a PaymentPage — even if they all render within the same single-page application route.
// pages/CheckoutPage.ts
import { type Page, type Locator } from '@playwright/test';
export class CheckoutPage {
readonly page: Page;
readonly shippingAddress: Locator;
readonly paymentMethod: Locator;
readonly placeOrderButton: Locator;
readonly orderConfirmation: Locator;
constructor(page: Page) {
this.page = page;
this.shippingAddress = page.getByLabel('Shipping address');
this.paymentMethod = page.getByRole('combobox', { name: 'Payment method' });
this.placeOrderButton = page.getByRole('button', { name: 'Place order' });
this.orderConfirmation = page.getByTestId('order-confirmation');
}
async fillShipping(address: string) {
await this.shippingAddress.fill(address);
}
async selectPayment(method: string) {
await this.paymentMethod.selectOption(method);
}
async placeOrder() {
await this.placeOrderButton.click();
await this.orderConfirmation.waitFor({ state: 'visible' });
}
}
Notice that locators are defined in the constructor using Playwright's semantic selectors (getByLabel, getByRole, getByTestId), not raw CSS selectors. This makes the Page Object resilient to HTML structure changes while remaining readable.
Fixture Patterns with test.extend
Playwright's test.extend is one of its most powerful features, yet many teams underuse it. Custom fixtures let you encapsulate setup and teardown logic — authenticated sessions, test data creation, API state — so that test specs remain focused on behavior verification.
// fixtures/auth.fixture.ts
import { test as base, expect } from '@playwright/test';
import { CheckoutPage } from '../pages/CheckoutPage';
type AuthFixtures = {
authenticatedPage: CheckoutPage;
};
export const test = base.extend<AuthFixtures>({
authenticatedPage: async ({ page }, use) => {
// Setup: authenticate via API to skip UI login
const response = await page.request.post('/api/auth/login', {
data: { email: 'test@example.com', password: 'secure-password' }
});
const { token } = await response.json();
await page.context().addCookies([{
name: 'session',
value: token,
domain: 'localhost',
path: '/'
}]);
await page.goto('/checkout');
const checkoutPage = new CheckoutPage(page);
await use(checkoutPage);
// Teardown: clean up test data
await page.request.delete('/api/test/cleanup');
}
});
export { expect };
Now every test that needs an authenticated checkout page simply declares it as a fixture parameter — no repeated login logic, no shared state between tests, and automatic cleanup on teardown.
Reliable Selectors: The Foundation
Flaky tests often trace back to brittle selectors. My selector priority, which I enforce through code review and linting rules, is:
getByRole— Reflects accessibility semantics. If your button is not findable by role, it has an accessibility problem too.getByLabel/getByPlaceholder/getByText— User-visible text selectors. Resilient to structural changes.getByTestId— Dedicateddata-testidattributes. Use when semantic selectors are ambiguous.- CSS selectors — Last resort. Avoid class names that are generated by CSS-in-JS tooling.
I explicitly prohibit XPath in code reviews. It is fragile, hard to read, and nearly always replaceable with one of the above strategies.
Handling Flaky Tests
Flakiness is the number one credibility killer for a test suite. If developers cannot trust the results, they stop looking at them. Here are the patterns I use to eliminate and manage flakiness:
Auto-retry with trace on failure. Playwright's built-in retry mechanism combined with trace recording gives you a full diagnostic package when a test fails intermittently:
// playwright.config.ts
import { defineConfig } from '@playwright/test';
export default defineConfig({
retries: process.env.CI ? 2 : 0,
use: {
trace: 'on-first-retry',
screenshot: 'only-on-failure',
video: 'retain-on-failure'
},
reporter: [
['html', { open: 'never' }],
['junit', { outputFile: 'results/junit.xml' }]
]
});
Network mocking for external dependencies. Tests that depend on third-party APIs — payment gateways, geolocation services, email providers — should mock those boundaries. Playwright's page.route() makes this straightforward:
await page.route('**/api/payment/process', route => {
route.fulfill({
status: 200,
contentType: 'application/json',
body: JSON.stringify({
transactionId: 'mock-txn-001',
status: 'approved'
})
});
});
Isolate test data. Tests should never share database state. Each test creates its own data through API fixtures and cleans up in teardown. If two tests depend on the same user account, they will eventually collide when running in parallel.
Parallel Execution and Sharding
Playwright runs test files in parallel by default, which is the correct behavior for CI. However, you need to design for it:
- No shared state between spec files. Each
.spec.tsfile runs in its own worker. If two files need the same setup, use fixtures — not global state. - Use sharding for large suites. When your suite exceeds 15 minutes on a single machine, shard across multiple CI runners. Playwright supports this natively:
# .github/workflows/e2e.yml
name: E2E Tests
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
strategy:
matrix:
shard: [1/4, 2/4, 3/4, 4/4]
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
- run: npm ci
- run: npx playwright install --with-deps chromium
- run: npx playwright test --shard=${{ matrix.shard }}
- uses: actions/upload-artifact@v4
if: failure()
with:
name: traces-${{ matrix.shard }}
path: test-results/
This configuration splits the test suite across four parallel runners. On failure, traces are uploaded as artifacts for debugging. The entire suite that takes 20 minutes on one machine completes in roughly 5 minutes across four shards.
CI Integration Principles
Beyond sharding, these CI practices have proven essential in my experience:
- Install only the browsers you need.
npx playwright install chromiumis faster than installing all three browsers. Run cross-browser tests nightly, not on every PR. - Cache Playwright browsers. Browser binaries are large. Cache them between CI runs to avoid re-downloading on every build.
- Fail fast, debug later. Configure CI to stop on the first failure during PR checks (quick feedback), but run the full suite nightly (complete coverage). Two different configurations for two different purposes.
- Report to the PR. Use Playwright's HTML reporter and publish results as CI artifacts or comments on the PR. If developers have to dig through CI logs to find failures, adoption suffers.
Patterns I Have Learned to Avoid
After maintaining Playwright suites with hundreds of tests, these are the anti-patterns I catch in code review:
Hard-coded waits. await page.waitForTimeout(3000) is almost always wrong. Playwright's auto-waiting handles the vast majority of timing issues. If you need an explicit wait, wait for a specific condition: await page.waitForResponse(), await locator.waitFor(), or await expect(locator).toBeVisible().
Testing implementation details. Your E2E test should validate what the user sees and does, not how the frontend renders it internally. Asserting on CSS classes, internal component state, or DOM structure couples your tests to implementation decisions that will change.
Overusing E2E for what unit tests should cover. If you are writing an E2E test to verify that a utility function formats a date correctly, you are using the wrong testing level. E2E tests should cover user journeys through the application. Logic validation belongs in unit tests.
Ignoring the test pyramid. I still see teams with 200 E2E tests and 10 unit tests. This inverted pyramid means slow feedback loops, high maintenance costs, and fragile test results. A healthy ratio for most web applications is roughly 70% unit, 20% integration, 10% E2E.
Playwright gives you the tools to build a reliable E2E test suite. But tools do not enforce discipline — engineering practices do. Structure your project for growth, design Page Objects around user intents, leverage fixtures for isolation, choose resilient selectors, and integrate thoughtfully into CI. The result is a test suite that developers trust and actively maintain, rather than one they learn to ignore.
Comments
0 commentsAll comments are moderated and will appear after review.