Automated Visible Regression Testing With Playwright

March 29, 2025

185

Evaluating visible artifacts generally is a highly effective, if fickle, method to automated testing. Playwright makes this appear easy for web sites, however the particulars may take slightly finessing.

Latest downtime prompted me to scratch an itch that had been plaguing me for some time: The fashion sheet of an internet site I keep has grown just a bit unwieldy as we’ve been including code whereas exploring new options. Now that we have now a greater thought of the necessities, it’s time for inner CSS refactoring to pay down a few of our technical debt, profiting from trendy CSS options (like utilizing CSS nesting for extra apparent construction). Extra importantly, a cleaner basis ought to make it simpler to introduce that darkish mode function we’re sorely missing so we will lastly respect customers’ most popular shade scheme.

Nevertheless, being of the apprehensive persuasion, I used to be reluctant to make giant adjustments for concern of unwittingly introducing bugs. I wanted one thing to protect in opposition to visible regressions whereas refactoring — besides which means snapshot testing, which is notoriously gradual and brittle.

On this context, snapshot testing means taking screenshots to determine a dependable baseline in opposition to which we will evaluate future outcomes. As we’ll see, these artifacts are influenced by a mess of things that may not at all times be totally controllable (e.g. timing, variable {hardware} sources, or randomized content material). We even have to keep up state between check runs, i.e. save these screenshots, which complicates the setup and means our check code alone doesn’t totally describe expectations.

Having procrastinated with out a extra agreeable resolution revealing itself, I lastly got down to create what I assumed can be a fast spike. In spite of everything, this wouldn’t be a part of the common check suite; only a one-off utility for this explicit refactoring activity.

Luckily, I had imprecise recollections of previous analysis and shortly rediscovered Playwright’s built-in visible comparability function. As a result of I attempt to choose dependencies rigorously, I used to be glad to see that Playwright appears to not depend on many exterior packages.

Setup

The really useful setup with npm init playwright@newest does a good job, however my minimalist style had me set all the things up from scratch as a substitute. This do-it-yourself method additionally helped me perceive how the completely different items match collectively.

Provided that I count on snapshot testing to solely be used on uncommon events, I wished to isolate all the things in a devoted subdirectory, referred to as check/visible; that will probably be our working listing from right here on out. We’ll begin with bundle.json to declare our dependencies, including just a few helper scripts (spoiler!) whereas we’re at it:

{
  "scripts": ,
  "devDependencies": {
    "@playwright/check": "^1.49.1"
  }
}

When you don’t need node_modules hidden in some subdirectory but additionally don’t wish to burden the basis venture with this rarely-used dependency, you may resort to manually invoking npm set up --no-save @playwright/check within the root listing when wanted.

With that in place, npm set up downloads Playwright. Afterwards, npx playwright set up downloads a variety of headless browsers. (We’ll use npm right here, however you may choose a special bundle supervisor and activity runner.)

We outline our check setting through playwright.config.js with a couple of dozen fundamental Playwright settings:

import { defineConfig, units } from "@playwright/check";

let BROWSERS = ["Desktop Firefox", "Desktop Chrome", "Desktop Safari"];
let BASE_URL = "http://localhost:8000";
let SERVER = "cd ../../dist && python3 -m http.server";

let IS_CI = !!course of.env.CI;

export default defineConfig({
  testDir: "./",
  fullyParallel: true,
  forbidOnly: IS_CI,
  retries: 2,
  staff: IS_CI ? 1 : undefined,
  reporter: "html",
  webServer: {
    command: SERVER,
    url: BASE_URL,
    reuseExistingServer: !IS_CI
  },
  use: {
    baseURL: BASE_URL,
    hint: "on-first-retry"
  },
  tasks: BROWSERS.map(ua => ({
    identify: ua.toLowerCase().replaceAll(" ", "-"),
    use: { ...units[ua] }
  }))
});

Right here we count on our static web site to already reside throughout the root listing’s dist folder and to be served at localhost:8000 (see SERVER; I choose Python there as a result of it’s broadly out there). I’ve included a number of browsers for illustration functions. Nonetheless, we would cut back that quantity to hurry issues up (thus our easy BROWSERS checklist, which we then map to Playwright’s extra elaborate tasks knowledge construction). Equally, steady integration is YAGNI for my explicit situation, in order that complete IS_CI dance could possibly be discarded.

Seize and evaluate

Let’s flip to the precise assessments, beginning with a minimal pattern.check.js file:

import { check, count on } from "@playwright/check";

check("residence web page", async ({ web page }) => {
  await web page.goto("https://css-tricks.com/");
  await count on(web page).toHaveScreenshot();
});

npm check executes this little check suite (based mostly on file-name conventions). The preliminary run at all times fails as a result of it first must create baseline snapshots in opposition to which subsequent runs evaluate their outcomes. Invoking npm check as soon as extra ought to report a passing check.

Altering our website, e.g. by recklessly messing with construct artifacts in dist, ought to make the check fail once more. Such failures will provide numerous choices to check anticipated and precise visuals:

Failing test with slightly different screenshots side by side

We are able to additionally examine these baseline snapshots straight: Playwright creates a folder for screenshots named after the check file (pattern.check.js-snapshots on this case), with file names derived from the respective check’s title (e.g. home-page-desktop-firefox.png).

Producing assessments

Getting again to our unique motivation, what we wish is a check for each web page. As a substitute of arduously writing and sustaining repetitive assessments, we’ll create a easy net crawler for our web site and have assessments generated robotically; one for every URL we’ve recognized.

Playwright’s world setup permits us to carry out preparatory work earlier than check discovery begins: Decide these URLs and write them to a file. Afterward, we will dynamically generate our assessments at runtime.

Whereas there are different methods to cross knowledge between the setup and test-discovery phases, having a file on disk makes it simple to change the checklist of URLs earlier than check runs (e.g. quickly ignoring irrelevant pages).

Website map

Step one is to increase playwright.config.js by inserting globalSetup and exporting two of our configuration values:

export let BROWSERS = ["Desktop Firefox", "Desktop Chrome", "Desktop Safari"];
export let BASE_URL = "http://localhost:8000";

// and many others.

export default defineConfig({
  // and many others.
  globalSetup: require.resolve("./setup.js")
});

Though we’re utilizing ES modules right here, we will nonetheless depend on CommonJS-specific APIs like require.resolve and __dirname. It seems there’s some Babel transpilation taking place within the background, so what’s truly being executed might be CommonJS? Such nuances generally confuse me as a result of it isn’t at all times apparent what’s being executed the place.

We are able to now reuse these exported values inside a newly created setup.js, which spins up a headless browser to crawl our website (simply because that’s simpler right here than utilizing a separate HTML parser):

import { BASE_URL, BROWSERS } from "./playwright.config.js";
import { createSiteMap, readSiteMap } from "./sitemap.js";
import playwright from "@playwright/check";

export default async operate globalSetup(config) {
  // solely create website map if it does not exist already
  attempt {
    readSiteMap();
    return;
  } catch(err) {}

  // launch browser and provoke crawler
  let browser = playwright.units[BROWSERS[0]].defaultBrowserType;
  browser = await playwright[browser].launch();
  let web page = await browser.newPage();
  await createSiteMap(BASE_URL, web page);
  await browser.shut();
}

That is pretty boring glue code; the precise crawling is occurring inside sitemap.js:

createSiteMap determines URLs and writes them to disk.
readSiteMap merely reads any beforehand created website map from disk. This will probably be our basis for dynamically producing assessments. (We’ll see later why this must be synchronous.)

Luckily, the web site in query offers a complete index of all pages, so my crawler solely wants to gather distinctive native URLs from that index web page:

operate extractLocalLinks(baseURL) {
  let urls = new Set();
  let offset = baseURL.size;
  for(let { href } of doc.hyperlinks) {
    if(href.startsWith(baseURL)) {
      let path = href.slice(offset);
      urls.add(path);
    }
  }
  return Array.from(urls);
}

Wrapping that in a extra boring glue code offers us our sitemap.js:

import { readFileSync, writeFileSync } from "node:fs";
import { be part of } from "node:path";

let ENTRY_POINT = "/subjects";
let SITEMAP = be part of(__dirname, "./sitemap.json");

export async operate createSiteMap(baseURL, web page) {
  await web page.goto(baseURL + ENTRY_POINT);
  let urls = await web page.consider(extractLocalLinks, baseURL);
  let knowledge = JSON.stringify(urls, null, 4);
  writeFileSync(SITEMAP, knowledge, { encoding: "utf-8" });
}

export operate readSiteMap() {
  attempt {
    var knowledge = readFileSync(SITEMAP, { encoding: "utf-8" });
  } catch(err) {
    if(err.code === "ENOENT") {
      throw new Error("lacking website map");
    }
    throw err;
  }
  return JSON.parse(knowledge);
}

operate extractLocalLinks(baseURL) {
  // and many others.
}

The fascinating bit right here is that extractLocalLinks is evaluated throughout the browser context — thus we will depend on DOM APIs, notably doc.hyperlinks — whereas the remainder is executed throughout the Playwright setting (i.e. Node).

Checks

Now that we have now our checklist of URLs, we mainly simply want a check file with a easy loop to dynamically generate corresponding assessments:

for(let url of readSiteMap()) {
  check(`web page at ${url}`, async ({ web page }) => {
    await web page.goto(url);
    await count on(web page).toHaveScreenshot();
  });
}

Because of this readSiteMap needed to be synchronous above: Playwright doesn’t at the moment assist top-level await inside check information.

In observe, we’ll need higher error reporting for when the location map doesn’t exist but. Let’s name our precise check file viz.check.js:

import { readSiteMap } from "./sitemap.js";
import { check, count on } from "@playwright/check";

let sitemap = [];
attempt {
  sitemap = readSiteMap();
} catch(err) {
  check("website map", ({ web page }) => {
    throw new Error("lacking website map");
  });
}

for(let url of sitemap) {
  check(`web page at ${url}`, async ({ web page }) => {
    await web page.goto(url);
    await count on(web page).toHaveScreenshot();
  });
}

Getting right here was a little bit of a journey, however we’re just about achieved… until we have now to take care of actuality, which generally takes a bit extra tweaking.

Exceptions

As a result of visible testing is inherently flaky, we generally must compensate through particular casing. Playwright lets us inject customized CSS, which is usually the best and simplest method. Tweaking viz.check.js…

// and many others.
import { be part of } from "node:path";

let OPTIONS = {
  stylePath: be part of(__dirname, "./viz.tweaks.css")
};

// and many others.
  await count on(web page).toHaveScreenshot(OPTIONS);
// and many others.

… permits us to outline exceptions in viz.tweaks.css:

/* suppress state */
essential a:visited {
  shade: var(--color-link);
}

/* suppress randomness */
iframe[src$="/articles/signals-reactivity/demo.html"] {
  visibility: hidden;
}

/* suppress flakiness */
physique:has(h1 a[href="https://css-tricks.com/wip/unicode-symbols/"]) {
  essential tbody > tr:last-child > td:first-child {
    font-size: 0;
    visibility: hidden;
  }
}

:has() strikes once more!

Web page vs. viewport

At this level, all the things appeared hunky-dory to me, till I spotted that my assessments didn’t truly fail after I had modified some styling. That’s not good! What I hadn’t taken into consideration is that .toHaveScreenshot solely captures the viewport quite than your entire web page. We are able to rectify that by additional extending playwright.config.js.

export let WIDTH = 800;
export let HEIGHT = WIDTH;

// and many others.

  tasks: BROWSERS.map(ua => ({
    identify: ua.toLowerCase().replaceAll(" ", "-"),
    use: {
      ...units[ua],
      viewport: {
        width: WIDTH,
        top: HEIGHT
      }
    }
  }))

…after which by adjusting viz.check.js‘s test-generating loop:

import { WIDTH, HEIGHT } from "./playwright.config.js";

// and many others.

for(let url of sitemap) {
  check(`web page at ${url}`, async ({ web page }) => {
    checkSnapshot(url, web page);
  });
}

async operate checkSnapshot(url, web page) {
  // decide web page top with default viewport
  await web page.setViewportSize({
    width: WIDTH,
    top: HEIGHT
  });
  await web page.goto(url);
  await web page.waitForLoadState("networkidle");
  let top = await web page.consider(getFullHeight);

  // resize viewport for earlier than snapshotting
  await web page.setViewportSize({
    width: WIDTH,
    top: Math.ceil(top)
  });
  await web page.waitForLoadState("networkidle");
  await count on(web page).toHaveScreenshot(OPTIONS);
}

operate getFullHeight() {
  return doc.documentElement.getBoundingClientRect().top;
}

Be aware that we’ve additionally launched a ready situation, holding till there’s no community visitors for some time in a crude try to account for stuff like lazy-loading photos.

Bear in mind that capturing your entire web page is extra resource-intensive and doesn’t at all times work reliably: You may need to take care of format shifts or run into timeouts for lengthy or asset-heavy pages. In different phrases: This dangers exacerbating flakiness.

Conclusion

A lot for that fast spike. Whereas it took extra effort than anticipated (I imagine that’s referred to as “software program growth”), this may truly remedy my unique drawback now (not a typical function of software program as of late). In fact, shaving this yak nonetheless leaves me itchy, as I’ve but to do the precise work of scratching CSS with out breaking something. Then comes the actual problem: Retrofitting darkish mode to an present web site. I simply may want extra downtime.

Previous articleWhy a Modular Method Is Higher for WordPress Improvement — Speckyboy

Next articleOf brokers & company with Amal Hussein (Changelog & Associates #86)

Automated Visible Regression Testing With Playwright

Setup

Seize and evaluate

Producing assessments

Website map

Checks

Exceptions

Web page vs. viewport

Conclusion

How you can Create Interactive, Droplet-like Metaballs with Three.js and GLSL

Higher CSS Shapes Utilizing form() — Half 3: Curves

Don’t Miss Our WWDC 2025 Livecast – June 9, 9PM EDT!

LEAVE A REPLY Cancel reply

Most Popular

#CoffeeWithRW: from Tech Author to Analytics Engineer

The Delegate RequestDelegate doesn’t take X arguments – Experiences with minimal APIs – blogs.cninnovation.com

Eleventy Starter Mission Updates

Tips on how to Set up an Entry Level

Recent Comments

ABOUT US

POPULAR POSTS

#CoffeeWithRW: from Tech Author to Analytics Engineer

The Delegate RequestDelegate doesn’t take X arguments – Experiences with minimal APIs – blogs.cninnovation.com

Eleventy Starter Mission Updates

POPULAR CATEGORY