Noah here. There are a bunch of different ways to test software. Two of the most common methods are tests at the code level, where developers write additional code to help ensure the new behavior works as expected (and continues to do so), and human quality assurance (QA) testing, where individuals click around an application to check that everything works right. The latter is generally needed because the former isn’t foolproof.
While it’s possible to write bug-free software, with any program of reasonable complexity the likelihood gets pretty low. Even with lots of code level tests and typing—which helps ensure that various inputs match what the function expects to receive—at some reasonable size of a codebase it’s effectively impossible to guarantee there aren’t any unintended behaviors lurking.
One reason it’s so hard to write completely bug-free software was proven by Alan Turing back in 1936. He set out to answer whether you could write a general-purpose program that could look at any other program and determine whether it would halt (finish) or run into some infinite loop. (One quick note: halting is good in computing. It means the program has finished what it was meant to do and the code completes. The alternative is something running endlessly.)
Turing imagined that there was a special machine that could solve the Halting Problem. Then he showed how we could have this machine analyze itself, in such a way that it has to halt if it runs forever, and run forever if it halts. Like a hound that finally catches its tail and devours itself, the mythical machine vanishes in a fury of contradiction. (That’s the sort of thing you don’t say in a research paper.)
Basically, he showed that if you consumed the output of this magical program that could check for halting and then fed it into another machine programmed to do the opposite (continue to run if the program halts and halt if it continues to run), then you’re back to the place where you started. (There’s even a fun Seuss-style poem about the Halting Problem if you want more.)
Why is this interesting?
While it might be impossible to be 100% certain your software is bug-free, that’s the goal of every software development organization. And while human QA testing is great and necessary, finding ways to augment those humans is important. A few weeks ago I discovered Playwright, which is a tool for writing frontend or end-to-end tests that essentially reproduce the work of a human tester by clicking around an application using code instead of muscles. Since then I’ve been having a lot of fun writing an ever-expanding set of tests.
Playwright isn’t the first tool of its kind, in fact, the folks behind previously developed a tool called Puppeteer that does something quite similar, but it seems to be quickly picking up steam as the tool of choice for managing these types of tests. The whole thing is made possible largely because Chrome, the browser you’re likely reading this on right now, has a “headless” version that allows developers to operate it with code instead of clicks. With that as the engine, you simply write code telling the browser where you want to click, describing elements on the page using their CSS selectors and text or even general location.
Just clicking through an application wouldn’t be enough for testing though. Most code-based tests are assertion-based, meaning you tell the code what to expect and fail the test if it doesn’t deliver. If you’re sorting a list, for instance, you might grab the first entry in the pre-sorted A-Z list and then check it against the first entry in the Z-A sort, if the two match the list failed to sort and the test fails. Playwright makes this easy. You write some code to grab the first value, hit the sort button, and then assert (expect in Playwright parlance) the new first value not to match the one you grabbed pre-sort.
What makes Playwright nice, and a lot easier to use than its predecessor, is how many annoying things they’ve ironed out. Webpages and applications load at different speeds and often push in data in a specific order, Playwright has some patience built-in, helping to ensure tests aren’t flaky (passing one moment, failing the next for no good reason).
Playwright, or any other tests, won’t solve the halting problem. But they’re a helpful tool in delivering software with fewer bugs. (NRB)
Quick Links:
Simon Willison, a developer who does a lot of open source work, has also been digging into Playwright lately and released a simple tool to do screenshots from the command line using Playwright. (NRB)
--
WITI x McKinsey:
An ongoing partnership where we highlight interesting McKinsey research, writing, and data.
Want to keep your employees? Redesign the office. As many contemplate a return to physical offices, get perspective on people-centered design and what comes next in a new interview with Diane Hoskins, co-CEO of Gensler, a global design and architecture firm. Check it out.
--
Thanks for reading,
Noah (NRB) & Colin (CJN)
—
Why is this interesting? is a daily email from Noah Brier & Colin Nagy (and friends!) about interesting things. If you’ve enjoyed this edition, please consider forwarding it to a friend. If you’re reading it for the first time, consider subscribing (it’s free!).