Coding is Writing

by Rebecca Sutton Koeser

Rebecca Sutton Koeser is the Lead Developer at Princeton’s Center for Digital Humanities, designing and building customized software in the service of scholarly projects such as the Princeton Prosody Archive and the Shakespeare and Company Project. Trained in both English literature and computer science, she is well positioned to explore the nexus of coding and writing.

I read an article years ago comparing writing software to creative writing. The author’s takeaway was that, as with writing, some people will be more gifted than others. As someone interested in both literature and computers, the idea of coding as writing stuck with me – but I would take this one step further. Just as anyone can learn to write, or get better at writing, anyone can learn to code or get better at coding. Like any skill, it may come more easily to some than others, and some people may enjoy it more than others, but that doesn’t mean you can’t do it.

Because I want to continue to improve my own writing skills, I took an advanced writing class this summer, and the instructor, Princeton Writes’ director, John Weeren, reminded us to frame our writing in terms of audience and goals. It turns out those are also useful ways to think about writing code.

Software is very goal oriented: code is written to accomplish some purpose, whether to calculate trajectories for a rocket, display an image, encrypt a message, or simply print out a text like “Hello, world!” (the classic first step in learning a new programming language, invented by Brian Kernighan, Princeton Professor of Computer Science and a member of the Center for Digital Humanities’ Executive Committee).

It’s more interesting, though, to think of software in terms of audience. We can regard the machine that will run the software as the primary audience. This is why programming languages are so particular: we’re communicating with a very literal-minded entity with a circumscribed vocabulary. Unlike a human reader, who will make leaps of understanding and connect things you don’t explicitly mention, or see mistakes and understand what you meant, the computer actually refuses to read what you write if you get the syntax wrong. In fact, it’s a little more complicated than that – high-level programming languages are compiled into machine code, so when we write code, we’re really communicating with a digital interpreter that translates what we write into the limited, structured vocabulary of the machine.

There’s another important audience for code, and that’s other people – usually developers or aspiring developers, and sometimes even your future self! The machine doesn’t care how the code is organized or if it’s documented – comments in the code are thrown away when the interpreter translates the instructions. But that organization and documentation matters a great deal to other humans when they look at the code, either to read it and understand what it’s doing, or to maintain and expand it. A large codebase is like a book with sections and chapters; the code needs to be structured and organized so that human minds can work with it. We can think of it as an edited volume with chapters by multiple authors and editors that gets expanded and revised over time.

A lot of the phrases we use to describe writing code are similar to those we use to describe writing. We talk about code being “readable,” and some programming languages are known for being more readable than others. We “refactor” code, which is a kind of revision – restructuring or simplifying the code, usually with the goal of making it more efficient or readable while preserving the original logic. There’s a practice called “code review,” which is akin to editing – reading through to make sure the logic is correct and that the intent is clear. We talk about whether a piece of code is “elegant,” and, as with writing, shorter code is often better, but also harder to write. Like writing, code can be brief like a short story, or long and complicated like a novel.

We can also think of code as a choose-your-own adventure text, since it isn’t always read in order. Nor is it necessarily only read once. One of the powerful things about code is that some sections can be read over and over, with some configurable variation. In literary writing, we may find references and allusions that bring to mind another story or image – similarly, software has actual callouts to other code; it tells the machine to go read and process some other piece of code and then come back. Like a reference or allusion, that other piece of code may be from a different “text” entirely – I can tell the machine to read something from a software library or package written by someone else so that I can accomplish my own goals more quickly.

You may have heard that coding is all about logic, but thinking through the logical flow of a piece of code is a bit like making an argument in prose. You assemble your evidence carefully and sequence it properly to make the strongest case you can. However, when writing software, you have to decide how (or if!) you want to handle every possible error or exception (think counter-argument). With prose, you can’t tell if your argument will be persuasive; writing code is sometimes more satisfying in this regard, since you can test and verify that your logic works.

As a software developer, I sometimes liken the large codebases I’ve worked on to a corpus of literary texts. My work is probably larger in volume than the corpus of a poet, but no one will ever inadvertently memorize my lines of code. In fact, the better I do my job, the less likely it is that anyone will read my code, since we usually only go back to old code if we have to find and fix problems. However, the tools of collaborative writing and editing are still powerful ones for thinking about and writing code, which is in turn expanding and transforming the ways we write and communicate with each other.