Code as memory?

It is very poetic to think about code as it containing the memory of its maintainers. I don’t entirely disagree with the idea, but it’s overly poetic and the reality of maintenance on systems that have become too unwieldy is anything but poetic.

I am currently polishing off a small demo script that I wrote a year ago. The code takes a Whatsapp export and reformats it to HTML that can be skinned. For fun I’ve used a NES stylesheet which has neat dialogs to represent a conversation à la Whatsapp where the speech bubbles on the left are your friends or colleagues and those on the right are you.

The script is largely whimsical, but it has given me plenty to think about if I want to turn this into a production project in the future.

But what does “production” mean here?

All of the code I write professionally these days is heavily tested. The code is organized well, adopts the single responsibility principle, and uses a range of linting tools to make sure the code layout is idomatic and gotchas are addressed early, e.g. linting will often highlight insecure or fragile code.

And I follow the same principles for the most part for the code that I write for myself.

Except for when I am writing fast for fun. Like in the case of Whatsmapper, Take this innocuous example:

https://github.com/ross-spencer/whatsmapper/blob/e438dcd9913f617a3ef7cb94a79874d8121e943d/src/whatsmapper/whatsmap.py#L257C1-L266

try:
if line.startswith("["):
individual = line.split("] ")[1].split(": ", 1)[0]
chat_text = "".join(line.split(individual)[1:])
if individual not in individuals:
individuals.append(individual)
else:
chat_text = line
except IndexError:
chat_text = line

In this example I make three different decisions about what the “chat_text” variable needs to be set to. But what are these decisions?

I can tell you 48 hours separated from the code that each new message in a Whatsapp message begins with [DATE TIMESTAMP] NAME: MESSAGE e.g. [[9/14/24, 13:06:47] ~ Alfred Stewart: g'day!.

We need to tokenize these values, and so we try and do that in the code above. If a line doesn’t begin with a square bracket it is assumed to be part of a multi-line message and it will be added to the chat text variable.

The code is nested in a try-catch if something goes wrong while slicing the string, but do we need it?

Honestly I already can’t recall and I am loathe to get rid of it just yet. And I only changed this code two days ago.

My mistakes are that I don’t have all the sample data that I used to start the script a year ago and I don’t have tests that show me example inputs and outputs as I would like them to be. The code isn’t separated from the primary loop into a single responsibility function where that responsibility is nicely described by the function name and its documentation (docstring) e.g.

def tokenize_message(msg: str) -> list:
"""Split the first line of a Whatsapp message into date, time, individual,
and message string.
"""

For the simplest of decisions, I already have a maintenance problem. I have to make a decision at some point: When do I take this code from fun proof of concept to something serious? When do I start adding tests?

What I want to say to those developing their own tooling in digital preservation is that you need to start writing the tests from the very beginning.

If I write tests it frees me and others from a number of coding prisons:

remembering what myself five-years ago wrote.
guaranteeing that what I wrote works if I change something.
code becomes less complex and easier to follow and it makes it easier for others to jump into the code.
guarantees that the work someone else does works with the legacy code.

We work on a lot of formats in digital preservation and we know from experience or anecdotally that a lot of these formats have “quirks”, e.g. written across multiple specifications or interpretations of complex specifications, or having no specification at all. We need to encode these quirks in our code and in our tests so that we can continue to maintain the incredibly useful scripts that we are writing today.

Note. formats are easy to pick on because they represent the fundamental aspects of programming but it applies to all and any code we write, e.g. writing manifests for packages; restriction mechanisms for access servers; repository software, and so on.

As this script evolves, the different branches that I create that ensure different variations of Whatsapp export are parsed correctly (earlier versions, or those with different export formats, for example) the level of complexity in the code increases exponentially and the ability to reason about decisions made in the code becomes more and more difficult over time.

As scripts become software becomes systems this gets even more important. Good coverage protects us from the risks and costs of the impact of different types of error (and regression of fixes one day in the future). The fewer errors systems have the greater the reputation of those systems. And the easier it is to maintain them, the cheaper those systems can become to maintain, meaning that focus can be shifted to new features, initiatives, or tooling.

It is a hope of mine that one day we can audit our different systems for different metrics such as test coverage or cyclometric or cognitive complexity among other measures (such as ability to perform localization) to provide an indication of the quality of the foundations of the software that we use to maintain our digital legacy.

Code doesn’t represent the memory of its maintainers, it’s just the scratch pad. Memory requires reinforcement and the confidence that it’s accurate, tests are the reinforcement and the confidence; tests are the memory of the maintainers. Without tests, code is just ephemeral.

1 thought on “Code as memory?”

Leave a Reply Cancel reply

Follow ross spencer :: exponentialdecay.digipres :: blog