Winding down at Orcfax: a retrospective
![]()
ross spencer :: exponentialdecay.digipres :: blog
Digital preservation analyst, researcher, and software developer

Posts on the topic of digital preservation, digital preservation tooling, and the digital preservation community. Digipres is an often used shorthand tag for “digital preservation” on social media.
![]()
The Serpentine is one of the world’s most renowned art galleries. Their exhibitions as varied as Gerhard Richter, Damien Hirst, and Marina Abramović. They don’t hold a permanent collection, instead, they provide a space for temporary collections and an annual pavilion, the pavilion designed by luminaries such as Zaha Hadid, Frank Gehry, and Ai Weiwei.
Given a recent job posting it looks like they are looking at maintaining their memory better and branching out into digital preservation.
Here’s the kicker — its salary band is GBP 35,000 to GBP 38,000. So it must be an entry level position, especially in London, right?
Well, let’s see what they want you to do for that price tag…
![]()
Contributing back to the commons in digital preservation hasn’t been for everyone.
We know the famous XKCD that touches on the underappreciated work of maintainers in obscurity. When you, or your institutions, or services are using free and open source software, or other information and data in the commons, and you’re not contributing back, you’re perpetuating this, and what’s more, there’s a virtuous cycle that we’re missing out on.
I read something the other day and it felt like a red flag.
![]()
I introduced bsdiff in a blog in 2014. bsdiff compares the differences between two files, e.g. broken_file_a and corrected_file_b and creates a patch that can be applied to broken_file_a to generate a byte-for-byte match for corrected_file_b.
On the face of it, in an archive, we probably only care about corrected_file_2 and so why would we care about a technology that patches a broken file?
In all of the use-cases we can imagine the primary reasons are cost savings and removing redundancy in file storage or transmission of digital information. In one very special case we can record the difference between broken_file_a and corrected_file_b and give users a totally objective method of recreating corrected_file_b from broken_file_a providing 100% verifiable proof of the migration pathway taken between the two files.
![]()
We might not have a second life, but what if I told you there was a second internet? Not the deep web, but another web that we engage with nearly every day?
Think about it, that QR code you scanned for more information? That payment link you followed on your electricity bill? The website you’re told to visit at the end of a television ad?
The antipodes of the internet are these terminal endpoints, material and not necessarily material objects that represent the end of the freely navigable web — the QR code on a concert poster is the web printed onto the physical world. There is every chance it will be scanned and followed by someone from a mobile device, but it’s a transient object, something that will exist for a short amount of time, and then disappear into the palimpsest of the poster board or wall it was pasted on until it eventually disappears.
This is part of the materiality of the internet that has long fascinated me. Perhaps it comes from being a student of material culture, but if we look around, we see the Internet everywhere!
![]()
tldr: https://emoji.exponentialdecay.co.uk
File Formats As Emoji (0xFFAE or 0xffae) might be my most random file format hack yet. Indeed, it is a random page generator! But it generates random pages of file formats represented as Emoji.
The idea came in 2016 with radare releasing a new version that supported an emoji hexdump. I wondered whether I could do something fun combining file formats and the radare output to create a web-page.
Along came a spare moment one weekend, some pyscript, and bit of sqlite, et voilà. File Formats as Emoji (0xFFAE) was made a reality.
![]()
A primitive in software development can be described as:
a fundamental data type or code that can be used to build more complex software programs or interfaces.
– via https://www.capterra.com/glossary/primitive/ (also Wiki: language primitives)
Like bricks and mortar in the building industry, or oil and acrylic for a painter, a primitive helps a software developer to create increasingly more complex software, from your shell scripts, to entire digital preservation systems.
Primitives also help us to create file formats, as we’ve seen with the Eyeglass example I have presented previously, the file format is at its most fundamental level a representation of a data structure as a binary stream, that can be read out of the data structure onto disk, and likewise from disk to a data structure from code.
For the file format developer we have at our disposal all of the primitives that the software developer has, and like them, we also have “file formats” (as we tend to understand them in digital preservation terms) that serve as our primitives as well.
![]()
In December I asked “What will you bitflip today?” Not long after, Johan’s (@bitsgalore) Digtial Dark Age Crew released its long lost hidden single Y2K — well, I couldn’t resist corrupting it.
Fixity is an interesting property enabled by digital technologies. Checksums allow us to demonstrate mathematically that a file has not been changed. An often cited definition of fixity is:
Fixity, in the preservation sense, means the assurance that a digital file has remained unchanged, i.e. fixed — Bailey (2014)
It’s very much linked to the concept of integrity. A UNESCO definition of which:
The state of being whole, uncorrupted and free of unauthorized and undocumented changes.
Integrity is massively important at this time in history. It gives us the guarantees we need that digital objects we work with aren’t harboring their own sinister secrets in the form of malware and other potentially damaging payloads.
These values are contingent on bit-level preservation, the field of digital preservation largely assumes this; that we will be able to look after our content without losing information. As feasible as this may be these days, what happens if we lose some information? Where does authenticity come into play?
Through corrupting Y2K, I took time to reflect on integrity versus authenticity, as well as create some interesting glitched outputs. I also uncovered what may be the first audio that reveals what the Millennium Bug itself may have sounded like! Keen to hear it? Read on to find out more.
![]()
Tyler recently posted this quote:
“History unprocessed is opportunity unrealized”
It reminds me of an unrealized article I wasn’t able to get written and into the wild, but it’s an important thought I would like to share nonetheless.
Proposed for James Lowry’s ACARM Symposium in 2015, I wanted to discuss when government is unable to adequately fund day-to-day effort, and research and development in the archive sector, leading to inefficient and potentially ineffective processing pipelines for records of archival value accessioned from government agencies and commissions.
It was just an abstract, but maybe folks have thoughts about this? Have we moved on since the early to mid 2010’s? What modern metrics do we have available to us today to see the progress? What does the advent of the new US administration mean for issues like this? As well as increasing worldwide authoritarianism?
![]()
Sometime in 2024, I received zero feedback for a job interview—one of at least five interviews without any feedback in the last eight years.
The thing is, digital preservation is very niche. Those five roles probably represent a good number of institutions actually hiring specialists and likely represent some of the best chances for jobs in the future.
Not getting a role is part and parcel of interviewing, but in not providing feedback, a didactic moment was lost—a moment of community connection and outreach—and simply an act of care.
Furthermore, loops are not closed, processes feel incomplete, and of course, you will likely know the person who gets the role ahead of you. Trying to measure yourself against that individual will likely be in the back of your mind when you next meet or work with these individuals because you have been left questioning by the recruiter.
And before it is suggested that this is just a ‘you’ thing—let’s say conservatively, five people interviewed for each of the five positions I applied for. Assuming everyone is treated equally, that’s 20 people missing out on something critical to improving their skill set, interview technique, or helping them find more suitable jobs in the future. I guarantee, you ALL deserve feedback. It is also 20 people that each recruiter has missed an active opportunity to build a stronger bond with, who will sing the praises of the process and the organization; this is important.
![]()