Animated GIF of various corrupted frames from Johan's Y2K cover.

The sensitivity index: Corrupting Y2K

In December I asked “What will you bitflip today?” Not long after, Johan’s (@bitsgalore) Digtial Dark Age Crew released its long lost hidden single Y2K — well, I couldn’t resist corrupting it.

Image showing a hugely glitched file in Audacity. The waveforms should largely be the same in both stereo channels but they are not.

Fixity is an interesting property enabled by digital technologies. Checksums allow us to demonstrate mathematically that a file has not been changed. An often cited definition of fixity is:

Fixity, in the preservation sense, means the assurance that a digital file has remained unchanged, i.e. fixed — Bailey (2014)

It’s very much linked to the concept of integrity. A UNESCO definition of which:

The state of being whole, uncorrupted and free of unauthorized and undocumented changes.

Integrity is massively important at this time in history. It gives us the guarantees we need that digital objects we work with aren’t harboring their own sinister secrets in the form of malware and other potentially damaging payloads.

These values are contingent on bit-level preservation, the field of digital preservation largely assumes this; that we will be able to look after our content without losing information. As feasible as this may be these days, what happens if we lose some information? Where does authenticity come into play?

Through corrupting Y2K, I took time to reflect on integrity versus authenticity, as well as create some interesting glitched outputs. I also uncovered what may be the first audio that reveals what the Millennium Bug itself may have sounded like! Keen to hear it? Read on to find out more.

Loading

"Bei der Buche", a landscape architectural installation by landscape architect and photographer Karina Raeck. Created in 1993 in the Wartberg area north-east of Stuttgart.

wikidata + mediawiki = wikidata + provenance == wikiprov

Today I want to showcase a Wikidata proof of concept that I developed as part of my work integrating Siegfried and Wikidata.

That work is wikiprov a utility to augment Wikidata results in JSON with the Wikidata revision history.

For siegfried it means that we can showcase the source of the results being returned by an identification without having to go directly back to Wikidata, this might mean more exposure for individuals contributing to Wikidata. We also provide access to a standard permalink where records contributing to a format identification are fixed at their last edit. Because Wikidata is more mutable than a resource like PRONOM this gives us the best chance of understanding differences in results if we are comparing siegfried+Wikidata results side-by-side.

I am interested to hear your thoughts on the results of the work. Lets go into more detail below.

Loading

René Magritte's The Lovers, Paris 1928 (Photographed at MoMA, NYC in 2017

Unrealized ideas: Unintentional Secrecy in the Era of Openness

Tyler recently posted this quote:

“History unprocessed is opportunity unrealized”

It reminds me of an unrealized article I wasn’t able to get written and into the wild, but it’s an important thought I would like to share nonetheless.

Proposed for James Lowry’s ACARM Symposium in 2015, I wanted to discuss when government is unable to adequately fund day-to-day effort, and research and development in the archive sector, leading to inefficient and potentially ineffective processing pipelines for records of archival value accessioned from government agencies and commissions.

It was just an abstract, but maybe folks have thoughts about this? Have we moved on since the early to mid 2010’s? What modern metrics do we have available to us today to see the progress? What does the advent of the new US administration mean for issues like this? As well as increasing worldwide authoritarianism?

Loading

Cat's Meow from the Offner Dynograph EEG

What will you bitflip today?

I want to let you into a secret: I enjoy corruption. Corrupting digital objects leads to undefined behavior (C++’s definition is fun). And flipping bits in objects can tell us something both about the fragility, and robustness of our digital files and the applications that work with them.

I had a pull-request for bitflip accepted the other day. Bitflip is by Antoine Grondin and is a simple utility for flipping bits in digital files. I wrote in my COPTR entry for it that it reminds me of shotGun by Manfred Thaller. The utility is exceptionally easy to use (and of course update and maintain written in Golang) and has some nice features for flipping individual bits or a uniform percentage of bits across a digital file.

My pull-request was a simple one updating Goreleaser and its GitHub workflow to provide binaries for Windows and FreeBSD. I only needed to use Windows for a short amount of time thankfully, but it’s an environment I believe is prevalent for a lot of digital preservationists in corporate IT environments.

Bitflip is a useful utility to improve your testing of digital preservation systems, or simply for outreach, but let’s have a quick look at it in action.

Loading

Shattering the eyeglass: Using Kaitai Structs to dissect the eyeglass’ contents

In my post from 2012: Genesis of a File Format, I created a new file format – the Eyeglass file format. The format provides a mechanism to persist information about a patient’s eye health following a checkup at an opticians. Today in 2023 we can use the format to understand how to make use of Kaitai Structs for understanding file formats.

Given the disclaimer that I am not actually an optician and that the format is purely illustrative, let’s look at the eyeglass again below.

Loading

Stop, Look, Listen, retro game style advertising for safety at a Houston Bus Stop

Linting as understanding

I have been working on a Python template repository as part of my day-job at Orcfax.

It is based on the popular pypa sample project and adds important tooling that supports the quality assurance of projects that many developers are expected to engage with.

In my template repository I add editor defaults, linting, and prepare the repository for unit tests, and then deployment.

I have migrated a copy of the template I created for Orcfax to a new file format organisation I have created to capture work I am doing around tools such as ffdev.info (the PRONOM signature development utility).

The new template repository can be found here: ffdev-info/template.py.

I want to talk about how this tooling can be used as a way of understanding legacy, or new code that you are going to be looking at. Looking at how linting can be useful for learning and understanding.

Loading

Follow

Get every new post delivered to your Inbox

Join other followers: