A screenshot of a file format (fmt/983) in 0xffae. The title 0xffae sits over the top of the original image.

File formats as Emoji: 0xffae

tldr: https://emoji.exponentialdecay.co.uk

File Formats As Emoji (0xFFAE or 0xffae) might be my most random file format hack yet. Indeed, it is a random page generator! But it generates random pages of file formats represented as Emoji.

The idea came in 2016 with radare releasing a new version that supported an emoji hexdump. I wondered whether I could do something fun combining file formats and the radare output to create a web-page.

Along came a spare moment one weekend, some pyscript, and bit of sqlite, et voilà. File Formats as Emoji (0xFFAE) was made a reality.

Loading

Image of the foundations of a new building being erected in Wellington New Zealand, circa 2017.

File format building blocks: primitives in digital preservation

A primitive in software development can be described as:

a fundamental data type or code that can be used to build more complex software programs or interfaces.

– via https://www.capterra.com/glossary/primitive/ (also Wiki: language primitives)

Like bricks and mortar in the building industry, or oil and acrylic for a painter, a primitive helps a software developer to create increasingly more complex software, from your shell scripts, to entire digital preservation systems.

Primitives also help us to create file formats, as we’ve seen with the Eyeglass example I have presented previously, the file format is at its most fundamental level a representation of a data structure as a binary stream, that can be read out of the data structure onto disk, and likewise from disk to a data structure from code.

For the file format developer we have at our disposal all of the primitives that the software developer has, and like them, we also have “file formats” (as we tend to understand them in digital preservation terms) that serve as our primitives as well. 

Loading

Animated GIF of various corrupted frames from Johan's Y2K cover.

The sensitivity index: Corrupting Y2K

In December I asked “What will you bitflip today?” Not long after, Johan’s (@bitsgalore) Digtial Dark Age Crew released its long lost hidden single Y2K — well, I couldn’t resist corrupting it.

Image showing a hugely glitched file in Audacity. The waveforms should largely be the same in both stereo channels but they are not.

Fixity is an interesting property enabled by digital technologies. Checksums allow us to demonstrate mathematically that a file has not been changed. An often cited definition of fixity is:

Fixity, in the preservation sense, means the assurance that a digital file has remained unchanged, i.e. fixed — Bailey (2014)

It’s very much linked to the concept of integrity. A UNESCO definition of which:

The state of being whole, uncorrupted and free of unauthorized and undocumented changes.

Integrity is massively important at this time in history. It gives us the guarantees we need that digital objects we work with aren’t harboring their own sinister secrets in the form of malware and other potentially damaging payloads.

These values are contingent on bit-level preservation, the field of digital preservation largely assumes this; that we will be able to look after our content without losing information. As feasible as this may be these days, what happens if we lose some information? Where does authenticity come into play?

Through corrupting Y2K, I took time to reflect on integrity versus authenticity, as well as create some interesting glitched outputs. I also uncovered what may be the first audio that reveals what the Millennium Bug itself may have sounded like! Keen to hear it? Read on to find out more.

Loading

A poem by Kay Ryan - An Elephant in the Room The room is almost all elephant. Almost none of it isn't. Pretty much solid elephant. So there's no room to talk about it.

Interviewing in digital preservation: a duty of care and community

Sometime in 2024, I received zero feedback for a job interview—one of at least five interviews without any feedback in the last eight years.

The thing is, digital preservation is very niche. Those five roles probably represent a good number of institutions actually hiring specialists and likely represent some of the best chances for jobs in the future.

Not getting a role is part and parcel of interviewing, but in not providing feedback, a didactic moment was lost—a moment of community connection and outreach—and simply an act of care.

Furthermore, loops are not closed, processes feel incomplete, and of course, you will likely know the person who gets the role ahead of you. Trying to measure yourself against that individual will likely be in the back of your mind when you next meet or work with these individuals because you have been left questioning by the recruiter.

And before it is suggested that this is just a ‘you’ thing—let’s say conservatively, five people interviewed for each of the five positions I applied for. Assuming everyone is treated equally, that’s 20 people missing out on something critical to improving their skill set, interview technique, or helping them find more suitable jobs in the future. I guarantee, you ALL deserve feedback. It is also 20 people that each recruiter has missed an active opportunity to build a stronger bond with, who will sing the praises of the process and the organization; this is important.

Loading

Cat's Meow from the Offner Dynograph EEG

What will you bitflip today?

I want to let you into a secret: I enjoy corruption. Corrupting digital objects leads to undefined behavior (C++’s definition is fun). And flipping bits in objects can tell us something both about the fragility, and robustness of our digital files and the applications that work with them.

I had a pull-request for bitflip accepted the other day. Bitflip is by Antoine Grondin and is a simple utility for flipping bits in digital files. I wrote in my COPTR entry for it that it reminds me of shotGun by Manfred Thaller. The utility is exceptionally easy to use (and of course update and maintain written in Golang) and has some nice features for flipping individual bits or a uniform percentage of bits across a digital file.

My pull-request was a simple one updating Goreleaser and its GitHub workflow to provide binaries for Windows and FreeBSD. I only needed to use Windows for a short amount of time thankfully, but it’s an environment I believe is prevalent for a lot of digital preservationists in corporate IT environments.

Bitflip is a useful utility to improve your testing of digital preservation systems, or simply for outreach, but let’s have a quick look at it in action.

Loading

Tyler's Halloween Matryoshka Dolls represent the internal complexities of container file formats. The dolls here have formats attached to them representing different ways they might be nested, with ZIP and OLE2 being the primary containers that can be handled in DROID and Siegfried at present.

A year in file formats 2024

A great write up from Francesca at TNA about the past year for PRONOM via Georgia at the OPF.

It’s great to see the continuing work including vital translation of guides into other languages. Francesca includes a couple of shout outs to some pieces I have contributed in my spare time this year; including a collaborative workshop with Francesca, David, and Tyler at iPRES2024.

Loading

What we could do with a second life…

Cleaning up some posts today for clarity or for improving their appearance in ActivityPub instances I didn’t want to lose this quote introduced to us at Archives New Zealand in a visit from Verne Harris back in 2017. It represents the need for a second life to apply all of the lessons learned in this one – in the GLAM sector, everything we learn getting up to speed, to learn how to work within our institutional boundaries, to align with corporate strategy, or just to hustle to have our work recognized and valued.

My colleague Andrea references the quote a lot and I am often reaching to recall it.

We work in the dark – we do what we can – we give what we have. Our doubt is our passion, and our passion is our task. The rest is the madness of art. – Henry James, The Middle Years

Loading

C3PO narrates the story of Star Wars to the Ewoks in Return of the Jedi

PRONOM’s dustiest records

Tyler’s recent blog post for the PRONOM Hack-a-thon Week 2024 (my previous for this week), brought up an interesting point about two of PRONOM’s oldest outline records, Real Video Clip (fmt/204) and Real Video (x-fmt/277). How did they end up in PRONOM?

NB. because of the complexity of this post, it may be easier to read in original blog form, than on Mastodon here: https://exponentialdecay.co.uk/blog/pronoms-dustiest-records/

Tyler suggests:

I assume PRONOM originally added these based on MIME types available.

I thought I knew the answer, but it prompted a forensic look at the records to see if what I thought I knew aligned with reality!

Loading

Follow

Get every new post delivered to your Inbox

Join other followers: