ross spencer :: exponentialdecay.digipres :: blog - Page 4 of 9 - Digital preservation analyst, researcher, and software developer

Demystify-lite is upgraded and demystify 2.0.0 is finally here!

Demystify-lite and Demystify 2.0.0 have been released featuring the denylist for the first time, plus some bonus features! With thanks…

PRONOM release statistics

My contribution to PRONOM research week 2023 (held in November 2023) is a PRONOM summary website and Application Programming Interface…

Photograph of a caravan in Gohlis Leipzig (CC-BY)

Moving to Leipzig and the year ahead

Late 2023 ended with a move to Leipzig. The move was driven by a switch back to full-time contracting in…

A rough guide to digital preservation metadata

A rough guide to metadata I published on Twitter once upon day and again in Content-disposition: Archival – Repatriating Dates…

Shattering the eyeglass: Using Kaitai Structs to dissect the eyeglass’ contents

In my post from 2012: Genesis of a File Format, I created a new file format – the Eyeglass file format. The format provides a mechanism to persist information about a patient’s eye health following a checkup at an opticians. Today in 2023 we can use the format to understand how to make use of Kaitai Structs for understanding file formats.

Given the disclaimer that I am not actually an optician and that the format is purely illustrative, let’s look at the eyeglass again below.

Stop, Look, Listen, retro game style advertising for safety at a Houston Bus Stop

Linting as understanding

I have been working on a Python template repository as part of my day-job at Orcfax.

It is based on the popular pypa sample project and adds important tooling that supports the quality assurance of projects that many developers are expected to engage with.

In my template repository I add editor defaults, linting, and prepare the repository for unit tests, and then deployment.

I have migrated a copy of the template I created for Orcfax to a new file format organisation I have created to capture work I am doing around tools such as ffdev.info (the PRONOM signature development utility).

The new template repository can be found here: ffdev-info/template.py.

I want to talk about how this tooling can be used as a way of understanding legacy, or new code that you are going to be looking at. Looking at how linting can be useful for learning and understanding.

Moonshine: a small part of the file format analyst’s toolkit

Today I released Moonshine 2.0.0. Moonshine is a a file format discovery tool I developed a few years ago. A…

Using a custom Wikibase with Siegfried

In March I was invited by the LD4 Wikidata Affinity Group to talk about my experiences using Wikibase with Siegfried, the file format identification tool. I don’t think I’ve talked about that work on here before but you can find links to my iPRES talk on my ORCID page.

Let’s look at the abstract and the content of the talk below.

What is the checksum of a directory? Introducing sumfolder1

In Fractal in detail: What information is in a file-format identification report? I describe the different ways of dissecting the information in a file-format identification report.

A file-format identification report is a data-rich artifact created during the processing of digital collections.

I had the idea of using this type of report to attach a checksum to an archival collection (files, and directories) as a whole. This is done using methods akin to a Merkle Tree, similar to those in source control systems such as Git, and Web3 Blockchain projects like Bitcoin.

This project is called sumfolder1.

Published: PREMIS Events Through an Event-sourced Lens

Not long after my first Code4Lib article I had another idea to run by the team there, and elected to see if my paper looking at events in the PREMIS metadata standard would be of interest to them and the readership.

My paper PREMIS Events Through an Event-sourced Lens was published April this year.

I take a look at the content of this paper below and plug a few gaps that I have been thinking about since its publication.

Follow ross spencer :: exponentialdecay.digipres :: blog