Demystify-lite is upgraded and demystify 2.0.0 is finally here!
![]()
ross spencer :: exponentialdecay.digipres :: blog
Digital preservation analyst, researcher, and software developer

![]()
![]()
![]()
In my post from 2012: Genesis of a File Format, I created a new file format – the Eyeglass file format. The format provides a mechanism to persist information about a patient’s eye health following a checkup at an opticians. Today in 2023 we can use the format to understand how to make use of Kaitai Structs for understanding file formats.
Given the disclaimer that I am not actually an optician and that the format is purely illustrative, let’s look at the eyeglass again below.
![]()
![]()
In March I was invited by the LD4 Wikidata Affinity Group to talk about my experiences using Wikibase with Siegfried, the file format identification tool. I don’t think I’ve talked about that work on here before but you can find links to my iPRES talk on my ORCID page.
Let’s look at the abstract and the content of the talk below.
![]()
In Fractal in detail: What information is in a file-format identification report? I describe the different ways of dissecting the information in a file-format identification report.
A file-format identification report is a data-rich artifact created during the processing of digital collections.
I had the idea of using this type of report to attach a checksum to an archival collection (files, and directories) as a whole. This is done using methods akin to a Merkle Tree, similar to those in source control systems such as Git, and Web3 Blockchain projects like Bitcoin.
This project is called sumfolder1.
![]()
Not long after my first Code4Lib article I had another idea to run by the team there, and elected to see if my paper looking at events in the PREMIS metadata standard would be of interest to them and the readership.
My paper PREMIS Events Through an Event-sourced Lens was published April this year.
I take a look at the content of this paper below and plug a few gaps that I have been thinking about since its publication.
![]()
In early 2022, I was finally able to get around to writing a paper that I had been thinking about for the better part of a decade. The paper, “Fractal in Detail: What Information Is in a File Format Identification Report?” was published in the Code4Lib journal Issue 53.
The paper takes a deep dive into the fractal contents of file format identification reports exported from tools like Siegfried and DROID.
Let’s take a brief look the article and its contents below.
![]()
It was back in May, yes, way back when, that Jordan Hale of the Information Maintainers group put the following to me:
I write today to ask if you’d be interested in being our special guest on the next Information Maintainers call … we thought your perspective on working within and maintaining decentralized, small-group systems and development infrastructures would be really rad to hear about. What do you think?
I am a big fan of the Information Maintainers and so I was pretty stoked to be asked. Of course, I jumped at the chance and wrote about “Something something twenty years open source…”
![]()