Tyler's Halloween Matryoshka Dolls represent the internal complexities of container file formats. The dolls here have formats attached to them representing different ways they might be nested, with ZIP and OLE2 being the primary containers that can be handled in DROID and Siegfried at present.

A year in file formats 2024

A great write up from Francesca at TNA about the past year for PRONOM via Georgia at the OPF.

It’s great to see the continuing work including vital translation of guides into other languages. Francesca includes a couple of shout outs to some pieces I have contributed in my spare time this year; including a collaborative workshop with Francesca, David, and Tyler at iPRES2024.

Loading

C3PO narrates the story of Star Wars to the Ewoks in Return of the Jedi

PRONOM’s dustiest records

Tyler’s recent blog post for the PRONOM Hack-a-thon Week 2024 (my previous for this week), brought up an interesting point about two of PRONOM’s oldest outline records, Real Video Clip (fmt/204) and Real Video (x-fmt/277). How did they end up in PRONOM?

NB. because of the complexity of this post, it may be easier to read in original blog form, than on Mastodon here: https://exponentialdecay.co.uk/blog/pronoms-dustiest-records/

Tyler suggests:

I assume PRONOM originally added these based on MIME types available.

I thought I knew the answer, but it prompted a forensic look at the records to see if what I thought I knew aligned with reality!

Loading

Logo for wddroidy

Making DROID work with Wikidata

Wikidata is a good service, Wikibase (on which Wikidata is built) is a better platform.

I have spoken before about its potential to be added into the file-format registry ecosystem in a federated model.

If we are to use it as a registry that can perhaps complement the pipelines going into PRONOM, e.g. in vendor’s digital preservation platforms such as the Rosetta Format Library, a Wikidata should be able to output different serializations of signature file for tools such as Siegfried, DROID or FIDO.

And what about DROID?

Loading

Client-side file format identification and reporting pipeline with Siegfried and Demystify Lite

With thanks to the sponsorship of Archives New Zealand and Richard Lehane for his great coding expertise and his collaboration; Demystify Lite has a new feature — Siegfried!!

Richard recently posted about this work on LinkedIn but lets look at this effort in more detail below.

Loading

Using a custom Wikibase with Siegfried

In March I was invited by the LD4 Wikidata Affinity Group to talk about my experiences using Wikibase with Siegfried, the file format identification tool. I don’t think I’ve talked about that work on here before but you can find links to my iPRES talk on my ORCID page.

Let’s look at the abstract and the content of the talk below.

Loading

What is the checksum of a directory? Introducing sumfolder1

In Fractal in detail: What information is in a file-format identification report? I describe the different ways of dissecting the information in a file-format identification report.

A file-format identification report is a data-rich artifact created during the processing of digital collections.

I had the idea of using this type of report to attach a checksum to an archival collection (files, and directories) as a whole. This is done using methods akin to a Merkle Tree, similar to those in source control systems such as Git, and Web3 Blockchain projects like Bitcoin.

This project is called sumfolder1.

Loading

Follow

Get every new post delivered to your Inbox

Join other followers: