Software Development Archives - ross spencer :: exponentialdecay.digipres

Portrait of me in my Orcfax tee, featuring our company mascot, Echo

Winding down at Orcfax: a retrospective

With the recent announcement that Orcfax is heading into operational mode, it’s a bittersweet moment that means our adventure in…

An slide excerpt from my presentation Declarative Programming for Digital Preservationists showing how network effect can be embraced and side-effects are reduced in the declarative paradigm.

Declarative programming for Digital Preservationists @ NTTW8

Just released on the No Time to Wait (NTTW) YouTube channel is my presentation from NTTW8 in Karlsruhe, Germany. (Slides also available here).

The presentation follows up on my proposal for iPRES 2024 and allowed me to present parts of what was, in the end, a pretty significant paper (in terms of word count).

Some of my reflections on the presentation are below.

A tree at sunset photographed from the train on the Bodendsee in Southern Germany

Versioning as memory?

So, it turns out my theme of the moment is code hygiene (or maybe memory?).

Today I am thinking about versioning, especially in relation to its impact on digital preservation; both software preservation and the impact of versions on long-term preservation efforts in other contexts.

Silver birth tree in Ravensburg featuring its characteristic eye-like bark.

Code as memory?

It is very poetic to think about code as it containing the memory of its maintainers. I don’t entirely disagree with the idea, but it’s overly poetic and the reality of maintenance on systems that have become too unwieldy is anything but poetic.

Making DROID work with Wikidata

Wikidata is a good service, Wikibase (on which Wikidata is built) is a better platform.

I have spoken before about its potential to be added into the file-format registry ecosystem in a federated model.

If we are to use it as a registry that can perhaps complement the pipelines going into PRONOM, e.g. in vendor’s digital preservation platforms such as the Rosetta Format Library, a Wikidata should be able to output different serializations of signature file for tools such as Siegfried, DROID or FIDO.

Siegfried ✅: https://github.com/richardlehane/siegfried/wiki/Wikidata-identifier
Fido ❌: I’ll need to revisit this!

And what about DROID?

Client-side file format identification and reporting pipeline with Siegfried and Demystify Lite

With thanks to the sponsorship of Archives New Zealand and Richard Lehane for his great coding expertise and his collaboration; Demystify Lite has a new feature — Siegfried!!

Richard recently posted about this work on LinkedIn but lets look at this effort in more detail below.

iPRES2024 header for DESIGN PATTERNS IN DIGITAL PRESERVATION: DECLARATIVE SOFTWARE FOR DIGITAL PRESERVATIONISTS

Not your first paper from iPRES2024: Design patterns in Digital Preservation: Declarative software for digital preservationists

Well folks, my paper for iPRES2024 was rejected. but the good news is that you get to read it here…

Demystify-lite is upgraded and demystify 2.0.0 is finally here!

Demystify-lite and Demystify 2.0.0 have been released featuring the denylist for the first time, plus some bonus features! With thanks…

Stop, Look, Listen, retro game style advertising for safety at a Houston Bus Stop

Linting as understanding

I have been working on a Python template repository as part of my day-job at Orcfax.

It is based on the popular pypa sample project and adds important tooling that supports the quality assurance of projects that many developers are expected to engage with.

In my template repository I add editor defaults, linting, and prepare the repository for unit tests, and then deployment.

I have migrated a copy of the template I created for Orcfax to a new file format organisation I have created to capture work I am doing around tools such as ffdev.info (the PRONOM signature development utility).

The new template repository can be found here: ffdev-info/template.py.

I want to talk about how this tooling can be used as a way of understanding legacy, or new code that you are going to be looking at. Looking at how linting can be useful for learning and understanding.

What is the checksum of a directory? Introducing sumfolder1

In Fractal in detail: What information is in a file-format identification report? I describe the different ways of dissecting the information in a file-format identification report.

A file-format identification report is a data-rich artifact created during the processing of digital collections.

I had the idea of using this type of report to attach a checksum to an archival collection (files, and directories) as a whole. This is done using methods akin to a Merkle Tree, similar to those in source control systems such as Git, and Web3 Blockchain projects like Bitcoin.

This project is called sumfolder1.

Tag: Software Development

Winding down at Orcfax: a retrospective

Declarative programming for Digital Preservationists @ NTTW8

Versioning as memory?

Code as memory?

Making DROID work with Wikidata

Client-side file format identification and reporting pipeline with Siegfried and Demystify Lite

Not your first paper from iPRES2024: Design patterns in Digital Preservation: Declarative software for digital preservationists

Demystify-lite is upgraded and demystify 2.0.0 is finally here!

Linting as understanding

What is the checksum of a directory? Introducing sumfolder1

Follow ross spencer :: exponentialdecay.digipres :: blog