5 thoughts on “Revisiting bsdiff as a tool for digital preservation

    1. Hi Coucou — yes, that one looks equivalent. The original is by Colin Percival, see https://www.daemonology.net/bsdiff/ with more info here. bsdiff is packaged with FreeBSD and I’m not sure where the sources are for the version I am using on Ubuntu but it looks like it was packaged circa 2003 going on the man page. I hope that helps!

  1. Interesting process. Is this similar to how NZ archives handles “fixes” in repository? Also, what would be best practice for the extension on the patch file, seems like retaining the source extension would be problematic as the bsdiff file uses its own file format.

    1. Hi Tyler — Thanks for reading! And good questions. About Archives NZ, I didn’t mean to imply it was, but at the time I first wrote about bsdiff, I was wrestling with the mechanics of the versioning in their digital preservation system. There was a lot to think about there, including duplication of storage and the eventual representation of versions in the METS (I ended up writing about it here and we ended up re-digitizing and re-ingesting). In DP systems in general, I think there would need to be additions to data models and their implementation and execution of workflows to incorporate bsdiff/delta versioning effectively. Perhaps it’s more suited for a greenfield project? I first implemented it with colleagues at TNA for a digitization project (one we had lots of control over). I couldn’t speak to how that ended up landing in the preservation system there or whether it persisted. WRT to naming, I hear you. I expect .patch would be a good baseline, but we could add more meaning in the extension if desired, i.e., something to indicate its use for preservation/repair. Interesting to think about!

Leave a Reply

Your email address will not be published. Required fields are marked *

Follow

Get every new post delivered to your Inbox

Join other followers: