{"id":2138,"date":"2025-09-28T21:00:30","date_gmt":"2025-09-28T21:00:30","guid":{"rendered":"https:\/\/exponentialdecay.co.uk\/blog\/?p=2138"},"modified":"2025-12-01T16:59:33","modified_gmt":"2025-12-01T16:59:33","slug":"bsdiff-as-a-tool-for-digital-preservation","status":"publish","type":"post","link":"https:\/\/exponentialdecay.co.uk\/blog\/bsdiff-as-a-tool-for-digital-preservation\/","title":{"rendered":"Revisiting bsdiff as a tool for digital preservation"},"content":{"rendered":"<p>I introduced bsdiff in <a href=\"https:\/\/openpreservation.org\/blogs\/bsdiff-technological-solutions-reversible-pre-conditioning-complex-binary-objects\/\" target=\"_blank\" rel=\"noopener\">a blog in 2014<\/a>. bsdiff compares the differences between two files, e.g. <code>broken_file_a<\/code> and <code>corrected_file_b<\/code> and creates a <code>patch<\/code> that can be applied to <code>broken_file_a<\/code> to generate a byte-for-byte match for <code>corrected_file_b<\/code>.<\/p>\n<p>On the face of it, in an archive, we probably only care about <code>corrected_file_2<\/code> and so why would we care about a technology that patches a broken file?<\/p>\n<p>In all of the use-cases we can imagine the primary reasons are cost savings and removing redundancy in file storage or transmission of digital information. In one very special case we can record the difference between <code>broken_file_a<\/code> and <code>corrected_file_b<\/code> and give users a totally objective method of recreating <code>corrected_file_b<\/code> from <code>broken_file_a<\/code> providing 100% verifiable proof of the migration pathway taken between the two files.<\/p>\n<p><!--more--><\/p>\n<p>On space saving we have very concrete examples of digital preservation system or digital preservation policy that do not allow for the replacement of an original digital file, e.g. as it was first received from an agency or donor or via some other workflow. If we later find an issue with the file but we find out that we can correct any errors we may end up ingesting a second &#8220;copy&#8221; of the object but with the fixes applied.<\/p>\n<p>As file sizes increase duplication, and redundancy gets higher.<\/p>\n<p>Marion Jaks and J\u00e9r\u00f4me Martinez described this exact scenario for multi-gigabyte video files at No Time to Wait 8.<\/p>\n<p><iframe loading=\"lazy\" title=\"No Time To Wait - S08E21 - MediaInfo\u2019s checker at Austrian Mediathek - Marion Jaks, J\u00e9r\u00f4me Martinez\" width=\"1200\" height=\"675\" src=\"https:\/\/www.youtube.com\/embed\/qCofw7d-K74?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe><\/p>\n<p>In their scenario they discovered that non-conformant WAV files can be corrected using an automated fix. After applying the fix 99% of the file remains unchanged.<\/p>\n<p>The (current) preservation policy at the <a href=\"https:\/\/www.mediathek.at\/\" target=\"_blank\" rel=\"noopener\">\u00d6sterreichische Mediathek<\/a> is such that both the original and the corrected file will be kept which means a lot of duplicate information is kept.<\/p>\n<p>Marion notes that policies may change in future and J\u00e9r\u00f4me too describes the potential to use documentation to avoid this redundancy.<\/p>\n<p>I put the idea of bsdiff to them. In this scenario, a patch file may provide enough documentation and potential for automation (to create a corrected file) to make important storage (and cost) savings.<\/p>\n<p>That being said, it has been a long time since I looked at the tooling so in this blog I take a look at the state of bsdiff in 2005. Let&#8217;s see if we can use it to fix the corrupted object we created when we corrupted <a href=\"https:\/\/exponentialdecay.co.uk\/blog\/the-sensitivity-index-corrupting-y2k\/\" target=\"_blank\" rel=\"noopener\">The Digital Dark Age Crew&#8217;s Y2K<\/a>!<\/p>\n<h2>Corrupting Y2K<\/h2>\n<p>When we <a href=\"https:\/\/exponentialdecay.co.uk\/blog\/the-sensitivity-index-corrupting-y2k\/\" target=\"_blank\" rel=\"noopener\">corrupted Y2K<\/a> we created a corrupt file from a high-quality original.<\/p>\n<p>I tried to corrupt (glitch) about seven different file formats and present various different corrupted versions that demonstrate how the quality of the audio changes over various different corruptions.<\/p>\n<p>Take one example in isolation.<\/p>\n<p><iframe loading=\"lazy\" src=\"https:\/\/w.soundcloud.com\/player\/?url=https%3A\/\/api.soundcloud.com\/tracks\/soundcloud%253Atracks%253A2176598586&amp;color=%23ff0d0d&amp;auto_play=false&amp;hide_related=false&amp;show_comments=true&amp;show_user=true&amp;show_reposts=false&amp;show_teaser=true&amp;visual=true\" width=\"100%\" height=\"300\" frameborder=\"no\" scrolling=\"no\"><\/iframe><\/p>\n<div style=\"font-size: 10px; color: #cccccc; line-break: anywhere; word-break: normal; overflow: hidden; white-space: nowrap; text-overflow: ellipsis; font-family: Interstate,Lucida Grande,Lucida Sans Unicode,Lucida Sans,Garuda,Verdana,Tahoma,sans-serif; font-weight: 100;\" data-darkreader-inline-color=\"\"><a style=\"color: #cccccc; text-decoration: none;\" title=\"exponential-decay\" href=\"https:\/\/soundcloud.com\/exponential-decay\" target=\"_blank\" rel=\"noopener\" data-darkreader-inline-color=\"\">exponential-decay<\/a> \u00b7 <a style=\"color: #cccccc; text-decoration: none;\" title=\"00055-ddac-y2k-snippet-ac3\" href=\"https:\/\/soundcloud.com\/exponential-decay\/00055-ddac-y2k-snippet-ac3\" target=\"_blank\" rel=\"noopener\" data-darkreader-inline-color=\"\">00055-ddac-y2k-snippet-ac3<\/a><\/div>\n<p>&nbsp;<\/p>\n<p>This track represents the 55th glitch of the Digital Dark Age Crew&#8217;s Y2K. Our broken file has the MD5: <code>dec84cb48e1d44d07166ea6bb3a8fd6c<\/code><\/p>\n<p>In this instance we also have access to the original file. Hypothetically, this file can represent having access to the original, e.g. in the case of a network transfer error, or a modified version, in the case we were able to correct some aspect of the file.<\/p>\n<p>The corrected file has the checksum: <code>ab5afe4018907f5cae3905234f539c8d<\/code><\/p>\n<p>Both files are 196 KiB.<\/p>\n<p>You can see the side by side some examples of where the file was corrupted:<\/p>\n<figure id=\"attachment_2687\" aria-describedby=\"caption-attachment-2687\" style=\"width: 1600px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/uploads\/2025\/09\/ac3diff.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-2687 size-full\" src=\"https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/uploads\/2025\/09\/ac3diff.png\" alt=\"Image shows one corrupted file side-by-side with its non-corrupted partner through the lens of a diff tool. The differences are highlighted on the command line in red and green.\" width=\"1600\" height=\"750\" srcset=\"https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/uploads\/2025\/09\/ac3diff.png 1600w, https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/uploads\/2025\/09\/ac3diff-500x234.png 500w, https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/uploads\/2025\/09\/ac3diff-768x360.png 768w, https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/uploads\/2025\/09\/ac3diff-1024x480.png 1024w, https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/uploads\/2025\/09\/ac3diff-1536x720.png 1536w\" sizes=\"auto, (max-width: 1600px) 100vw, 1600px\" \/><\/a><figcaption id=\"caption-attachment-2687\" class=\"wp-caption-text\">Figure 1: Visual diff of a small section of two files showing the differences between the two.<\/figcaption><\/figure>\n<p>If we remind ourselves of the original <code>bitflip<\/code> command, the differences represent 55 x 0.001% of the file: <code>bitflip spray percent:0.001 {}<\/code>.<\/p>\n<p>We will call our broken file: <code>source<\/code> and our corrected file <code>target<\/code>. We want to create a patch file: <code>patch<\/code> such that when it is applied to source we generate <code>target<\/code> (although we will need to call that <code>result<\/code> and ensure that it is the same as <code>target<\/code>.<\/p>\n<p>To create patch we do the following:<\/p>\n<pre>bsdiff source.ac3 target.ac3 patch.ac3\r\n<\/pre>\n<p><em><strong>NB.<\/strong> using .ac3 as an extension for the patch file is an arbitrary choice to simplify this illustration. Other schemes might be adopted in other scenarios.<\/em><\/p>\n<p>The patch file is 4.0 KiB which is 2% the size of the broken or target file. Its contents fits easily on this page <em>(a copy of the repaired file we will generate from this patch would take up 98 x the space)<\/em>:<\/p>\n<pre>00000000: 42 53 44 49 46 46 34 30 30 00 00 00 00 00 00 00  BSDIFF400.......\r\n00000010: 86 07 00 00 00 00 00 00 00 0c 03 00 00 00 00 00  ................\r\n00000020: 42 5a 68 39 31 41 59 26 53 59 48 28 c8 1d 00 00  BZh91AY&amp;SYH(....\r\n00000030: 04 40 c0 58 24 00 00 c0 00 20 00 21 a6 8c d4 21  .@.X$.... .!...!\r\n00000040: 80 bd ab 06 34 1e 2e e4 8a 70 a1 20 90 51 90 3a  ....4....p. .Q.:\r\n00000050: 42 5a 68 39 31 41 59 26 53 59 ca 98 69 a8 00 00  BZh91AY&amp;SY..i...\r\n00000060: ab 7f ff ff ff ff ff ff ff ff ff ff ff df ff ff  ................\r\n00000070: fb 5f ff fd ff fe ff bb ff ff ff ef ff ff 7c 7f  ._............|.\r\n00000080: ff fe dd d0 05 b8 00 00 07 9c e9 00 0a 22 52 9b  .............\"R.\r\n00000090: 20 d0 43 11 81 a4 c8 c1 32 64 da 46 d1 34 c8 64   .C.....2d.F.4.d\r\n000000a0: da 98 34 98 98 98 9a 69 ea 69 a3 13 4c 9b 21 90  ..4....i.i..L.!.\r\n000000b0: d0 26 98 00 4c 02 30 4c d3 53 46 99 1a 60 34 9e  .&amp;..L.0L.SF..`4.\r\n000000c0: a6 8c d1 34 1e a6 99 1a 64 69 8d 1a 69 a2 36 93  ...4....di..i.6.\r\n000000d0: 21 55 35 3d 1a 9b 21 1b 40 68 02 6d 47 a6 81 a6  !U5=..!.@h.mG...\r\n000000e0: 9a 02 62 62 7a 4f 41 30 7a a6 69 a0 d1 a3 40 00  ..bbzOA0z.i...@.\r\n000000f0: 4c 99 06 4d 06 46 10 60 9a 30 04 c0 19 0d 46 00  L..M.F.`.0....F.\r\n00000100: 0d 34 00 d0 d2 64 64 c0 21 82 1a 6d 42 29 fa 09  .4...dd.!..mB)..\r\n00000110: a0 09 54 00 86 13 23 09 a6 11 a6 11 93 02 60 86  ..T...#.......`.\r\n00000120: 98 09 89 a3 04 69 89 84 c4 c0 4c 00 00 04 00 69  .....i....L....i\r\n00000130: a3 13 08 30 09 88 30 26 99 03 46 98 00 98 00 00  ...0..0&amp;..F.....\r\n00000140: 80 00 00 00 00 00 00 0d 00 34 00 00 00 00 00 0d  .........4......\r\n00000150: 00 34 34 68 00 34 00 00 00 00 06 40 00 00 00 34  .44h.4.....@...4\r\n00000160: 00 01 a3 26 80 65 29 26 02 30 99 31 0d 34 d3 4d  ...&amp;.e)&amp;.0.1.4.M\r\n00000170: 03 23 4c 23 43 09 a1 a3 26 86 26 8c 08 34 60 08  .#L#C...&amp;.&amp;..4`.\r\n00000180: 64 c8 31 32 01 89 a1 82 31 18 9a 32 34 c8 30 4d  d.12....1..24.0M\r\n00000190: 06 08 03 26 13 08 30 8c 9a 06 23 26 83 1e ef b8  ...&amp;..0...#&amp;....\r\n000001a0: 87 fc 7d e6 2f 88 50 7f 7a 1e ea e8 de 44 d1 f6  ..}.\/.P.z....D..\r\n000001b0: 73 2b 8d 04 b9 4c 04 0a 6e 42 00 8e 4e 99 3c ea  s+...L..nB..N.&lt;.\r\n000001c0: 69 eb 99 22 14 2b 33 82 94 ea 76 46 34 0e 79 d0  i..\".+3...vF4.y.\r\n000001d0: ee 20 e7 50 51 6a c9 98 b2 51 92 82 35 ad ce 64  . .PQj...Q..5..d\r\n000001e0: ac a9 28 bb 13 8c a8 0b 26 10 8a 4c e6 6d 5c c1  ..(.....&amp;..L.m\\.\r\n000001f0: 93 8a 09 8a 82 30 93 3c dc d2 78 92 95 c5 5e e9  .....0.&lt;..x...^.\r\n00000200: 19 5c a0 a6 20 d8 42 a8 46 9c 14 96 09 9b 03 bd  .\\.. .B.F.......\r\n00000210: 1c a2 ea 1e 4c 6d 24 84 9d b0 eb 04 7c 45 49 d4  ....Lm$.....|EI.\r\n00000220: e5 31 58 29 24 0d 90 8c 8f 06 02 aa 1b 0c 2b 5d  .1X)$.........+]\r\n00000230: 11 14 20 04 af 6b b3 8c 9d 49 24 b9 12 0c 1b bd  .. ..k...I$.....\r\n00000240: c3 44 15 f0 f5 94 5d 45 a2 ec 66 0a d2 8a 33 a2  .D....]E..f...3.\r\n00000250: 9b c6 00 81 42 8d c8 da c9 2a 47 2b 2b 26 74 5c  ....B....*G++&amp;t\\\r\n00000260: dd 89 8c 8d c4 aa 79 5c 46 91 39 e7 02 8d 25 39  ......y\\F.9...%9\r\n00000270: 84 3b c0 dd 92 2f 0c 4d 62 57 b1 90 b4 8e 2b 9a  .;...\/.MbW....+.\r\n00000280: 41 0a 36 6a 28 a8 44 76 20 74 79 04 23 4c 40 d0  A.6j(.Dv ty.#L@.\r\n00000290: 29 4c 14 9d 1a b5 81 4c 94 ca c0 21 95 af 1b d2  )L.....L...!....\r\n000002a0: f0 ca 40 ef 6b 0a 5a ef 93 23 26 55 48 4d d9 da  ..@.k.Z..#&amp;UHM..\r\n000002b0: e8 01 c6 a9 91 bc d0 39 ca 88 cb 91 35 15 54 8e  .......9....5.T.\r\n000002c0: 47 00 57 90 8c c9 2b 09 3a 82 c1 21 ca 54 48 31  G.W...+.:..!.TH1\r\n000002d0: 91 18 27 33 c8 0c 08 08 8b e0 ea 05 ae 70 98 a5  ..'3.........p..\r\n000002e0: 05 22 81 6f 7b da 70 27 50 46 29 74 ca 98 23 67  .\".o{.p'PF)t..#g\r\n000002f0: 40 86 b5 9d 6b 57 a0 24 ba 50 ab 58 14 f0 c6 19  @...kW.$.P.X....\r\n00000300: eb 7b 45 12 29 85 32 84 44 90 a2 2b 26 0c 8a c0  .{E.).2.D..+&amp;...\r\n00000310: aa 21 44 42 bd 67 55 9a 45 28 86 f3 75 5a 0b 02  .!DB.gU.E(..uZ..\r\n00000320: 73 34 43 25 2a 59 54 8a 8e 78 0b 88 92 58 32 a5  s4C%*YT..x...X2.\r\n00000330: f0 31 02 4c 1b b3 21 a1 a0 c8 af 44 c3 2b 5a 36  .1.L..!....D.+Z6\r\n00000340: 42 21 53 c4 46 80 84 74 a9 a1 15 08 65 71 82 6a  B!S.F..t....eq.j\r\n00000350: 26 4b 0b a6 53 10 14 89 b8 b0 47 06 4f 13 2c 67  &amp;K..S.....G.O.,g\r\n00000360: 14 7b 2a 28 22 71 9b d5 26 42 f2 3c 8a ea 61 a7  .{*(\"q..&amp;B.&lt;..a.\r\n00000370: 5c da a3 24 05 22 19 34 67 76 69 02 c0 10 3c 6b  \\..$.\".4gvi...<\/pre>\n<p>Now we want to apply patch to the broken file: source to generate the corrected file: <code>result.ac3<\/code>.<\/p>\n<p>We do this using the <code>bsdiff<\/code> companion tool <code>bspatch<\/code> as follows:<\/p>\n<pre>bspatch source.ac3 result.ac3 patch.ac3\r\n<\/pre>\n<p>We compare this to <code>target.ac3<\/code>\u00a0and see that the result is a byte-for-byte match. We can list the directory contents:<\/p>\n<pre>$ md5sum *\r\ndec84cb48e1d44d07166ea6bb3a8fd6c source.ac3\r\nba4c8e99b0f69fab8a8330c69a4084a6 patch.ac3\r\nab5afe4018907f5cae3905234f539c8d target.ac3\r\nab5afe4018907f5cae3905234f539c8d result.ac3 &lt;-- checksum matches target.ac3\r\n<\/pre>\n<p>And we can listen to the results:<\/p>\n<p><iframe loading=\"lazy\" src=\"https:\/\/w.soundcloud.com\/player\/?url=https%3A\/\/api.soundcloud.com\/tracks\/soundcloud%253Atracks%253A2176750962&amp;color=%23ff0d0d&amp;auto_play=false&amp;hide_related=false&amp;show_comments=true&amp;show_user=true&amp;show_reposts=false&amp;show_teaser=true&amp;visual=true\" width=\"100%\" height=\"300\" frameborder=\"no\" scrolling=\"no\"><\/iframe><\/p>\n<div style=\"font-size: 10px; color: #cccccc; line-break: anywhere; word-break: normal; overflow: hidden; white-space: nowrap; text-overflow: ellipsis; font-family: Interstate,Lucida Grande,Lucida Sans Unicode,Lucida Sans,Garuda,Verdana,Tahoma,sans-serif; font-weight: 100;\" data-darkreader-inline-color=\"\"><a style=\"color: #cccccc; text-decoration: none;\" title=\"exponential-decay\" href=\"https:\/\/soundcloud.com\/exponential-decay\" target=\"_blank\" rel=\"noopener\" data-darkreader-inline-color=\"\">exponential-decay<\/a> \u00b7 <a style=\"color: #cccccc; text-decoration: none;\" title=\"DDAC Y2K Snippet\" href=\"https:\/\/soundcloud.com\/exponential-decay\/ddac-y2k-snippet\" target=\"_blank\" rel=\"noopener\" data-darkreader-inline-color=\"\">DDAC Y2K Snippet<\/a><\/div>\n<p>&nbsp;<\/p>\n<p>We can see our commands and the results on the command line via <a href=\"https:\/\/asciinema.org\/a\/7kqp0cmZ0pYbvIUpEfjBZHtRs\" target=\"_blank\" rel=\"noopener\">Asciinema<\/a>.<\/p>\n<p><script src=\"https:\/\/asciinema.org\/a\/7kqp0cmZ0pYbvIUpEfjBZHtRs.js\" id=\"asciicast-7kqp0cmZ0pYbvIUpEfjBZHtRs\" async=\"true\"><\/script><\/p>\n<h2>Repeating the process with a video<\/h2>\n<p>We can demonstrate the same process with a video. Take the source below, corrupted again using <code>bitflip<\/code>.<\/p>\n<p><iframe loading=\"lazy\" title=\"Y2K corrupted\" width=\"1200\" height=\"675\" src=\"https:\/\/www.youtube.com\/embed\/lI3-NUyKsXc?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe><\/p>\n<p>The <code>source<\/code> file has a checksum of <code>a4964b40c87c8dee73fe338a70eb2ed9<\/code> and is 3.3 MiB.<\/p>\n<p>We are fortunate to have the <code>target<\/code> file. Again, imagine we&#8217;ve corrected its container object, or added some metadata. The <code>target<\/code> has the checksum: <code>0da8c1dcd3b4dd7f8eff629d6f876bcc<\/code><\/p>\n<pre>bsdiff source.mp4 target.mp4 patch.mp4\r\n<\/pre>\n<p>Like the audio file, the resulting <code>patch<\/code> file is 4.0 KiB <em>(in this instance this is just 0.12% the size of the original)<\/em>:<\/p>\n<pre>00000000: 42 53 44 49 46 46 34 30 37 00 00 00 00 00 00 00  BSDIFF407.......\r\n00000010: 89 04 00 00 00 00 00 00 49 7b 34 00 00 00 00 00  ........I{4.....\r\n00000020: 42 5a 68 39 31 41 59 26 53 59 9c 1f a2 3e 00 00  BZh91AY&amp;SY...&gt;..\r\n00000030: 06 5c c0 60 20 20 00 04 00 00 20 00 08 40 40 20  .\\.`  .... ..@@ \r\n00000040: 00 21 a0 1b 50 83 26 22 f2 92 43 06 78 bb 92 29  .!..P.&amp;\"..C.x..)\r\n00000050: c2 84 84 e0 fd 11 f0 42 5a 68 39 31 41 59 26 53  .......BZh91AY&amp;S\r\n00000060: 59 53 b6 29 dd 00 0f 4c ff ff f6 d5 fb 3f fc fb  YS.)...L.....?..\r\n00000070: fb 7f 5f a4 7f 36 8e 4f f7 25 7b 9d 7d a6 ef b7  .._..6.O.%{.}...\r\n00000080: ed c5 ee af f4 cd d7 3f e6 7d d0 05 5e 00 00 00  .......?.}..^...\r\n00000090: 00 20 08 54 89 9a 83 68 cd 44 f5 0c 98 18 89 90  . .T...h.D......\r\n000000a0: f5 1a 69 b4 4c 4c 3d 4d 09 e2 9a 63 53 4f 51 a6  ..i.LL=M...cSOQ.\r\n000000b0: d4 7a 80 d3 0d 0d 26 4f 4c 4d 4c 34 0c 10 9e 8d  .z....&amp;OLML4....\r\n000000c0: 4d a6 a1 ea 7a 4d a9 fa 9b 4d 0c 93 d4 41 93 11  M...zM...M...A..\r\n000000d0: 93 4c 9a 0d 0c 98 86 40 1a 68 0c 23 04 69 a6 08  .L.....@.h.#.i..\r\n000000e0: 30 98 86 23 46 80 34 03 08 c0 9a 06 9a 68 32 00  0..#F.4......h2.\r\n000000f0: c8 69 93 40 0d 32 10 64 c4 64 d3 26 83 43 26 21  .i.@.2.d.d.&amp;.C&amp;!\r\n00000100: 90 06 9a 03 08 c1 1a 69 82 0c 26 21 88 d1 a0 0d  .......i..&amp;!....\r\n00000110: 00 c2 30 26 81 a6 9a 0c 80 32 1a 64 d0 03 4c 84  ..0&amp;.....2.d..L.\r\n00000120: 19 31 19 34 c9 a0 d0 c9 88 64 01 a6 80 c2 30 46  .1.4.....d....0F\r\n00000130: 9a 60 83 09 88 62 34 68 03 40 30 8c 09 a0 69 a6  .`...b4h.@0...i.\r\n00000140: 83 20 0c 86 99 34 00 d3 20 42 a8 46 8c 86 a5 40  . ...4.. B.F...@\r\n00000150: 06 40 00 00 0d 1a 00 00 00 00 00 00 00 00 00 00  .@..............\r\n00000160: 00 00 00 00 00 d3 fe 0b f5 2e d7 28 1f da 05 dd  ...........(....\r\n00000170: 2a 9b e0 7d 4c 26 50 aa 61 21 9c a9 d8 48 0f 08  *..}L&amp;P.a!...H..\r\n00000180: 11 32 8f fd 09 d8 40 a9 84 02 06 70 a0 e1 29 d9  .2....@....p..).\r\n00000190: 21 5c 60 d6 40 07 68 78 42 89 ba 04 0c 24 0d d2  !\\`.@.hxB....$..\r\n000001a0: 01 ca 03 8c 20 26 90 a8 7e e8 55 e3 0a 14 a0 f1  .... &amp;..~.U.....\r\n000001b0: 90 13 fe c0 aa 50 34 27 09 14 73 85 28 10 e5 28  .....P4'..s.(..(\r\n000001c0: b8 c8 a6 f9 13 84 8a 1a c2 a1 84 2b 94 1c 24 73  ...........+..$s\r\n000001d0: 25 34 94 34 95 71 90 13 7d 48 22 6f d3 04 14 d6  %4.4.q..}H\"o....\r\n000001e0: 10 43 69 40 77 ca a9 9d e2 20 03 38 50 ca 04 46  .Ci@w.... .8P..F\r\n000001f0: 95 42 80 0c 65 17 18 40 35 80 14 ea 85 79 40 23  .B..e..@5....y@#\r\n00000200: 48 19 42 02 e7 02 ae 10 8f 18 03 49 4e 52 a9 c6  H.B........INR..\r\n00000210: 10 c6 01 4c 20 4c e0 40 c6 47 58 0e 32 a1 84 a2  ...L L.@.GX.2...\r\n00000220: 65 22 74 85 50 d2 01 e7 20 1a c2 07 1b 69 10 e9  e\"t.P... ....i..\r\n00000230: 0a f1 94 4e 70 0b 42 8b ca 04 5c e1 36 91 10 e3  ...Np.B...\\.6...\r\n00000240: 2a 25 20 65 0b ac 0a e7 2a af 48 4c e0 77 4a 29  *% e....*.HL.wJ)\r\n00000250: be 53 59 47 94 28 a7 28 74 95 1a 03 19 55 71 cb  .SYG.(.(t....Uq.\r\n00000260: 02 95 03 84 82 65 00 07 28 00 5d a1 13 09 57 48  .....e..(.]...WH\r\n00000270: 4d a0 3a 46 52 a6 f9 11 e9 0d 07 28 4d d0 8e d2  M.:FR......(M...\r\n00000280: 8b b4 08 99 40 18 cb ce 30 94 1d d2 29 c6 7a 4a  ....@...0...).zJ\r\n00000290: 3c 65 43 69 53 94 28 36 18 02 ed 0a bb e5 14 df  &lt;eCiS.(6........ 000002a0: 00 39 4f c8 80 17 39 1e 72 28 75 48 1b e0 3a a4 .9O...9.r(uH..:. 000002b0: 29 0d 24 01 3a a2 93 9c 83 b4 b9 ca 06 72 a2 ee ).$.:........r.. 000002c0: 90 75 90 46 80 3a 4a 06 b2 a3 94 a2 f5 4a 39 c8 .u.F.:J......J9. 000002d0: ae 52 86 30 85 00 bc 61 43 76 fc 15 1d a1 35 85 .R.0...aCv....5. 000002e0: 13 aa 04 e9 21 ba 44 e1 2e e8 57 48 54 35 80 df ....!.D...WHT5.. 000002f0: 28 19 c2 07 39 41 c2 45 35 94 ca 55 1d 20 55 35 (...9A.E5..U. U5 00000300: b2 85 13 9c 26 e8 34 84 07 75 ca 01 10 ca c6 00 ....&amp;.4..u...... 00000310: 53 18 45 38 c0 38 ca 8e 50 81 be 45 d2 05 39 c8 S.E8.8..P..E..9. 00000320: 3a 59 41 84 0b c2 54 e5 03 c2 51 c6 15 ea 80 c6 :YA...T...Q..... 00000330: 44 e9 21 94 0a f1 84 1c e1 0d c4 82 6d d3 04 51 D.!.........m..Q 00000340: e9 2b c2 40 d2 00 31 85 1e 52 22 9c 67 84 26 d2 .+.@..1..R\".g.&amp;. 00000350: 18 4a 38 c2 ed 02 f4 8c a0 13 84 08 f4 8e 12 f2 .J8............. 00000360: 80 5c 65 13 74 8a 3b 42 8e d2 1c 21 03 48 40 dd .\\e.t.;B...!.H@. 00000370: 02 85 28 27 18 57 a4 00 63 00 e3 20 08 6d 22 71 ..('.W..c.. .m\"q 00000380: 81 43 48 57 74 8e 30 23 9c 6b 00 e7 00 63 02 86 .CHWt.0#.k...c.. 00000390: f9 03 48 14 30 80 17 38 40 e1 00 0f 38 47 29 10 ..H.0..8@...8G). 000003a0: 35 d7 00 03 28 50 c6 5e 5a 60 8a 38 c0 27 62 01 5...(P.^Z`.8.'b. 000003b0: 14 11 1a 44 50 44 7b 99 9e 2a 1d eb 64 28 08 98 ...DPD{..*..d(.. 000003c0: 4b 47 bd 3a f6 33 93 eb 15 b3 6d 6d 67 ec b3 3f KG.:.3....mmg..? 000003d0: 61 9a 42 5f f7 ae 60 7e 1b fb 4a a4 42 94 46 d5 a.B_..`~..J.B.F. 000003e0: cf bf 85 5a 76 e8 fc cd 63 90 4a ab 89 59 29 04 ...Zv...c.J..Y). 000003f0: 01 d4 3d a2 64 30 cc 2e 2d 59 af 34 51 08 8a f6 ..=.d0..-Y.4Q... 00000400: d5 92 60 d1 f9 f1 fe eb 43 fa 73 0c 52 dc 22 c4 ..`.....C.s.R.\". 00000410: 04 82 22 00 52 d8 51 9a b1 73 ee 58 cc a2 a8 13 ..\".R.Q..s.X.... 00000420: 11 48 8c 04 44 75 f5 35 fb f6 17 9a 47 ec af fc .H..Du.5....G... 00000430: cb 3e ed 77 d8 e2 1d c2 80 da be 43 8c f2 d2 34 .&gt;.w.......C...4\r\n00000440: 5b 1e 2a 87 93 65 3a 3e 22 81 11 00 0d 8c 1f 51  [.*..e:&gt;\"......Q\r\n00000450: af 9d 3e bf 55 61 ab 8b fa c0 b9 57 0e 4e e1 a9  ..&gt;.Ua.....W.N..\r\n00000460: fa a0 20 00 ef 45 33 5e 44 48 4c a5 9c 53 fe 7f  .. ..E3^DHL..S..\r\n00000470: 68 6b 45 28 c4 40 39 65 cd 44 c4 8d 06 35 3d 3d  hkE(.@9e.D...5==\r\n00000480: af 17 ca 29 32 4c 31 bb 98 f9 df be 39 69 44 59  ...)2L1.....9iDY\r\n00000490: 58 c7 d2 b8 a1 3e b7 39 fb a6 4b e3 19 cb ff 7f  X....&gt;.9..K.....\r\n000004a0: c8 5b 92 f2 fa c2 20 20 20 00 02 03 2e 8f 9a 54  .[....   ......T\r\n000004b0: 85 b8 c7 76 7a 52 2d cd c6 ea 77 25 1d c4 73 44  ...vzR-...w%..sD\r\n000004c0: 1d 1b 22 90 e5 2a 99 e0 21 74 c3 eb af 76 f7 5e  ..\"..*..!t...v.^\r\n000004d0: fc 72 39 9c 1f ce 2e e4 8a 70 a1 20 a7 6c 53 ba  .r9......p. .lS.\r\n000004e0: 42 5a 68 39 17 72 45 38 50 90 00 00 00 00        BZh9.rE8P.....\r\n<\/pre>\n<p>And we can apply it to the source to generate the <code>result<\/code>. We can once again compare <code>result<\/code> to <code>target<\/code> by looking at the checksums. Both match as we expected:<\/p>\n<pre>$ md5sum *\r\na4964b40c87c8dee73fe338a70eb2ed9  source.mp4\r\n458a98361f5e47ac515e3cbc0a41d66e  patch.mp4\r\n0da8c1dcd3b4dd7f8eff629d6f876bcc  target.mp4\r\n0da8c1dcd3b4dd7f8eff629d6f876bcc  result.mp4 &lt;-- checksum matches target.mp4\r\n<\/pre>\n<p>We can see the appearance of the corrected file is <em>vastly<\/em> improved!<\/p>\n<p><iframe loading=\"lazy\" title=\"Y2K Uncorrupted\" width=\"1200\" height=\"675\" src=\"https:\/\/www.youtube.com\/embed\/y1OSaLosg1Q?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe><\/p>\n<h2>Analyzing the patch file<\/h2>\n<p>If like me you&#8217;re asking how a patch file might be 4.0 KiB for a 200kb audio file <em>and<\/em> a 4mb video file, then you will find the answer lays in compression.<\/p>\n<p>The utility makes use of <a href=\"https:\/\/www.daemonology.net\/bsdiff\/\" target=\"_blank\" rel=\"noopener\">local bzip2 compression<\/a> and attaches is own header (and possibly footer) to the compressed output to help make it easier to identify.<\/p>\n<p>Because a bsdiff patch is actually a &#8220;mask&#8221; and the mask only contains differences between two files the mask will contain a lot of zeros where no data is going to be replaced; blocks of contiguous identical content are very easy to compress.<\/p>\n<p>We can see how sparse the differences between the two audio files used in this blog in Figure 1. There are only a few bytes dotted about that make the difference between the corruption and not so we anticipate much of the patch file to be\u00a0 zeros.<\/p>\n<p>How can we demonstrate this?<\/p>\n<h3 id=\"diff-the-diffs\">Diffing the diffs<\/h3>\n<p>If we take an empty file and apply a patch to that file it is reasonable to assume we will end up with a file\u00a0<em><strong>patch-bytes<\/strong><\/em><em>\u00a0<\/em>in size, containing only null and positive bytes, i.e. the mask. But do we?<\/p>\n<p>We can try it!<\/p>\n<ol>\n<li>Given a patch file: <code>patch.ac3<\/code><\/li>\n<li>Create a new empty file: <code>touch empty.ac3<\/code><\/li>\n<li>Run bspatch against the empty file and create <code>mask.ac3<\/code><\/li>\n<\/ol>\n<pre>bspatch empty.ac3 mask.ac3 patch.ac3\r\n<\/pre>\n<p>inspect the mask file, e.g. <code>xxd -g1 -a mask.ac3 | less<\/code> (<a href=\"https:\/\/explainshell.com\/explain?cmd=xxd+-g1+-a+mask.ac3+%7C+less\" target=\"_blank\" rel=\"noopener\">explainshell<\/a>)<\/p>\n<pre>00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................\r\n*\r\n00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 08  ................\r\n00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................\r\n*\r\n000000f0: 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00  . ..............\r\n00000100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................\r\n*\r\n00000290: 00 00 00 00 00 00 00 00 80 00 00 00 00 00 00 00  ................\r\n000002a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................\r\n*\r\n000004d0: 00 00 00 00 00 00 00 00 00 40 00 00 00 00 00 00  .........@......\r\n000004e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................\r\n*\r\n000005d0: 00 00 00 00 00 00 00 00 fe 00 00 00 00 00 00 00  ................\r\n000005e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................\r\n*\r\n00000650: 00 00 00 00 00 00 00 00 c0 00 00 00 00 00 00 00  ................\r\n00000660: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................\r\n<\/pre>\n<p>You will observe that most of the content (in this example at least) are null bytes, and dotted throughout bytes where bsdiff would apply a transformation to create the byte or bytes from the original target file.<\/p>\n<p>You might be able to get a better idea via <code>xxd<\/code> on its own.<\/p>\n<p><a href=\"https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/uploads\/2025\/09\/xxd-diff.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2693\" src=\"https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/uploads\/2025\/09\/xxd-diff.png\" alt=\"Image shows a hexdump with non-null bytes colorized making it easier to see differences, and ultimately how sparse the data is in the file.\" width=\"797\" height=\"754\" srcset=\"https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/uploads\/2025\/09\/xxd-diff.png 797w, https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/uploads\/2025\/09\/xxd-diff-500x473.png 500w, https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/uploads\/2025\/09\/xxd-diff-768x727.png 768w\" sizes=\"auto, (max-width: 797px) 100vw, 797px\" \/><\/a><\/p>\n<h3>Shell script<\/h3>\n<p>You can use the following shell script to do the same:<\/p>\n<pre>#! \/usr\/bin\/bash\r\n\r\necho \"hex diff two files using bsdiff\"\r\nif [ $# -lt 2 ]\r\nthen\r\n echo \"please supply an source and target file\"\r\n exit\r\nfi\r\necho \"bad_file_a: $1\"\r\necho \"corrrected_file_b: $2\"\r\nbsdiff $1 $2 patch.file\r\ntouch empty.file\r\nbspatch empty.file mask.file patch.file\r\nread -p \"\u26a0\ufe0f   display diff (Y\/N): \" confirm &amp;&amp; [[ $confirm == [yY] || $confirm == [yY][eE][sS] ]] || exit 1\r\nxxd -g1 -a mask.file\r\n<\/pre>\n<h3>If nothing else, a visualization<\/h3>\n<p>The mask we create has most use as a visualization technique as the result is a mixture of bytes inserted <em>or<\/em> transformed via bsdiff&#8217;s algorithm and so some bytes are bit-for-bit matches for the target file and some are only a result of a transformation (and so not an exact bit match).<\/p>\n<p>That being said, if you are looking to get an idea of the areas in a file that have been corrupted between <code>broken_file_a<\/code> and <code>corrected_file_b<\/code> you can get a pretty effective first opinion either using hex tools like above or using tools like <a href=\"https:\/\/binvis.io\/\" target=\"_blank\" rel=\"noopener\">binvis.io<\/a>:<\/p>\n<p><a href=\"https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/uploads\/2025\/09\/mask-1376-101216.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2695\" src=\"https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/uploads\/2025\/09\/mask-1376-101216.png\" alt=\"Image shows a visualization of a mask file using binvis.io. The colors are mostly black showing null bytes but droplets of color show where differences appear and these will represent areas of corruption between a broken_file_a and corrected_file_b.\" width=\"1079\" height=\"1079\" srcset=\"https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/uploads\/2025\/09\/mask-1376-101216.png 1079w, https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/uploads\/2025\/09\/mask-1376-101216-500x500.png 500w, https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/uploads\/2025\/09\/mask-1376-101216-1024x1024.png 1024w, https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/uploads\/2025\/09\/mask-1376-101216-300x300.png 300w, https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/uploads\/2025\/09\/mask-1376-101216-768x768.png 768w\" sizes=\"auto, (max-width: 1079px) 100vw, 1079px\" \/><\/a><\/p>\n<h3>Another quick demo<\/h3>\n<p>Another quick demo using a shell script:<\/p>\n<pre>#! \/usr\/bin\/bash\r\n\r\n# create a target file.\r\necho \"There are differences in this file.\" &gt; target.txt\r\n\r\n# insert some differences to create source \r\n# (6 differences below in total).\r\necho \"there  re dinnerences in this zile!\" &gt; source.txt\r\n\r\n# create our patch and diff files to see the mask.\r\nbsdiff source.txt target.txt patch.txt\r\ntouch empty.txt\r\nbspatch empty.txt mask.txt patch.txt\r\n\r\n# output our comparison.\r\necho \"===========================================\"\r\necho \"bsdiff demo: showing mask compared to diffs\"\r\necho \"===========================================\"\r\necho \"\"\r\necho \"mask:\"\r\necho \"\"\r\nxxd -g1 mask.txt\r\necho \"----\"\r\necho \"source:\"\r\necho \"\"\r\nxxd -g1 source.txt\r\necho \"----\"\r\necho \"target:\"\r\necho \"\"\r\nxxd -g1 target.txt\r\n<\/pre>\n<p>This will output the following:<\/p>\n<pre>===========================================\r\nbsdiff demo: showing mask compared to diffs\r\n===========================================\r\n\r\nmask:\r\n\r\n00000000: <span style=\"text-decoration: underline;\"><strong>e0<\/strong><\/span> 00 00 00 00 00 <span style=\"text-decoration: underline;\"><strong>41<\/strong><\/span> 00 00 00 00 00 <span style=\"text-decoration: underline;\"><strong>f8 f8<\/strong><\/span> 00 00  ......A.........\r\n00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <span style=\"text-decoration: underline;\"><strong>ec<\/strong><\/span> 00  ................\r\n00000020: 00 00 <span style=\"text-decoration: underline;\"><strong>2e<\/strong><\/span> 0a                                      ....\r\n----\r\nsource:\r\n\r\n00000000: <span style=\"text-decoration: underline;\"><strong>74<\/strong><\/span> 68 65 72 65 20 <span style=\"text-decoration: underline;\"><strong>20<\/strong><\/span> 72 65 20 64 69 <span style=\"text-decoration: underline;\"><strong>6e 6e<\/strong><\/span> 65 72  there  re dinner\r\n00000010: 65 6e 63 65 73 20 69 6e 20 74 68 69 73 20 <span style=\"text-decoration: underline;\"><strong>7a<\/strong><\/span> 69  ences in this zi\r\n00000020: 6c 65 <span style=\"text-decoration: underline;\"><strong>21<\/strong><\/span> 0a                                      le!.\r\n----\r\ntarget:\r\n\r\n00000000: <span style=\"text-decoration: underline;\"><strong>54<\/strong><\/span> 68 65 72 65 20 <span style=\"text-decoration: underline;\"><strong>61<\/strong><\/span> 72 65 20 64 69 <span style=\"text-decoration: underline;\"><strong>66 66<\/strong><\/span> 65 72  There are differ\r\n00000010: 65 6e 63 65 73 20 69 6e 20 74 68 69 73 20 <span style=\"text-decoration: underline;\"><strong>66<\/strong><\/span> 69  ences in this fi\r\n00000020: 6c 65 <span style=\"text-decoration: underline;\"><strong>2e<\/strong><\/span> 0a  \r\n<\/pre>\n<p>Observing the mask file you can see most bytes are null bytes. In contrast our six bytes that are different between the source file and target file are clear as they are non-null.<\/p>\n<p><em>Try some different content and give it a whirl to get an idea how this technology might work for you!<\/em><\/p>\n<h2>Evaluating the workflow<\/h2>\n<p>In my original post about bsdiff I describe the use of a (possible) simple provenance note that might be part of a digital original&#8217;s ingest metadata:<\/p>\n<blockquote><p><em>Programmers Notepad 2.2.2300-rc used to convert plain-text file to UTF-8. UTF-8 byte-order-mark (0xEFBBBF) added to beginning of file \u2013 file size +3 bytes. Em-dash (0x97 ANSI) at position d1256 replaced by UTF-8 representation 0xE28094 at position d1256+3 bytes (d1259-d1261) \u2013 file size +2 bytes.<\/em><\/p><\/blockquote>\n<p>There are limitations to narrative, especially when it comes to technology. Cognitively it just takes time to parse, technically, not everyone can easily understand the language.<\/p>\n<p>The patch file created through bsdiff provides a true <em>mechanism<\/em> that can be adopted to demonstrate a verifiable link between an original source file and a modified, e.g. corrected version of that file. In some cases a fix may enable rendering (display) to occur\u00a0<em>at all <\/em>and so it can be hugely consequential.<\/p>\n<p>Instead of just providing written documentation of something happening, e.g. somewhere in the PREMIS event detail in digital preservation systems still making use of that standard we can provide a patch file, a note, instructions, and even tooling that links <code>broken_file_a<\/code> to <code>corrected_file_b<\/code> via <code>patch_file_c<\/code>.<\/p>\n<p>In times of wavering trust, being able to demonstrate an empirical link between two objects to the public makes a lot of sense.<\/p>\n<p>bsdiff also offers the opportunity to cut down on storage use on your servers; again, the patch file, becomes the mechanism by which a broken file can be turned into a corrected file and there can be ways to deliver a corrected high-fidelity copy of that file via bsdiff and bspatch to your users.<\/p>\n<h3>It won&#8217;t always be feasible<\/h3>\n<p>We are able to take a correct object and correct the errors but it&#8217;s only possible within a small number of digital preservation scenarios, e.g.<\/p>\n<ol>\n<li>we have the luxury of high-quality original, e.g. from a digitization process.<\/li>\n<li>we have enough knowledge about an original file format and are able to correct any issues ourselves.<\/li>\n<\/ol>\n<p>Maybe there are other scenarios folks reading this post will intuit. Certainly, the use-case from the \u00d6sterreichische Mediathek could potentially be handled this way, and given the medium, there is likely to be a lot of savings in their preservation storage. How much could me made in terms of savings would need to be measured and evaluated.<\/p>\n<h2>Conclusion<\/h2>\n<p>There are occasions where we have the ability to correct damage in a digital object, and given that occasion we may also need to make a decision, how much storage do we need to use in aid of the fix? How do we transmit information about what fix took place?<\/p>\n<p>bsdiff offers a way of preserving space, but even if we don&#8217;t preserve space, bsdiff provides us with a method of recording machine readable provenance information about a correction. Such a note could exist between <code>broken_file_a<\/code> and <code>corrected_file_b<\/code> creating a package of the following:<\/p>\n<ul>\n<li><code>broken_file_a<\/code><\/li>\n<li><code>corrected_file_b<\/code><\/li>\n<li><code>patch_file_c<\/code><\/li>\n<\/ul>\n<p>Users are then able to observe both the broken and corrected files as well as manually apply the patch to the broken file to give them measurable (cryptographically strong) evidence that the preservation action applied to a file is what it purported to be when the checksum of <code>broken_file_a<\/code> + <code>patch_file_c<\/code> equals that of <code>corrected_file_b<\/code>.<\/p>\n<p>We have options, and as digital literacy and capability in the field grow, they are options we are able to pursue with confidence.<\/p>\n<p><a href=\"https:\/\/www.youtube.com\/watch?v=DATj6_jTZ2A\" target=\"_blank\" rel=\"noopener\">Give it a whirl<\/a>, let me know if you decide to adopt it in your workflows. Let me know how it goes.<\/p>\n<hr \/>\n<h2>Acknowledgements<\/h2>\n<p>Thanks <a href=\"https:\/\/mastodon.social\/@p3ter\" target=\"_blank\" rel=\"noopener\">Peter B<\/a>., <a href=\"https:\/\/digipres.club\/@kieranjol\" target=\"_blank\" rel=\"noopener\">Kieran<\/a>, and <a href=\"https:\/\/digipres.club\/@joshuatj\" target=\"_blank\" rel=\"noopener\">JoshuaTJ<\/a> for <a href=\"https:\/\/digipres.club\/@joshuatj\/114231926600812394\" target=\"_blank\" rel=\"noopener\">this thread<\/a> in Mastodon inspiring parts of this revisit (especially around <a href=\"#diff-the-diffs\">diffing the diffs<\/a>.).<\/p>\n<p>And thanks once again to <a class=\"u-url mention\" href=\"https:\/\/digipres.club\/@bitsgalore\" rel=\"mention\">@bitsgalore<\/a> and the DDAC for their content and triggering these in-depth analyses.<\/p>\n<div class=\"pvc_clear\"><\/div>\n<p id=\"pvc_stats_2138\" class=\"pvc_stats total_only  \" data-element-id=\"2138\" style=\"\"><i class=\"pvc-stats-icon small\" aria-hidden=\"true\"><svg aria-hidden=\"true\" focusable=\"false\" data-prefix=\"far\" data-icon=\"chart-bar\" role=\"img\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" viewBox=\"0 0 512 512\" class=\"svg-inline--fa fa-chart-bar fa-w-16 fa-2x\"><path fill=\"currentColor\" d=\"M396.8 352h22.4c6.4 0 12.8-6.4 12.8-12.8V108.8c0-6.4-6.4-12.8-12.8-12.8h-22.4c-6.4 0-12.8 6.4-12.8 12.8v230.4c0 6.4 6.4 12.8 12.8 12.8zm-192 0h22.4c6.4 0 12.8-6.4 12.8-12.8V140.8c0-6.4-6.4-12.8-12.8-12.8h-22.4c-6.4 0-12.8 6.4-12.8 12.8v198.4c0 6.4 6.4 12.8 12.8 12.8zm96 0h22.4c6.4 0 12.8-6.4 12.8-12.8V204.8c0-6.4-6.4-12.8-12.8-12.8h-22.4c-6.4 0-12.8 6.4-12.8 12.8v134.4c0 6.4 6.4 12.8 12.8 12.8zM496 400H48V80c0-8.84-7.16-16-16-16H16C7.16 64 0 71.16 0 80v336c0 17.67 14.33 32 32 32h464c8.84 0 16-7.16 16-16v-16c0-8.84-7.16-16-16-16zm-387.2-48h22.4c6.4 0 12.8-6.4 12.8-12.8v-70.4c0-6.4-6.4-12.8-12.8-12.8h-22.4c-6.4 0-12.8 6.4-12.8 12.8v70.4c0 6.4 6.4 12.8 12.8 12.8z\" class=\"\"><\/path><\/svg><\/i> <img loading=\"lazy\" decoding=\"async\" width=\"16\" height=\"16\" alt=\"Loading\" src=\"https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/plugins\/page-views-count\/ajax-loader-2x.gif\" border=0 \/><\/p>\n<div class=\"pvc_clear\"><\/div>\n","protected":false},"excerpt":{"rendered":"<p>I introduced bsdiff in <a href=\"https:\/\/openpreservation.org\/blogs\/bsdiff-technological-solutions-reversible-pre-conditioning-complex-binary-objects\/\" target=\"_blank\" rel=\"noopener\">a blog in 2014<\/a>. bsdiff compares the differences between two files, e.g. <code>broken_file_a<\/code> and <code>corrected_file_b<\/code> and creates a <code>patch<\/code> that can be applied to <code>broken_file_a<\/code> to generate a byte-for-byte match for <code>corrected_file_b<\/code>.<\/p>\n<p>On the face of it, in an archive, we probably only care about <code>corrected_file_2<\/code> and so why would we care about a technology that patches a broken file?<\/p>\n<p>In all of the use-cases we can imagine the primary reasons are cost savings and removing redundancy in file storage or transmission of digital information. In one very special case we can record the difference between <code>broken_file_a<\/code> and <code>corrected_file_b<\/code> and give users a totally objective method of recreating <code>corrected_file_b<\/code> from <code>broken_file_a<\/code> providing 100% verifiable proof of the migration pathway taken between the two files.<\/p>\n<div class=\"link-more\"><a href=\"https:\/\/exponentialdecay.co.uk\/blog\/bsdiff-as-a-tool-for-digital-preservation\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &ldquo;Revisiting bsdiff as a tool for digital preservation&rdquo;<\/span>&hellip;<\/a><\/div>\n<div class=\"pvc_clear\"><\/div>\n<p id=\"pvc_stats_2138\" class=\"pvc_stats total_only  \" data-element-id=\"2138\" style=\"\"><i class=\"pvc-stats-icon small\" aria-hidden=\"true\"><svg aria-hidden=\"true\" focusable=\"false\" data-prefix=\"far\" data-icon=\"chart-bar\" role=\"img\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" viewBox=\"0 0 512 512\" class=\"svg-inline--fa fa-chart-bar fa-w-16 fa-2x\"><path fill=\"currentColor\" d=\"M396.8 352h22.4c6.4 0 12.8-6.4 12.8-12.8V108.8c0-6.4-6.4-12.8-12.8-12.8h-22.4c-6.4 0-12.8 6.4-12.8 12.8v230.4c0 6.4 6.4 12.8 12.8 12.8zm-192 0h22.4c6.4 0 12.8-6.4 12.8-12.8V140.8c0-6.4-6.4-12.8-12.8-12.8h-22.4c-6.4 0-12.8 6.4-12.8 12.8v198.4c0 6.4 6.4 12.8 12.8 12.8zm96 0h22.4c6.4 0 12.8-6.4 12.8-12.8V204.8c0-6.4-6.4-12.8-12.8-12.8h-22.4c-6.4 0-12.8 6.4-12.8 12.8v134.4c0 6.4 6.4 12.8 12.8 12.8zM496 400H48V80c0-8.84-7.16-16-16-16H16C7.16 64 0 71.16 0 80v336c0 17.67 14.33 32 32 32h464c8.84 0 16-7.16 16-16v-16c0-8.84-7.16-16-16-16zm-387.2-48h22.4c6.4 0 12.8-6.4 12.8-12.8v-70.4c0-6.4-6.4-12.8-12.8-12.8h-22.4c-6.4 0-12.8 6.4-12.8 12.8v70.4c0 6.4 6.4 12.8 12.8 12.8z\" class=\"\"><\/path><\/svg><\/i> <img loading=\"lazy\" decoding=\"async\" width=\"16\" height=\"16\" alt=\"Loading\" src=\"https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/plugins\/page-views-count\/ajax-loader-2x.gif\" border=0 \/><\/p>\n<div class=\"pvc_clear\"><\/div>\n","protected":false},"author":1,"featured_media":2688,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"activitypub_content_warning":"","activitypub_content_visibility":"","activitypub_max_image_attachments":3,"activitypub_interaction_policy_quote":"anyone","activitypub_status":"federated","footnotes":""},"categories":[86,114,3],"tags":[308,59,310,312,392,301,311,298,391,300,183,303,304,147,103,396,115,71,394,302,17,313,397,272,299,395,195,285,305,393],"class_list":["post-2138","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-archives","category-digital-literacy","category-digital-preservation","tag-ac3","tag-archives","tag-audio","tag-audiovisual","tag-audit","tag-authenticity","tag-av","tag-bash","tag-bsdiff","tag-checksums","tag-code4lib","tag-corruption","tag-corruption-index","tag-digipres","tag-digital-archiving","tag-digital-forensics","tag-digital-literacy","tag-digital-preservation","tag-digital-storage","tag-diplomatics","tag-file-formats","tag-glitch","tag-glitch-audio","tag-glitchart","tag-integrity","tag-preservation-analysis","tag-preservation-metadata","tag-provenance","tag-sensitivity-index","tag-storage","entry"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.3 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Revisiting bsdiff as a tool for digital preservation - ross spencer :: exponentialdecay.digipres :: blog<\/title>\n<meta name=\"description\" content=\"I first blogged about bsdiff in 2014. How does it stand up in terms of its potential and ease of use in 2025? Let&#039;s take a look.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/exponentialdecay.co.uk\/blog\/bsdiff-as-a-tool-for-digital-preservation\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Revisiting bsdiff as a tool for digital preservation - ross spencer :: exponentialdecay.digipres :: blog\" \/>\n<meta property=\"og:description\" content=\"I first blogged about bsdiff in 2014. How does it stand up in terms of its potential and ease of use in 2025? Let&#039;s take a look.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/exponentialdecay.co.uk\/blog\/bsdiff-as-a-tool-for-digital-preservation\/\" \/>\n<meta property=\"og:site_name\" content=\"ross spencer :: exponentialdecay.digipres :: blog\" \/>\n<meta property=\"article:published_time\" content=\"2025-09-28T21:00:30+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-12-01T16:59:33+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/uploads\/2025\/09\/00055-ddac-y2k-snippet.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1280\" \/>\n\t<meta property=\"og:image:height\" content=\"640\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Ross Spencer\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@beet_keeper\" \/>\n<meta name=\"twitter:site\" content=\"@beet_keeper\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Ross Spencer\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"10 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/bsdiff-as-a-tool-for-digital-preservation\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/bsdiff-as-a-tool-for-digital-preservation\\\/\"},\"author\":{\"name\":\"Ross Spencer\",\"@id\":\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/#\\\/schema\\\/person\\\/4cae0a954400f42b9c1b70c699837716\"},\"headline\":\"Revisiting bsdiff as a tool for digital preservation\",\"datePublished\":\"2025-09-28T21:00:30+00:00\",\"dateModified\":\"2025-12-01T16:59:33+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/bsdiff-as-a-tool-for-digital-preservation\\\/\"},\"wordCount\":1987,\"commentCount\":13,\"publisher\":{\"@id\":\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/#\\\/schema\\\/person\\\/4cae0a954400f42b9c1b70c699837716\"},\"image\":{\"@id\":\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/bsdiff-as-a-tool-for-digital-preservation\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/wp-content\\\/uploads\\\/2025\\\/09\\\/00055-ddac-y2k-snippet.png\",\"keywords\":[\"ac3\",\"Archives\",\"audio\",\"audiovisual\",\"Audit\",\"authenticity\",\"av\",\"Bash\",\"bsdiff\",\"checksums\",\"Code4Lib\",\"corruption\",\"corruption index\",\"digipres\",\"Digital Archiving\",\"Digital Forensics\",\"digital literacy\",\"Digital Preservation\",\"Digital Storage\",\"diplomatics\",\"File Formats\",\"glitch\",\"glitch audio\",\"GlitchArt\",\"integrity\",\"Preservation Analysis\",\"Preservation Metadata\",\"provenance\",\"sensitivity index\",\"Storage\"],\"articleSection\":[\"Archives\",\"Digital Literacy\",\"Digital Preservation\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/bsdiff-as-a-tool-for-digital-preservation\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/bsdiff-as-a-tool-for-digital-preservation\\\/\",\"url\":\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/bsdiff-as-a-tool-for-digital-preservation\\\/\",\"name\":\"Revisiting bsdiff as a tool for digital preservation - ross spencer :: exponentialdecay.digipres :: blog\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/bsdiff-as-a-tool-for-digital-preservation\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/bsdiff-as-a-tool-for-digital-preservation\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/wp-content\\\/uploads\\\/2025\\\/09\\\/00055-ddac-y2k-snippet.png\",\"datePublished\":\"2025-09-28T21:00:30+00:00\",\"dateModified\":\"2025-12-01T16:59:33+00:00\",\"description\":\"I first blogged about bsdiff in 2014. How does it stand up in terms of its potential and ease of use in 2025? Let's take a look.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/bsdiff-as-a-tool-for-digital-preservation\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/bsdiff-as-a-tool-for-digital-preservation\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/bsdiff-as-a-tool-for-digital-preservation\\\/#primaryimage\",\"url\":\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/wp-content\\\/uploads\\\/2025\\\/09\\\/00055-ddac-y2k-snippet.png\",\"contentUrl\":\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/wp-content\\\/uploads\\\/2025\\\/09\\\/00055-ddac-y2k-snippet.png\",\"width\":1280,\"height\":640,\"caption\":\"Image shows two layered waveforms, one a corrupt waveform and the other a good original. The corrupt form is in red and the uncorrupt one is green.\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/bsdiff-as-a-tool-for-digital-preservation\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Revisiting bsdiff as a tool for digital preservation\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/#website\",\"url\":\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/\",\"name\":\"ross spencer :: exponentialdecay.digipres :: blog\",\"description\":\"Digital preservation analyst, researcher, and software developer\",\"publisher\":{\"@id\":\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/#\\\/schema\\\/person\\\/4cae0a954400f42b9c1b70c699837716\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/#\\\/schema\\\/person\\\/4cae0a954400f42b9c1b70c699837716\",\"name\":\"Ross Spencer\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/wp-content\\\/uploads\\\/2025\\\/06\\\/avatar-scaled.png\",\"url\":\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/wp-content\\\/uploads\\\/2025\\\/06\\\/avatar-scaled.png\",\"contentUrl\":\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/wp-content\\\/uploads\\\/2025\\\/06\\\/avatar-scaled.png\",\"width\":2560,\"height\":2560,\"caption\":\"Ross Spencer\"},\"logo\":{\"@id\":\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/wp-content\\\/uploads\\\/2025\\\/06\\\/avatar-scaled.png\"},\"description\":\"Digital preservation domain expert and full-stack software developer.\",\"sameAs\":[\"http:\\\/\\\/www.exponentialdecay.co.uk\\\/blog\",\"https:\\\/\\\/www.instagram.com\\\/b33tk33p3r\\\/\",\"https:\\\/\\\/www.linkedin.com\\\/in\\\/ross-spencer-b6b9b758\\\/\",\"https:\\\/\\\/x.com\\\/beet_keeper\"],\"url\":\"https:\\\/\\\/exponentialdecay.co.uk\\\/blog\\\/author\\\/exponentialdecay\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Revisiting bsdiff as a tool for digital preservation - ross spencer :: exponentialdecay.digipres :: blog","description":"I first blogged about bsdiff in 2014. How does it stand up in terms of its potential and ease of use in 2025? Let's take a look.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/exponentialdecay.co.uk\/blog\/bsdiff-as-a-tool-for-digital-preservation\/","og_locale":"en_US","og_type":"article","og_title":"Revisiting bsdiff as a tool for digital preservation - ross spencer :: exponentialdecay.digipres :: blog","og_description":"I first blogged about bsdiff in 2014. How does it stand up in terms of its potential and ease of use in 2025? Let's take a look.","og_url":"https:\/\/exponentialdecay.co.uk\/blog\/bsdiff-as-a-tool-for-digital-preservation\/","og_site_name":"ross spencer :: exponentialdecay.digipres :: blog","article_published_time":"2025-09-28T21:00:30+00:00","article_modified_time":"2025-12-01T16:59:33+00:00","og_image":[{"width":1280,"height":640,"url":"https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/uploads\/2025\/09\/00055-ddac-y2k-snippet.png","type":"image\/png"}],"author":"Ross Spencer","twitter_card":"summary_large_image","twitter_creator":"@beet_keeper","twitter_site":"@beet_keeper","twitter_misc":{"Written by":"Ross Spencer","Est. reading time":"10 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/exponentialdecay.co.uk\/blog\/bsdiff-as-a-tool-for-digital-preservation\/#article","isPartOf":{"@id":"https:\/\/exponentialdecay.co.uk\/blog\/bsdiff-as-a-tool-for-digital-preservation\/"},"author":{"name":"Ross Spencer","@id":"https:\/\/exponentialdecay.co.uk\/blog\/#\/schema\/person\/4cae0a954400f42b9c1b70c699837716"},"headline":"Revisiting bsdiff as a tool for digital preservation","datePublished":"2025-09-28T21:00:30+00:00","dateModified":"2025-12-01T16:59:33+00:00","mainEntityOfPage":{"@id":"https:\/\/exponentialdecay.co.uk\/blog\/bsdiff-as-a-tool-for-digital-preservation\/"},"wordCount":1987,"commentCount":13,"publisher":{"@id":"https:\/\/exponentialdecay.co.uk\/blog\/#\/schema\/person\/4cae0a954400f42b9c1b70c699837716"},"image":{"@id":"https:\/\/exponentialdecay.co.uk\/blog\/bsdiff-as-a-tool-for-digital-preservation\/#primaryimage"},"thumbnailUrl":"https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/uploads\/2025\/09\/00055-ddac-y2k-snippet.png","keywords":["ac3","Archives","audio","audiovisual","Audit","authenticity","av","Bash","bsdiff","checksums","Code4Lib","corruption","corruption index","digipres","Digital Archiving","Digital Forensics","digital literacy","Digital Preservation","Digital Storage","diplomatics","File Formats","glitch","glitch audio","GlitchArt","integrity","Preservation Analysis","Preservation Metadata","provenance","sensitivity index","Storage"],"articleSection":["Archives","Digital Literacy","Digital Preservation"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/exponentialdecay.co.uk\/blog\/bsdiff-as-a-tool-for-digital-preservation\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/exponentialdecay.co.uk\/blog\/bsdiff-as-a-tool-for-digital-preservation\/","url":"https:\/\/exponentialdecay.co.uk\/blog\/bsdiff-as-a-tool-for-digital-preservation\/","name":"Revisiting bsdiff as a tool for digital preservation - ross spencer :: exponentialdecay.digipres :: blog","isPartOf":{"@id":"https:\/\/exponentialdecay.co.uk\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/exponentialdecay.co.uk\/blog\/bsdiff-as-a-tool-for-digital-preservation\/#primaryimage"},"image":{"@id":"https:\/\/exponentialdecay.co.uk\/blog\/bsdiff-as-a-tool-for-digital-preservation\/#primaryimage"},"thumbnailUrl":"https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/uploads\/2025\/09\/00055-ddac-y2k-snippet.png","datePublished":"2025-09-28T21:00:30+00:00","dateModified":"2025-12-01T16:59:33+00:00","description":"I first blogged about bsdiff in 2014. How does it stand up in terms of its potential and ease of use in 2025? Let's take a look.","breadcrumb":{"@id":"https:\/\/exponentialdecay.co.uk\/blog\/bsdiff-as-a-tool-for-digital-preservation\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/exponentialdecay.co.uk\/blog\/bsdiff-as-a-tool-for-digital-preservation\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/exponentialdecay.co.uk\/blog\/bsdiff-as-a-tool-for-digital-preservation\/#primaryimage","url":"https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/uploads\/2025\/09\/00055-ddac-y2k-snippet.png","contentUrl":"https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/uploads\/2025\/09\/00055-ddac-y2k-snippet.png","width":1280,"height":640,"caption":"Image shows two layered waveforms, one a corrupt waveform and the other a good original. The corrupt form is in red and the uncorrupt one is green."},{"@type":"BreadcrumbList","@id":"https:\/\/exponentialdecay.co.uk\/blog\/bsdiff-as-a-tool-for-digital-preservation\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/exponentialdecay.co.uk\/blog\/"},{"@type":"ListItem","position":2,"name":"Revisiting bsdiff as a tool for digital preservation"}]},{"@type":"WebSite","@id":"https:\/\/exponentialdecay.co.uk\/blog\/#website","url":"https:\/\/exponentialdecay.co.uk\/blog\/","name":"ross spencer :: exponentialdecay.digipres :: blog","description":"Digital preservation analyst, researcher, and software developer","publisher":{"@id":"https:\/\/exponentialdecay.co.uk\/blog\/#\/schema\/person\/4cae0a954400f42b9c1b70c699837716"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/exponentialdecay.co.uk\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/exponentialdecay.co.uk\/blog\/#\/schema\/person\/4cae0a954400f42b9c1b70c699837716","name":"Ross Spencer","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/uploads\/2025\/06\/avatar-scaled.png","url":"https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/uploads\/2025\/06\/avatar-scaled.png","contentUrl":"https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/uploads\/2025\/06\/avatar-scaled.png","width":2560,"height":2560,"caption":"Ross Spencer"},"logo":{"@id":"https:\/\/exponentialdecay.co.uk\/blog\/wp-content\/uploads\/2025\/06\/avatar-scaled.png"},"description":"Digital preservation domain expert and full-stack software developer.","sameAs":["http:\/\/www.exponentialdecay.co.uk\/blog","https:\/\/www.instagram.com\/b33tk33p3r\/","https:\/\/www.linkedin.com\/in\/ross-spencer-b6b9b758\/","https:\/\/x.com\/beet_keeper"],"url":"https:\/\/exponentialdecay.co.uk\/blog\/author\/exponentialdecay\/"}]}},"views":1907,"_links":{"self":[{"href":"https:\/\/exponentialdecay.co.uk\/blog\/wp-json\/wp\/v2\/posts\/2138","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/exponentialdecay.co.uk\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/exponentialdecay.co.uk\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/exponentialdecay.co.uk\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/exponentialdecay.co.uk\/blog\/wp-json\/wp\/v2\/comments?post=2138"}],"version-history":[{"count":23,"href":"https:\/\/exponentialdecay.co.uk\/blog\/wp-json\/wp\/v2\/posts\/2138\/revisions"}],"predecessor-version":[{"id":2709,"href":"https:\/\/exponentialdecay.co.uk\/blog\/wp-json\/wp\/v2\/posts\/2138\/revisions\/2709"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/exponentialdecay.co.uk\/blog\/wp-json\/wp\/v2\/media\/2688"}],"wp:attachment":[{"href":"https:\/\/exponentialdecay.co.uk\/blog\/wp-json\/wp\/v2\/media?parent=2138"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/exponentialdecay.co.uk\/blog\/wp-json\/wp\/v2\/categories?post=2138"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/exponentialdecay.co.uk\/blog\/wp-json\/wp\/v2\/tags?post=2138"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}