Image of a children's learn to code textbook by Usborne Books. The page shows a snippet from a computer game in BASIC called "Escape". The illustration is of three menacing looking Cyborgs.

Informed consent: considering steganographic techniques to fingerprint Generative AI output

Artificial intelligence (AI) is a polarizing topic. For every reasoned assessment of the technology and its potential to make some of our smaller, onerous, or more repetitive tasks easier, there are probably 100 reactive pieces predicting some radical overhaul of societal norms, from the service industry receiving new intakes of out of work software developers to laypeople taking on roles traditionally occupied by those of a college education, if they just start asking their AI the right questions ¯\_(ツ)_/¯

The amount of AI-propaganda is draining, and the reaction is often spread across the board too, some cheer leading, some decrying, plenty taking their time to offer skilled and nuanced rebuttals, or suggestions for improvements.

I find myself largely trying to stay out of the conversations. A lot like blockchain conversations 8 years ago, it will take another half decade for the hype-cycle to plateau for us to see where it can truly complement our work.

One part of the conversation that is increasingly harder to ignore, is being informed about when AI has been used in the generation of text or images. It is the property of knowing, or having the tools to know is what I feel is the most important.

How can we be better informed about when AI is used, so that we are better prepared as consumers, to receive and understand content?

In this blog I want to explore the potential for steganography techniques to be used in the output of AI to fingerprint content and provide a way for front-end mechanisms to identify it, as we might file formats using magic numbers, so that users can be given the chance of informed consent: the opportunity to opt-in or out of whether we engage with AI content or not.

Loading

"Bei der Buche", a landscape architectural installation by landscape architect and photographer Karina Raeck. Created in 1993 in the Wartberg area north-east of Stuttgart.

wikidata + mediawiki = wikidata + provenance == wikiprov

Today I want to showcase a Wikidata proof of concept that I developed as part of my work integrating Siegfried and Wikidata.

That work is wikiprov a utility to augment Wikidata results in JSON with the Wikidata revision history.

For siegfried it means that we can showcase the source of the results being returned by an identification without having to go directly back to Wikidata, this might mean more exposure for individuals contributing to Wikidata. We also provide access to a standard permalink where records contributing to a format identification are fixed at their last edit. Because Wikidata is more mutable than a resource like PRONOM this gives us the best chance of understanding differences in results if we are comparing siegfried+Wikidata results side-by-side.

I am interested to hear your thoughts on the results of the work. Lets go into more detail below.

Loading

Follow

Get every new post delivered to your Inbox

Join other followers: