Hidden Figures: disguising images from machines

ruthmoog - Apr 26 - - Dev Community

Artists who share their work online are frustrated by the theft of their style and content by machine learning (ML) tools. It's possible for artworks to be used for training of generative artificial intelligence (AI) without permission, and without guidance or accountability, tech giants have investigated buying publishing houses to access non-public content.

There is potential in these fashionable technologies as a force for good, but we don't have established ethical policies in place to make them fair - so artists and developers are finding ways to disrupt malpractice and artwork theft.

Poison attacks

Enter Glaze and Nightshade, tools which can manipulate content so it will disrupt or poison ML training data.

Nightshade can help deter model trainers who disregard copyrights, opt-out lists, and do-not-scrape/robots.txt directives
- What is Nightshade? The Glaze Project Team

Glaze works by making changes in the digital image file, which are not obvious to the human eye, but are significant enough to distort the input to an AI training model. That can result in a picture in one style being interpreted as a differently style. e.g. you see a painting in the style of Van Gogh, but the AI 'sees' a picture in the style of Man Ray.
This works (for now!) on artificial neural networks which use a variational autoencoder (VAE) architecture, but it doesn't work on networks that don't rely on VAE.

Nightshade works in a similar way to Glaze - they're from the same team - but it essentially gaslights the AI model into learning misinformation, and this is a deterrent for training on private content.
But, it does affect the lightness of the original image to the human eye. e.g. you see a 'shaded' photograph of a sunflower, but the AI could 'see' a photograph of a sunflower alarm clock, and with enough data could learn that sunflowers have clock hands and bells.

Nightshade's goal is not to break models, but to increase the cost of training on unlicensed data, such that licensing images from their creators becomes a viable alternative.
- What is Nightshade? The Glaze Project Team

Is it worth it?

Keep in mind you're fighting fire with fire in terms of compute:

Glaze uses around 5 Gb of memory and a substantial amount of CPU computing. We are working on reducing these numbers.
- User guide The Glaze Project Team

Ultimately it depends how ignoble and determined the party is that is trying to crawl the artwork. Nightshade will be most effective against large volumes, and Glaze is a temporary measure which could become ineffective long-term. But there's nothing to stop users applying both measures, and yet little else to do without keeping content off the web (and good luck with that). If the content is there, it cannot also not be there.

Resources

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .