Copyright Storm — Authorship in the age of AI
ps: english is not my native language, hence I make some weird phrase constructions quite often.
#thisisnotastorm
This is not a storm photo. It is also not a storm painting. It was generated with an artificial intelligence technique known as GAN (Generative Adversarial Network). A very short simplified explanation of what GANs are: statistics and computer vision models capable of creating images similar to a dataset (for the case debated in this article, let's just say it's a collection of images). The more similar looking the images in the dataset are, the more likely a GAN will be capable of creating an output that looks like such images. Best results usually need several thousands of images, but as the technology improves, less and less images are necessary for GANs to look just like originals.
To illustrate why photographers and artists in general need to be paying attention to the advances in artificial intelligence I decided to make a provocation: I downloaded around 6500 thousands images from photographers specialized in storm chasing. To do so, I just typed “storm chasing photography” on google, found an article that highlighted 5 of these artists, entered their websites and downloaded all the images I could. The websites were very clear that all images have copyrights and are available for license purchase. I got around 1500 high quality photos using a simple browser extension. Not satisfied with the quantity, I checked the “stormchasing” hashtag on Instagram and used it to find accounts that were specialized on the topic. Then I downloaded all the images from these accounts and cleaned the few not related to storms from my dataset. The hardest part was getting help to install an Instagram images scrapper, which allowed me to download all images from one account or hashtag in just one command.
282 images from @cataclysms_of_nature, 682 @willeadesphotography, 457 from @world_of_thunders, 413 from @adamkylejackson and more. It took me a couple os hours to gather the images, then I sent the dataset to Runway, a very user friendly platform that runs AI algorithms in their cloud computing and has been making machine learning much more accessible to people. In 8 hours, the GAN was trained on the storms dataset, while I was happily sleeping. It costs around 30 dollars to generate these images on Runway, the computing power necessary to run many AI models is still very expensive — but there are free options such as Google Colab, which are not as user friendly. If I had trained on 50k photos of storms, the output would probably look just like a photo, as these projects by Nvidia.
As an artist myself, the idea that I could do this and sell the resulting images as my own work is quite shocking. I don’t think it makes sense for me to sign these images as mine, as I was not the one who learned how to create these compositions, nor was I the one who took the photos. I can barely claim to have selected these images, as the hashtags and the websites listing the photographers made it all too easy to find some of the best storm chasers. There are many tools that can find images based on similarity, so dataset creation can also be automated with a well designed code (I highly recommend same.energy website!). It makes much more sense for me to call this a collaborative work, a crowdsourced artwork, as the quality of the final images depends on the quality of the dataset.
AI is more than a tool, it’s so powerful in its own creation skills that one could call it a collaborator. But giving so much credit to AI can also be tricky, as I am the one responsible for feeding the GAN with other artists works who did not consent to this. And they would most likely feel harmed by a generative system based on their work that can propose the same kind of product they love and sell: images of storms. However, as the knowledge about this kind of technique is not widespread and data is just easily available on the internet, there are not enough incentives for transparency and even less to pay any sort of royalties for data. Acknowledging the artists that are part of a dataset should be an obvious courtesy, but if most people don’t even know what a dataset is, how can we enforce this?
Yet, the issues around AI and copyrights are far more complicated than artists ethics and their transparency over datasets practices. More recent models released by Open AI and other research groups are trained on millions of images collected from the internet. With the latest combinations of CLIP text2image model, people can just give a text prompt, such as ‘a landscape in the style of MC Escher’ and the AI is capable of making something original. The results are fascinating. The models are already delivering images with interesting compositions, complex backgrounds, placing hats and adornments in the right places, interpreting abstract phrases. I say interpret because the quality and originality of the images goes far beyond the sort of pixel statics delivered by the previous GAN models. These newer models are still experimental and haven't been released for commercial applications, but independent researchers and artists managed to use what was published by Open AI and are experimenting with it. Bellow some of the images resulted from prompts I came up with.
Our notions of authorship and copyright make little sense with such techniques. It seems absurd that one could be sued for using one image commercially without a proper license, but one could train a model using a random living artist body of work (hence hundreds or thousands of images) as a dataset. Can we really say it is fair use of intellectual property if we know this will have an economic impact for all sorts of creators? The generated images have similar compositions and colors if compared to the originals, but they are different (ranging from very similar to an abstract resemblance), therefore it’s quite possible that even the initial artist wouldn’t realize the connection with their own work, specially if they are not acquainted with AI models particular aesthetics. It’s very hard to impose copyrights laws over such creations. Copyrights laws as they are today are often not beneficial for artists either, but completely abolishing them is probably even worse. Regulation is too slow to catch up with tech progress and most likely it’s going to be ineffective anyway. Or it will stifle innovation for some companies while someone else will surely keep the research (just check bellow at what is going on in China). So, artists need to try to understand what AI is capable of doing because very soon they will start to feel its impact in their own market.
When photography first became accessible, it radically changed the image creation work space and even the way painters thought about images. Painting didn’t die because of photography, but it surely changed. As AI deep fakes become widespread, it will be essential to have some kind of digital certificate to guarantee a photo or video was really shot with a camera at a location in a specific time. Making money out of licensing stock photos will become obsolete. As reproducing any image style will be a simple task, artists might put more focus over process, or materials, performance, or they will do more complex simulations in mixed reality environments. Personally, I see a lot of potential for independent animation and AI. People find their ways of adapting, society changes, though for many professionals (from accountants, lawyers, artists, engineers and almost everyone else as well), AI will probably bring a crisis into their job market (Yuval Harari has been pretty vocal about it).
All of us are contributing for AI development nowadays. Everything we do on the internet can be used as data for training some kind of AI model. We also use online platforms for our own needs, but there’s a clear imbalance between the people who give their data and the ones, mostly big tech companies, who not only give but can harvest massive amounts of data and profit from it. When artists upload content on Instagram they pretty much give Instagram the right to do whatever they want with it (also, artists have already played with fair use of instagram images and been successful). The crypto world promises to protect content creators by allowing them to own whatever they share, so that looks like an important path to follow — though it’s all very experimental so far and NFT platforms are already full of copy minters and other copyright mazes.
More than a specific technological change, people’s perception around how content is created and spread is key to ensure a bright future for creators. I deeply support Gene Kogan’s approach on AI as collective intelligence, how he highlights NFTs potential towards massive collaborations; from dataset level, not only social media but also public domain archives (the best sources to create amazing datasets! we have to deeply thank their staff!), to coders and marketers and whoever else is involved in a project. It doesn’t mean artists can’t produce individually anymore, nor that anyone working with AI necessarily needs to perceive and sell their work as a collective creation. But if we highlight collaboration instead of competition, we might go much further and live better :) . With DAOs (Decentralised Autonomous Organizations) we could come up with different models of organizing ourselves, more autonomous and horizontal.
I do think it’s dangerous to treat AI generated images the same way we think of any other image creation tool, including collages. I’d love to make another long article about the history of authorship and the concept of genius, how it changed from the time when artists worked in cooperative guilds (and wouldn’t even sign the work) to the stereotype of the eccentric troubled creator who is a great genius and is isolated from everyone else. A balance between transparency and privacy, plus education and decentralization are key to prevent the worst outcomes of technological development. We can’t afford to treat technologies as black boxes. I think the best artists can do to protect themselves and thrive is to really pay attention to what these technologies are, learn about them, from their scary impacts to their beautiful potentials.
If you want to learn more about AI and art creation I highly recommend the following sites:
ml4a.net
Lex Fridman podcast — not so much for art, but for a lot of great conversations with scientists and philosophers (and some guest I don't particularly appreciate, but it's still worth a look!)
And follow artists and researchers working with AI on twitter, which is by far the most up to date platform for this kind of discussion.