Extended Creativity and the Data, Users, Tasks of AI-Generated Art
Problem
As soon as graphical capabilities started to exist on computers, people started exploring the process of generating and displaying images algorithmically. From procedural generation and the early days of AARON, never have we seen such an astonishing development in the "artistic" capabilities of computers as in the past year. Multi-modal neural networks, mostly represented by CLIP have ushered a new era of text-to-image technology, starting with Dall-E, and then Disco Diffusion, Midjourney, Stable diffusion, and so many others. The ability to generate good quality images fast and purely from language is not something to be taken lightly from a technological AND artistic point of view.
These applications all run models that are essentially a huge amount of distilled data from images and text condensed into a latent space. This latent space contains enough knowledge about our vision, language, concepts both textual and symbolic, art, and composition to be useful in a new array of tasks, which we have just began to explore. We can already speculate with some certainty that artists will have to adapt and become users or competitors of these technologies. However, they provide clear positive benefits in the democratization of art, and towards what we could call extended creativity, or the potential for machines to collaborate in human expression.
Image prompt: "Kanye West by Jan Toorop and Viktor Vasnetsov", 50 steps @ stable-diffusion-v1-4
Image prompt: "Eulenspiegel by Robert Crumb", 50 steps @ stable-diffusion-v1-4
Aim
At the moment there are exciting opportunities to explore topics surrounding text-to-image technologies, multi-modal neural networks and their contents, AI art, and so on. From the point of view of Visualization, Visual Analytics, and HCI, we can organize three main dimensions of research:
- Data: what is inside of these models? What is their relation with the original data? How to visualize this type of data: large collection of images, latent spaces;
- Users: who will be the users of this new technologies and how to adapt them for their needs, epistemologies, etc.
- Tasks: what are the affordances in new types of tasks, artistic or otherwise, these technologies provide?
Other information
This paper discusses the larger context of these ideas and contains a miro board with extensive content.