Examining the Text-to-Image Community of Practice: Why and How do People Prompt Generative AIs? | Proceedings of the 15th Conference on Creativity and Cognition (2024)

research-article

Author: Téo Sanchez

Published: 19 June 2023 Publication History

  • 11citation
  • 670
  • Downloads

Metrics

Total Citations11Total Downloads670

Last 12 Months670

Last 6 weeks58

  • Get Citation Alerts

    New Citation Alert added!

    This alert has been successfully added and will be sent to:

    You will be notified whenever a record that you have chosen has been cited.

    To manage your alert preferences, click on the button below.

    Manage my Alerts

    New Citation Alert!

    Please log in to your account

  • Get Access

      • Get Access
      • References
      • Media
      • Tables
      • Share

    Abstract

    Image generation gained popularity with machine learning (ML) models generating images from text, fuelling new online communities of practices. This work explores the sociology, motivations, and usages of AI art hobbyists. We analyzed an online questionnaire answered by 64 practitioners and a dataset of user prompts sent to the Stable Diffusion generative model. Our findings suggest that TTI generation is a recreational activity mainly conducted by narrow socio-demographic groups who use auxiliary techniques across platforms and beyond request-response interactions. Inherent model limitations and finding suitable prompt formulation are the main obstacles practitioners face. A taxonomy and a corresponding ML model capable of recognizing the semantic content of unseen prompts were created to conduct the user prompt analysis. The prompt analysis revealed that artist names are the main specifier used beside the main subject, often in sequences. We finally discuss the design and socio-technical implications of our work for creativity support.

    References

    [1]

    2012. Common Crawl. Available at https://commoncrawl.org.

    [2]

    2022. DALL·E Editor Guide | OpenAI Help Center. https://help.openai.com/en/articles/6516417-dall-e-editor-guide

    [3]

    2022. DALL·E: Introducing Outpainting. https://openai.com/blog/dall-e-introducing-outpainting/

    [4]

    2022. Imagine Parameters Illustrated - Midjourney Documentation. https://midjourney.gitbook.io/docs/imagine-parameters#image-prompting-with-url

    [5]

    2022. Lexica. https://lexica.art/

    [6]

    2022. Midjourney. Available at https://www.midjourney.com.

    [7]

    2022. Midjourney prompt generator - promptoMANIA. https://promptomania.com/midjourney-prompt-builder/

    [8]

    2022. MidJourney Prompt Tool. https://prompt.noonshot.com/midjourney

    [9]

    2022. MidJourney Random Commands Generator. https://blog.user.today/midjourney/

    [10]

    2022. Phraser — the collaborative creative AI tool. https://phraser.tech/builder

    [11]

    2022. Prompt Battle. https://promptbattle.com/

    [12]

    2022. Prompt Hunt. https://www.prompthunt.com/explore

    [13]

    2022. Prompt Silo. https://pheeds.com/PromptSilo.php?ref=futuretools.io

    [14]

    2022. PromptBase | Prompt Marketplace: DALL·E, Midjourney, Stable Diffusion & GPT-3. https://promptbase.com/

    [15]

    2023. Pretrained Models — Sentence-Transformers documentation. https://www.sbert.net/docs/pretrained_models.html?highlight=pretrained

    [16]

    Andy Baio. 2022. Exploring 12 Million of the 2.3 Billion Images Used to Train Stable Diffusion’s Image Generator - Waxy.org. https://waxy.org/2022/08/exploring-12-million-of-the-images-used-to-train-stable-diffusions-image-generator/

    [17]

    Virginia Braun and Victoria Clarke. 2012. Thematic analysis. In APA handbook of research methods in psychology, Vol 2: Research designs: Quantitative, qualitative, neuropsychological, and biological.American Psychological Association, 57–71. https://doi.org/10.1037/13620-004

    [18]

    Bryan Burford, Pam Briggs, and JohnP. Eakins. 2003. A Taxonomy of the Image: On the Classification of Content for Image Retrieval. Visual Communication 2, 2 (2003), 123–161. https://doi.org/10.1177/1470357203002002001

    [19]

    Baptiste Caramiaux and SarahFdili Alaoui. 2022. "Explorers of Unknown Planets": Practices and Politics of Artificial Intelligence in Visual Arts. Proceedings of the ACM on Human-Computer Interaction 1, 1 (2022), 1–24. https://hal.inria.fr/hal-03762351%0Ahttps://hal.inria.fr/hal-03762351/document

    Digital Library

    [20]

    Hai Dang, Lukas Mecke, Florian Lehmann, Sven Goller, and Daniel Buschek. 2022. How to Prompt? Opportunities and Challenges of Zero- and Few-Shot Learning for Human-AI Interaction in Creative Applications of Generative Models; How to Prompt? Opportunities and Challenges of Zero- and Few-Shot Learning for Human-AI Interaction in Creat. (2022). https://doi.org/10.1145/nnnnnnn.nnnnnnn

    [21]

    Niklas Deckers, Maik Fröbe, Johannes Kiesel, Gianluca Pandolfo, Christopher Schröder, Benno Stein, and Martin Potthast. 2022. The Infinite Index: Information Retrieval on Generative Text-To-Image Models. arxiv.org (2022). http://arxiv.org/abs/2212.07476

    [22]

    SCraig Finlay and FranklinD Schurz. 2014. Age and gender in Reddit commenting and success. (2014). https://doi.org/10.1633/JISTaP.2014.2.3.2

    [23]

    Paul Ginsparg. 2011. ArXiv at 20. nature.com (2011). https://www.nature.com/articles/476145a

    [24]

    ExplosionAI GmbH. 2019. Computer Vision ·Prodigy ·An annotation tool for AI, Machine Learning & NLP. https://prodi.gy/%0Ahttps://prodi.gy/features/computer-vision

    [25]

    Dejan Grba. 2022. Deep Else: A Critical Framework for AI Art. mdpi.com (2022). https://doi.org/10.3390/digital2010001

    [26]

    Maarten Grootendorst. 2022. BERTopic: Neural topic modeling with a class-based TF-IDF procedure. (3 2022). http://arxiv.org/abs/2203.05794

    [27]

    Melissa Heikkilä. 2022. This artist is dominating AI-generated art. And he’s not happy about it. | MIT Technology Review. https://www.technologyreview.com/2022/09/16/1059598/this-artist-is-dominating-ai-generated-art-and-hes-not-happy-about-it/

    [28]

    Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising Diffusion Probabilistic Models. Advances in Neural Information Processing Systems 2020-December (6 2020). https://doi.org/10.48550/arxiv.2006.11239

    [29]

    Jonathan Ho, Chitwan Saharia, William Chan, DavidJ. Fleet, Mohammad Norouzi, and Tim Salimans. 2022. Cascaded Diffusion Models for High Fidelity Image Generation. Journal of Machine Learning Research 23 (2022). https://www.jmlr.org/papers/volume23/21-0635/21-0635.pdf

    [30]

    Irina Ivanova. 2023. Artists sue AI company for billions, alleging "parasite" app used their work for free - CBS News. https://www.cbsnews.com/news/ai-stable-diffusion-stability-ai-lawsuit-artists-sue-image-generators/

    [31]

    Anna Kantosalo and Hannu Toivonen. 2016. Modes for creative human-computer collaboration: Alternating and task-divided co-creativity. Proceedings of the 7th International Conference on Computational Creativity, ICCC 2016 (2016), 77–84. https://www.computationalcreativity.net/iccc2016/wp-content/uploads/2016/01/Modes-for-Creative-Human-Computer-Collaboration.pdf

    [32]

    Jennifer Korn. 2023. Getty Images suing the makers of popular AI art tool for allegedly stealing photos | CNN Business. https://edition.cnn.com/2023/01/17/tech/getty-images-stability-ai-lawsuit/index.html

    [33]

    Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, and Chris Dyer. 2016. Neural architectures for named entity recognition. 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 - Proceedings of the Conference (2016), 260–270. https://doi.org/10.18653/v1/n16-1030

    [34]

    Vivian Liu and LydiaB. Chilton. 2022. Design Guidelines for Prompt Engineering Text-to-Image Generative Models. Conference on Human Factors in Computing Systems - Proceedings 1, 1 (4 2022), 1–27. https://doi.org/10.1145/3491102.3501825

    [35]

    Tiago Martins, JoãoM. Cunha, João Correia, and Penousal Machado. 2023. Towards theEvolution ofPrompts withMetaPrompter. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 13988 LNCS (2023), 180–195. https://doi.org/10.1007/978-3-031-29956-8_12

    Digital Library

    [36]

    Leland McInnes, John Healy, and James Melville. 2018. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. (2 2018). http://arxiv.org/abs/1802.03426

    [37]

    Anna Notaro. 2020. State of the Art: A.I. through the (artificial) artist’s eye. BCS Learning and Development Ltd. https://doi.org/10.14236/ewic/EVA2020.58

    [38]

    Organisation for Economic Co-operation OECD. 2018. Bridging the digital gender divide: Include, upskill, innovate. voced.edu.au (2018). https://www.voced.edu.au/content/ngv:81069

    [39]

    Jonas Oppenlaender. 2022. A Taxonomy of Prompt Modifiers for Text-To-Image Generation. arxiv.org 1, 1 (4 2022), 1–15. http://arxiv.org/abs/2204.13988

    [40]

    Jonas Oppenlaender. 2022. The Creativity of Text-to-Image Generation. 25th International Academic Mindtrek conference (11 2022), 192–202. https://doi.org/10.1145/3569219.3569352

    Digital Library

    [41]

    Nikita Pavlichenko and Dmitry Ustalov. 2022. Best Prompts for Text-to-Image Models and How to Find Them. (9 2022). https://doi.org/10.48550/arxiv.2209.11711

    [42]

    @pharmapsychotic. 2022. CLIP Interrogator - a Hugging Face Space by pharma. https://huggingface.co/spaces/pharma/CLIP-Interrogator

    [43]

    Alec Radford, JongWook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. 2021. Learning Transferable Visual Models From Natural Language Supervision. (2 2021). https://doi.org/10.48550/arxiv.2103.00020

    [44]

    Juan Ramos. 2003. Using tf-idf to determine word relevance in document queries. Proceedings of the first instructional conference on machine learning 242, 1 (2003), 29–48. https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=b3bf6373ff41a115197cb5b30e57830c16130c2c

    [45]

    Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Yael Pritch, Michael Rubinstein, and Kfir Aberman. 2022. DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation. (8 2022). https://doi.org/10.48550/arxiv.2208.12242

    [46]

    Gustavo Santana. 2022. MagicPrompt Stable Diffusion - a Hugging Face Space by Gustavosta. https://huggingface.co/spaces/Gustavosta/MagicPrompt-Stable-Diffusion

    [47]

    Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wightman, Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Wortsman, Patrick Schramowski, Srivatsa Kundurthy, Katherine Crowson, Ludwig Schmidt, Robert Kaczmarczyk, and Jenia Jitsev. 2022. LAION-5B: An open large-scale dataset for training next generation image-text models. (10 2022). https://doi.org/10.48550/arxiv.2210.08402

    [48]

    Ali Shutler. 2022. Artists protest use of AI-generated artwork on portfolio site Artstation. https://www.nme.com/news/gaming-news/artists-protest-use-of-ai-generated-artwork-on-portfolio-site-artstation-3366778

    [49]

    Jascha Sohl-Dickstein, EricA. Weiss, Niru Maheswaranathan, and Surya Ganguli. 2015. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. 32nd International Conference on Machine Learning, ICML 2015 3 (3 2015), 2246–2255. https://doi.org/10.48550/arxiv.1503.03585

    [50]

    Yang Song and Stefano Ermon. 2019. Generative Modeling by Estimating Gradients of the Data Distribution. Advances in Neural Information Processing Systems 32 (7 2019). https://doi.org/10.48550/arxiv.1907.05600

    [51]

    SpaCy. 2020. spaCy · Industrial-strength Natural Language Processing in Python. https://spacy.io/

    [52]

    Luke Stark and Kate Crawford. 2019. The work of art in the age of artificial intelligence: What artists can teach us about the ethics of data practice. Surveillance and Society 17, 3-4 (2019), 442–455. https://doi.org/10.24908/ss.v17i3/4.10821

    [53]

    Xander Streenbrugge. 2022. Xander Steenbrugge on Twitter: "One annoying thing about Generative AI right now is that new models are constantly emerging, and each one requires you to "relearn how to prompt" it. Prompts that looked amazing in SD v1.5 don’t in v2.0. This constant relearning feels like a struggle we can prob improve on" / Twitter. https://twitter.com/xsteenbrugge/status/1595780305981689862

    [54]

    Succintly AI. 2022. succinctly/text2image-prompt-generator · Hugging Face. https://huggingface.co/succinctly/text2image-prompt-generator

    [55]

    Chris Vallance. 2022. "Art is dead Dude" - the rise of the AI artists stirs debate - BBC News. https://www.bbc.com/news/technology-62788725

    [56]

    Xintao Wang, Yu Li, Honglun Zhang, and Ying Shan. 2021. Towards real-world blind face restoration with generative facial prior. openaccess.thecvf.com (2021). http://openaccess.thecvf.com/content/CVPR2021/html/Wang_Towards_Real-World_Blind_Face_Restoration_With_Generative_Facial_Prior_CVPR_2021_paper.html

    [57]

    ZijieJ. Wang, Evan Montoya, David Munechika, Haoyang Yang, Benjamin Hoover, and DuenHorng Chau. 2022. DiffusionDB: A Large-scale Prompt Gallery Dataset for Text-to-Image Generative Models. (10 2022). https://doi.org/10.48550/arxiv.2210.14896

    Cited By

    View all

    • Peng XKoch JMackay W(2024)DesignPrompt: Using Multimodal Interaction for Design Exploration with Generative AIDesigning Interactive Systems Conference10.1145/3643834.3661588(804-818)Online publication date: 1-Jul-2024

      https://dl.acm.org/doi/10.1145/3643834.3661588

    • Shelby RSrinivasan RBurgdorf KLena JRostamzadeh N(2024)Creative ML Assemblages: The Interactive Politics of People, Processes, and ProductsProceedings of the ACM on Human-Computer Interaction10.1145/36373158:CSCW1(1-30)Online publication date: 26-Apr-2024

      https://dl.acm.org/doi/10.1145/3637315

    • Palani SRamos G(2024)Evolving Roles and Workflows of Creative Practitioners in the Age of Generative AIProceedings of the 16th Conference on Creativity & Cognition10.1145/3635636.3656190(170-184)Online publication date: 23-Jun-2024

      https://dl.acm.org/doi/10.1145/3635636.3656190

    • Show More Cited By

    Index Terms

    1. Examining the Text-to-Image Community of Practice: Why and How do People Prompt Generative AIs?

      1. Computing methodologies

        1. Artificial intelligence

          1. Natural language processing

            1. Information extraction

        2. Human-centered computing

          1. Collaborative and social computing

            1. Empirical studies in collaborative and social computing

        Recommendations

        • The Creativity of Text-to-Image Generation

          Academic Mindtrek '22: Proceedings of the 25th International Academic Mindtrek Conference

          Text-guided synthesis of images has made a giant leap towards becoming a mainstream phenomenon. With text-to-image generation systems, anybody can create digital images and artworks. This provokes the question of whether text-to-image generation is ...

          Read More

        • Is It AI or Is It Me? Understanding Users’ Prompt Journey with Text-to-Image Generative AI Tools

          CHI '24: Proceedings of the CHI Conference on Human Factors in Computing Systems

          Generative Artificial Intelligence (AI) has witnessed unprecedented growth in text-to-image AI tools. Yet, much remains unknown about users’ prompt journey with such tools in the wild. In this paper, we posit that designing human-centered text-to-image ...

          Read More

        • Reprioritizing the relationship between HCI research and practice: bubble-up and trickle-down effects

          DIS '14: Proceedings of the 2014 conference on Designing interactive systems

          There has been an ongoing conversation about the role and relationship of theory and practice in the HCI community. This paper explores this relationship privileging a practice perspective through a tentative model, which describes a "bubble-up" of ...

          Read More

        Comments

        Information & Contributors

        Information

        Published In

        Examining the Text-to-Image Community of Practice: Why and How do People Prompt Generative AIs? | Proceedings of the 15th Conference on Creativity and Cognition (2)

        C&C '23: Proceedings of the 15th Conference on Creativity and Cognition

        June 2023

        564 pages

        ISBN:9798400701801

        DOI:10.1145/3591196

        Copyright © 2023 ACM.

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [emailprotected].

        Sponsors

        • SIGCHI: ACM Special Interest Group on Computer-Human Interaction

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 19 June 2023

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. community of practice
        2. text-to-image generation

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Funding Sources

        • Banque Publique d'Investissem*nt France

        Conference

        C&C '23

        Sponsor:

        • SIGCHI

        C&C '23: Creativity and Cognition

        June 19 - 21, 2023

        Virtual Event, USA

        Acceptance Rates

        Overall Acceptance Rate 108 of 371 submissions, 29%

        Contributors

        Examining the Text-to-Image Community of Practice: Why and How do People Prompt Generative AIs? | Proceedings of the 15th Conference on Creativity and Cognition (3)

        Other Metrics

        View Article Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 11

          Total Citations

          View Citations
        • 670

          Total Downloads

        • Downloads (Last 12 months)670
        • Downloads (Last 6 weeks)58

        Other Metrics

        View Author Metrics

        Citations

        Cited By

        View all

        • Peng XKoch JMackay W(2024)DesignPrompt: Using Multimodal Interaction for Design Exploration with Generative AIDesigning Interactive Systems Conference10.1145/3643834.3661588(804-818)Online publication date: 1-Jul-2024

          https://dl.acm.org/doi/10.1145/3643834.3661588

        • Shelby RSrinivasan RBurgdorf KLena JRostamzadeh N(2024)Creative ML Assemblages: The Interactive Politics of People, Processes, and ProductsProceedings of the ACM on Human-Computer Interaction10.1145/36373158:CSCW1(1-30)Online publication date: 26-Apr-2024

          https://dl.acm.org/doi/10.1145/3637315

        • Palani SRamos G(2024)Evolving Roles and Workflows of Creative Practitioners in the Age of Generative AIProceedings of the 16th Conference on Creativity & Cognition10.1145/3635636.3656190(170-184)Online publication date: 23-Jun-2024

          https://dl.acm.org/doi/10.1145/3635636.3656190

        • Domínguez Hernández AKrishna SPerini AKatell MBennett SBorda AHashem YHadjiloizou SMahomed SJayadeva SAitken MLeslie D(2024)Mapping the individual, social and biospheric impacts of Foundation ModelsProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3658939(776-796)Online publication date: 3-Jun-2024

          https://dl.acm.org/doi/10.1145/3630106.3658939

        • Torricelli MMartino MBaronchelli AAiello L(2024)The Role of Interface Design on Prompt-mediated Creativity in Generative AIProceedings of the 16th ACM Web Science Conference10.1145/3614419.3644000(235-240)Online publication date: 21-May-2024

          https://dl.acm.org/doi/10.1145/3614419.3644000

        • Mahdavi Goloujeh ASullivan AMagerko B(2024)The Social Construction of Generative AI PromptsExtended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613905.3650947(1-7)Online publication date: 11-May-2024

          https://dl.acm.org/doi/10.1145/3613905.3650947

        • Guo QYuan KHe CPeng ZMa X(2024)Exploring the Evolvement of Artwork Descriptions in Online Creative Community under the Surge of Generative AI: A Case Study of DeviantArtExtended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613905.3650851(1-7)Online publication date: 11-May-2024

          https://dl.acm.org/doi/10.1145/3613905.3650851

        • Mahdavi Goloujeh ASullivan AMagerko B(2024)Is It AI or Is It Me? Understanding Users’ Prompt Journey with Text-to-Image Generative AI ToolsProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642861(1-13)Online publication date: 11-May-2024

          https://dl.acm.org/doi/10.1145/3613904.3642861

        • Rajcic NLlano Rodriguez MMcCormack J(2024)Towards a Diffractive Analysis of Prompt-Based Generative AIProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3641971(1-15)Online publication date: 11-May-2024

          https://dl.acm.org/doi/10.1145/3613904.3641971

        • McCormack JLlano MKrol SRajcic N(2024)No Longer Trending onArtstation: Prompt Analysis ofGenerative AI ArtArtificial Intelligence in Music, Sound, Art and Design10.1007/978-3-031-56992-0_18(279-295)Online publication date: 3-Apr-2024

          https://dl.acm.org/doi/10.1007/978-3-031-56992-0_18

        • Show More Cited By

        View Options

        Get Access

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        Get this Publication

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Media

        Figures

        Other

        Tables

        Examining the Text-to-Image Community of Practice: Why and How do People Prompt Generative AIs? | Proceedings of the 15th Conference on Creativity and Cognition (2024)
        Top Articles
        Latest Posts
        Article information

        Author: Pres. Carey Rath

        Last Updated:

        Views: 6026

        Rating: 4 / 5 (41 voted)

        Reviews: 88% of readers found this page helpful

        Author information

        Name: Pres. Carey Rath

        Birthday: 1997-03-06

        Address: 14955 Ledner Trail, East Rodrickfort, NE 85127-8369

        Phone: +18682428114917

        Job: National Technology Representative

        Hobby: Sand art, Drama, Web surfing, Cycling, Brazilian jiu-jitsu, Leather crafting, Creative writing

        Introduction: My name is Pres. Carey Rath, I am a faithful, funny, vast, joyous, lively, brave, glamorous person who loves writing and wants to share my knowledge and understanding with you.