Spending holidays with our family of origin can invite both delight and friction. Between old patterns and new; different lived experiences; and perspectives from conflicting personalities. This holiday season was no different as I introduced some of my family to ChatGPT and MidJourney.
My mom generated an abstract painting of blood vessels (above); my sister generated a curriculum for elementary school music; my brother generated a recipe for a salmon dish. As academics, many of them were horrified by the usual concerns of cheating, stealing and diminished learning.
While I believe it’s futile to debate whether these technologies should exist - they already do - I admit I’ve been hesitant to show enthusiasm for this “game changer” and equally prickly in identifying why I’m so bothered by the growing ubiquity of tools which will certainly explode in 2023.
I was first introduced to generative AI during the shutdown when my OnBoardXR showcase presented Isak Keller’s play written by GPT3. After the production, Isak collaborated with me on my own generative play from prompts which ultimately I never finished because I felt a bit robbed of my preferred creative writing process.
I only realized my displeasure through oversimplifying and pondering “better” language to describe how I was using the technology. Similarly, by oversimplifying the backend of OpenAI functions with my family, it occurred to me that my quarrel is not with the messenger, but the message…or rather, the language.
Artificial intelligence and/or machine learning is merely a tool to quickly collect and curate broad data sets and reassemble them based on queries that users input, known as prompts. These can be anything from patterns of pixels identified as the subject or style of a painting or a collection of text reassembled into a new composition. This has endless applications and is extremely powerful, but since the debut of these tools, the underlying concern has been: Where is it getting the data?
Is this copyrighted material scraped without the original creator’s consent? What is the line between copy-cat and inspired-by ‘intelligence’? Can people opt-out of their data being used to train or feed these machines? You may remember when I caught a film finance AI using the subtitles of my movies (and others) to train its algorithm for screenplay format and performance metrics.
GPT stands for Generative Pre-Trained Transformer and herein lies the flaw in this language: To generate is defined as to produce or cause or form. And to transform is defied as to change, alter, or make a thorough or dramatic change in the form, appearance, or character. But that’s not quite what is occurring here.
The artificial intelligence is not creating a wholly original work - it is scraping information with incredible efficiency, sorting and reassembling that data into an output it then represents as an original creation. The use of the words ‘generative’ and ’transform’ are flawed pain points that may be inhibiting the enthusiastic adoption of this incredible technology.
Already ChatGPT has been called “the Google killer” as the next evolution of Search. But again, language matters here. Say what you will about Google (and there is a lot to say); the aggregation and prioritization of real-time search results still ultimately drive users to content destinations on the internet, controlled by the original creators, which is a form of attribution.
Sure, anyone could certainly search a topic and “scrape” information from various sites; but to present those results as ones own material requires a purposeful (and arguably malicious) decision to omit the source(s). None of these AI platforms yet empower the user to know where (or who) their data comes from.
This feeds into my concerns with content creation in general. Our social platforms prioritize an increasing volume of consistent posting, beyond what is sustainable for average people and businesses. This has resulted in normalizing the “reposting” of other people’s content, thus diminishing the metrics, attribution and success of the original post and creator. This is an unsustainable practice and only becomes worse through “generation” of content trained on other people’s creations.
Now instead imagine artificial intelligence actually delivered on the promise of its own buzzwords: Either by truly generating wholly original content trained ethically on non-proprietary material…. or much more simply… by ‘generating’ output with attributions and citations included to the very sources it scraped (similar to a Wikipedia article).
Using machine learning as a discovery platform for quality content across the internet would be unparalleled and the benefits of adoption might be more palatable to all content creators. Artists could gain recognition and business from acknowledgement of their work as reference art. Educators could pivot from writing skills to critical reasoning skills to curate large sets of information. Journalists could trust and/or fact check information pulled from outside their immediate rolodex. We could all break outside of our bubble and feed.
Machine learning as a discovery tool for human collaboration perseveres the creative economies that make the internet such a rich and diverse source of information and entertainment. It’s also a sustainable path to prevent devaluing human contribution and dangerous misinformation.
This holiday post I’m sure offers both delight and friction to my community and I sincerely welcome your perspective and insights. None of us know where this will go, but I suspect we all want to have our rights and livelihoods intact on the other side.