Skip to content

AI Outperforms Humans on Standardized Tests of Creative Potential

Get another one for artificial intelligence. In a recent study, 151 human participants were challenged with ChatGPT-4 in three tests designed to measure divergent thinking, which is considered an indicator of creative thinking.

Divergent thinking is characterized by the ability to generate a unique solution to a question that has no expected solution, such as “What is the best way to avoid talking about politics with my parents?” In the study, GPT-4 provided more original and elaborate responses than human participants.

The study, “The Current State of Artificial Intelligence Generative Language Models Are More Creative Than Humans on Divergent Thinking Tasks,” was published in Scientific Reports and written by U of A Ph.D. psychological sciences students Kent F. Hubert and Kim N. Awa, as well as Darya L. Zabelina, assistant professor of psychological sciences at the U of A and director of the Mechanisms of Creative Cognition and Attention Laboratory.

The three tests used were the Alternative Use Task, which asks participants to come up with creative uses for everyday objects such as a rope or a fork; the Consequences Task, which invites participants to imagine possible outcomes of hypothetical situations, such as “what if humans no longer needed sleep?”; and the Divergent Associations Task, which asks participants to generate 10 nouns that are as semantically distant as possible. For example, there is not much semantic distance between “dog” and “cat,” while there is much between words like “cat” and “ontology.”

Responses were evaluated based on the number of responses, response duration, and semantic difference between words. Ultimately, the authors found that “overall, GPT-4 was more original and elaborate than humans on each of the divergent thinking tasks, even when controlling for response fluency. In other words, GPT-4 demonstrated greater creative potential in an entire battery. of divergent thinking tasks.”

This finding carries some caveats. The authors state: “It is important to note that the measures used in this study are all measures of creative potential, but participation in creative activities or achievements is another aspect of measuring a person’s creativity.” The purpose of the study was to examine creative potential at a human level, not necessarily of people who have established creative credentials.

Hubert and Awa further note that “AI, unlike humans, has no agency” and “relies on the assistance of a human user. Therefore, the creative potential of AI is in a constant state of stagnation unless that is requested”.

Additionally, the researchers did not evaluate the appropriateness of the GPT-4 responses. So while the AI ​​may have provided more answers and more original answers, the human participants may have felt they were limited by the need for their answers to be based on the real world.

Awa also acknowledged that human motivation to write elaborate responses may not have been high, and said there are additional questions about “how is creativity operationalized? Can we really say that using these tests for humans is generalizable to different people?” Is it an evaluation?” A wide range of creative thinking? “So I think it leads us to critically examine what the most popular measures of divergent thinking are.”

In reality, the question is not whether tests are perfect measures of human creative potential. The point is that large linguistic models are progressing rapidly and surpassing humans in ways they have never done before. It remains to be seen whether they pose a threat to replacing human creativity. For now, the authors continue to see that “looking ahead, the future possibilities for AI to act as a tool for inspiration, as an aid in a person’s creative process, or to overcome fixation are promising.”