A groundbreaking investigation, spearheaded by Professor Karim Jerbi from the Université de Montréal’s Department of Psychology and featuring contributions from distinguished AI pioneer Yoshua Bengio, has undertaken the most extensive direct comparison to date between the creative output of artificial intelligence systems and human ingenuity, analyzing the performance of sophisticated generative AI models against the results of over 100,000 individuals. This comprehensive study, published in the esteemed journal Scientific Reports, marks a significant inflection point, revealing that contemporary generative AI, particularly large language models, has ascended to a level where it can demonstrably surpass the typical human capacity in specific dimensions of creative ideation. Concurrently, the research underscores a persistent and substantial advantage held by the most creatively gifted humans over even the most advanced AI systems currently available.
The research meticulously evaluated several prominent large language models, including widely recognized platforms such as ChatGPT, Claude, and Gemini, alongside others, contrasting their performance metrics with the extensive data gathered from a diverse cohort exceeding 100,000 human participants. The findings illuminate a pivotal juncture in the evolution of AI, indicating that some AI systems, notably GPT-4, have achieved scores that exceed those of the average human on tasks specifically engineered to quantify divergent linguistic creativity. Professor Karim Jerbi commented on the implications, stating, "Our investigation confirms that certain AI architectures, built upon large language models, are now capable of outperforming average human creativity when assessed through well-defined tasks. While this outcome might elicit surprise, perhaps even unease, our study also highlights an equally crucial observation: even the most sophisticated AI systems still fall short of the creative heights attained by exceptionally imaginative individuals."
Further in-depth analysis, conducted by the study’s co-first authors, postdoctoral researcher Antoine Bellemare-Pépin from the Université de Montréal and PhD candidate François Lespinasse from Concordia University, unveiled a compelling and consistent pattern. While some AI models now demonstrate superiority over the general population’s creative abilities, the zenith of human imagination remains firmly within the human domain. In fact, when the researchers focused their examination on the most creatively adept half of the human participants, their average scores consistently surpassed those generated by every AI model put to the test. This divergence in performance became even more pronounced when considering the top 10 percent of individuals exhibiting the highest levels of creativity. Professor Jerbi elaborated on the methodological rigor, noting, "We developed a robust analytical framework that enabled us to compare human and AI creativity using identical evaluative instruments, drawing upon data from over 100,000 participants, in collaboration with Jay Olson from the University of Toronto."
To ensure a fair and equitable evaluation of creativity across both human and artificial intelligence, the research team employed a multifaceted approach. The cornerstone of their assessment was the Divergent Association Task (DAT), a widely accepted psychological instrument designed to measure divergent creativity, which is defined as the capacity to generate a broad spectrum of novel and original ideas stemming from a singular stimulus. The DAT, conceived by study co-author Jay Olson, requires participants, whether human or AI, to enumerate ten words that are as semantically disparate as possible. An exemplary display of high creativity on this task might involve a list of words such as "galaxy, fork, freedom, algae, harmonica, quantum, nostalgia, velvet, hurricane, photosynthesis." The efficacy of performance on this task has been strongly correlated with outcomes on other established measures of creativity, encompassing written expression, idea generation, and the resolution of complex problems. Although the DAT is fundamentally a language-based assessment, its scope extends beyond mere lexical proficiency, engaging more profound cognitive processes integral to creative thought across a multitude of disciplines. The DAT also offers practical advantages, notably its brevity, typically requiring only two to four minutes to complete, and its accessibility to the general public via online platforms.
Following the assessment of word association, the researchers sought to ascertain whether AI’s proficiency in this simplified task could be extrapolated to more intricate and authentic creative endeavors. To this end, they subjected both AI systems and human participants to creative writing challenges. These tasks included the composition of haiku, a traditional three-line Japanese poetic form, the generation of concise movie plot summaries, and the production of brief fictional narratives. The outcomes of these more complex exercises mirrored the established pattern observed in the DAT. While AI systems occasionally outperformed average human performance, the most adept human creators consistently produced work that was both more sophisticated and demonstrably more original.
This investigation also delved into a critical question regarding the malleability of AI-generated creativity: is it an immutable quality, or can it be modulated? The study demonstrated that AI creativity can indeed be influenced by adjustments to technical parameters, most notably the "temperature" setting. This parameter serves to govern the degree of predictability versus adventurousness in the AI’s generated responses. At lower temperature settings, AI tends to produce outputs that are more conservative and conventional. Conversely, increasing the temperature leads to responses that are more varied, less predictable, and exhibit a greater degree of exploratory behavior, enabling the system to venture beyond established or familiar concepts. Furthermore, the researchers observed that the nature of the prompts provided significantly impacts creative output. For instance, instructions that encourage AI models to consider word etymology and structural origins can yield more unexpected associations and elevated creativity scores. These findings collectively underscore the profound dependence of AI creativity on human direction, positioning interaction and the art of prompt engineering as central components of the creative workflow.
The study offers a nuanced perspective on widespread apprehensions that artificial intelligence might supplant human creative professionals. Although AI systems have now demonstrated the capability to match or even exceed average human creativity in specific contexts, they still exhibit discernible limitations and remain fundamentally reliant on human guidance. Professor Karim Jerbi articulated this perspective, stating, "Even as AI now achieves human-level creativity on certain assessments, it is imperative that we move beyond a potentially misleading narrative of competition. Generative AI has, above all, evolved into an exceptionally potent instrument in service of human creativity. It will not displace creators but will instead profoundly reshape the very methods by which they conceive, explore, and bring their ideas to fruition – for those who embrace its potential." Rather than heralding an obsolescence of creative professions, the research findings suggest a future where AI functions as a sophisticated creative collaborator. By expanding the horizons of possible ideas and opening novel avenues for exploration, AI has the potential to amplify human imagination rather than diminish it. Professor Jerbi concluded, "By directly confronting and comparing the capabilities of humans and machines, studies such as ours propel us to re-examine and redefine our understanding of creativity itself." The research paper, titled "Divergent creativity in humans and large language models," was officially published in Scientific Reports on January 21, 2026, and involved a multidisciplinary collaboration among researchers from the Université de Montréal, Concordia University, the University of Toronto Mississauga, Mila (Quebec AI Institute), and Google DeepMind. Professor Karim Jerbi led the investigation, with Antoine Bellemare-Pépin and François Lespinasse serving as co-first authors. The research team’s distinguished members also included Yoshua Bengio, the visionary founder of Mila and LoiZéro, recognized globally as a seminal figure in the advancement of deep learning, the foundational technology underpinning contemporary AI systems like ChatGPT.
