Back to list
This article was auto-translated.View original (中文)
Tech1mo ago

GPT Image 2 Ascends to Godhood in Text-to-Image Generation, Surpassing Google's Nano Banana 2 to Become World No. 1

OpenAI's newly launched GPTImage2 model has demonstrated astonishing performance in authoritative text-to-image evaluations, successfully surpassing Google's NanoBanana2 and claiming the global top spot. Officially launched on April 21st, the model has undergone months of iterative upgrades, with significant improvements in image quality, comprehension, and detail restoration.

GPT Image 2 Ascends to Godhood in Text-to-Image Generation, Surpassing Google's Nano Banana 2 to Become World No. 1

SuperCLUE evaluation data shows that GPT Image 2 topped all core dimensions including Chinese character generation, realistic reproduction, and image quality, demonstrating a remarkably strong performance.

Its Chinese character generation ability reached 93.07 points, with full marks for text accuracy. Whether it's seal script on blue and white porcelain or golden lettering on acrylic material, it can be perfectly integrated without any floating effect, completely solving the problem of garbled Chinese characters generated by overseas models.

Chinese Character Generation – Image-Text Coherence

Chinese Character Generation – Image-Text Coherence

In terms of scene restoration, the model can accurately replicate complex scenes such as old-fashioned bakeries and intangible cultural heritage iron flower performances, with details that are realistic and natural. At the same time, it demonstrates a good understanding of long prompts and logical reasoning requirements, accurately generating professional images such as scientific diagrams and poster designs, with a very high degree of instruction following.

Realistic Scene Replication

Compared to its previous generation, GPT Image 2 has significantly improved in image-text consistency and Chinese character generation, comprehensively surpassing the previous model. Compared with competitors such as Google and Baidu, it also achieves leading performance in multiple dimensions, especially with prominent advantages in creative reasoning and realistic reproduction.

Currently, the model still has room for optimization in areas such as spatial relationship understanding and knowledge reasoning, but overall it has reached the industry's top level, marking a new stage in text-to-image technology.

Comparison of Scores for Leading Domestic and International Models at the First Level Dimensions