Synopsis
OpenAI has unveiled ChatGPT Images 2.0, a powerful AI image generator boasting enhanced accuracy and detail. This new version excels at rendering text, icons, and complex layouts across multiple languages, offering greater flexibility with aspect ratios. Advanced "thinking" capabilities are also introduced, promising more sophisticated image creation for all users.Taking to X, OpenAI said, “A state-of-the-art image model that can take on complex visual tasks and produce precise, immediately usable visuals, with sharper editing, richer layouts, and thinking-level intelligence.”
According to the company, the updated model significantly improves how it follows instructions and manages intricate elements within images. It is also better at rendering small text, icons and user interface components — areas where earlier models often struggled.
Greater precision and control
AI image generators have traditionally faced challenges in dealing with fine details, especially text accuracy. OpenAI claims this has been addressed in the new model, which can handle spelling and small visual elements more reliably.
To demonstrate this, OpenAI asked users to “zoom in on the rice” in a sample image, highlighting the level of detail the model can achieve.
Stronger across languages
OpenAI says the model delivers better performance across multiple languages. It can generate text within images in various languages, including Japanese, Korean, Chinese, Hindi and Bengali, with improved accuracy.
This goes beyond simple translation. The model can create visually consistent designs where language is part of the overall composition, including posters, diagrams and comics.
As a result, users can produce non-English visuals with clearer structure and improved readability.
Flexible aspect ratios
ChatGPT Images 2.0 supports aspect ratios as wide as 3:1 and as tall as 1:3, the company said. This flexibility makes it suitable for various formats, including social media posts, presentations and banners.
Stylistic sophistication and photo realism
“ChatGPT Images 2.0 is better able to capture the defining characteristics of photos, as well as cinematic stills, pixel art, manga, and other distinctive visual languages, with greater consistency in texture, lighting, composition, and fine detail,” OpenAI said.
These improvements make it more useful for tasks such as game prototyping, storyboarding, marketing content and creating assets in specific styles or genres, the company added.
Additionally, the model also introduces “thinking” capabilities, enabling it to handle more complex tasks. In advanced modes, it can access real-time information, generate multiple images from a single prompt and evaluate its own outputs.
Availability
The new model is now available to all users via ChatGPT, Codex and the API. Advanced features with “thinking” capabilities are accessible to ChatGPT Plus, Pro and Business users.
The gpt-image-2 model is also available through the API, with pricing depending on image quality and resolution.
Mobile users will need to update to the latest version of the app to access these features.