OpenAI has integrated GPT-4o, a new image generation model, directly into ChatGPT, replacing the previous DALL-E 3 integration and offering enhanced capabilities such as multimodal processing and improved text rendering. As reported by TechCrunch, the upgrade allows ChatGPT to create more detailed and accurate images, with the new system now available to various tiers of users and API access for developers coming soon.
The rollout of GPT-4o's image generation capabilities began on March 25, 2025, marking a significant upgrade to ChatGPT's visual creation abilities12. This new feature is now available to Plus, Pro, Team, and Free tier users, with Enterprise and Education users gaining access soon3. Free users are limited to generating up to 3 images per day, while Plus and higher tier subscribers can create unlimited images4. The integration aims to provide more consistent results and fewer content restrictions compared to the previous DALL-E 3 system. Developers can expect API access for GPT-4o image generation in the coming weeks, allowing for broader implementation across various platforms and applications43.
GPT-4o boasts several advanced features that set it apart from its predecessor. The model can handle up to 20 different objects simultaneously while maintaining correct relationships between them, making it ideal for complex scene generation.1 Its contextual awareness allows it to build upon images and text in chat context, ensuring consistency throughout iterations. Additionally, GPT-4o excels at in-context learning, enabling users to upload images for the AI to analyze and incorporate details into new generations.2 Due to the complexity and detail of the images produced, rendering may take up to one minute, but the results are often more visually striking and crisper in detail compared to previous models.34
The new image generation system in ChatGPT offers a streamlined user experience. Users can simply ask the model to create an image with specific details or select the "Create image" option in the composer. The system allows for customization of images with precise requirements, including aspect ratio, exact colors using hex codes, and transparent backgrounds.12 This integration makes image creation an essential part of AI-driven communication, allowing users to refine images through natural conversation while maintaining a consistent style.3 The improved capabilities of GPT-4o enable it to generate highly accurate and detailed images, including intricate elements like text, hands, and faces, responding effectively to extensive and detailed prompts.45
While GPT-4o has become the primary image generation model integrated into ChatGPT, OpenAI has maintained DALL-E as a separate option for users who prefer its specific capabilities. DALL-E will be accessible through a dedicated GPT, allowing users to switch between the two models based on their needs12. This decision ensures that users can still benefit from DALL-E's unique strengths, such as its ability to generate stylized or artistic images, while also having access to GPT-4o's advanced features. The availability of both models provides users with greater flexibility and choice in their image generation tasks, catering to a wider range of creative and practical applications34.