OpenAI has unveiled DALL-E 3, the third and latest version of its AI image generation model, today. The new model delivers significant improvements in image quality, fidelity to prompts, and ability to render complex scenes and relationships, explained the company on its website.
Alongside significant product enhancements, one of the most notable updates for DALL·E 3 is its native integration with ChatGPT, which would allow users generate images right from the chatbot. The integration will also enable ChatGPT to generate prompts, addressing a common challenge faced by users when using AI image generators.
Image generators generally perform better with longer prompts containing many details. By building DALL-E 3 natively within ChatGPT, OpenAI has effectively made it simple for users to ask ChatGPT to generate those lengthy prompt paragraphs for the image model from just a few words describing their idea.
DALL-E 3 also understands prompt details and specifics far better than prior models which enables it to generate images accurately matching even intricate descriptions, resulting in exceptionally precise visuals. The latest model, even with the same prompt, delivers significant improvements over DALL-E 2, claimed the AI firm.
“Modern text-to-image systems have a tendency to ignore words or descriptions, forcing users to learn prompt engineering. DALL-E 3 represents a leap forward in our ability to generate images that exactly adhere to the text you provide,” states DALL-E 3’s web page on OpenAI’s website.
The AI firm has taken several steps to enhance safety measures in DALL-E 3 similar to prior models. The new system aims to limit violent, explicit, or hateful content while also declining requests for public figures by name.
Improved mitigations were informed by domain experts stress-testing the model in risk areas like representation bias. DALL-E 3 intends to decline generating images in the styles of living artists unless granted permission.
Additionally, OpenAI is researching provenance classification and opt-out mechanisms to provide transparency into whether an image originated from the system and give creators control over how their work is utilized in future training. These precautions appear aimed at balancing the model’s capabilities with safety as OpenAI strives to understand potential applications and misuses of AI-generated imagery.
The image generator is currently in research preview and will first be available to ChatGPT Plus and Enterprise customers in October and through API and Labs later this year. The company hasn’t shared any details on when the latest version of DALL-E will be available for free users of ChatGPT.
OpenAI has reportedly been testing the new model for at least a few months. MattVidPro, a tech YouTuber, received some of the images generated by the model from a member of his community. These images were shared with a significant amount of detail in one of his videos.
“Today we are truly glimpsing into the future of AI image generation. I promise you you have not seen any AI image generations that are this good. Midourney cannot compete at this level. I don’t think even Midjourney version 6 would be able to compete at this level,” he stated, sharing the results.
The impending launch of DALL-E 3 by OpenAI promises to further intensify competition in the emerging AI image generation sector. Midjourney, which has faced ballooning user volumes in the past leading it to cancel its free plan, will be closely monitoring DALL-E 3’s debut. Midjourney now operates solely on a paid model, with monthly subscriptions ranging from $10-120.
OpenAI’s strategy to provide access to the powerful DALL-E 3 tool through an affordable $20 ChatGPT Plus plan could undermine competitors by undercutting their pricing. However, the full breadth of DALL-E 3’s abilities and limitations upon public release remain unseen. Only after it debuts will direct assessments emerge around limitations and actual results.