Exclusive: Stability AI brings advanced 3D and image fine-tuning to Stable Diffusion

VentureBeat presents: AI Unleashed – An exclusive executive event for enterprise data leaders. Network and learn with industry peers. Learn More

Stability AI announced today several new enhancements to its Stable Diffusion platform. These updates not only offer exciting new capabilities for text-to-image, but also venture into the realm of 3D content creation.

The most notable enhancement is the brand new Stable 3D model. Until now, Stable Diffusion has primarily worked on two-dimensional (2D) image generation. The Stable 3D model will change that, providing functionality that could help with any type of 3D content creation, including graphic design and even video game development.

Alongside its foray into 3D content generation, Stability AI has introduced the Sky Replacer tool that is designed to do exactly what the name implies—replace the sky in a 2D images. 

The Stable Diffusion platform also now offers Stable Fine-Tuning, designed to help enterprises expedite the image fine-tuning process for specific use cases.


AI Unleashed

An exclusive invite-only evening of insights and networking, designed for senior enterprise executives overseeing data stacks and strategies.


Learn More

Additionally, the company will integrate an invisible watermark for content authentication in images generated by the Stability AI API. The new updates are all about helping enterprises with creative development pipelines as generative AI increasingly becomes part of common workflows.

“It’s about bringing creative storytellers the tools they need to have that level of extra control over the images,” Emad Mostaque, CEO of Stability AI, told VentureBeat in an exclusive interview.  

Stable Diffusion adds features in an increasingly competitive GenAI landscape

The advancements from Stability AI come at a time when the text-to-image generation market is becoming highly competitive.

Adobe has taken aim at the market with its Firefly tools that are tightly integrated with the company’s design software. Midjourney has been increasingly adding new features to its technology to help designers generate images. Not to be left out, OpenAI recently released its DALL-E 3 models with improved capabilities for generating text inside of images.

Mostaque is well-aware of his competition and is aiming to help differentiate Stability AI in several ways. In particular, he emphasized that his company is now moving away from being just about models to being about enabling a creative pipeline. With the new Sky Replacer and Fine Tuning features, he noted they are both additional steps that go above and beyond what’s in a core base model for generating images.

Sky Replacer isn’t just a feature, it’s a focus for a business use case

The concept of replacing a background in an image is not a new one. In non-generative AI applications, backgrounds are commonly replaced by techniques such as green screens and chroma keys.

Mostaque said that Stability AI is building on those classic techniques and automating the workflow to make the process fast and efficient for business users. Changing the background color of the sky isn’t just about adding some form of creative flair either, it’s a capability that has a very specific and practical use case.

“Sky Replacer is great for Real Estate for example,” Mostaque said.

Mostaque noted that users want to be able to have different backgrounds, with different lighting effects. Fundamentally he emphasized that it’s all about offering control as organizations have their own workflows to generate images and content. What Stability AI is doing is building optimized workflows to help enable the control that different use cases require.

“Sky Replacer is the first in a series of these that we’ll be bringing out that  are very industry and enterprise specific, building on the experiences we’ve had over the last six to 12 months,” he said.

Stable 3D extends Stable Diffusion for new use cases

The new Stable 3D model works by extending the diffusion model used in Stable Diffusion to include additional 3D datasets and vectorization. 

“I’m incredibly excited about the ability to create whole worlds in 3D,” Mostaque said.

Mostaque explained that Stable 3D was built from both Stable Diffusion and Stability AI’s work on Objaverse-XL, which is one of the world’s largest open 3D datasets. Building and rendering 3D images has long been a resource intensive process, but it’s one that Mostaque is optimistic that Stable 3D will be more efficient than traditional approaches to 3D image generation. He emphasized that it’s still early days for Stable AI but is optimistic the technology will steadily evolve and expand over time. Stable 3D is initially being made available as a private preview. 

“This is incredibly efficient compared to the classical kind of 3D model creation,” he said. “Things that classically took a long time to build now are quick to get the first cut.” 

Watermarks and the Biden EO on AI

With the Executive Order (EO) from the Biden Administration this week on AI, one component is a direction to integrate watermarks into generated content.

Stability AI is now integrating invisible watermarks and Content Credentials into its API. Content Credentials is a multi-vendor industry effort that Adobe and others are participating in to help provide information about authorship information about content. Mostaque said that adding the invisible watermarks and Content Credentials is the responsible thing to do. It’s also part of a broader effort that Stability AI is working on to bring authenticity to generated content.

“We are really pioneering a number of initiatives and some additional ones that we’re announcing around this, as well as additional research, because we want to know what’s real and what’s fake,” Mostaque said. “It also helps with some of the attribution and other mechanisms that we’re building in for future releases.”

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.