HomeArtificial IntelligenceStability AI gives videos a brand new dimension with Stable Video 3D

Stability AI gives videos a brand new dimension with Stable Video 3D

Stability AI today expands its generative AI model portfolio with the discharge of Stable Video 3D (SV3D).

As the name suggests, the brand new model is a Gen AI video tool for rendering 3D videos. Stability AI has developed video capabilities with its Stable Video technology that permits users to create short videos from a picture or text prompt. SV3D builds on Stability AI's previous Stable Video Diffusion model and adapts it for the duty of novel view synthesis and 3D generation.

With SV3D, Stability AI adds recent depth to its video generation model with the power to create and transform multi-view 3D meshes from a single input image.

SV3D is now available for industrial use with a Stability AI Professional Membership ($20 per 30 days for creators and developers with annual revenue of lower than $1 million). For non-commercial use, users can download the model weights at Hugging face.

Here's an example video I quickly created. As you'll see, despite some slight distortion, the shapes of all objects within the video remain clearly coherent and solid, at the same time as the camera rotates around them.

Game development and e-commerce are mentioned as goal use cases

“By adapting our Stable Video Diffusion image-to-video diffusion model with the addition of camera path conditioning, Stable Video 3D is able to generating multi-view videos of an object,” the corporate wrote in a blog entry Details in regards to the recent model.

“Stable Video 3D is a useful tool for generating 3D assets, especially within the gaming space,” Varun Jampani, principal researcher at Stability AI, told VentureBeat. “In addition, it enables the production of 360-degree orbital videos, useful in e-commerce and providing a good more immersive and interactive shopping experience.”

From Stable Zero123 to SV3D

Stability AI is probably best known for its Stable Diffusion text-to-image gene AI models, which include SDXL and Stable Diffusion 3.0, the latter of which remains to be in early research preview. Stable Diffusion 1.5 is an open source image generation model that forms the premise for a lot of other AI image generation and video products including runway And Leonardo AI.

The Stable Zero123 model was released back in December 2023, offering recent possibilities for creating 3D images. At the time, Emad Mostaque, founder and CEO of Stability AI, told VentureBeat that Stable Zero123 can be the primary in a series of 3D models.

SV3D technology takes a distinct approach to 3D creation than Stable Zero123.

“Stable Video 3D could be seen as a successor and improvement to our previous offering Stable Zero123said Jampani. “Stable Video 3D is a novel view synthesis network that takes a single image as input and outputs novel view images.

Jampani explained that Stable Zero123 is predicated on Stable Diffusion and outputs one image at a time. Stable Video 3D is predicated on Stable Video Diffusion models and outputs multiple novel views concurrently. Stable Video 3D offers novel views in significantly higher quality and might due to this fact help generate higher 3D meshes from a single image.

Coherent views from any angle

In one research paperStability AI researchers explain among the techniques used to enable 3D from a single image using latent video diffusion.

“Recent work in 3D generation proposes techniques for adapting 2D generative models for novel view synthesis (NVS) and 3D optimization,” the report states. “However, these methods have several drawbacks as a consequence of either limited views or inconsistent NVS, thereby affecting the performance of 3D object generation.”

One of SV3D's key strengths is its ability to supply consistent, novel, multi-view images of an object. According to Stability AI, SV3D delivers coherent views from any angle.

The research paper on SV3D highlights this advancement and states: “. …unlike previous approaches that usually struggle with limited perspectives and inconsistencies in output, Stable Video 3D is able to delivering coherent views from any given angle with competent generalization.”

In addition to its novel view synthesis capabilities, SV3D also goals at optimizing 3D meshes. By leveraging its multi-view consistency, SV3D can generate high-quality 3D meshes directly from the novel views it produces.

“Stable Video 3D leverages its multi-view consistency to optimize 3D Neural Radiance Fields (NeRF) and mesh representations, improving the standard of 3D meshes generated directly from novel views,” Stability AI wrote in its announcement post.

Two powerful variants: SV3D_u and SV3D_p
SV3D is available in two variants, each designed for specific use cases.

SV3D_u generates orbital videos based on individual image inputs without the necessity for camera conditioning. Camera conditioning in generative AI is a method by which additional input, often in the shape of a picture or a set of parameters related to camera angles or positions, is used to regulate the generation means of recent images or content.

On the opposite hand, SV3D_p extends this capability by supporting each single frame and orbital views, allowing users to create 3D videos along specific camera paths.


Please enter your comment!
Please enter your name here

Must Read