The race for high-quality, AI-generated videos is becoming increasingly exciting.
On Monday, Runway, an organization that develops generative AI tools for film and image content creation, revealed Gen-3 Alpha, the corporate's latest AI model for generating video clips from text descriptions and still images. Runway says Gen-3 offers a “significant” improvement in generation speed and fidelity over Runway's previous flagship video model, Gen-2, in addition to finer control over the structure, style and motion of the videos created.
Gen-3 can be available to Runway subscribers in the approaching days, including enterprise customers and firms in Runway's Creative Partner program.
“Gen-3 Alpha excels at generating expressive human characters with a big selection of actions, gestures, and emotions,” Runway writes in a post on its blog. “It is designed to interpret a big selection of styles and cinematic terminology (and enable imaginative transitions and precise keyframing of elements within the scene.”
Gen-3 Alpha has its limitations, essentially the most obvious of which is probably that the utmost length of footage is 10 seconds. But Runway co-founder Anastasis Germanidis guarantees that Gen-3 is just the primary – and smallest – of several video-generating models in a next-gen family of models that can be trained on improved infrastructure.
“The model can struggle with complex interactions between characters and objects, and the generations don't at all times follow the laws of physics exactly,” Germanidis said in an interview with TechCrunch this morning. “This first launch will support high-resolution generations of 5 and 10 seconds, with significantly faster generation times than Gen-2. It takes 45 seconds to generate a 5-second clip, and 90 seconds to generate a 10-second clip.”
Gen-3 Alpha, like all video generation models, was trained on a lot of video examples – and pictures – in order that it could “learn” the patterns in those examples and generate recent clips. Where did the training data come from? Runway declined to say. Few generative AI vendors today voluntarily disclose such information, partially because they view training data as a competitive advantage and subsequently keep it and the data related to it to themselves.
“We have an internal research team that oversees all of our training, and we use curated, internal datasets to coach our models,” Germanidis said – and left it at that.
The details of the training data can even give rise to mental property litigation if the provider used public data from the web for the training, including copyrighted data. This provides one other incentive to disclose too many details. Several cases the courts reject the Fair Use training to guard dataand argues that generative AI tools replicate the variety of artists without their permission and permit users to create recent works that resemble the artists' originals without the artists receiving any payment.
In the blog post announcing Gen-3 Alpha, Runway elaborates a bit on the copyright issue, saying it consulted artists when developing the model. (Which artists? Unclear.) That echoes what Germanidis told me during a fireplace chat at TechCrunch's Disrupt conference in 2023:
“We're working closely with artists to determine what the very best approaches are to deal with this problem,” he said. “We're different data partnerships to proceed to grow … and develop the subsequent generation of models.”
In the blog post, Runway also says it plans to release Gen-3 with numerous recent safeguards, including a moderation system to dam attempts to create videos from copyrighted images and content that doesn't comply with Runway's terms of service. Also within the works is a provenance system — compatible with the C2PA standard supported by Microsoft, Adobe, OpenAI and others — to discover that videos originated from Gen-3.
“Our recent and improved internal image and text moderation system uses automated monitoring to filter out inappropriate or harmful content,” said Germanidis. “C2PA authentication verifies the origin and authenticity of media created with all Gen 3 models. As model capabilities and the power to generate highly accurate content increase, we’ll proceed to take a position significantly in our targeting and security efforts.”
In its post today, Runway also announced that it has worked with “leading entertainment and media corporations” to create custom versions of Gen-3 that allow for more “stylistically controlled” and consistent characters, targeting “specific artistic and narrative needs.” The company adds, “This means the generated characters, backgrounds, and assets can maintain a consistent feel and appear across different scenes.”
A giant unsolved problem with video generation models is control—getting a model to generate consistent video that matches the creator's artistic intentions. As my colleague Devin Coldewey recently wrote, easy things in traditional filmmaking, like selecting a color for a personality's clothing, require workarounds with generative models because each shot is created independently of the others. Sometimes even workarounds don't work—leaving editors with extensive manual work.
Runway has raised over $236.5 million from investors including Google (for which it owns cloud compute credits) and Nvidia, in addition to enterprise capitalists including Amplify Partners, Felicis and Coatue, and has aligned itself closely with the creative industry with its growing investments in generative AI technology. Runway operates Runway Studios, an entertainment division that acts as a production partner for enterprise clients, and hosts the AI Film Festival, one in all the primary events to showcase movies produced entirely or partially using AI.
But the competition is getting tougher.
Generative AI startup Luma last week announced Dream Machine, a video generator that went viral for its ability to animate memes. And just just a few months ago, Adobe announced that it was developing its own video generation model, trained using content from its Adobe Stock media library.
Elsewhere, there are established players like OpenAI's Sora, which stays tightly-guarded but has the support of selling agencies and independent and Hollywood film directors. (OpenAI CTO Mira Murati was present on the 2024 Cannes Film Festival.) This yr's Tribeca Festival – which also has a partnership with Runway to curate movies made with AI tools – showed short movies produced with Sora by directors who got early access.
Google has also put its image generation model Veo within the hands of select developers, including Donald Glover (aka Childish Gambino) and his creative agency Gilga, to integrate Veo into products like YouTube Shorts.
Whatever the end result of the varied collaborations, one thing is obvious: Generative AI video tools threaten to show the film and tv industry as we comprehend it the other way up.
Filmmaker Tyler Perry said recently that he put a planned $800 million expansion of his production studio on hold after seeing what Sora could do. Joe Russo, the director of Marvel blockbusters comparable to “Avengers: Endgame,” predicts that AI will give you the chance to create a full-fledged film inside a yr.
A 2024 study A study commissioned by the Animation Guild, a union representing Hollywood animators and cartoonists, found that 75% of film production corporations that adopted AI cut, consolidated or eliminated jobs after adopting the technology. The study also estimates that greater than 100,000 jobs within the U.S. entertainment industry can be lost to generative AI by 2026.
Some really strict health and safety measures are needed to be certain that video generation tools don’t follow within the footsteps of other generative AI technologies and turn into steep declines within the demand for creative work.