The popular AI image generation service In the center of the journey has implemented one among its most requested features: the power to consistently recreate characters in recent images.
This has naturally been a significant hurdle for AI image generators thus far.
This is because most AI image generators depend on “Diffusion models“Tools much like or based on Stability AI’s open source image generation algorithm Stable Diffusion. They roughly work by taking text entered by a user and attempting to piece together, pixel by pixel, a picture that matches that description, in accordance with similar findings using image and text tags of their massive (and controversial) training data set of hundreds of thousands Human created images.
Why consistent characters are so powerful – and elusive – for generative AI images
But as with text-based Large Language Models (LLMs) like OpenAI's ChatGPT or Cohere's recent Command-R, the issue with all generative AI applications lies within the inconsistency of the answers: the AI ​​generates something recent for each single prompt that’s entered, even when the Prompt is repeated or a number of the same keywords are used.
This is great for generating completely recent content – ​​within the case of Midjourney, images. But what when you create a storyboard for a movie, a novel, a graphic novel or a comic book, or another visual medium by which a number of characters move around in it and appear in several scenes, locations, with different facial expressions and props should?
This very scenario, which is often crucial for the continuity of the narrative, has thus far been very difficult to realize with generative AI. But Midjourney is now taking a probability and introducing a brand new tag, “–cref” (short for “character reference”), that users can add to the tip of their text prompts within the Midjourney Discord, which attempts to customize the character’s facial expression, body type, and even clothing from a URL that the user inserts after the tag.
As the feature continues to be developed and refined, it could evolve Midjourney from a cool toy or source of ideas right into a more skilled tool.
How to make use of the brand new Midjourney feature for consistent characters
The tag works best with previously generated midjourney images. For example, the workflow for a user could be to first generate or retrieve the URL of a previously generated character.
Let's start from the start and say that with this prompt we generate a brand new character: “a muscular, bald man with a pearl and a watch patch.”
We upscale the image we like essentially the most after which Ctrl-click on it on the Midjourney Discord server to seek out the “Copy Link” option.
Then we are able to enter a brand new prompt: “wearing a white tuxedo and standing in a mansion –cref (URL)” and paste the URL of the image we just generated, and Midjourney will attempt to generate the identical character from previously in our recent entered setting.
As you'll see, the outcomes are removed from a precise match to the unique character (and even our original prompt), but are definitely encouraging.
Additionally, the user can control to some extent the “weight” of how accurately the brand new image renders the unique character by typing at the tip of their recent prompt (after the “-cref(URL)” string, so something like this: “–cref (URL) -cw 100.” The lower the “cw” number, the more variance the resulting image has. The higher the “cw” number, the more The resulting recent image closely follows the unique reference.
As you’ll be able to see in our example, entering a really low value “cw 8” actually gives us what we wanted: the white tuxedo. However, our character's signature eye patch has now been removed.
Well, there's nothing a small “variable region” can't fix – right?
Ok, so the attention patch is on the flawed eye… but we're getting there!
You also can mix multiple characters into one through the use of two “–cref” tags next to their respective URLs.
The feature only went live this evening, but artists and YouTubers are already testing it. Try it yourself if you could have mid-journey. And read founder David Holz's full note about it below: