The Latest AI Image Game Changer

First it was AI generated text that seemed to have all the answers (or at least did a great job of making things up if it didn't), then there were AI generated images that were winning local and international art and photo contests. But even these tools, as magical as they were, had their weaknesses, their Achilles' heel. For images it's been things like drawing hands, which to be honest I still have one heck if a time drawing...so I may indeed be AI...who knows, or rendering text that isn't gibberish at best or a mess of cryptic letter looking random shapes at worst. Over that past year and half AI text to image generators have made great progress in fixing those weaknesses, but one still remained...consistency, character consistency to be specific. But with the release of the lastest update of Midjourney 6, everything may have changed with two new simple tags --cref and --sref which stand for "character reference" and "style reference" respectably. These are gigantic game changers in terms of illustrated books, comics and illustrated novels. Let's take a look at these two features and run them through some test to see how powerful they really are.

Character Reference

First, we'll take a look at the character reference tag --cref. Like any other Midjourney tags these are usually placed at the end of a text prompt. In this case the tag accepts a web address (URL) as an argument. All you need to do is find the URL of an image of a character, or person, that you want to use and you append the URL immediately following the --cref tag. You can even generate a character image in Midjourney and use that as the reference...as I did below with the prompt: "A full body character design, of a boy, age 10, with curly dark brown hair, light brown skin and dark brown eyes, in the style of traditional Disney animation. He's wearing an orange T-shirt, blue shorts and sneakers. Standing, facing the camera. Cute character." I figured to help make the character more easily identifiable I'd list some specifics, such as the color and style of his hair, eye color, and the orange shirt and blue shorts he has on.

In this case I was targeting a character for an illustrated children's book. After a few tries the prompt above got me the character below.

Our base character, Diego.

Now that we've got our base character design, lets put this character reference thing to the test and put him in a variety of locations and situations. We'll have him playing soccer, exploring a dark cave, and cautiously wandering around a haunted house at Halloween. I'm going to test out controlling different expressions and we'll have some fun with cheap store bought costumes at Halloween.

Diego...doing Diego stuff.

All in all Midjourney did a pretty decent, even impressive job of staying true to our base character even when put in different environments with different lighting and camera angles. One thing to note is that in order to put Diego in different outfits we had to modify the weight of our character reference with the --cw tag. By default the character weight tag is set to 100 which means it will attempt to stay consistent with ALL aspects of your character, the face, the hair, and the outfit they're wearing. Changing the character weight to "0" meant that Midjourney is ONLY concerned with consistency of the character's face...pretty much. So we had the --cw tag set to 0 on our Halloween images.

These turned out pretty consistent from a character standpoint but as you can probably tell, the styles of each of these, while remaining "hand illustrated", still vary quite a bit. This is where the other part of the equation comes in, style reference or the --sref tag.

Style Reference

To be able to create images for a book we need not only consistency in the character, but also in the art style itself. Let's create some new images, based around the same situations as above, but we'll specify an artistic style for each.

90's Disney Movie

First let's give a try at a hand animated Disney movie from the late 80's and early 90's circa: "The Little Mermaid" or "Beauty and the Beast".

90's Disney Animated Style

Keep in mind this is the Halloween storyline so the fact that he's not wearing his iconic orange shirt and blue shorts is okay. But I really love the fact that it kept the color scheme even with the new outfits.

Pixar Movie

Let's step it up a few decades and see how Diego might look as a character in a more modern 3d rendered Pixar film. For this one let's revisit our mysterious cave environment.

Modern Day Pixar

Not too bad. Midjourney took our 2d, traditional character and made him 3d while doing a pretty good job of sticking true to our base character design. The biggest variations, that would have to be fixed, are the type of shirt he's wearing and subtle differences in the flashlight design. Other than that, I think Diego looks pretty good in 3d.

Pushing the Limits

So how far can we take this style reference thing? How far can we stray from our original hand drawn 2d style? Let's see, and in typical Disney fashion...we'll take an existing animated image and remake it as a live action movie....because....why not. For the record, I'm not a big fan of this trend but I also can't think of a better way to test the boundaries of this style reference tag so...here's my first attempt at a live-action version of Diego by feeding the style reference a screenshot from a live-action movie.

Not quite "live-action" yet...

Okay well, that turned out...interesting. The background, while admittedly out of focus, does have a more photographic vibe to it, but poor Diego seems to be a sort of Pixar/Live-Action mashup. There's no way this passes as live-action. Let's try again, but this time we'll bring the full might of the style reference tag into play. Did I forget to mention that both the character reference and the style reference tags accept MULTIPLE URL's. That's correct, you can provide more than one reference image to your character or style reference simply by separating them by spaces (example: --sref [url 1] [url 2] [url 3]).

Adding to our live-action Marvel movie reference, I added a photo of a young Ralph Macchio, I always loved the original "Karate Kid", and with our new and improved prompt Midjourney created the following image:

Live-Action Diego

This worked out incredibly well as my goal was to get a human live-action look, not to actually have a young Ralph Macchio play Diego. I think Midjourney nailed it. In order to continue forward visualizing scenes from a live-action Diego movie I used this image as a single character reference image and was able to generate the following stills.

Other than the fact that soccer Diego's eyes are clearly locked on an opponent and not the ball I'm pretty happy with these results.

This is Just the Beginning

As you can see, Midjourney's new character reference tag combined with the style reference tag makes it an amazingly power visual tool for storytelling. While it doesn't work perfectly every time, I went through more than a few generations that I didn't show above it just takes a little bit of patience and trial and error to get the results you want. I expect there to be a surge of comics, illustrated story books and novels to flood Amazon's self publishing tool in the near future.

AI generated video is also making huge strides toward being a useful tool for creatives and combine that progress with the consistency show here and I fully expect things to become very interesting in video production within the next year. Big changes are coming. This is only the beginning.

the Creative Sight

Search This Blog