Skip to main content

The Latest AI Image Game Changer

First it was AI generated text that seemed to have all the answers (or at least did a great job of making things up if it didn't), then there were AI generated images that were winning local and international art and photo contests. But even these tools, as magical as they were, had their weaknesses, their Achilles' heel. For images it's been things like drawing hands, which to be honest I still have one heck if a time I may indeed be AI...who knows, or rendering text that isn't gibberish at best or a mess of cryptic letter looking random shapes at worst. Over that past year and half AI text to image generators have made great progress in fixing those weaknesses, but one still remained...consistency, character consistency to be specific. But with the release of the lastest update of Midjourney 6, everything may have changed with two new simple tags --cref and --sref which stand for "character reference" and "style reference" respectably. These are gigantic game changers in terms of illustrated books, comics and illustrated novels. Let's take a look at these two features and run them through some test to see how powerful they really are.

Character Reference

    First, we'll take a look at the character reference tag --cref. Like any other Midjourney tags these are usually placed at the end of a text prompt. In this case the tag accepts a web address (URL) as an argument. All you need to do is find the URL of an image of a character, or person, that you want to use and you append the URL immediately following the --cref tag. You can even generate a character image in Midjourney and use that as the I did below with the prompt: "A full body character design, of a boy, age 10, with curly dark brown hair, light brown skin and dark brown eyes, in the style of traditional Disney animation. He's wearing an orange T-shirt, blue shorts and sneakers. Standing, facing the camera. Cute character." I figured to help make the character more easily identifiable I'd list some specifics, such as the color and style of his hair, eye color, and the orange shirt and blue shorts he has on.

In this case I was targeting a character for an illustrated children's book. After a few tries the prompt above got me the character below.

Our base character, Diego.

Now that we've got our base character design, lets put this character reference thing to the test and put him in a variety of locations and situations. We'll have him playing soccer, exploring a dark cave, and cautiously wandering around a haunted house at Halloween. I'm going to test out controlling different expressions and we'll have some fun with cheap store bought costumes at Halloween.

Diego...doing Diego stuff.

All in all Midjourney did a pretty decent, even impressive job of staying true to our base character even when put in different environments with different lighting and camera angles. One thing to note is that in order to put Diego in different outfits we had to modify the weight of our character reference with the --cw tag. By default the character weight tag is set to 100 which means it will attempt to stay consistent with ALL aspects of your character, the face, the hair, and the outfit they're wearing. Changing the character weight to "0" meant that Midjourney is ONLY concerned with consistency of the character's face...pretty much. So we had the --cw tag set to 0 on our Halloween images.

 These turned out pretty consistent from a character standpoint but as you can probably tell, the styles of each of these, while remaining "hand illustrated", still vary quite a bit. This is where the other part of the equation comes in, style reference or the --sref tag.

Style Reference

To be able to create images for a book we need not only consistency in the character, but also in the art style itself. Let's create some new images, based around the same situations as above, but we'll specify an artistic style for each. 

90's Disney Movie

First let's give a try at a hand animated Disney movie from the late 80's and early 90's circa: "The Little Mermaid" or "Beauty and the Beast".

90's Disney Animated Style

Keep in mind this is the Halloween storyline so the fact that he's not wearing his iconic orange shirt and blue shorts is okay. But I really love the fact that it kept the color scheme even with the new outfits.

Pixar Movie

Let's step it up a few decades and see how Diego might look as a character in a more modern 3d rendered Pixar film. For this one let's revisit our mysterious cave environment.

Modern Day Pixar

Not too bad. Midjourney took our 2d, traditional character and made him 3d while doing a pretty good job of sticking true to our base character design. The biggest variations, that would have to be fixed, are the type of shirt he's wearing and subtle differences in the flashlight design. Other than that, I think Diego looks pretty good in 3d.

Pushing the Limits 

So how far can we take this style reference thing? How far can we stray from our original hand drawn 2d style? Let's see, and in typical Disney fashion...we'll take an existing animated image and remake it as a live action movie....because....why not. For the record, I'm not a big fan of this trend but I also can't think of a better way to test the boundaries of this style reference tag's my first attempt at a live-action version of Diego by feeding the style reference a screenshot from a live-action movie.

Not quite "live-action" yet...

Okay well, that turned out...interesting. The background, while admittedly out of focus, does have a more photographic vibe to it, but poor Diego seems to be a sort of Pixar/Live-Action mashup. There's no way this passes as live-action. Let's try again, but this time we'll bring the full might of the style reference tag into play. Did I forget to mention that both the character reference and the style reference tags accept MULTIPLE URL's. That's correct, you can provide more than one reference image to your character or style reference simply by separating them by spaces (example: --sref [url 1] [url 2] [url 3]). 

Adding to our live-action Marvel movie reference, I added a photo of a young Ralph Macchio, I always loved the original "Karate Kid", and with our new and improved prompt Midjourney created the following image:

Live-Action Diego

This worked out incredibly well as my goal was to get a human live-action look, not to actually have a young Ralph Macchio play Diego. I think Midjourney nailed it. In order to continue forward visualizing scenes from a live-action Diego movie I used this image as a single character reference image and was able to generate the following stills.

Other than the fact that soccer Diego's eyes are clearly locked on an opponent and not the ball I'm pretty happy with these results. 

This is Just the Beginning

As you can see, Midjourney's new character reference tag combined with the style reference tag makes it an amazingly power visual tool for storytelling. While it doesn't work perfectly every time, I went through more than a few generations that I didn't show above it just takes a little bit of patience and trial and error to get the results you want. I expect there to be a surge of comics, illustrated story books and novels to flood Amazon's self publishing tool in the near future.

AI generated video is also making huge strides toward being a useful tool for creatives and combine that progress with the consistency show here and I fully expect things to become very interesting in video production within the next year. Big changes are coming. This is only the beginning. 



Popular posts from this blog

FUNctionality! - See Things DIfferently

Educators, if you're looking for a quick, fun game for your class that also serves to help everyone see things differently, more creatively , then try this "FUNctionality" activity. This is a game I developed with the help of my students the latter part of the year. It's been through a few iterations already and I present it in its most recent, and balanced, version. Before we begin, let me ask you this, how could you use the object in the image shown below? For most people, a single purpose comes to mind and I'll go out on a limb and assume that I don't have to describe it. However, for students playing our game, this object spawned a wide range of uses that included, cleaning up spills, writing messages, drawing circles, dressing up as a mummy, measuring the length of something and stuffing a pillow or stuffed animal. The Setup This little game doesn't take much, just literally the things you have around your classroom, and a stopwatch (you can

Genius Hour: Week 8 - "The Final Stretch"

The weeks are counting down and we're nearly at the end of our first Genius Hour period. Students are putting last minute touches on songs and poems. They're polishing book layouts in Adobe inDesign, furnishing virtual houses in SketchUp, and they're practicing their dance moves. As we wrap up week 8 of Genius Hour there a few special things to note. Juniors are BACK! First, is that our juniors have just returned from their nearly 2 week long professional internships with companies and organizations around the area. I was dying to circle up and hear about those experiences, but they already lost time last week and I wanted to give them as much time as they had available. I was also aware they spent a couple of hours debriefing earlier that morning. Spring Is Here Next week is Spring Break! Most students will use that time to take trips, visit theme parks, and relax, but I have heard mention from several students or groups that they still have a little work and practic

Kicking Off Genius Hour

Learning to speak Korean, illustrating how car engines work, learning desktop publishing software, and demonstrating how to pilot a plane; these are just a few examples of what students will be learning in my first period class for the next nine Friday's. And that's just the first of seven classes that are exploding with dozens of wildly different projects and ideas. Friday's Are About to Get Brilliant Today we kicked off "Genius Hour" in each of my classes, which includes Graphic Design and Advanced Graphic Design for grades 9-11. Genius Hour is based off Google's 20% time. Google had the theory that if they granted their employees 20% of their time to work on projects they were passionate about that productivity would go up, stress would go down, and Google might just get some cool products out of the deal. Apparently, Google was right. As a result of Google-time, products like Gmail, Google News, Google AdSense, and Google Translate were all brough