Exploring the Creative Potential of AI-Generated Art
Posted on November 3, 2023 by Fusion Connect
Watch & Listen
Tech UNMUTED is on YouTube
Catch up with new episodes or hear from our archive. Explore and subscribe!
Transcript for this Episode:
INTRODUCTION VOICEOVER: This is Tech UNMUTED. The podcast of modern collaboration – where we tell the stories of how collaboration tools enable businesses to be more efficient and connected. With your hosts, George Schoenstein and Santi Cuellar. Welcome to Tech UNMUTED.
SANTI: Welcome everybody to another episode of Tech UNMUTED. Today, I am here with a very dear friend and colleague from across the pond, Sam Husbands. Sam, welcome to Tech UNMUTED. Again, this is your second time, second appearance on the show.
SAM: Yes. Thank you so much for inviting me, Santi. It's a pleasure to be here. Got some really interesting stuff to go through today.
SANTI: We sure do. I can't wait for this particular podcast because we will be talking and looking at two different AI image-generating platforms, and we're going to be matching them up against each other and see what outcomes we get. We're talking about Midjourney, which is one that you're very familiar with. You use it quite a bit.
SANTI: Then DALL-E, which is a competing platform, which by the way, anybody can have access to DALL-E because they've embedded it inside of Microsoft Bing. If you go to your Microsoft Edge browser and you click on Bing, you'll see the co-pilot window, and guess what? Those images are being generated by DALL-E. This will be fun today. This will be really good.
SAM: What's really interesting is I've been using Midjourney for nearly a year now. We have a paid subscription, but obviously, the Bing is free, open to everybody. It's a good test today to see if it's something that I could switch to, to save both time and budget.
SANTI: That's a good point. I will say that it looks like it's still in preview, and I think they give you 30 prompts that you're allowed to do per day or something of that degree. I'm sure there's probably some advanced version out there. I agree. I think let's just use the tool as we have it. There are some pros and cons to both of these, like anything else. You get different results depending on which one you use. DALL-E seems to be what they consider best when you're trying to create something that's a consistent artwork to your exact requirements.
They feel that DALL-E is a little bit more consistent. I don't know, we're about to find out. When it comes to the cons, DALL-E will refuse to generate images of copyrighted characters, which I found very interesting. Not only that, it won't let you ask for the art style of any living artist. I found that fascinating that they have these restrictions built into DALL-E. I'm not so sure that these things exist in Midjourney. Do they?
SAM: No, Midjourney seems a bit more relaxed when it comes to requesting styles and inspiration. We'll put DALL-E to the test today, we'll see if it rejects some of our requests.
SANTI: My understanding is that Midjourney, as a pro, seems to be the best for expressiveness and that artistic flair. Again, we'll take a look at it and see if that's the case. Doing some research, apparently, it falls short when it comes to generating hyper-realistic images. I think if we're going to create something to really challenge these two things, we should probably be creating something that's hyper-realistic just to put it to the test.
SAM: Yes, we'll give it a try. Let me ask you one thing before we start. Now, as I said, I've been using this for a year and we have been creating some very abstract, artistic images, my background behind me is one of them. I'm torn on whether AI can be creative. Does it understand creativity, or is it just constructing based on those databases? What do you think?
SANTI: Just based on my understanding of the evolution of AI over the last 10 years and all these different modules, I'm going to say that AI has come to the point where there is a certain level of artificial comprehension.
SANTI: It is scary. It is scary. In other words, when machine learning first came out, it was pretty straightforward. It was a machine that would continue to fail and fail and fail until it finally gets a pass. When it passes, it would remember what steps it needed to take to pass to get the results that you wanted. It did that at a much quicker speed than any human could possibly do. Machine learning has been around for a while but it wasn't necessarily comprehending what was actually happening, it was just learning.
These new large language models that use natural language processing, that use visual computing, are actually to some degree comprehending what the prompt is, what they're actually reading. For example, if you give an AI an image, an AI that has a visual component to it, it will look at the image and comprehend, "Hey, this is an image of a dog. Oh, by the way, it's a bulldog." It will know what it's looking at.
I do think there's a level of comprehension, and I may even push the envelope here and say that maybe there's a slight feel or a level of some sort of artificial consciousness, honestly, because of the amount of comprehension and input that it's able to process. It's almost scary. I think it is comprehending.
SAM: That is very apt to one of the prompts that we're going to use today.
SANTI: Oh, that's good.
SAM: Do you think that crosses over into the AI understanding language? We're talking about art here, but if we think about it from an old creation or solving IT problems, do you think it now comprehends the question and the solution and it can write code?
SANTI: For sure. Code is nothing more than a language, it is. When you're talking about programming and coding, all you're really doing is you're writing something in a specific language that your system can understand and comprehend. These AI modules are absolutely learning the language. In fact, it will generate a unique set of codes based on parameters you give it. Maybe it's an output or an executable that nobody's ever thought about before, but it can do that because it absolutely understands the language.
This is why I'm saying that there is a certain level of artificial comprehension here because it is able to write something based on your input. It is fascinating, it really is.
SAM: I think that discussion probably mutes the copyright laws and the infringement from an art perspective because it is creating new art based on prompts, brand new, inspired by other people, yes, but it's the same as we get into the code side of things as well. This is fresh, new code and new problems that it's solving.
SANTI: It is. I think it's like anything else. I think artificial intelligence is able to learn different styles, different color tones, different approaches to art. It's able to learn these things, comprehend them, and then decide to use them when appropriate based on the prompt. I don't think it's plagiarizing. I think it's learned the style approaches from other artists in the past. It studies it. Again, we're back to that comprehension. It is comprehending that.
I think it's getting to be a point where honestly we're just touching the surface now, but we're at a point now where AI is finally at a point where we have a launch pad and now over the next several years-- It's going to happen fast because we broke all the barriers that were holding us back. Now, it's just how far we want to take it. Over the next 10 years, I think AI is going to evolve faster than any other technology we've seen in our lifetime.
SAM: That's a great segway into the prompt. Should we dive in and--
SANTI: Sure. Tell you what, why don't you share your screen, and you'll have both Midjourney. While you do that, what we're going to try and do today, because it's hard to rate these things. I think we just give it a scale of 1 to 10 when we look at these images. Sam, let's you and I take into account just a few things when we're trying to rate it. Let's look at realism. How closely does it resemble the real world or scenes? Let's look at the detail. Let's rate that. Let's look at color and saturation, maybe some creativity to your point.
How creative was it based on the prompt? I guess artistic value, which is really more of a-- Does it have that aesthetic appeal? We'll look at those 5 things and then you and I can just give it 1 to 10 and see what happens.
SAM: For anybody watching, you'll see that I'm in one of my Discord servers for art generation. You'll see some tests that I've done above. Just because of the topic, I thought it would be interesting to create some art that was generated or inspired by the movie Matrix.
SAM: I've got a couple of different prompts that should produce very different outcomes. The first is we're asking Midjourney to generate a hyper-realistic and lifelike software engineer who created the first AI machine. Surrounded by futuristic AI driven city, and then we want to give it some character, so we've asked it to be artistic, use the word noir to give it some moody feel. Then Cyberpunk because we know it's a very common form of art online and inspired by the movie Matrix as I said. To add some detail, I'm asking it to use an Unreal Engine 5 animation as the output.
That's it, let's see what that produces. Now, one of the things that I really like about Midjourney is the speed. Sometimes, if it's a really busy time, it could take a few seconds but as you can see here, the speed is already creating. It always produces a grid of four, so you have a choice.
SANTI: Sure. DALL-E does the same.
SAM: Oh, fantastic, okay. I'm looking forward to seeing if DALL-E has the same variability and able to change, upgrade, and expand, et cetera.
SANTI: I will tell you, right off the bat, that what I love about Midjourney is that you could see the process being created. The process, the image-- Whereas in DALL-E, it just shows you the image. It doesn't show you the process, it just, "Here's my results," so this is interesting. Is this one big image with four or can you click on these individually?
SAM: Yes. Once you expand and have a look, you choose your preferred option. They're all quite interesting, but I think we should go for the top right the first image that it produced. I'm simply going to upscale that image, and you have options. You can upscale by the power of 10 to make it incredibly high-res-
SANTI: Very nice.
SAM: -very detailed. You can also vary regions. If there's a small piece on there that you don't like, you can go into that image and you can just change that one particular region.
SANTI: Oh, that's awesome.
SAM: It's not a design tool, per se, obviously. When you're in charge of producing this art, the requests that come back to change this and the tiny detail possible, but it's every month or so. It's bringing out a new iteration where we can change. I would say that this has done a really good job of being hyper realistic.
SANTI: Yes, I like the shadowing on the faces. You have the lighting cast of shadow in certain parts of the face. I really like this. All right, you know what? Now, I'm excited. Let's go to DALL-E. I want to see this. Yes, so drop in the same prompt. Now, for those who are watching, basically Sam is inside of Microsoft Edge, which is Microsoft browser, and he just hit that copilot icon and just dropped in the same prompt into the text field. Just hit send.
SAM: I'm just going to ask you a question before I do because I've never actually used this before. It's the first time I've used DALL-E. Now, in Midjourney, it knows that it's going to be creating an image, so I don't have to say-
SANTI: Same here.
SAM: -it's a photo. Okay.
SANTI: Same here. Yes, it will comprehend what you're trying to do. Watch, you'll see. I asked myself the same question because I want to know, "Hey, should I be creating some type of--?" Well, look at this. This is interesting.
SAM: Very interesting.
SANTI: Tell you what, let's just stop responding, and then just add the prompt at the beginning, create an image and then drop in your rest. That's interesting. I have done it to where I just drop in the prompt without necessarily saying create an image, and it will know that I'm trying to create an image. Okay, it understood. [crosstalk] Well, look at this. High demand, can you believe it? This is what happens when you do a live podcast.
SAM: This is telling. We talked at the beginning around-- Is this something that we can realistically use on a free basis?
SAM: Now, if you don't pay for the premium upgrades in Midjourney, it has exactly the same problem. It will pick and choose based on usage, when you can generate, and you only have a very limited amount.
SANTI: Let's try it again.
SAM: It will change from second to second.
SANTI: Yes, let's try it again. See if it will give you a pass this time. Everybody decided to use DALL-E today. That's what-- Yes, interesting.
SAM: I'll tell you what we'll do is we'll jump back to Midjourney and we'll run a prompt that's very, very similar, but it's got a slightly different top and tail just to see the different output.
SANTI: Then we'll come back later and see if DALL-E's up for the challenge.
SAM: Exactly that.
SANTI: I really love this whole Midjourney AI platform. This is amazing.
SAM: Okay. For anybody not watching, the difference in prompt is instead of asking for a hyper-realistic and lifelike software engineer, we've asked for a cartoon software engineer. Instead of being inspired or utilizing Unreal Engine 5 as an animation tool, we've asked it to style like a Warner Brothers classic cartoon.
SANTI: Oh, nice.
SAM: Apart from that, the middle prompt is exactly the same.
SANTI: Got it, yes. You see, it has to comprehend what you're asking. I mean, it can't generate these things unless there's some form of comprehension in putting it together. Look how-- I love this because the image starts to appear in phases. This is what I love about mid-journeys. I can see the process of the image being created. Do you find that a lot of times the first one created is the best one?
SAM: I don't think I know the answer to that because we use all of them. There isn't generally one that's worse or better. Most of our requests are very similar for the style that we want. I think it's just pop up where it appears. Okay, so--
SANTI: Look at the details here.
SAM: The detail is amazing. The colors are incredible. Obviously, the prompts are quite simple and because the prompts are simple, it's ad-libbing more. Normally, we have prompts that could be five, six, seven sentences long. This is one, but that produces such an array of styles or colors, tones, and creativity. That's where it's being creative. When you give it less information, it's forced to provide bigger options. I think we should upscale top left.
SANTI: I agree.
SAM: Top left is much more colorful.
SANTI: Yes, I like that.
SAM: The engineer in the middle is more central, very similar to the hyper-realistic piece.
SANTI: That's what you do. You hit the U2 for upscaling the second image, and there you go. Wow, look at this.
SAM: Here, you can see the difference. You could vary it slightly that it might get rid of the birds in the background. We could vary it strongly and that might actually change the computers to service.
SANTI: Can you just bring up the image full size so I can see? Look at that. Oh, look at the details on the suits.
Listen, there is creativity, right? There is creativity there. This is a unique image, like there's no other image like this on the web right now.
SAM: Nothing, 100% unique. You can see, like me, for fans of the movie, The Matrix, you can see that there is inspiration drawn from it, but nothing exact. It has the right style, it has some of the landscape shots, but it's not ripped off from--
SANTI: Correct. I wonder if DALL-E's working. Let's go check out DALL-E.
SAM: Yes, let's go back in.
SANTI: Come on, DALL-E. This is your time to shine.
SAM: We go back to the original request, the realistic piece.
SANTI: Yes, see if it takes it this time. For those who can't see, basically we're getting a warning saying that it can't create the image right now due to high demand, so again, this is just DALL-E built into Microsoft Edge, right?
SAM: We are not in luck.
SANTI: We are not in luck, so while we would love to match up these two platforms, this free version of DALL-E is just not working, which is a shame because I think it would have been cool to see the differences. All right. Wait, it is generating something, isn't it? Scroll down. Oh, no, it's not.
SAM: I'm going to give it one more go.
SAM: Now, we all know that me being in the UK, it's three o'clock, it's everybody's downtime right now, so I'm surprised that it's saying no because this is the slow time, but of course, for you, it's first thing.
SANTI: What I'm going to do is, I'm just going to do a little test of my own. Why don't you-- I think you have a third prompt, right?
SAM: I do.
SANTI: Why don't you go ahead and create the third prompt on Midjourney, and let me see if I can get these images created on DALL-E while you do that. If I do have success, I'll just share my screen and show you the results.
SAM: Now, the third prompt, it's the same middle prompt, but we're asking for a pencil drawing of a software engineer, and to be inspired by a well-known pencil artist. I must admit, I'm not a fan of pencil artists, so I had to look up who was popular. We have Adonna Khare. I don't know much about the artist. We shall see if it picks up a certain style. I think it's quite easy to comprehend how you could go from lifelike to cartoon because we've had filters that have done that for years. Going from lifelike to a detailed pencil drawing, not an outline drawing, a detailed pencil drawing is a very, very different style of generation.
SANTI: Wow, look at this. That is unbelievable. Look.
SAM: Very nearly done.
SANTI: Oh, my goodness.
SAM: The detail in these is magnificent.
SANTI: No, this is mind-blowing. Look at the detail. Look at the tones and the edges. Wow. Is that a hand?
SAM: Yes, let's upscale the first one.
SANTI: Yes. By the way, DALL-E is working for me. I will have images to show, and I'm actually going to go ahead and drop in the other prompts while you walk us through Midjourney, and then I'll switch screens and show everybody what I have but it is very interesting. The results, I think you'll be surprised.
SAM: Okay. Over to you, Santi.
SANTI: All right.
SAM: Let me switch to share. Bring my screen up.
SANTI: I dropped in the exact same prompt you did, right? Hyper-realistic and lifelike software engineer who created the first AI machine surrounded by a futuristic AI-driven city. Exact same prompt. These are the results that I got. This is the first image.
SAM: It uses text.
SANTI: It does use text sometimes.
SAM: Big difference.
SANTI: I know, look at this second image. Look at the third. Look at this one. But here's what I find interesting. There is a woman.
SAM: That is wonderful to see. There's been some big debates around Midjourney and the fact that it's stereotypes.
SANTI: Maybe because we said software engineer. We didn't say anything about a male or female, and it threw in a female software engineer into the mix. Look at the details. This is absolutely beautiful. That was-- Look at this one. The second one, right?
SAM: Instantly, I think the feel is a bit grander, a bit more movie-esque. They all feel like they would be amazing front covers for a sci-fi movie.
SANTI: Interesting. Look at this one. With the skull on the back of the jacket. It's just fascinating. All right, so anyway, so that was your first prompt. Let's see, it should have generated the image for the second prompt already because I had-- yes, and sure it did. Here is your second prompt. Now, the second prompt was the cartoon one, right? Cartoon software engineers. It says here, look at the response DALL-E has given me. I see you have changed your prompt. I'll try to create a new graphic art based on your new prompt. Here's the first one. Again, look at this. Very different.
SAM: Very different.
SANTI: Very different from the Midjourney. Look at this one.
SAM: I think that comes down to its interpretation of the word classic when we've asked for a classic Warner Brothers style.
SANTI: Look at this fourth one. Look at this fourth one. Now, in this case, it did give us four men. There was no female in this one, which I find interesting. Well, you know what? Now, I'm really curious. Let me go grab the third prompt. The one you just did with the pencil. Let me just grab that real quick. Let's drop it in here. By the way, as you can see, I'm not really prompting it to create an image. It knows I'm trying to create an image. I no time did it. It's interesting how you had a different result than I did. Now, you get to see the generation process, right, how it generates it?
You'll see that it's not as exciting as Midjourney's, right? It gives me a little prompt, a little response. See?
SAM: That's interesting. It's actually given a description of the artist in copy and text form that we have not asked for. It's also, looks like it's showing us to her website. It's giving us more detail about her, but what an amazing piece.
SANTI: Fascinating, right? It's fascinating. Now, see, this is what I'm talking about. I don't see the images being created. I just get this placeholder that says, "Hey, we're generating your image." To me, it's not as exciting. I'm just waiting for it. With Midjourney, you see it, like the layers coming in, right? It's a different experience. I do love the fact that it gave me the extra history and background on the artist that we selected. That is just fascinating. Now, she won several awards, exhibited her drawings in many galleries and museums. it just gives you this amazing response. Oh, look. Sam, look. Another female. Look at the details.
SAM: Now, interestingly, I think this looks more like a hand-drawn pencil by an incredible artist, albeit, than it did on Midjourney.
SANTI: That is true. Look at this one.
SAM: Same, yes.
SANTI: It really does look like somebody sat down and did it by hand.
SAM: I think this has definitely added another tool to our creative armory.
SANTI: Yes. As long as it's not being overwhelmed by others using it.
SAM: I think the piece that sticks out for me is that the difference, same prompt, different outcome. When you're using all AI tools and you're-- let's go back to the code discussion, you need to create a piece of code to fix a certain problem.
SAM: How different is one piece of code generated in GPT going to be from another piece of code that gets to exactly the same answer, but in a different tool? Obviously, with art, it's a bit more creative. It's not as linear as it is. There's probably a lot more options, but that could be an interesting thing to test next.
SANTI: It is fascinating. If I had to rate the overall experience of Midjourney, to me, is a better experience than using DALL-E inside of Bing. Just the overall experience, the whole platform, the way it layers in, the final image, you can see it being produced before your eyes. When it comes to Bing, I love the fact that it gives you a response because in all, and with all prompts, it gave me a response. I said, "Hey, let me work on that." "Oh, by the way, that artist that you want me to mimic, let me tell you a little bit about her while I'm creating your image." I think that's just phenomenal, right?
SAM: Yes, very good.
SANTI: I love that. From an experience standpoint, I'd give them both like a pretty high, eight, but for different reasons because I enjoy them differently. I do find Midjourney to be a little bit more engaging, maybe, the whole process.
SAM: We do have to bear in mind that this is the paid Midjourney experience.
SANTI: Yes, that's true.
SAM: In the free version, it's less, and you have to use public servers and it is a different experience.
SANTI: Good point.
SAM: I can't wait to see what DALL-E is like as a paid experience as well. I think today has shown us that the majority of requests, you will be able to use DALL-E and you will be able to get roughly what you're looking for. It'd be interesting to now spend some time in it to see how you can change individually but it's also given me enough interest to go back into the paid version if there is such a thing anymore and investigate.
SANTI: From a realistic standpoint, I think they both did a great job. I think they were both equally as good from the realistic aspect, the artistic aspect, the creativity aspect. I really think that they're at par. They were just different. Even though it was the same prompt, the result was just different. I have a hard time saying, that one is better than the other. I don't know what your take is on that.
SAM: Let's ask for some comments. I think anybody watching or listening, if you have a second to tell us which one you think produces better images, we'd love to hear. I'm similar to you, Santi. I think both have some amazing pros. I'm not seeing too many cons.
SANTI: No, not really.
SAM: Because they're just producing really eye-catching art.
SANTI: Even though DALL-E did create pencil art that really does look like somebody sat down and did it, whereas Midjourney's pencil art feels more computer-generated.
SANTI: They were both fascinating. Again, just because they are different, I can't say that one was better than the other because the quality of the art-- This is a great episode, but at the same time, it's hard because I can't really scale it. I can't say, "Hey, yes, I'm going to give them a five." I feel like they both deserve a 10. It's just that they're different. It's a different experience, right? I almost feel like, to your point, if you're really trying to find the right image, I think you need to use both platforms. Knowing that you're going to get different outputs, similar but different.
By the way, I do love the fact that we didn't specify any gender. We just said a software engineer and DALL-E twice gave us a female, and I find that fascinating.
SAM: I think that alone is worth an extra point because if it's doing that there, where else is it doing the same thing? That's a real bonus.
SANTI: DALL-E gets an 11 on a scale from 1 to 10. All right. Listen, this was a lot of fun. We'll do some post-production so that we can bring the images up for folks to actually see and admire. In fact, I'll tell you what, I'll bring the images. I'll lay them out as follows. We'll put all the DALL-E images on the left side of the screen. We'll put the Midjourney images on the right. By all means, in the comments section, tell us what you think. Which would you choose? Which platform do you prefer? Is there something that you saw or a takeaway that you had that we didn't cover that you want to mention? We'd love to hear from you.
Sam, I really want to thank you for this. This is awesome. I love the fact that you do this for a living. I think it's fascinating that this is part of your job. I really appreciate you coming on today. Folks, this is a good time to remind you to subscribe. Subscribe to Tech UNMUTED on your favorite podcast platform. That means all your audio platforms as well as YouTube so that you can see some of these visuals that we're referring to. Until next time, stay curious, stay connected. See you next time.
CLOSING VOICEOVER: Visit www.fusionconnect.com/techunmuted for show notes and more episodes. Thanks for listening.
Produced by: Fusion Connect
Tech UNMUTED, the podcast of modern collaboration, where we tell the stories of how collaboration tools enable businesses to be more efficient and connected. Humans have collaborated since the beginning of time – we’re wired to work together to solve complex problems, brainstorm novel solutions and build a connected community. On Tech UNMUTED, we’ll cover the latest industry trends and dive into real-world examples of how technology is inspiring businesses and communities to be more efficient and connected. Tune in to learn how today's table-stakes technologies are fostering a collaborative culture, serving as the anchor for exceptional customer service.
Get show notes, transcripts, and other details at www.fusionconnect.com/techUNMUTED. Tech UNMUTED is a production of Fusion Connect, LLC.