Skip to main content

Illustrating Dystopian Future Poems with Stable Diffusion and Dall.e 2 AI's - Comparing Directed with Non Directed Output

Her addiction was the feeling of scoring a bargain. Image by Hugging Face Stable Diffusion Demo based on a prompt by David Arandle (TET).
Her addiction was the feeling of
scoring a bargain.

Image by Hugging Face Stable Diffusion Demo
based on a prompt by David Arandle (TET).

Far from being fearful of AI image generators like Dall.e 2 and Stable Diffusion I feel a new skill set is emerging for artists who embrace the technology. Chiefly among those skills is the ability to write 'quality' text inputs, and the ability to curate the best outputs i.e. images, if they are being fed into a bigger project or art piece.

Case in point. One of my FB friends shared an article by Jesus Diaz, AI was made to turn David Bowie songs into surreal music videos, in which YouTuber aidontknow fed the lyrics of Bowie's Space Oddity into Midjourney AI and then curated the resulting images into a video clip for the song (embed below). 

While aidontknow says they used minimal changes to the lyrics, such as to clarify characters being spoken about in the song, from my experience you don't get quite that cohesive range of images without also suggesting an art style and give more detailed direction such as camera shots etc.

Regardless, this inspired me to revive one of a series of dystopian future poems I wrote between 2005 and 2006 dealing with the human condition, virtually reality consumerism, and AIs. The poem is titled, Rachel. The video of me reciting it, is actually the very first entry in this blog (because this blog was initially going to an art piece telling the story of those poems. If you read that post it's actually the start of a story rather than a blog post).

One at a time I entered each line from my ten line poem into both Dream Studio's Stable Diffusion AI and Dall.e 2's AI with no modification to the lines other than appending; An Oil painting the style of Blade Runner the movie. Wide angle lens to the end of each prompt. This was to give every image a unified look and, when I wrote the poems, I always imagined a Blade Runner style to the art. 

You can see this Blade Runner influence in the digital art  image I created for another of the poems (below) called, Stealing. This image is the first and only complete artwork I made at the time.

Stealing. One of nine dystopian future poems written by TET in 2005-2006.
Stealing. One of nine dystopian future poems written by TET in 2005-2006.
Art by TET

Incidentally you can read more about my whole concept, and read another of the poems, The Fabulous Machine, in my TET Life blog article, Virtual Reality Addiction Meets Online Shopping and Death! I digress.

Feeding my ten lines into both AI's, for each line I generated eight images with DreamStudio (which was free at the time) and four with Dall.e 2, which I only had enough free credit left to enter nine of my ten lines.

I curated the best images from both AI's into a video presentation that includes the poems words, plus YouTube library music and other sound effects. Hopefully it gives you a sense of the poem, its mood, and what it's trying to convey.

Most of the more polished, cleaner looking, images are by Dream AI, which tended to fixate on the neon lit darkness of the city depicted in Blade Runner. While the more painterly images are Dall.e 2's, which must feel the textured oil paint look is the definition of 'oil painting'.

At this point I thought I was going to finish the project but then I started to wonder, how would my video presentation look with more directed images that actually describe more of the type of image I had in mind for each line of the poem?

For example, for the first line of the poem I originally entered this prompt:

Rachael patched in a circuit wired to her brain. An Oil painting the style of Blade Runner the movie.

For my second video presentation I entered this prompt:

Rachael, sitting on her bed, wearing a VR headset wired to a computer in her cyberpunk style bedroom, patched in a circuit wired to her brain. An Oil painting the style of Blade Runner the movie. Wide angle lens.

As you can see, a lot more detailed and not the exact line verbatim. Below is a side by side comparison of what I feel is the best produced image for each prompt, both generated by Dream AI.

Side by side comparison of Dream AI's output with an unedited, direct interpretation of the first line of my poem used as a prompt on the left. On the right the prompt included more description, along the lines of what I had in mind for an image when I wrote the poem.
Side by side comparison of Dream AI's output with an unedited, direct interpretation
of the first line of my poem, used as a prompt on the left. On the right the prompt
included more description, along the lines of what I had in mind for an image,
when I wrote the poem. 

It's entirely subjective on which image is a better interpretation of the first line. Especially as what I envisioned in my head is not necessarily the same vision anyone would imagine, reading my poem for the first time, because no one else has all the additional context I do.

Below is my updated video presentation, using the best images generated with my more detailed input prompts (same music and sound effects just to save time).

Note that I didn't use Dall.e 2 this time because I didn't want to pay for more credit. I also didn't use Dream's AI directly either for similar credit issues (I'm not like Rachael, spending all my money on zeroes and ones for fun). Instead I used Hugging Faces demo version of Stable Diffusion which is slower but essentially the same AI with a few less settings, and completely free at the time of writing this. (Insert rant here about all these AI's putting up pay walls rather than going the free, ad supported route).

Anyway, what do you think of my second video presentation?

Creating works like this really does show that the human element of generating quality prompts is very much a skill to be learned, as is the curation of the output. Not every prompt produces the results you are hoping for. Particularly if the AI fixates on the wrong part of a prompt as the main subject to highlight.

Several times in my second presentation I completely scrapped detailed prompts that I thought should get good results but were just producing garbage (never more is the computing quote "garbage in, garbage out" personified than with text to image AI generators).

As I said in my previous musing on AI's, Is Your Next Design or Writing Partner an AI?, these algorithms do not actually think for themselves. Even if you were to use a writing AI to randomly generate prompts for an image AI neither would have any concept of the output as an abstract concept, or how that concept might relate to other prompts. The human element is still key in getting the best images.

I'm tempted to try this with all my poems in this series. It seems very appropriate to use AI to generate images for poems about AI and how humans are finding more ways to hook themselves into 'the machine' for longer and longer periods at a time (not to mention the rise of corporate money machines, passively draining your bank account - did I mention all the text to image AI's being put behind paywalls yet?).

One of my original concept sketches for Stealing drawn alongside the poem in 2005.
One of my original concept sketches for Stealing
drawn alongside the poem in 2005.
I guess the ultimate experiment would be for me to execute the project in the way I initially envisioned back in 2005, with a combination of digital collage images mixed with my own hand drawn sketches. I'd also need to write the accompanying narration that links the poems together. Which I think was a kind of future noir detective story. Not sure because my first blog post in this blog is the only part of that I actually wrote.

The question is, now that I've been influenced by AI text to image generators, could I even produce what I had in mind back in 2005?  


Popular posts from this blog

Should You Buy or Upgrade to MOHO 13? *Spoiler* Yes. Yes You Should!

MOHO 13's New Character Set. Smith Micro released MOHO 13 , their all in one, 2D animation studio, this week. The question is should you buy or upgrade to the latest version? Obviously I've already spoiled this in the title, so the actual question is why do I think you should buy or upgrade? To be clear, I'm only talking about MOHO 13 Pro. If you're considering MOHO 13 Debut be aware that you're missing out on some of the new features, and a lot of existing features that are only available in the Pro version. Debut is fine if the budget doesn't stretch to Pro, but, if you never want to be disappointed about not having a feature, it's Pro or nothing! The other thing I need to be transparent about is I'm not, by any stretch, a frequent MOHO user/animator. However I took the time to learn MOHO 12 Pro fairly extensively, blogging about my process and sorting out 104 free MOHO training videos into a logical viewing order in the process. I think I ha

AE Juice - Animation Presets, Motion Graphics, Templates, Transitions for After Effects, Premiere Pro, and Other Video Applications

Level up you video edits and animations with AE Juice's motion graphics and templates. Some days you just don't have the time to create flashy motion graphics for your latest video or animation. For some of us it's more a question of our own artistic abilities being a little less than the awesome we'd like them to be. Whatever reason a resource like AE Juice's animation presets, motion graphics, templates, and transitions packs for After Effects , Premiere Pro , and other video applications can really make your work stand out very quickly. AE Juice gives you access to an instant library of free, premade content elements and sound effects, which you can add to with additional purchases of various themed packs from their store. There are three ways to manage their content, all of which can be used in commercial projects . The AE Juice Standalone Package Manager makes it easy to browse previews of all your pack contents and to download and find just the elements yo

Review: CrazyTalk Animator 3 vs Moho Studio Pro 12

Reallusion's CrazyTalk Animator 3 or Smith Micro's Moho Studio Pro 12. Which of these 2D animation applications is right for you? Regular readers of this blog will know I'm a strong supporter, and fairly proficient user of CrazyTalk Animator since version 1. It's a great piece of software for producing 2D animations from purchased content quickly and, with version 3, is easier than ever to create animations from your own art. Lesser known is that I first purchased Moho Studio Pro 12 (then known as Anime Studio Pro 9) back in October of 2012 and have been upgrading it to the latest version ever since because I believed in it as an application for creating great 2D animation to TV quality standard. As such, it's a much more complex application than CTA3 that I only got around to learning properly late last year. I'm still in the process of blogging my progress . Despite this I feel I've learned enough of Moho to compare it to CTA3 to help you determ

Can You Learn Reallusion's Cartoon Animator 5 for Free Using Their 137 Official YouTube Video Tutorials Sorted Into a Logical Learning Order?

Or you could just buy The Lazy Animator Beginner's Guide to Cartoon Animator . While Reallusion's Cartoon Animator is one of the easiest 2D animation studios to get up and running with quickly, learning it from all of the official, free, video tutorials can be more overwhelming than helpful. With more than 137 videos totaling more than 28 and a half hours of tutorials, spread across three generations of the software (Cartoon Animator 3 through 5) it's hard to know if what you're learning is a current or legacy feature that you either need to know or can be skipped. Many of the official tutorials only teach specific features of the software and don't relate at all to previous or later tutorials. As a result there are many features either not mentioned or are hard to find. To make your learning easier, on this page, I've collected together all of the essential, official, free video tutorials and sorted them into a learning order that makes sense. Simply start at

Make Disney/Pixar Style Characters with Reallusion's Character Creator and Toon Figure Bases

The Extraordinary Tourist Classic Coat outfit created using Reallusion's Toon Designer for CC3. I've talked before how I've wanted to get into 3D Disney/Pixar style character animation since I first saw the animated cutscenes for the very first Tomb Raider game back in 1996. It's why I initially bought Reallusion's iClone 3D studio app as soon as I could afford a computer that would run it. But then Reallusion released their 3D Character Creator (CC) for iClone and I wanted to create my characters with that (and I did try with Bat Storm ). But the focus of CC was realism, even with ToKoMotion's stylised body morphs . Now with Reallusion's Cartoon Designer bundle for CC3 which features two packs, Toon Figures , and Toon Hair , designing Disney/Pixar style 3D characters just got a whole lot quicker. The two packs are the bare essentials for creating Toon style characters. Five body morphs (2 male, 2 female, and one adolescent body morph that w

TimeBolt: Fast Video Editing for Anyone Creating Online Courses, Podcasts, or Vlogs.

I resisted making tutorial videos for a long time because I don't like editing. Specifically I don't like editing me teaching as I step students through a process during a screen record. I have a tendency to insert long pauses not just in the middle of sentences but between multiple words in the middle of sentences as my pace matches what I'm doing onscreen. This makes for very long and very slow paced video tutorials. To counteract this I have to edit out all the pauses. This can take hours, or even days on particularly long tutorials. For example, when I created my main course, The Lazy Animator Beginner's Guide to Cartoon Animator , I literally injured the thumb on my right hand, operating my mouse, as I spent weeks taking out all my pauses (seriously, I had to wear a thumb brace for a few weeks to fix the pain). Recently I came across TimeBolt , a very affordable, fast editing application with the featured purpose of removing all the pauses from your video (and even

Creating a G3-360 Head From a Single Photo in Reallusion's Cartoon Animator

Source Photo from Generated Photos . Ever since Reallusion introduced the G3-360 Character Head into Cartoon Animator 4 I've wanted to see if their 360 Head Creator tool could be used to create an animated head using a photo. Part of the reason I've never given this a shot, until now, is that I just assumed it would be difficult, and require a lot of photo editing to blend out the sprite edges. It turns out, creating a photographic G3-360 head is not that much more difficult than creating a cartoon head, and can be done using a single photographic image using my own G3-360 head rigging system . While this article isn't intended to be a full tutorial, I'll run through the basic steps of how I achieved my photographic G3-360 head, shown in the comparison below, of a Cartoon Animator Morph-based head on the left, and my G3-360 head on the right. Pros and Cons Cartoon Animator's morph-based head system is ideal for animating photographic faces. It uses a semi 3D wire me