How (not) to Use ChatGPT's New Image Generation Features

David Recine
May 1
5 min read

A crayon-colored children's stick figure drawing of a woman, looking anxious, next to the text "Welcome to the AI anxiety club!!" — You're not alone.

I conducted an experiment: Who can make an eye catching image faster: ChatGPT, or a human? A few weeks ago, this would be no contest. It would be the human all the way! But times change quickly in the AI "Wild Wild West" that we're living in. A little over a month ago, ChatGPT unveiled its new image generation model. And it is definitely a smoother ride than the previous image-gen functionality. But don't just take my word for it. The Verge reports that Adobe and Figma, two of the biggest names in image creation and graphic design, are both adopting OpenAI's new image creation model.

Before we get to the experiment, it would be helpful to take a closer look at the current popular perception of AI imagery.

People Dislike AI Images

One of the biggest problems here is that there is simply no mistaking an AI-generated image. This is bad, because scholarly studies (such as this one from MIT), show that audiences have a strong bias against content they believe is AI-generated.

This phenomenon is borne out anecdotally as well. As a Wisconsinite, I personally witnessed this bias in action on my Facebook feed in April. After the 2025 Wisconsin Supreme Court election concluded and the candidate backed by Elon Musk lost, a political image began to make the rounds: a huge piece of cheese crushing a Tesla in front of the Capitol Square in Madison. To my surprise, my friends who shared the image didn't express political anger-- instead they expressed anger and irritation that the image might be AI! As I discussed in my LinkedIn post on this matter, it wasn't AI, and was in fact created by a fairly prominent and talented artist. But even the perception of AI can cause people to dislike an image. (You'll find the picture itself in that LI post.) Irritation is still engagement though, right? That non-AI image made the rounds! But I would argue that the reason people engaged so much with the image was because it wasn't AI. Even in the throes of AI anxiety, audiences are instinctively drawn to a human artist's unique "voice."

Imagery is the Sloppiest Part of AI Slop

Much complaining has been done about AI Slop. The UK Guardian has even characterized it as a full-on menace to society. Interestingly, when people complain about or fight AI slop, they tend to target imagery. The Washington Post wrote a very damning article about fake movie trailers made with AI. (Full disclosure: I personally like those.) And Pinterest, one of the world's top image sharing websites, just announced measures to reduce the presence of AI slop in their system.

AI Images are Bland Images

Everyone's using AI images these days! This is understandable; with ChatGPT's late-March image generation model rollout, it's easier than ever to make a passably good AI image.

Now, if only a few people had access to this, it would be their content marketing secret weapon. Their images would be intriguingly different from everything else out there.

But since everyone has AI, what we've got in our social media feeds and ads are pictures that all look the same!

To fully understand why this is a problem, think of a human artist or studio you like. Maybe Dr. Seuss, or Charles Schulz, or Banksy. Now imagine that the majority of the posts and ads you saw were done in that style. Even if you loved the style, you'd probably start tuning it out. It would certainly become far less special.

If everyone could do this, how boring would that be?

Still, the appeal for content designers and companies is undeniable: AI can make a passably good image far more easily and quickly than a human can, right? Let's find out.

The Experiment

I decided I wanted an image for my most recent LinkedIn post (as of this writing). I started a timer and grabbed the nearest art supplies in my office: a college-ruled notebook, a ballpoint pen, and a baggie of crayons and colored pencils that I hadn't used in ages. In several minutes, I'd created this:

In a few more minutes, I'd scanned the image and added a caption in MS Paint:

A crayon-colored children's stick figure drawing of a woman, looking anxious, next to the text "Welcome to the AI anxiety club!!" A typset, stencil style capton reads "Today's Topic: AI generated images."

No prompt writing, and I didn't even truly prompt myself. While I knew the theme of my post, I started to draw with no plan as to where the drawing would go. In just shy of ten minutes, I made something eye catching and brimming with personality. I uploaded it to LI, added some text, and made a post that was reasonably well-received. Next, I went to ChatGPT. Using the same ethos, I prompted with an open, exploratory mind. I gave ChatGPT only the very basic idea behind my drawing, prompting it as follows:

PROMPT:

Create image of an anxious looking woman, with the words "Welcome to the AI Anxiety Club"

And this is what I got:

A woman in an organge sweater, looking anxious, next to typefaced text "Welcome to the AI Anxiety Club."

Not as expressive, duller colors, and it looks unmistakably "OpenAI." This is a lot less likely to catch a viewer's eye, compared to my hand drawn version.

Sill, generating this only took a minute! With careful prompting, could I get something that had the personality of my hand-drawn image? Several iterations and several minutes later, I had this:

A full body image of a woman in jeans and an orange sweater, colored in crayon textures, whith crayon style text that reads "Welcome to the AI anxiety club"

Not yet what I intended or hoped for, and about as "AI-bland." So I continued to work toward the quality of my hand drawn image. Here is what I got in 15 minutes, compared to what I'd drawn by hand:

Two stick figure drawings of anxious looking women and the text "Welcome to the AI anxiety club." The caption under the image on the left reads "Fig 1: Human-created (9 minutes of drawing and scanning)." The Caption under the image on the right reads" "Fig 2: AI-created (15 minutes of prompting and iteration)."

I persisted nontheless. In a little over 20 minutes, this is what I'd iterated on ChatGPT:

A crayon style sitck figure image of a woman looking anxious. In crayon style lettering, there is text that reads "Welcome to the AI anxiety club" in red. This text is followed by blue typset words that say "Today's topic: AI Generated images." Below that is red crayon text that reads "Today's topic:"

Let's again compare that to my hand-drawn, MS-Paint-enhanced drawing, which took a few seconds shy of 10 minutes:

If Everyone Drew Stick Figures, Wouldn't That Be Bland Too?

My answer to that would be no. Why? Because no one draws stick figures quite like I do. Or quite like you do. Or quite like anyone else does. If we all did stick figures, we'd still each be able to carve out our own unique content "voice" in a way that AI slop images can't. But we don't all have to do stick figures. We can take photos, or download royalty-free ones. We can iterate in a Canva or Figma template more quickly and freely than we can with AI. I would argue that AI is simply not the best option for images that will draw in audiences, lead to clicks, or drive demand gen and sales conversion.

AI'S Fundemental Problem: Art is Not Like Writing

Here's why I think AI may never arrive when it comes to image generation: It's much harder to fix an LLM's artisitc shortfalls, compared to its writing shortfalls. Do the em dashes in your AI prose feel too robotic? Are a few turns of phrase too trite? You can retype those trouble-spots in just a few minutes. Not so with an LLM's artwork! To smoothly redraw what you don't like in a ChatGPT iteration, you need specialized software, and certainly more than a few minutes' time. So I say leave art to the humans. It's faster and it's better. And even if it weren't... why let the LLM have all the fun?