I created this ‘Action Figure’ trend, and I have some thoughts.

Friday 11th April 2025




Imagine walking into a toy store and spotting yourself - boxed up, glossy-eyed, and surrounded by your daily essentials. That’s the latest AI image generation trend making waves across social media. People posting pictures of themselves, or famous people as Action Figures created using ChatGPT.

OpenAI’s image capabilities using DALL.E has always felt good but not great compared to other image generator models. MidJourney has always been a front-runner, and for my own projects, I’ve steered towards Leonardo.ai. However, with the recent update to their model, OpenAI’s image generation capabilities have now become immensely powerful and positioned itself as a strong contender in the image generating space.

Every generation always starts with a good prompt and I opted to put more detail in my prompt to produce an incredibly specific result.

Create a photorealistic action figure of the person in the photo. The figure should be full-body and placed inside a clear plastic box with a colourful cardboard background, just like real collectible toys.

Make the packaging look as realistic as possible - shiny plastic, a top hook for hanging, and toy-store-style design.

The figure should have dark brown hair, and be wearing a purple suit jacket and a white top, with purple suit jeans, and smart black shoes. They should be wearing purple glasses.

Place accessories next to the figure that reflect its style and image:
This should include:
- A PowerPoint clicker
- A laptop
- A microphone
- A rucksack
On the box:

At the top, write in large letters: Tommy Hills
Below that - description: Performer

Make the image as realistic as possible — as if it's a real toy you'd find in a store.

There are a few key choices in the prompt that made a noticeable difference to the result:

The other requirement? Supplying the model with reference images.

I snapped a quick portrait and added several shots from my performing portfolio. I uploaded these images directly into ChatGPT with the prompt, using the image upload feature on desktop. This gave the model a solid anchor when shaping the action figure’s face - at least in theory.

And then it was time to generate. It’s at this point I noticed how long the model takes to produce its image. It took around 5 minutes to generate which is bar far the longest that any image model I’ve tried has taken. But here were the results:

Action figure generations from ChatGPT

Successes

Struggles

Oddities

Now this was a really strange part of the generations. In the prompt, I outlined four items A laptop, a microphone, a PowerPoint clicker and a rucksack and all three generations five objects were added into my action figure. In the first two generations, I have two rucksacks and in the second I have two laptops. I question why it is sticking so rigidly to having five objects, when I have only asked for four. Perhaps there is some ‘template’ or reference in the dataset it is using that requires five objects?

End

So here’s my take on this trend and my thoughts about it. The improvements that are being made to image generators are making it really easy to get a fantastic result.

I’ve already started to implement using OpenAI’s image generation into my own projects with fantastic results. I’m excited to see what the next AI-trend is going to be. My personal hope is for a PopVinyl’s trend!

Pop Vinyl generation from ChatGPT