2026-05-034 min read

Prompting GPT Image 2

OpenAI's image model is the king of text-in-image. A short guide to posters, signs, menus, and the tricks that make it sing.

There are three reasons to pick GPT Image 2 in the FluxGen model picker:

  1. You need real text in the image — a sign, a poster headline, a menu, an infographic.
  2. The image has many components and you want them arranged sensibly.
  3. You're willing to wait a bit longer for a more deliberate result.

GPT Image 2 thinks before it draws. It breaks your request into parts, decides how those parts fit together, and then generates an image that reflects that internal plan. Most prompt techniques here exist to help that plan land where you want it.

Prompt structure

OpenAI recommends one ordering: scene → subject → details → constraints. It maps to how the model plans:

For complex prompts, short labelled segments or line breaks beat a single paragraph. The model reads structure.

Text rendering — the actual killer feature

This is where GPT Image 2 leaves every other model behind. Three rules to get the most from it:

Quote the exact text. Always wrap the string you want rendered in straight quotes:

Billboard text (EXACT, verbatim): "Fresh and clean". Typography: bold sans-serif, high contrast, centred kerning.

Specify typography as a constraint. Don't describe the mood; describe the type. Bold sans-serif. Heavy weight. White on black. Tight tracking. The model treats these as targets.

Use medium or high quality for any small text. Low quality can produce legible text on big posters but often blurs body copy on slides, menus, infographics, or anything dense.

A prompt that combines all three:

A pitch-deck slide titled "Market Opportunity" in the style of a Series A fundraising deck. Layout: TAM / SAM / SOM concentric circles on the left; three short bullets on the right with believable market-sizing numbers. Typography: bold sans-serif headline, regular weight body text. Constraint: no logos or trademarks.

That's a slide GPT Image 2 will produce competently. Most other models will produce something that looks like a slide but with garbled text.

Picking a quality tier

A useful workflow: rough out the layout on medium, then re-run the same prompt on high once you're happy.

Editing and reference images

GPT Image 2 accepts reference images. The single most important habit when editing:

Restate the preserve list every iteration.

Change the bottle label to read "MAGNOLIA". Keep everything else the same: the bottle shape, the cork, the lighting, the background, and the surrounding glassware. No watermark, no extra text.

If you skip the preserve list, the model drifts — backgrounds shift, lighting changes, supporting objects rearrange. The preserve list is not optional.

When you have multiple reference images, refer to them by index and say how they interact:

Apply the typography style from image 1 to the layout in image 2.

Things that don't work

Detailed camera specs. 85mm f/1.4 with chromatic aberration is interpreted loosely. Use camera language for vibe, not for physics.

Brand and IP names. "In the style of Studio Ghibli" either gets refused or drifts. Describe the aesthetic instead: hand-drawn animation, soft pastel palette, painterly cloud textures, gentle magical realism.

Long paragraphs of text inside the image. Even GPT Image 2 has limits. If you need a full paragraph of body copy, generate the layout with a placeholder block and overlay the text later.

When to choose GPT Image 2 over Nano Banana

Nano Banana is faster and excellent at photographic scenes. GPT Image 2 wins decisively the moment text accuracy or layout reasoning matters: posters, slides, menus, comic panels, UI mockups, infographics — anything with structured copy. If your image has words in it, this is the model.

Two-line summary

Write the constraint-and-text parts as if you were briefing a designer. Run small text and dense layouts on high quality.

The rest is iteration. Keep the prompt mostly stable; change one thing at a time.