Prompting Nano Banana
Google's image models read sentences, not keyword lists. A short guide to working with Nano Banana 2 and Nano Banana Pro inside FluxGen.
Nano Banana is Google's image model — Nano Banana 2 (fast) and Nano Banana Pro (sharper) both live in the FluxGen model picker. They share one core trait that sets the tone for every prompt you write: they read sentences, not keyword lists.
A prompt that works on a tag-based model — cyberpunk, neon, rain, 4k, masterpiece — produces noisy, conflicted output here. A short paragraph almost always does better. Describe the scene as if you were briefing a photographer.
The five-part formula
Google's own guide names five elements that, together, cover almost any image:
- Subject — who or what is in frame, with one specific feature
- Composition — close-up, wide shot, low angle, portrait
- Action — what the subject is doing
- Location — where the scene takes place
- Style — the overall aesthetic
You don't need to label them out loud. Just make sure each one shows up somewhere in your sentence.
A working prompt:
A close-up portrait of an elderly Japanese ceramicist with deep, sun-etched wrinkles and a warm smile, inspecting a freshly glazed tea bowl. His rustic workshop is sun-drenched. Captured with an 85mm portrait lens, soft golden-hour light through a window. Serene, masterful mood.
Subject. Action. Location. Composition (close-up + 85mm). Style (golden hour, serene). One beat per phrase.
Pick the variant by intent
- Nano Banana 2 — fast at lower resolutions and broader on aspect ratios, including extreme panoramas. Use it for exploration, drafts, and high-volume work.
- Nano Banana Pro — slower, but with much lower error rates on small text and fine detail, and stronger at typography-heavy compositions. Use it when you've decided on the shot and want the polished version.
A common workflow: iterate ten times on Nano Banana 2 until the composition is right, then switch to Pro for the keeper.
Editing a photo you upload
Both variants accept a reference image. The rule for edits is simple: say what to change, say what to keep.
Using the provided image of the living room, change only the blue sofa to a vintage brown leather chesterfield. Keep the rest of the room — pillows, lighting, floor — unchanged.
That second sentence does most of the work. Without it, the model drifts: walls shift colour, the rug changes, lamps move. Naming the preserved elements pins them in place.
For multi-reference prompts, assign each image a role:
Use image 1 as the pose reference and image 2 as the painting style. Render the figure from image 1 in the style of image 2.
Things that don't work the way you'd expect
"No cars." Negative phrasing is unreliable. Replace it with a positive description of the absence: "an empty deserted street with no signs of traffic."
Style stacking. "Watercolor + cyberpunk + Pixar" produces mush. Pick one dominant style and let the others appear as small accents at most.
Vague text in images. If you want specific words rendered, quote them: the sign reads "URBAN EXPLORER". Without quotes the model paraphrases.
Changing too many variables at once. Keep 80–90% of your prompt identical between iterations and shift one or two things. The model is then making a small move, not a fresh image.
Character consistency
When iterating on the same character across multiple prompts, repeat the distinguishing features — hair colour, glasses, outfit, build — in every prompt. Identity drifts otherwise. The first prompt sets the character; the rest re-anchor it.
Two-line summary
Write a sentence, not a list. For edits, name what changes and what stays.
That covers about 80% of what you'll do with Nano Banana in FluxGen. The other 20% is taste — and for that, just generate more.