A real lifestyle photo with light study-note annotations layered on top — not an illustration, not a flat-design poster. Upload your photo, the model preserves it as the base (no face swap, no composition changes, no over-beautification), then overlays white hand-drawn outline annotations on a handful of everyday objects in the frame. Each annotation labels its object with the English word and one short A1-A2 example sentence in white hand-written-style text. Optional tiny chibi study buddy in a corner if your photo has a person or a pet.

Photo vocabulary scrapbook — sunlit kitchen counter with kettle, bread, fruit bowl, mug, moka pot, basil plant labelled in white hand-drawn study notes

What you bring

A lifestyle photo — your kitchen counter, your desk, your closet, your café table, your bookshelf, anywhere with everyday objects worth labelling.
A vocab count — how many objects to annotate. Recommended 5-6 for legibility; you can ask for more, but the model has to fit that many word + sentence pairs in the frame and quality may degrade.
Chibi study buddy — auto (a tiny mascot derived from the person or pet in your photo, only if one is present) or off (skip the chibi entirely).
Decoration density — minimal (a few sparkles, dots, arrows) or moderate (more sparkles, tiny icon glyphs, underlines).

What you get back

Your original photo, preserved in its lighting, composition, identity, and authentic camera feel — with white thin-line hand-drawn outlines traced around 5-6 (or however many you asked for) clearly-visible everyday objects, each connected to a small white-text label nearby. The label is two lines: the English word in larger hand-written type, and one short A1-A2 example sentence (4-8 words, simple subject-verb-object, basic vocabulary, no subordinate clauses, no idioms) in slightly smaller hand-written type underneath. Useful for English learners, language teachers, social-media-friendly study notes, and personal scrapbook pages.

The photo stays a photo

The prompt's top-locked PRESERVE-THE-PHOTO section forbids: turning the photo into an illustration / painting / render, face-swapping, identity drift, over-beautification, body re-shaping, adding objects that weren't in the original, aggressive colour grading, or recropping the composition. The annotations live on top of the photo, layered over the existing frame — they don't replace it.

A1-A2 grammar discipline

Every example sentence stays at A1-A2 register: 4-8 words, simple subject-verb-object, basic high-frequency vocabulary, no subordinate clauses, no idioms, no rare words. Sample sentences: "I drink hot tea every morning." / "She reads the book at night." / "We have bread for breakfast." / "The lamp is on the desk." The text is English-only — no translations into other languages anywhere in the frame.

Cost & timing

8 credits per run. Roughly 60-75 seconds. With your 20 signup credits you get two before topping up.

Try it →

Inspired by @MrLarus on X — the seed prompt "随拍学英语! 用 ChatGPT-Image2 把照片变成『看图学单词手账』" framing a lifestyle photo as a vocabulary study page. The dense Chinese original was distilled into FluxGen's scaffolding: top-locked PRESERVE-THE-PHOTO fidelity rules, instruction fence around the user-supplied vocabCount, per-enum FORBIDDEN-cue blocks for the chibi-helper and decoration-density modes, A1-A2 grammar discipline (4-8 word sentences, no subordinate clauses, basic vocabulary only), and explicit output requirements. The user picks the count — there's a sanity max of 30 — and the prompt makes clear that asking for more than the photo can fit legibly is the user's call.