A traditional product photoshoot costs $500-$3,000, takes a week, and produces 20-40 final images. AI product photography collapses that into 10 minutes and 50 credits. But only if you follow a workflow. Generic prompts produce generic AI-stiff output that hurts conversion rates instead of helping. This is the workflow we use to ship product photos that actually convert.

What you'll need

A reference photo of your actual product (a phone shot on white background works).
A Viral Engine account (70 free credits gets you a full first session).
A clear sense of where the photo will live: PDP hero, ad creative, lifestyle for social, packaging, etc. Different uses, different shots.

Step 1: Start with a reference

This is the single biggest difference between cheap-looking AI product photos and ones that pass for real. Without a reference, the model invents a product that resembles yours but isn't. Customers spot this in 0.3 seconds and bounce.

Upload your reference using image-to-image on Flux or use the multi-reference feature (up to 5 reference images with @image1 through @image5 tags). The model now has to render your specific product, not a generic one.

Step 2: Pick the model for the use case

PDP hero shot, packaging, print: Imagen 4 Ultra. The photorealism ceiling is the reason it exists.
Lifestyle scenes (product in context): Imagen 4 Standard. Strong on people-with-product.
Iteration and exploration: Nano Banana 2. Fast loop while you find the right direction.
Variations of a winning shot: Flux with seed locking. Same composition, different lighting or backgrounds.
Surgical fixes (label readability, background swap): Flux inpainting.

Step 3: The 5-axis prompt for products

The structure from the prompt engineering guide applies, with product-specific tweaks:

Subject (be specific about the product)

Don't say "a coffee mug." Say:

A matte black ceramic coffee mug with subtle texture, brushed metal handle, holding lightly steaming dark-roast coffee, faint foam ring at the rim

Material, finish, condition, contents. The more specific, the more your reference photo guides the model.

Style

product photography, clean, on-white, lit-for-detail
commercial product shot, branded, magazine-ready
lifestyle product photography, contextual, environmental
flat lay product, overhead, arranged
moody product still life, dark, dramatic, painterly

Composition

Where is the product in the frame? What's around it?

centered, three-quarter angle, classic product hero
off-center to the right with negative space on left for headline copy, ad creative
overhead flat lay with complementary props, lifestyle
extreme close-up of the brand mark, detail shot

Lighting (the conversion driver)

Lighting separates "AI product photo" from "actual product photo." Generic prompts get default flat lighting. Named lighting gets shipped:

soft north-window light from camera right, premium feel, painterly
hard side lighting from camera left, sculpted shadows, dramatic, masculine
high-key studio with two-to-one ratio, clean, fashion
overhead diffused softbox, even, e-commerce on-white
warm golden-hour backlight, soft fill from front, lifestyle

Camera and lens

shot on 50mm prime, f/2.8, shallow depth of field, natural perspective
shot on 85mm portrait lens, f/2, flattering compression for products with people
macro lens, 1:1 magnification, extreme detail
shot on Hasselblad medium format, f/8, premium, tack-sharp

Step 4: Iterate fast

Generate 4-8 variations per prompt iteration. Look at what's working: composition, lighting, product fidelity. Refine the prompt based on what landed:

If the lighting is wrong, change only the lighting line. Don't rewrite the whole prompt.
If composition is wrong, change only the composition line.
If the product looks like a different product, increase reference image weight or add more visual detail to the subject description.

Single-axis iteration is the fastest path to a finished shot. Multi-axis rewrites confuse the model.

Step 5: Inpaint to fix the last 10%

You'll often have a shot where 90% is right and 10% is broken. Don't re-roll. Inpaint.

Wrong product label. Brush the label, regenerate with a corrected prompt: "matte white label with the words 'Brand X' in dark serif type."
Cluttered background. Brush the cluttered area, regenerate with: "clean minimal studio background, gradient gray."
Mis-rendered detail (handle, lid, cap). Brush just that detail, regenerate.

Inpainting on Flux is 5 credits. Re-rolling on Imagen 4 Ultra is 25 credits. The math always favors surgical fixes.

The 50-credit hero shot recipe

Upload reference photo of the product. (Free.)
Generate 4 Nano Banana 2 variations exploring composition. 40 credits.
Pick the strongest direction. Refine prompt.
Generate one Imagen 4 Ultra final shot at the winning composition. 25 credits.
If anything needs surgical fix, one Flux inpaint. 5 credits.

Total: 70 credits. Time: 8 minutes. Output: ad-quality product hero shot.

Volume play: 20 product shots in an hour

Once you've nailed the prompt and lighting recipe for one product, the second product takes a third of the time. By product 5, you have a templated workflow you can run on autopilot. We've seen agencies ship 20+ finished product shots in a single hour using this loop, after the first hour of prompt refinement.

For real automation, save the winning prompt as an Agent (saved prompt template) and chain product shots in the Visual Workflow Builder. Drop in a CSV of product references, run the workflow, get back a folder of finished shots.

Bottom line

AI product photography works when you bring a reference photo, follow a 5-axis prompt structure, iterate one axis at a time, and inpaint surgical fixes instead of re-rolling. With this workflow, a single product hero costs ~70 credits ($1.05 on the $14/mo Essential plan). A traditional photoshoot of the same shot costs 500-1500x that.

More: prompt engineering guide, Imagen 4 vs Flux, Visual Workflow Builder

How to Make AI Product Photos That Sell