Blog / Prompt Engineering, 2026-04-30, 9 min read

AI Video Prompt Tips That Actually Work in 2026

Image prompts you can keep loose. Video prompts cost more credits and waste more of them when wrong. Kling, Luma, Veo 3, and Runway each have their own preferences but they share a core grammar: subject, motion, camera move, duration, and mood. Get those five right and the model gets out of your way.

The 5-axis video formula

  1. Subject and scene, what's in the frame
  2. Motion, what is the subject doing
  3. Camera move, how the camera relates to the action
  4. Duration and pacing, energy of the clip
  5. Mood and lighting, the atmosphere

1. Subject and scene: anchor the frame

Same as image prompts but more specific. Tell the model what the start frame looks like, because that's the anchor for the entire clip:

A matte black water bottle stands centered on a wet polished slate surface in a rainy alley, neon ambient light reflecting on the wet stone

If you're doing image-to-video (recommended for products and characters), the still IS the start frame. Skip the scene description and focus on motion.

2. Motion: name what changes

This is the axis most people skip. Without it, the model invents motion, usually some generic camera drift. Be explicit:

Specific motion verbs give the model something to actually animate.

3. Camera move: tell the camera what to do

If you don't specify, the model will usually do a slow drift or static hold. Often that's wrong for the shot:

Kling executes camera moves more cleanly than any other model. Veo 3 reads complex camera grammar (focal length changes, dolly-zoom effects). Luma adds stylization to whatever move you request.

4. Duration and pacing

Most models default to 5 seconds. You can usually pick 5 or 10. Use the pacing of the action to inform duration:

Match the duration to what's plausible for the action. 10-second clips of someone slowly turning their head feel uncanny.

5. Mood and lighting

Lighting in video carries even more weight than in stills because it affects every frame. Named light direction beats generic adjectives:

Putting it together: copy-paste templates

Product image-to-video

[Optional: 'static start frame matches reference image.'] [Motion: what changes]. [Camera move]. [Lighting and mood]. [Duration].

Example:

Water droplets bead and slide down the bottle, light condensation forming. Slow push-in from medium to close-up. Cool blue practical lighting from camera left, soft fill from front. 5 seconds.

Character motion (image-to-video)

[Subject motion]. [Subtle background motion]. [Camera move]. [Lighting]. [Duration].

Example:

The woman turns her head slowly toward camera, eyes meeting lens, faint smile forming. Soft breeze moving wisps of hair. Locked-off tripod, no camera movement. Soft north-window light from camera right, emerald velvet backdrop. 5 seconds.

Cinematic establishing shot

[Wide scene description]. [Subtle motion in environment]. [Camera move]. [Lighting and time of day]. [Duration].

Example:

A man in a dark wool coat walks slowly across a wet cobblestone street in foggy old Edinburgh, distant streetlamps glowing warm. Slight steam rises from a manhole cover. Slow dolly-out from medium to wide. Blue hour, ambient practical lights, slight haze. 10 seconds.

Model-specific tips

Kling

The motion specialist. Be specific about physics: cloth flow, liquid, hair. Camera moves execute cleanly. Best image-to-video fidelity. Tends to default to subtle motion, so prompt explicit motion if you want energy.

Luma

Stylized, dreamy, surreal. Less literal than Kling. Forgive minor physics inaccuracies; lean into stylization. Ideal for social-first content where vibe beats realism.

Veo 3

Reads complex film grammar (lens choices, dolly-zoom, focus pulls). Native synchronized audio means you can describe sound: "sound of rain on slate, distant traffic." Veo 3 generates the audio to match.

Runway Gen-4

Strong on motion brush (region-specific motion control) and editing tools. Less responsive to detailed scene descriptions; keep prompts shorter and rely on Runway's UI controls for fine motion direction.

Common mistakes

The image-first workflow

The fastest path to good AI video is to nail the still frame first, then animate it. Image generation is 5-25 credits per try. Video is 100-500. Iterating on the still costs 1/20th the price.

  1. Generate the start-frame still on Imagen 4 or Flux until it's perfect.
  2. Send to video generator with image-to-video.
  3. Write a focused motion-only prompt, no scene description needed.
  4. Generate 2-3 variants. Pick the strongest motion direction.

Bottom line

Video prompts work the same way image prompts do, but with motion as the load-bearing axis. Subject, motion, camera, duration, mood. Be specific on motion verbs, name your camera move, match duration to action. Test the formula with the 70 free credits on signup, though plan to upgrade quickly because video is computationally expensive.

More: image prompt guide, Kling vs Veo 3, video generator

Try the formula on real models

70 free credits on signup. No credit card.

Start free