AI Video Prompt Tips That Actually Work in 2026

Image prompts you can keep loose. Video prompts cost more credits and waste more of them when wrong. Kling, Luma, Veo 3, and Runway each have their own preferences but they share a core grammar: subject, motion, camera move, duration, and mood. Get those five right and the model gets out of your way.

The 5-axis video formula

Subject and scene, what's in the frame
Motion, what is the subject doing
Camera move, how the camera relates to the action
Duration and pacing, energy of the clip
Mood and lighting, the atmosphere

1. Subject and scene: anchor the frame

Same as image prompts but more specific. Tell the model what the start frame looks like, because that's the anchor for the entire clip:

A matte black water bottle stands centered on a wet polished slate surface in a rainy alley, neon ambient light reflecting on the wet stone

If you're doing image-to-video (recommended for products and characters), the still IS the start frame. Skip the scene description and focus on motion.

2. Motion: name what changes

This is the axis most people skip. Without it, the model invents motion, usually some generic camera drift. Be explicit:

water droplets bead and slide down the bottle
steam slowly rises from the cup, drifting left
the model turns her head slightly toward camera, smiling
cloth of the dress flows in a soft breeze from camera right
liquid pours from the bottle into the glass, splash visible

Specific motion verbs give the model something to actually animate.

3. Camera move: tell the camera what to do

If you don't specify, the model will usually do a slow drift or static hold. Often that's wrong for the shot:

slow push-in toward subject, intimate, dramatic
dolly out from extreme close-up to medium shot, reveal
orbit shot, camera circles subject 90 degrees clockwise, hero turntable
slight handheld sway, no track, naturalism, doc feel
locked-off tripod, no camera movement, classical, formal
crash zoom into subject's face, ad-energy, tension
overhead drone descent, environmental, scale

Kling executes camera moves more cleanly than any other model. Veo 3 reads complex camera grammar (focal length changes, dolly-zoom effects). Luma adds stylization to whatever move you request.

4. Duration and pacing

Most models default to 5 seconds. You can usually pick 5 or 10. Use the pacing of the action to inform duration:

Static product turntable, 5 seconds is plenty
Character motion (turning head, smile, gesture), 5 seconds reads natural
Cinematic establishing shot with slow push-in, 10 seconds lets it breathe
Rapid action (pour, splash, crash), 5 seconds; longer feels weird

Match the duration to what's plausible for the action. 10-second clips of someone slowly turning their head feel uncanny.

5. Mood and lighting

Lighting in video carries even more weight than in stills because it affects every frame. Named light direction beats generic adjectives:

golden hour backlight with rim halation, warm, romantic
cool blue practical lighting from off-screen, moody, late night
hard side light from camera left, deep shadows, dramatic
soft north-window diffuse light, painterly, premium
neon ambient with hard rim, cyberpunk feel, bold

Putting it together: copy-paste templates

Product image-to-video

[Optional: 'static start frame matches reference image.'] [Motion: what changes]. [Camera move]. [Lighting and mood]. [Duration].

Example:

Water droplets bead and slide down the bottle, light condensation forming. Slow push-in from medium to close-up. Cool blue practical lighting from camera left, soft fill from front. 5 seconds.

Character motion (image-to-video)

[Subject motion]. [Subtle background motion]. [Camera move]. [Lighting]. [Duration].

Example:

The woman turns her head slowly toward camera, eyes meeting lens, faint smile forming. Soft breeze moving wisps of hair. Locked-off tripod, no camera movement. Soft north-window light from camera right, emerald velvet backdrop. 5 seconds.

Cinematic establishing shot

[Wide scene description]. [Subtle motion in environment]. [Camera move]. [Lighting and time of day]. [Duration].

Example:

A man in a dark wool coat walks slowly across a wet cobblestone street in foggy old Edinburgh, distant streetlamps glowing warm. Slight steam rises from a manhole cover. Slow dolly-out from medium to wide. Blue hour, ambient practical lights, slight haze. 10 seconds.

Model-specific tips

Kling

The motion specialist. Be specific about physics: cloth flow, liquid, hair. Camera moves execute cleanly. Best image-to-video fidelity. Tends to default to subtle motion, so prompt explicit motion if you want energy.

Luma

Stylized, dreamy, surreal. Less literal than Kling. Forgive minor physics inaccuracies; lean into stylization. Ideal for social-first content where vibe beats realism.

Veo 3

Reads complex film grammar (lens choices, dolly-zoom, focus pulls). Native synchronized audio means you can describe sound: "sound of rain on slate, distant traffic." Veo 3 generates the audio to match.

Runway Gen-4

Strong on motion brush (region-specific motion control) and editing tools. Less responsive to detailed scene descriptions; keep prompts shorter and rely on Runway's UI controls for fine motion direction.

Common mistakes

Vague action verbs. "Moving" or "doing something" produces drift. Be specific: pours, slides, turns, beads, drifts, exhales.
Conflicting camera moves. "Slow push-in with handheld sway and an orbit" is three things; the model averages and you get nothing clean. Pick one.
Over-specifying duration. "Frame 1 to frame 30 the model walks, then turns at frame 45" is shot-list thinking. Models work on whole-clip vibes, not frame-by-frame instructions.
Treating video like image prompts. Static descriptions without motion produce static-feeling clips. Motion is the differentiator; lead with it.

The image-first workflow

The fastest path to good AI video is to nail the still frame first, then animate it. Image generation is 5-25 credits per try. Video is 100-500. Iterating on the still costs 1/20th the price.

Generate the start-frame still on Imagen 4 or Flux until it's perfect.
Send to video generator with image-to-video.
Write a focused motion-only prompt, no scene description needed.
Generate 2-3 variants. Pick the strongest motion direction.

Bottom line

Video prompts work the same way image prompts do, but with motion as the load-bearing axis. Subject, motion, camera, duration, mood. Be specific on motion verbs, name your camera move, match duration to action. Test the formula with the 70 free credits on signup, though plan to upgrade quickly because video is computationally expensive.

More: image prompt guide, Kling vs Veo 3, video generator

AI Video Prompt Tips That Actually Work in 2026

The 5-axis video formula

1. Subject and scene: anchor the frame

2. Motion: name what changes

3. Camera move: tell the camera what to do

4. Duration and pacing

5. Mood and lighting

Putting it together: copy-paste templates

Product image-to-video

Character motion (image-to-video)

Cinematic establishing shot

Model-specific tips

Kling

Luma

Veo 3

Runway Gen-4

Common mistakes

The image-first workflow

Bottom line

Try the formula on real models