Blog / Prompt Engineering, 2026-04-30, 9 min read
AI Video Prompt Tips That Actually Work in 2026
Image prompts you can keep loose. Video prompts cost more credits and waste more of them when wrong. Kling, Luma, Veo 3, and Runway each have their own preferences but they share a core grammar: subject, motion, camera move, duration, and mood. Get those five right and the model gets out of your way.
The 5-axis video formula
- Subject and scene, what's in the frame
- Motion, what is the subject doing
- Camera move, how the camera relates to the action
- Duration and pacing, energy of the clip
- Mood and lighting, the atmosphere
1. Subject and scene: anchor the frame
Same as image prompts but more specific. Tell the model what the start frame looks like, because that's the anchor for the entire clip:
If you're doing image-to-video (recommended for products and characters), the still IS the start frame. Skip the scene description and focus on motion.
2. Motion: name what changes
This is the axis most people skip. Without it, the model invents motion, usually some generic camera drift. Be explicit:
water droplets bead and slide down the bottlesteam slowly rises from the cup, drifting leftthe model turns her head slightly toward camera, smilingcloth of the dress flows in a soft breeze from camera rightliquid pours from the bottle into the glass, splash visible
Specific motion verbs give the model something to actually animate.
3. Camera move: tell the camera what to do
If you don't specify, the model will usually do a slow drift or static hold. Often that's wrong for the shot:
slow push-in toward subject, intimate, dramaticdolly out from extreme close-up to medium shot, revealorbit shot, camera circles subject 90 degrees clockwise, hero turntableslight handheld sway, no track, naturalism, doc feellocked-off tripod, no camera movement, classical, formalcrash zoom into subject's face, ad-energy, tensionoverhead drone descent, environmental, scale
Kling executes camera moves more cleanly than any other model. Veo 3 reads complex camera grammar (focal length changes, dolly-zoom effects). Luma adds stylization to whatever move you request.
4. Duration and pacing
Most models default to 5 seconds. You can usually pick 5 or 10. Use the pacing of the action to inform duration:
- Static product turntable, 5 seconds is plenty
- Character motion (turning head, smile, gesture), 5 seconds reads natural
- Cinematic establishing shot with slow push-in, 10 seconds lets it breathe
- Rapid action (pour, splash, crash), 5 seconds; longer feels weird
Match the duration to what's plausible for the action. 10-second clips of someone slowly turning their head feel uncanny.
5. Mood and lighting
Lighting in video carries even more weight than in stills because it affects every frame. Named light direction beats generic adjectives:
golden hour backlight with rim halation, warm, romanticcool blue practical lighting from off-screen, moody, late nighthard side light from camera left, deep shadows, dramaticsoft north-window diffuse light, painterly, premiumneon ambient with hard rim, cyberpunk feel, bold
Putting it together: copy-paste templates
Product image-to-video
Example:
Character motion (image-to-video)
Example:
Cinematic establishing shot
Example:
Model-specific tips
Kling
The motion specialist. Be specific about physics: cloth flow, liquid, hair. Camera moves execute cleanly. Best image-to-video fidelity. Tends to default to subtle motion, so prompt explicit motion if you want energy.
Luma
Stylized, dreamy, surreal. Less literal than Kling. Forgive minor physics inaccuracies; lean into stylization. Ideal for social-first content where vibe beats realism.
Veo 3
Reads complex film grammar (lens choices, dolly-zoom, focus pulls). Native synchronized audio means you can describe sound: "sound of rain on slate, distant traffic." Veo 3 generates the audio to match.
Runway Gen-4
Strong on motion brush (region-specific motion control) and editing tools. Less responsive to detailed scene descriptions; keep prompts shorter and rely on Runway's UI controls for fine motion direction.
Common mistakes
- Vague action verbs. "Moving" or "doing something" produces drift. Be specific: pours, slides, turns, beads, drifts, exhales.
- Conflicting camera moves. "Slow push-in with handheld sway and an orbit" is three things; the model averages and you get nothing clean. Pick one.
- Over-specifying duration. "Frame 1 to frame 30 the model walks, then turns at frame 45" is shot-list thinking. Models work on whole-clip vibes, not frame-by-frame instructions.
- Treating video like image prompts. Static descriptions without motion produce static-feeling clips. Motion is the differentiator; lead with it.
The image-first workflow
The fastest path to good AI video is to nail the still frame first, then animate it. Image generation is 5-25 credits per try. Video is 100-500. Iterating on the still costs 1/20th the price.
- Generate the start-frame still on Imagen 4 or Flux until it's perfect.
- Send to video generator with image-to-video.
- Write a focused motion-only prompt, no scene description needed.
- Generate 2-3 variants. Pick the strongest motion direction.
Bottom line
Video prompts work the same way image prompts do, but with motion as the load-bearing axis. Subject, motion, camera, duration, mood. Be specific on motion verbs, name your camera move, match duration to action. Test the formula with the 70 free credits on signup, though plan to upgrade quickly because video is computationally expensive.