Turn a Single Frame Into a Directed Shot with Gen 4
Gen 4 is an AI video generation model developed by Runway, released to paid users on April 1, 2025 , following the Gen-3 Alpha generation (released in 2024 ). Gen 4’s focus is controllable image-to-video creation—designed to help creators generate consistent characters, locations, and objects across scenes as part of Runway’s “world consistency” direction .
Operationally, Gen 4 is built around a simple but production-friendly loop: you start from an input image (required) and write a motion-first prompt to choreograph subject movement, camera movement, and environmental motion. In Runway’s official Gen-4 guidance, the input image acts as the first frame, while the text prompt should primarily describe motion, not the static visual contents .
On Vidofy.ai, Gen 4 becomes easier to use as a repeatable workflow: keep your image references organized, iterate variations quickly, and export clean deliverables for editing pipelines. Gen 4 generates short clips in either 5 or 10 second durations , and Runway notes that generative video outputs are created at 720p but can be upscaled to 4K —making it practical for concept-to-cut workflows where you lock motion first, then upscale for finishing.
The New Director vs. The Proven Workhorse: Gen 4 vs Gen 3
Both Gen 4 and Gen 3 are Runway video-generation models you can access on Vidofy. Gen 4 prioritizes controllable, reference-driven image-to-video work, while Gen 3 remains valuable when you want text-to-video and long-form extension workflows.
| Feature/Spec | Gen 4 | Gen 3 |
|---|---|---|
| Control modes (officially documented) | Image + text (image required) | Text to Video + Image to Video |
| Supported clip durations | 5s or 10s | 5s or 10s |
| Output sizes / aspect formats (documented pixel resolutions) | 16:9 1280x720; 9:16 720x1280; 1:1 960x960; 4:3 1104x832; 3:4 832x1104; 21:9 1584x672 | 16:9 1280x768 |
| Frame rate (FPS) | 24fps | 24fps |
| Text prompt character limit | 1000 characters | 1000 characters |
| Generation cost (credits per second) | 12 credits/second | 10 credits/second |
| Extending videos beyond the base clip | Not supported for Gen-4 generations (Extend is only available for Gen-3 Alpha and Gen-3 Turbo generations) | Up to 40s maximum extended length |
| Accessibility | Instant on Vidofy | Gen 3 also available on Vidofy |
Detailed Analysis
Analysis: Control-first image-to-video vs. text-first ideation
Gen 4 is optimized for creators who want a “directed shot” workflow: lock a frame, then specify motion. Because the image acts as the first frame, you can drive continuity and art direction through references instead of over-describing the scene in text.
Gen 3 still matters when your starting point is pure language (text-to-video) or when you need a built-in official extension path for longer sequences. On Vidofy, you can choose the model per shot—Gen 4 for consistency-critical moments, Gen 3 when text-only generation or extension is the deciding factor.
Analysis: Format flexibility for real deliverables
Gen 4 officially supports a wider set of output formats (including horizontal, vertical, square, and widescreen). In practical terms, that means you can generate the same concept for different placements (social verticals, squares, cinematic widescreen) without redesigning your entire creative approach.
Vidofy streamlines this by keeping your prompt-and-reference workflow consistent across formats—so you spend less time retooling and more time iterating toward the shot that cuts cleanly into your edit.
Verdict: Choose Gen 4 When Consistency and Control Are Non-Negotiable
How It Works
Follow these 3 simple steps to get started with our platform.
Step 1: Upload a strong reference frame
Choose an image that matches your intended composition, character look, and lighting. This becomes the visual anchor that Gen 4 animates.
Step 2: Write a motion-first prompt
Describe what moves (subject, camera, environment) in clear physical terms—like tracking, dolly, pan, drifting smoke, wind in fabric, or waves rolling.
Step 3: Generate, compare versions, and export
Iterate until the motion reads cleanly, then export your chosen take. If you need higher-resolution finishing, upscale in your workflow where supported.
Frequently Asked Questions
What is Gen 4, and who develops it?
Gen 4 is a Runway-developed AI video generation model designed for controllable, reference-driven creation. Runway positions Gen 4 around media generation and “world consistency,” including generating consistent characters, locations, and objects across scenes.
Does Gen 4 support text-to-video without an image?
No. Gen 4 requires an input image; your image acts as the first frame, and your text prompt primarily directs motion.
What are Gen 4’s official duration limits?
Gen 4 generates clips in 5-second or 10-second durations.
Can I extend Gen 4 videos beyond the base clip?
Not via Runway’s Extend feature for Gen 4. Runway states that Extend video is only available for Gen-3 Alpha and Gen-3 Turbo generations, while Gen 4 outputs are generated as 5s or 10s clips.
Can I export Gen 4 videos in 4K?
Runway states that generative video outputs are created in 720p, but can be upscaled to 4K.
Do I have commercial rights to what I generate with Gen 4?
Yes. Runway states that the content you upload and generate is yours to use without non-commercial restrictions from Runway, and you retain ownership and rights to your generations.