Steady-Forcing: Balancing Spatial Persistence and Motion Continuity in Long-Horizon Nature Video Diffusion

Matiur Rahman Minar1, Seunghun Oh2, Ganghyeon Jeong2, Unsang Park1,2
1Department of Computer Science and Engineering, Sogang University   2Department of Artificial Intelligence, Sogang University

Teaser
|

Steady-Forcing is a dual-memory framework for fixed-camera nature-flow video generation. It balances stability and motion to sustain high background persistence and continuous fluid dynamics over multi-minute horizons.

City Rainy Street Scene

A calm fixed-camera rainy street scene with steady background persistence.

Tranquil River Under Concrete Bridge

A serene river scene with sustained motion and background stability.

Forest Storm Scene

A stormy hillside forest scene that highlights fluid motion under fixed-camera conditions.

City Snow Scene

A calm fixed-camera sea beach stream with steady background persistence.

Volcanic Valley Scene

A lava flow from volcanic valley scene with sustained motion and background stability.

Night River Scene

A dangerous river scene at night that highlights fluid motion under fixed-camera conditions.

Duration-Based Results

Evaluating Steady-Forcing across fixed-camera real-world streams at 60s, 120s, and 240s durations.

60-Second Evaluation

Street Food Shop Scene

A 60-second fixed-camera street food scene with steady background composition.

Empty Sea Beach Scene

A calm 60-second sea beach stream with minimal motion and strong spatial persistence.

Mountain Stream Flow

A 60-second mountain stream showcasing continuous river motion under a fixed viewpoint.

120-Second Evaluation

Forest River Long Take

A 120-second forest river sequence with sustained water motion and static framing.

Mountain Waterfall Cascade

A 120-second waterfall stream demonstrating continuous fluid dynamics from a fixed camera.

Urban Rainstorm at Night

A 120-second nighttime rainstorm stream with wet urban details and stationary framing.

240-Second Evaluation

Long Deep Smoke Rising

A full 240-second fixed-camera smoke sequence with continuous motion and solid spatial consistency.

Serene Forest River Loop

A 240-second forest river with a consistent steady view and continuous water motion.

Wide Open Ocean Waves

A 240-second fixed-camera ocean wave sequence with persistent continuity.

Motivations
|

Standard autoregressive video generation models suffer from a fundamental trade-off over extended evaluation horizons: they either suffer from severe background drift or experience complete motion collapse. Steady-Forcing breaks this bottleneck by decoupling spatial persistence from motion continuity. Operating entirely at inference time without retraining, it leverages a structural dual-memory protocol to successfully lock down static background anchors while simultaneously sustaining continuous, natural fluid motion dynamics.

Standard Autoregressive Baseline

Baseline rollout suffers from gradual structural warping and background element drifting.

Steady-Forcing (Ours)

Our dual-memory pipeline maintains rigorous background stability alongside ongoing fluid animation.

Qualitative Results

Qualitative evaluations demonstrating Steady-Forcing's ability to maintain high fidelity, static structural layouts and non-decaying natural motion paths.

Ultra Long Video Generation
|

Showcasing long-horizon generative streams that preserve background geometry across hundreds of frames.

Serene Forest Waterfall Stream

Ocean Waves on Rocky Coastline

Environmental Stream Dynamics
|

Fine-grained fluid and motion transitions applied seamlessly over stationary background compositions.

Breeze to Gale Wind Transition

Flow Velocity Acceleration

Flickering Campfire Embers

Lakeside Ripple Intensification

Volcanic Vent Smoke Swirls

Geyser Eruption Surge

Fog Formation Dynamics

Ocean Undercurrent Eddies

Raindrop Disturbance Patterns

Multi-Element Nature Streams
|

Coordinated landscape rollouts demonstrating multi-component conditioning, managing localized motion boundaries, and synchronized ecosystem behaviors.

Concurrent Rain & Wind Fronts

Waterfall Cascade with River Waves

Drifting Clouds Over Swaying Pines

Ocean Waves Intersecting Sea Spray

Desert Sand Drift & Wind Gusts

Wildlife Interaction with Water Flow

Weather & Seasonal Flow
|

Demonstrations of introducing progressive atmospheric variations and environmental states onto an unyielding background layout.

Sudden Rainstorm Front Onset

Rolling Fog Blanketing Meadow

Blizzard Onset Over Frozen Lake

Temporal Weather & Light Variation
|

Modulating global illumination, dynamic cloud occlusion, and time-of-day progression across continuous nature rollouts.

Sunset Golden Hour Progression

Overcast Cloud Shadow Tracking

Fixed-Camera Real-World Results

New user-uploaded fixed-tripod scenes demonstrating stable background persistence across real-world coastal, urban, and forest environments.

Empty Sea Beach Scene

Flooded Urban Street (Goyang-si)

Continuous Rainy City Street

Continuous Urban Town Street

Continuous Urban Fire Scene

Dangerous Night River Scene

Forest Storm Scene

Tranquil River Under Concrete Bridge

BibTeX

@article{minar2025steady,
  title={Steady-Forcing: Balancing Spatial Persistence and Motion Continuity in Long-Horizon Nature Video Diffusion},
  author={Minar, Matiur Rahman and Oh, Seunghun and Jeong, Ganghyeon and Park, Unsang},
  journal={arXiv preprint arXiv:2606.7661673},
  year={2026}
}