Google’s Lumiere brings AI video closer to real than unreal

Posted by
Check your BMI

Single frame from a video showing multiple AI-generated clips
Still frame from a teaser reel of Lumiere clips | Image: Google
toonsbymoonlight

Google’s new video generation AI model Lumiere uses a new diffusion model called Space-Time-U-Net, or STUNet, that figures out where things are in a video (space) and how they simultaneously move and change (time). Ars Technica reports this method lets Lumiere create the video in one process instead of putting smaller still frames together.

Lumiere starts with creating a base frame from the prompt. Then, it uses the STUNet framework to begin approximating where objects within that frame will move to create more frames that flow into each other, creating the appearance of seamless motion. Lumiere also generates 80 frames compared to 25 frames from Stable Video Diffusion.

Admittedly, I am more of a text reporter than a video person, but…

Continue reading…