Sign in
Faster Image2Video Generation: A Closer Look at CLIP Image Embedding's Impact on Spatio-Temporal Cross-Attentions
Journal article   Open access   Peer reviewed

Faster Image2Video Generation: A Closer Look at CLIP Image Embedding's Impact on Spatio-Temporal Cross-Attentions

Ashkan Taghipour, Morteza Ghahremani, Mohammed Bennamoun, Aref Miri Rekavandi, Zinuo Li, Hamid Laga and Farid Boussaid
IEEE access, Vol.13, pp.141313-141327
2025
pdf
Published13.15 MBDownloadView
CC BY V4.0 Open Access

Abstract

Australia CLIP image encoding Computational modeling Computer architecture Diffusion models image-to-video generation Noise reduction spatial cross-attention temporal-cross-attention Text to video Three-dimensional displays Training Video generation Videos Visualization

Details

Metrics

2 File views/ downloads
17 Record Views
Logo image