EvTexture: Event-driven Texture Enhancement for Video Super-Resolution

Abstract

Event-based vision has drawn increasing attention due to its unique characteristics, such as high temporal resolution and high dynamic range. It has been used in video super-resolution (VSR) recently to enhance the flow estimation and temporal alignment. Rather than for motion learning, we propose in this paper the first VSR method that utilizes event signals for texture enhancement. Our method, called EvTexture, leverages high-frequency details of events to better recover texture regions in VSR. In our EvTexture, a new texture enhancement branch is presented. We further introduce an iterative texture enhancement module to progressively explore the high-temporal-resolution event information for texture restoration. This allows for gradual refinement of texture regions across multiple iterations, leading to more accurate and rich high-resolut ion details. Experimental results show that our EvTexture achieves state-of-the-art performance on four datasets. For the Vid4 dataset with rich textures, our method can get up to 4.67dB gain compared with recent event-based methods.

Motivation

Comparative results of VSR methods on the City clip in Vid4. It can be observed that current VSR methods, with (EGVSR and EBVSR) or without (BasicVSR++) event signals, still suffer from blurry textures or jitter effects, resulting in large errors in texture regions. In contrast, our method can predict the texture regions successfully and greatly reduce errors in the restored frames.

VSR Methods Comparisons

(a) RGB-based methods usually focus on motion leaning to recover the missing details from other unaligned frames. (b) Previous event-based methods have attempted to use events to enhance the motion learning. (c) In contrast, our method is the first to utilize events to enhance the texture restoration in VSR. The red dotted line is an optional branch, where our method can easily adapt to approaches that use events to enhance the motion learning.

Network Architecture

(a) Following BasicVSR, our EvTexture adopts a bidirectional recurrent network, where features are propagated forward and backward. At each timestamp, it includes a motion branch and a parallel texture branch to explicitly enhance the restoration of texture regions. (b) In the texture branch, the ITE module plays a key role. It progressively refines the feature across multiple iterations, leveraging high-frequency textural information from events along with context information from the current frame.

Quantitative Results

Quantitative comparison (PSNR↑/SSIM↑) on Vid4, REDS4 and Vimeo-90K-T for 4× VSR. All results are calculated on Y-channel except REDS4 (RGB-channel). The input types "I" and "I+E" represent RGB-based and event-based methods, respectively. Red and blue colors indicate the best and second-best performances, respectively.

BibTeX

@inproceedings{kai2024evtexture,
  title={{E}v{T}exture: {E}vent-driven {T}exture {E}nhancement for {V}ideo {S}uper-{R}esolution},
  author={Kai, Dachun and Lu, Jiayao and Zhang, Yueyi and Sun, Xiaoyan},
  booktitle={Proceedings of the 41st International Conference on Machine Learning},
  pages={22817--22839},
  year={2024},
  volume={235},
  publisher={PMLR}
}

EvTexture: Event-driven Texture Enhancement for Video Super-Resolution

Video Demos

Abstract

Motivation

VSR Methods Comparisons

Network Architecture

Quantitative Results

Qualitative Results

Qualitative comparison on Vid4 for 4× VSR.

Qualitative comparison on Vimeo-90K-T for 4× VSR.

Qualitative comparison on REDS4 for 4× VSR.

Qualitative comparison on CED for 4× VSR.

BibTeX