EvTexture++: Event-Driven Texture Enhancement for Video Super-Resolution

Abstract

Event-based vision has drawn increasing attention owing to its distinctive properties, including ultra-high temporal resolution and extreme dynamic range. Recent works have introduced it to video super-resolution (VSR) to enhance flow estimation and temporal alignment. In contrast, this paper shifts the focus of event signals from motion refinement to texture enhancement in VSR. We propose EvTexture++, the first event-driven framework dedicated to texture enhancement in VSR. It leverages high-frequency spatiotemporal details from events to improve texture recovery. EvTexture++ incorporates a customized texture enhancement branch, along with an iterative texture enhancement module that progressively exploits high-temporal-resolution event information for texture restoration. This enables gradual refinement of texture regions across iterations, yielding more accurate and detailed high-resolution outputs. Besides intra-frame texture recovery, large motions could degrade inter-frame temporal consistency, particularly in texture regions, leading to texture flickering. To mitigate this, we further exploit the continuous-time motion cues of events to enhance temporal consistency, introducing a temporal texture alignment module that estimates event-guided texture-aware flow for precise inter-frame texture alignment. Moreover, EvTexture++ is designed as a plug-and-play tool to flexibly boost the performance of existing VSR models. Experiments on five datasets demonstrate that EvTexture++ achieves state-of-the-art performance. When integrated into recent VSR models, it yields significant improvements, with gains of up to 1.55 dB in PSNR on the texture-rich Vid4 dataset.

Motivation

Visual comparison on a challenging texture-rich scene. While current VSR methods, whether frame-based (e.g., MIA-VSR and IART) or event-based (e.g., EGVSR), suffer from severe over-smoothing, our EvTexture++ successfully reconstructs coherent building stripes. This is further validated by the error maps, where our method exhibits significantly lower residuals by leveraging high-frequency event information.

VSR Methods Comparisons

Comparison of different VSR paradigms. (a) RGB-based methods primarily rely on motion alignment to aggregate temporal information. (b) Previous event-based methods leverage events mainly to assist motion learning. (c) In contrast, EvTexture++ pioneers the use of events for explicit texture restoration, while simultaneously utilizing them to refine motion alignment for better robustness.

Network Architecture

Network architecture of EvTexture++. (a) EvTexture++ adopts a bidirectional recurrent structure with parallel event-guided texture and motion branches for spatial texture restoration and temporal texture consistency, respectively. (b) The ITE module iteratively refines features with richer textural details via a shared ConvGRU, leveraging high-frequency spatiotemporal event signals and the current frame context.

Event-guided Motion Branch

EvTexture++ further integrates event signals into the motion branch and introduces a Temporal Texture Alignment (TTA) module, which consists of an RGB-based MEMC and an event-based MEMC that jointly improve feature alignment. In the event-based MEMC, events are converted into voxel grids and processed by a U-Net to estimate fast and non-linear motion from events for alignment. The RGB-based MEMC estimates optical flow from images using SpyNet and aligns features accordingly.

Plug-in Framework

Overview of the EvTexture++ plug-in framework. During training, the frozen backbone extracts spatial features before propagation, temporal features after propagation, and bidirectional optical flow. The EvTexture++ plug-in refines propagated features conditioned on event information and the other extracted features. This flexible design can be integrated into various VSR models to consistently improve performance.

Quantitative Results

Quantitative comparison on Vid4, REDS4, and Vimeo-90K-T

Quantitative comparison at different upsampling scales

Quantitative comparison with the EvTexture++ plug-in

EvTexture++ achieves state-of-the-art performance on standard VSR benchmarks, extended scale settings, and plug-in evaluations. The plug-in variants consistently improve frozen CNN- and Transformer-based backbones, indicating that the gains come from event-guided texture cues rather than simply increasing parameter count.

BibTeX

@article{kai2026evtexture++,
  title={{E}v{T}exture++: {E}vent-{D}riven {T}exture {E}nhancement for {V}ideo {S}uper-{R}esolution},
  author={Kai, Dachun and Lu, Jiayao and Zhang, Yueyi and Sun, Xiaoyan},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  volume={48},
  number={6},
  pages={6642--6659},
  year={2026},
  publisher={IEEE}
}

@inproceedings{kai2024evtexture,
  title={{E}v{T}exture: {E}vent-driven {T}exture {E}nhancement for {V}ideo {S}uper-{R}esolution},
  author={Kai, Dachun and Lu, Jiayao and Zhang, Yueyi and Sun, Xiaoyan},
  booktitle={Proceedings of the 41st International Conference on Machine Learning},
  pages={22817--22839},
  year={2024},
  volume={235},
  publisher={PMLR}
}

EvTexture++: Event-Driven Texture Enhancement for Video Super-Resolution

Video Demos

Abstract

Motivation

VSR Methods Comparisons

Network Architecture

Event-guided Motion Branch

Plug-in Framework

Quantitative Results

Qualitative Results

Qualitative comparison on Vid4 for 4× VSR. Only EvTexture++ can restore vivid branches and leaves on the tulip tree.

Qualitative comparison on Vimeo-90K-T for 4× VSR. Only EvTexture++ can restore faithful and detailed stripes on clothing surfaces.

Qualitative comparison on REDS4 for 4× VSR. Only EvTexture++ can clearly recover the digits "5886" on the license plate.

Qualitative comparison on UDM10 for 8× VSR. Only EvTexture++ can restore fine details such as camera text and railings.

Qualitative comparison on CED for 4× VSR. Only EvTexture++ can recover fine wall textures and sharp facial details.

EvTexture++ plug-in variants significantly improve restoration, producing clearer textures and more visual details.

BibTeX