Event-based vision has drawn increasing attention owing to its distinctive properties, including ultra-high temporal resolution and extreme dynamic range. Recent works have introduced it to video super-resolution (VSR) to enhance flow estimation and temporal alignment. In contrast, this paper shifts the focus of event signals from motion refinement to texture enhancement in VSR. We propose EvTexture++, the first event-driven framework dedicated to texture enhancement in VSR. It leverages high-frequency spatiotemporal details from events to improve texture recovery. EvTexture++ incorporates a customized texture enhancement branch, along with an iterative texture enhancement module that progressively exploits high-temporal-resolution event information for texture restoration. This enables gradual refinement of texture regions across iterations, yielding more accurate and detailed high-resolution outputs. Besides intra-frame texture recovery, large motions could degrade inter-frame temporal consistency, particularly in texture regions, leading to texture flickering. To mitigate this, we further exploit the continuous-time motion cues of events to enhance temporal consistency, introducing a temporal texture alignment module that estimates event-guided texture-aware flow for precise inter-frame texture alignment. Moreover, EvTexture++ is designed as a plug-and-play tool to flexibly boost the performance of existing VSR models. Experiments on five datasets demonstrate that EvTexture++ achieves state-of-the-art performance. When integrated into recent VSR models, it yields significant improvements, with gains of up to 1.55 dB in PSNR on the texture-rich Vid4 dataset.

Visual comparison on a challenging texture-rich scene. While current VSR methods, whether frame-based (e.g., MIA-VSR and IART) or event-based (e.g., EGVSR), suffer from severe over-smoothing, our EvTexture++ successfully reconstructs coherent building stripes. This is further validated by the error maps, where our method exhibits significantly lower residuals by leveraging high-frequency event information.

Comparison of different VSR paradigms. (a) RGB-based methods primarily rely on motion alignment to aggregate temporal information. (b) Previous event-based methods leverage events mainly to assist motion learning. (c) In contrast, EvTexture++ pioneers the use of events for explicit texture restoration, while simultaneously utilizing them to refine motion alignment for better robustness.

Network architecture of EvTexture++. (a) EvTexture++ adopts a bidirectional recurrent structure with parallel event-guided texture and motion branches for spatial texture restoration and temporal texture consistency, respectively. (b) The ITE module iteratively refines features with richer textural details via a shared ConvGRU, leveraging high-frequency spatiotemporal event signals and the current frame context.

EvTexture++ further integrates event signals into the motion branch and introduces a Temporal Texture Alignment (TTA) module, which consists of an RGB-based MEMC and an event-based MEMC that jointly improve feature alignment. In the event-based MEMC, events are converted into voxel grids and processed by a U-Net to estimate fast and non-linear motion from events for alignment. The RGB-based MEMC estimates optical flow from images using SpyNet and aligns features accordingly.

Overview of the EvTexture++ plug-in framework. During training, the frozen backbone extracts spatial features before propagation, temporal features after propagation, and bidirectional optical flow. The EvTexture++ plug-in refines propagated features conditioned on event information and the other extracted features. This flexible design can be integrated into various VSR models to consistently improve performance.
EvTexture++ achieves state-of-the-art performance on standard VSR benchmarks, extended scale settings, and plug-in evaluations. The plug-in variants consistently improve frozen CNN- and Transformer-based backbones, indicating that the gains come from event-guided texture cues rather than simply increasing parameter count.
@article{kai2026evtexture++,
title={{E}v{T}exture++: {E}vent-{D}riven {T}exture {E}nhancement for {V}ideo {S}uper-{R}esolution},
author={Kai, Dachun and Lu, Jiayao and Zhang, Yueyi and Sun, Xiaoyan},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
volume={48},
number={6},
pages={6642--6659},
year={2026},
publisher={IEEE}
}
@inproceedings{kai2024evtexture,
title={{E}v{T}exture: {E}vent-driven {T}exture {E}nhancement for {V}ideo {S}uper-{R}esolution},
author={Kai, Dachun and Lu, Jiayao and Zhang, Yueyi and Sun, Xiaoyan},
booktitle={Proceedings of the 41st International Conference on Machine Learning},
pages={22817--22839},
year={2024},
volume={235},
publisher={PMLR}
}