Recurrent generative adversarial networks for unsupervised WCE video summarization

作者：

Highlights：

•

摘要

Wireless capsule endoscopy (WCE) produces amounts of redundant images in one examination, which is very laborious and time-consuming for a physician to review these. It has been extremely needed for a technique that automatically produces a shortened and informative WCE video summary from its original video. This paper considers unsupervised WCE video summarization, and casts it as a sequence-to-sequence learning problem. Our key idea is to learn a deep summarizer network to minimize information loss between training videos and their summaries, in an unsupervised way. To this end, we propose a hybrid yet effective unsupervised WCE video summarization method using long short-term memory (LSTM), variational autoencoder (VAE), pointer network (Ptr-Net), generative adversarial network (GAN), and de-redundancy mechanism (DM) etc. techniques. The proposed model termed Adv-Ptr-Der-SUM adopts a generative adversarial framework, consisting of a summarizer and a discriminator. The summarizer is the VAE-based LSTM architecture with Ptr-Net and DM that aims to learn the conditional probability of output sequence and provide a compact summary. The discriminator is another LSTM aimed at distinguishing between the original video and reconstructed video from the summarizer. The summarizer and discriminator are adversarially trained to optimize the summarizer and produce optimal WCE video summary. Extensive experiments on our WCE-2019-Video dataset show that our model can outperform other video summarization approaches by a large margin in both supervised and unsupervised settings. Also, the proposed model is applied to two public multimedia benchmark datasets, verifying its effectiveness and generality, and demonstrating that it can achieve a competitive result.

论文关键词：Wireless capsule endoscopy,Video summarization,Variational autoencoder,Pointer network,Generative adversarial network,De-redundancy mechanism

论文评审过程：Received 30 November 2020, Revised 14 March 2021, Accepted 17 March 2021, Available online 20 March 2021, Version of Record 9 April 2021.

论文官网地址：https://doi.org/10.1016/j.knosys.2021.106971