Pros and cons of GAN evaluation measures

作者:

Highlights:

摘要

Generative models, in particular generative adversarial networks (GANs), have gained significant attention in recent years. A number of GAN variants have been proposed and have been utilized in many applications. Despite large strides in terms of theoretical progress, evaluating and comparing GANs remains a daunting task. While several measures have been introduced, as of yet, there is no consensus as to which measure best captures strengths and limitations of models and should be used for fair model comparison. As in other areas of computer vision and machine learning, it is critical to settle on one or few good measures to steer the progress in this field. In this paper, I review and critically discuss more than 24 quantitative and 5 qualitative measures for evaluating generative models with a particular emphasis on GAN-derived models. I also provide a set of 7 desiderata followed by an evaluation of whether a given measure or a family of measures is compatible with them.

论文关键词:

论文评审过程:Received 11 February 2018, Revised 24 October 2018, Accepted 28 October 2018, Available online 27 November 2018, Version of Record 22 February 2019.

论文官网地址:https://doi.org/10.1016/j.cviu.2018.10.009