Multi-level spatial and semantic enhancement network for expression recognition

作者:Yingdong Ma, Xia Wang, Lihua Wei

摘要

Facial expression recognition (FER) on real world databases is an active and challenging research topic. Existing CNN-based facial expression classifiers usually have good performance on common expressions, including happy and surprise, but have lower accuracy on difficult expressions, such as disgust and fear. Two main factors are responsible for this problem. Firstly, intra-class variation makes classification of difficult expressions more complex than other expressions. Secondly, severe data imbalance of difficult expressions in most FER datasets leads to overfitting during training. In this work, a new network architecture is proposed to address the intra-class variation problem. The proposed model consists of a spatial enhancement module and a semantic aggregation module to enhance fine-level expression features and high-level semantic features. To alleviate the data imbalance problem, an iterative learning method is introduced to collect difficult expression samples. New samples with inconsistent labels are classified by using a fuzzy clustering algorithm. The proposed FER framework has been evaluated on three real world expression datasets. Experimental results demonstrate that the proposed method significantly improved the recognition accuracy of difficult expressions and achieved top performance compared with state-of-the-art works.

论文关键词:Difficult expressions, Intra-class variation, Iterative learning

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-021-02254-0