KBHN: A knowledge-aware bi-hypergraph network based on visual-knowledge features fusion for teaching image annotation

作者:

Highlights:

摘要

Teaching images, as an important auxiliary tool in teaching and learning, are fundamentally different from the general domain images. Besides visually similar images being more likely to share common labels, teaching images also face the challenge of visual-knowledge inconsistency, including intra-knowledge visual difference and inter-knowledge visual similarity. To address the above challenges, we present KBHN, a knowledge-aware bi-hypergraph network, which not only considers coarse-grained visual features, but also extracts fine-grained knowledge features that reflect knowledge intention hidden in teaching images. In detail, a visual hypergraph is constructed to connect images with visual similarity. It further enriches coarse-grained visual features by modeling the high-order visual relations among teaching images. Moreover, a knowledge hypergraph based on typical images is built to aggregate images with similar knowledge information, which innovatively extracts fine-grained knowledge features by modeling high-order knowledge correlations between local regions. Furthermore, a multi-head attention mechanism is adopted to fuse visual-knowledge features for enriching image representation. A teaching image dataset is constructed to train and validate our model, which contains 20744 real-world images annotated with 24 knowledge points. Experimental results demonstrate that KBHN, incorporating visual-knowledge features, achieves state-of-the-art performance compared to existing methods.

论文关键词:Teaching image annotation,Intelligent education,Visual-knowledge inconsistency,Bi-hypergraph network,Knowledge hypergraph,Visual-knowledge features fusion

论文评审过程:Received 29 May 2022, Revised 25 August 2022, Accepted 28 September 2022, Available online 20 October 2022, Version of Record 20 October 2022.

论文官网地址:https://doi.org/10.1016/j.ipm.2022.103106