Deep online cross-modal hashing by a co-training mechanism

作者:

Highlights:

摘要

Batch-based cross-modal hashing retrieval methods have made great progress. However, they could not be applied in scenarios where new data continuously arrives in a stream. To this end, a few online cross-modal hashing retrieval methods have been proposed. However, they are based on shallow models, which may result in a suboptimal retrieval performance. Therefore, we propose Deep Online Cross-modal Hashing by a Co-training Mechanism (DOCHCM), which introduces deep learning to online cross-modal hashing by cooperatively training two subnetworks with two stages. DOCHCM addresses two problematics aspects. First, in each round, the image sub-network incrementally learns the hash codes of the current chunk of images by preserving the semantic similarity between their output features and the hash codes of the whole texts. The text sub-network incrementally learns the hash codes of the current chunk of texts by preserving the semantic similarity between their output features and the hash codes of the whole images. Second, knowledge distillation is leveraged to the image and text sub-networks to avoid catastrophic forgetting, which enables the two sub-networks not only to learn new knowledge but also to prevent the forgetting of the old knowledge. Extensive experiments on three benchmark datasets demonstrate that DOCHCM outperforms the state-of-the-art cross-modal hashing retrieval methods.

论文关键词:Online hashing,Cross-modal retrieval,Online learning,Knowledge distillation,Deep learning

论文评审过程:Received 7 June 2022, Revised 9 September 2022, Accepted 11 September 2022, Available online 17 September 2022, Version of Record 26 September 2022.

论文官网地址:https://doi.org/10.1016/j.knosys.2022.109888