Domain Adaptation for POS Tagging with Contrastive Monotonic Chunk-wise Attention

作者:Rajesh Kumar Mundotiya, Arpit Mehta, Rupjyoti Baruah

摘要

Part of Speech (POS) tagging is a sequential labelling task and one of the core applications of Natural Language Processing. It has been a challenging problem for the low resource languages. Sequential labelling algorithms aim to model relationships among the words of a sentence. Availability of annotated datasets in ample amounts is another challenge for low resource languages. Contrastive training has been tried as a robust approach that captures the essential features during model training and based on this, Contrastive Monotonic Chunk-wise attention with CNN-GRU-Softmax (CMCCGS) model architecture has been proposed for POS tagging. It learns optimal features in a low resource regime. It comprises three components: contrastive training, monotonic chunk-wise attention and CNN-GRU-Softmax, where Monotonic Chunk-wise attention exploits the discrete and chunk level dependencies. We experimented on the datasets of four domains, Article, Conversation, Disease and Tourism, of the Hindi treebank, Tweet domain from TweeBank, Newswire domain from Penn TreeBank (PTB) and Tweet domain from ARK and compared it with several state-of-the-art models. We have obtained \(96.63\%\), \(94.34\%\), \(91.24\%\), \(93.76\%\), \(92.30\%\), \(97.51\%\) and \(93.55\%\) accuracy on respective domains after CMCCGS has been applied. CMCCGS model has been further extended to domain adaptation by using single and multi-source domain adaptation to allow fine-tuning. It is analysed the effects on different layers. The extremely low resource domains such as Tourism, Disease and tweet domain of TweeBank and ARK have shown improvement in accuracy of \(+3.00\% (96.76\%)\) by an Article domain, \(+4.14\% (95.38\%)\) by Article and Tourism (multi-source), \(+2.93\% (95.23\%)\) by PTB domain and \(+1.43\% (94.98\%)\) by PTB and TweeBank (multi-source) as source domain, respectively. However, the Conversation domain has a negative impact on domain adaptation.

论文关键词:Part of Speech tagging, Domain adaptation, Contrastive training, Monotonic chunk-wise attention

论文评审过程:

论文官网地址:https://doi.org/10.1007/s11063-022-10746-4