Building a morpho-semantic knowledge graph for Arabic information retrieval
作者:
Highlights:
• A morpho-semantic knowledge graph CAMS-KG is built from vocalized Classical Arabic corpus.
• CAMS-KG combines tools for morphological analysis and disambiguation, and implements a concordance builder tool, and KG representation.
• KG stores the extracted morpho-semantic knowledge: representing morphological categories and both morphological and semantic relations.
• BM25 ranking is used for retrieving related documents for a given query.
• CAMS-KG is evaluated on two datasets (Tashkeela, and ZAD). Several query expansion strategies are experimented on 25 queries from ZAD dataset.
摘要
•A morpho-semantic knowledge graph CAMS-KG is built from vocalized Classical Arabic corpus.•CAMS-KG combines tools for morphological analysis and disambiguation, and implements a concordance builder tool, and KG representation.•KG stores the extracted morpho-semantic knowledge: representing morphological categories and both morphological and semantic relations.•BM25 ranking is used for retrieving related documents for a given query.•CAMS-KG is evaluated on two datasets (Tashkeela, and ZAD). Several query expansion strategies are experimented on 25 queries from ZAD dataset.
论文关键词:Morpho-semantic knowledge extraction,Classical Arabic text mining,Arabic information retrieval,Graph-based knowledge representation
论文评审过程:Received 26 March 2019, Revised 8 September 2019, Accepted 9 September 2019, Available online 25 September 2019, Version of Record 20 October 2020.
论文官网地址:https://doi.org/10.1016/j.ipm.2019.102124