Building a morpho-semantic knowledge graph for Arabic information retrieval

作者:

Highlights:

• A morpho-semantic knowledge graph CAMS-KG is built from vocalized Classical Arabic corpus.

• CAMS-KG combines tools for morphological analysis and disambiguation, and implements a concordance builder tool, and KG representation.

• KG stores the extracted morpho-semantic knowledge: representing morphological categories and both morphological and semantic relations.

• BM25 ranking is used for retrieving related documents for a given query.

• CAMS-KG is evaluated on two datasets (Tashkeela, and ZAD). Several query expansion strategies are experimented on 25 queries from ZAD dataset.

摘要

•A morpho-semantic knowledge graph CAMS-KG is built from vocalized Classical Arabic corpus.•CAMS-KG combines tools for morphological analysis and disambiguation, and implements a concordance builder tool, and KG representation.•KG stores the extracted morpho-semantic knowledge: representing morphological categories and both morphological and semantic relations.•BM25 ranking is used for retrieving related documents for a given query.•CAMS-KG is evaluated on two datasets (Tashkeela, and ZAD). Several query expansion strategies are experimented on 25 queries from ZAD dataset.

论文关键词:Morpho-semantic knowledge extraction,Classical Arabic text mining,Arabic information retrieval,Graph-based knowledge representation

论文评审过程:Received 26 March 2019, Revised 8 September 2019, Accepted 9 September 2019, Available online 25 September 2019, Version of Record 20 October 2020.

论文官网地址:https://doi.org/10.1016/j.ipm.2019.102124