PragmaticOIE: a pragmatic open information extraction for Portuguese language

作者:Cleiton Fernando Lima Sena, Daniela Barreiro Claro

摘要

Information extraction (IE) involves the extraction of useful facts from texts. IE approaches have been categorized into two types: Traditional IE and Open IE. Traditional IE recognizes a predefined set of relationships between the arguments, and it has typically been applied to specific domains. Open IE extracts relationship descriptors expressing any semantic relationship between a pair of arguments in different domains. Although a sentence can have a different meaning, given the context and intention used, a single semantic analysis does not guarantee useful extractions. Extractions depend on the context and the intention inherited in a sentence that goes beyond the semantic meaning. Thus, a pragmatic analysis enhances the set of extractions by considering the contextual and intentional aspects. As a consequence, new facts can be extracted from this set of sentences. The combination of inference, context, and intention enables the extraction of implicit facts from texts achieving a first pragmatic level. This novel approach increases the number of facts, extracting relationships from a sentence analyzing inference, context, and intention. This is the first method to analyze a first pragmatic level from a sentence within a set of Portuguese text documents. Our method was performed over a set of Portuguese text documents and outperforms the most relevant related work comparing accuracy, number of extracted facts, and minimality measures.

论文关键词:Open information extraction, Relation extraction, Inference, Context, Intention, Pragmatics

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10115-020-01442-7