SAUText - a system for analysis of unstructured textual data

作者:Grzegorz Protaziuk, Jacek Lewandowski, Robert Bembenik

摘要

Nowadays, semantic lexical resources, like ontologies, are becoming increasingly important in many systems, in particular those providing access to unstructured textual data. Typically, such resources are built based on already existing repositories and by analyzing available texts. In practice, however, building new or enriching existing resources of such type cannot be accomplished without using an appropriate tool. In this paper the SAUText is presented; it is a new system which provides the infrastructure for carrying out research involving the usage of semantic resources and the analysis of unstructured textual data. In the system a dedicated repository for storing various kinds of text data is used and parallelization is taken advantage of in order to speed up the analysis. As an example of a method for knowledge discovery available in the system, a new approach for synonym discovery is introduced.

论文关键词:Text mining, Text analysis system, Ontology enrichment, Synonym discovery

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10844-015-0384-1