A new technique for detecting patterns of term usage in text corpora

作者:

Highlights:

摘要

Term occurrence A is included in term occurrence B if A is a substring of B. By making a single pass through a slightly non-standard KWIC index, every recurring phrase can be detected, and its inclusion relationships with other phrases and/or single words can be computed. Results obtained by processing a corpus of 2675 medical titles indicate that several properties definable in terms of inclusion relationships among terms have significance for vocabulary control. Preliminary results from a corpus of more than 62,000 medical titles have confirmed this finding.

论文关键词:

论文评审过程:Available online 15 July 2002.

论文官网地址:https://doi.org/10.1016/0306-4573(76)90064-9