Evaluating computer-generated domain-oriented vocabularies

作者：

Highlights：

•

摘要

It is generally accepted that natural language understanding systems are not now able to deal successfully with unrestricted text, except in very superficial ways. Certainly no current NL system exhibits any significant degree of understanding over arbitrary subject matter. Moreover, there is no convincing reason to believe this situation will change in the near future. Successful systems, therefore, have been restricted to specific applications in particular discourse domains. In those situations where users are expected to provide the domain vocabulary (e.g., TEAM, TQA, etc.) it would be very desirable to provide at least suggestions as to what this vocabulary might be, because a good part of the difficulty in customizing a general system consists of supplying the domain vocabulary and specifying its grammatical properties. This paper discusses some methods for identifying domain vocabulary, as well as techniques for evaluating the quality of the resulting word list.

论文关键词：

论文评审过程：Received 19 January 1990, Accepted 19 April 1990, Available online 19 July 2002.

论文官网地址：https://doi.org/10.1016/0306-4573(90)90052-4