Methodologies for subject analysis in bibliographic databases

摘要

Techniques and methodologies for subject analysis have changed in recent years, and current research indicates that the changes may be accelerating. The review reported in this paper was undertaken to aid managers of databases in determining if new and little-known capabilities would improve the cost-effectiveness of subject analysis operations. Sophisticated computer aids to routine procedures in subject analysis seem likely to be valuable, although issues of capital investment might limit their application in a given situation. Operational machine-aided and automatic indexing systems were found to form a continuum. The same system can be used for automatic indexing (without human review of individual documents) and machine-aided indexing (with human review) for different applications. Commercial automatic indexing packages were also reviewed. The overall conclusion was that database producers should begin working seriously on upgrading their thesauri and codifying their indexing policies as a means of moving toward development of machine aids to indexing, but that fully automatic indexing is not yet ready for wholesale implementation. The primary obstacle to development of automatic indexing is the lack of machine “understanding” of natural language. Research in artificial intelligence and knowledge bases is attacking this problem, but there is still much work to be done. Recommendations for action include: increasing the power of the indexer interface; studying indexing policies; enrichment of thesauri; taking steps that will contribute to later development of knowledge bases; considering development of machineaided indexing; and applying the findings of natural language processing research.