Filtering search results using an optimal set of terms identified by an artificial neural network

作者:

Highlights:

摘要

Information filtering (IF) systems usually filter data items by correlating a set of terms representing the user’s interest (a user profile) with similar sets of terms representing the data items. Many techniques can be employed for constructing user profiles automatically, but they usually yield large sets of term. Various dimensionality-reduction techniques can be applied in order to reduce the number of terms in a user profile. We describe a new terms selection technique including a dimensionality-reduction mechanism which is based on the analysis of a trained artificial neural network (ANN) model. Its novel feature is the identification of an optimal set of terms that can classify correctly data items that are relevant to a user. The proposed technique was compared with the classical Rocchio algorithm. We found that when using all the distinct terms in the training set to train an ANN, the Rocchio algorithm outperforms the ANN based filtering system, but after applying the new dimensionality-reduction technique, leaving only an optimal set of terms, the improved ANN technique outperformed both the original ANN and the Rocchio algorithm.

论文关键词:Information filtering,Feature selection,User profile,Artificial neural network

论文评审过程:Received 7 June 2004, Accepted 21 March 2005, Available online 17 May 2005.

论文官网地址:https://doi.org/10.1016/j.ipm.2005.03.020