Concept dispersion, feature interaction and their effect on particular sources of bias in machine learning

作者:

Highlights:

摘要

Many fast, efficient induction algorithms have been produced which perform well across many domains. However, in more difficult problems — where low-level representations are present — these algorithms can perform poorly and abstract feature construction is often required. Greedy hill-climbing algorithms are harmed by feature interaction, where one attribute alone provides little information about the concept. We argue that the need for, and effects of feature construction are often swamped by the bias of the base learner. Other methods of class formation which are not as susceptable to the problems associated with low-level representation should be explored and augmented as a basis for feature construction. We present an alternative, information theoretic based approach to detecting feature interaction and develop harsher constraints on hypothesis generation, based on relative and absolute measures.

论文关键词:Algorithms,Concept learnng,Hypothesis generation

论文评审过程:Received 16 June 1998, Accepted 24 June 1998, Available online 29 December 1998.

论文官网地址:https://doi.org/10.1016/S0950-7051(98)00067-7