Discretization of continuous attributes through low frequency numerical values and attribute interdependency
作者:
Highlights:
• A new discretization technique called LFD.
• Does not require any user input.
• Interval width, number and frequency are automatically determined; all data driven.
• Minimizes information loss due to discretization by choosing low frequency cut points.
• Categorical attributes are taken as reference point for discretization.
摘要
•A new discretization technique called LFD.•Does not require any user input.•Interval width, number and frequency are automatically determined; all data driven.•Minimizes information loss due to discretization by choosing low frequency cut points.•Categorical attributes are taken as reference point for discretization.
论文关键词:Data discretization,Data pre-processing,Data cleansing,Missing value imputation,Corrupt data detection,Data mining
论文评审过程:Received 26 October 2014, Revised 5 October 2015, Accepted 6 October 2015, Available online 20 October 2015, Version of Record 10 November 2015.
论文官网地址:https://doi.org/10.1016/j.eswa.2015.10.005