An analysis on new hybrid parameter selection model performance over big data set

作者:

Highlights:

摘要

Parameter selection or attribute selection is one of the crucial tasks in the data analysis process. Incorrect selection of the important attribute might generate imprecise or event for a wrong decision. It is an advantage if the decision-maker could select and apply the best model that helps in identifying the best-optimized attribute set — in the decision analysis process. Recently, many data scientists from various application areas are attracted to investigate and analyze the advantages and disadvantages of big data. One of the issues is, analyzing large volumes and variety of data in a big data environment is very challenging to the data scientists when there is a lack of a suitable model or no appropriate model to be implemented and used as a guideline. Hence, this paper proposes an alternative parameterization model that is able to generate the most optimized attribute set without requiring a high cost to learn, to use, and to maintain. The model is based on two integrated models that are combined with correlation-based feature selection, best-first search algorithm, soft set, and rough set theories which were compliments to each other as a parameter selection method. Experimental have shown that the proposed model has significantly shown as an alternative model in a big data analysis process.

论文关键词:Big data,Parameter selection,Analysis tool,Decision,Hybrid method

论文评审过程:Received 6 March 2019, Revised 23 December 2019, Accepted 26 December 2019, Available online 30 December 2019, Version of Record 24 February 2020.

论文官网地址:https://doi.org/10.1016/j.knosys.2019.105441