Learning distributed discrete Bayesian Network Classifiers under MapReduce with Apache Spark
作者:
Highlights:
• Supervised classification on large scale and high dimensional data.
• Adaptation of a well known approach such as Bayesian Network Classifiers to MapReduce.
• Deep analysis of scalabilty properties of the new methods.
• Extensive experimentation on several synthetic and real datasets.
• Implementation on state-of-the-art Apache Spark library, open source code available.
摘要
•Supervised classification on large scale and high dimensional data.•Adaptation of a well known approach such as Bayesian Network Classifiers to MapReduce.•Deep analysis of scalabilty properties of the new methods.•Extensive experimentation on several synthetic and real datasets.•Implementation on state-of-the-art Apache Spark library, open source code available.
论文关键词:Bayesian Network Classifiers,MapReduce,Big Data,High dimensionality,Apache Hadoop,Apache Spark
论文评审过程:Received 17 March 2016, Revised 9 June 2016, Accepted 12 June 2016, Available online 22 June 2016, Version of Record 20 December 2016.
论文官网地址:https://doi.org/10.1016/j.knosys.2016.06.013