BNPA: An R package to learn path analysis input models from a data set semi-automatically using Bayesian networks

作者:

Highlights:

摘要

Epidemiologists constantly search for methodologies that help them better understand how diseases work. Populations urge these improvements to combat these diseases more effectively. The literature presents several authors defending the idea that epidemiologists should be able to develop causal models. In this area, the technique of structural equation models (SEM) has stood out in scientific research. Although SEM has been widely used in several research areas, it has been little explored by epidemiologists. Despite its evolution and efficiency, SEM has a gap in terms of discovering causalities. To fill this gap, this study developed an R package called BNPA, whose methodology joins the best of Bayesian network structural learning algorithms (BNSL) from data and path analysis (PA) a SEM subarea. The BNPA was built with pre-processing functions. Its main algorithm allows creating an input model to start the PA from a data set semi-automatically generating information to analyze the PA performance. An analysis of cardiovascular disease’s main predictors was performed using the BNPA with data from the Canadian Community Health Survey (CCHS). Multiple linear regression (MR) was used as a gold standard methodology; the results of BNPA matched 85% of MR results. In conclusion, BNPA is efficient and can benefit researchers, mainly novices, by enabling them to build PA models from data. Furthermore, statisticians and PA experts will have more time to support these researchers instead of creating an initial model.

论文关键词:Bayesian networks,Path analysis,Causal inference,R-package

论文评审过程:Received 8 October 2020, Revised 8 April 2021, Accepted 10 April 2021, Available online 18 April 2021, Version of Record 18 April 2021.

论文官网地址:https://doi.org/10.1016/j.knosys.2021.107042