Effective web log mining and online navigational pattern prediction
作者:
Highlights:
•
摘要
Accurate web log mining results and efficient online navigational pattern prediction are undeniably crucial for tuning up websites and consequently helping in visitors’ retention. Like any other data mining task, web log mining starts with data cleaning and preparation and it ends up discovering some hidden knowledge which cannot be extracted using conventional methods. In order for this process to yield good results it has to rely on some good quality input data. Therefore, more focus in this process should be on data cleaning and pre-processing. On the other hand, one of the challenges facing online prediction is scalability. As a result any improvement in the efficiency of online prediction solutions is more than necessary. As a response to the aforementioned concerns we are proposing an enhancement to the web log mining process and to the online navigational pattern prediction. Our contribution contains three different components. First, we are proposing a refined time-out based heuristic for session identification. Second, we are suggesting the usage of a specific density based algorithm for navigational pattern discovery. Finally, a new approach for efficient online prediction is also suggested. The conducted experiments demonstrate the applicability and effectiveness of the proposed approach.
论文关键词:Web mining,Weblog mining,Pattern analysis,Prediction,Navigation,Indexing
论文评审过程:Received 31 October 2012, Revised 16 April 2013, Accepted 18 April 2013, Available online 3 May 2013.
论文官网地址:https://doi.org/10.1016/j.knosys.2013.04.014