Finding cohesive clusters for analyzing knowledge communities

作者:Vasileios Kandylas, S. Phineas Upham, Lyle H. Ungar

摘要

Documents and authors can be clustered into “knowledge communities” based on the overlap in the papers they cite. We introduce a new clustering algorithm, Streemer, which finds cohesive foreground clusters embedded in a diffuse background, and use it to identify knowledge communities as foreground clusters of papers which share common citations. To analyze the evolution of these communities over time, we build predictive models with features based on the citation structure, the vocabulary of the papers, and the affiliations and prestige of the authors. Findings include that scientific knowledge communities tend to grow more rapidly if their publications build on diverse information and if they use a narrow vocabulary.

论文关键词:Gaussian Mixture Model, Text Mining, Normalize Mutual Information, Knowledge Community, Cluster Ensemble

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10115-008-0135-5