kdd | Knowledge Discovery and Data Mining (KDD)

  • 主办方 / 出版社:ACM
  • 方向:数据挖掘
  • CCF等级 / JCR分区:A类类
Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, June 28 - July 1, 2009.
Mismatched models, wrong results, and dreadful decisions: on choosing appropriate data mining tools.

David J. Hand

Mining web logs: applications and challenges.

Ravi Kumar

Randomization methods in data mining.

Heikki Mannila

Data mining at NASA: from theory to applications.

Ashok N. Srivastava

Network science: an introduction to recent statistical approaches.

Stanley Wasserman

Open standards and cloud computing: KDD-2009 panel report.

Michael Zeller, Robert Grossman, Christoph Lingenfelder, Michael R. Berthold, Erik Marcade, Rick Pechter, Mike Hoskins, Wayne Thompson, Rich Holada

Regression-based latent factor models.

Deepak Agarwal, Bee-Chung Chen

Frequent pattern mining with uncertain data.

Charu C. Aggarwal, Yan Li, Jianyong Wang, Jing Wang

Structured correspondence topic models for mining captioned figures in biological literature.

Amr Ahmed, Eric P. Xing, William W. Cohen, Robert F. Murphy

Name-ethnicity classification from open sources.

Anurag Ambekar, Charles B. Ward, Jahangir Mohammed, Swapna Male, Steven Skiena

Detection of unique temporal segments by information theoretic meta-clustering.

Shin Ando, Einoshin Suzuki

Collusion-resistant anonymous data collection method.

Mafruz Zaman Ashrafi, See-Kiong Ng

A viewpoint-based approach for interaction graph analysis.

Sitaram Asur, Srinivasan Parthasarathy

Optimizing web traffic via the media scheduling problem.

Lars Backstrom, Jon M. Kleinberg, Ravi Kumar

Improving clustering stability with combinatorial MRFs.

Ron Bekkerman, Martin Scholz, Krishnamurthy Viswanathan

Temporal mining for interactive workflow data analysis.

Michele Berlingerio, Fabio Pinelli, Mirco Nanni, Fosca Giannotti

Probabilistic frequent itemset mining in uncertain databases.

Thomas Bernecker, Hans-Peter Kriegel, Matthias Renz, Florian Verhein, Andreas Züfle

The offset tree for learning with partial labels.

Alina Beygelzimer, John Langford

New ensemble methods for evolving data streams.

Albert Bifet, Geoffrey Holmes, Bernhard Pfahringer, Richard Kirkby, Ricard Gavaldà

CoCo: coding cost for parameter-free outlier detection.

Christian Böhm, Katrin Haegler, Nikola S. Müller, Claudia Plant

Efficient anomaly monitoring over moving object trajectory streams.

Yingyi Bu, Lei Chen, Ada Wai-Chee Fu, Dawei Liu

Connections between the lines: augmenting social networks with text.

Jonathan Chang, Jordan L. Boyd-Graber, David M. Blei

Extracting discriminative concepts for domain adaptation in text mining.

Bo Chen, Wai Lam, Ivor W. Tsang, Tak-Lam Wong

Constrained optimization for validation-guided conditional random field learning.

Minmin Chen, Yixin Chen, Michael R. Brent, Aaron E. Tenney

Efficient influence maximization in social networks.

Wei Chen, Yajun Wang, Siyu Yang

Large-scale behavioral targeting.

Ye Chen, Dmitry Pavlov, John F. Canny

On compressing social networks.

Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, Michael Mitzenmacher, Alessandro Panconesi, Prabhakar Raghavan

Regret-based online ranking for a growing digital library.

Erick Delage

A generalized Co-HITS algorithm and its application to bipartite graphs.

Hongbo Deng, Michael R. Lyu, Irwin King

Mining for the most certain predictions from dyadic data.

Meghana Deodhar, Joydeep Ghosh

Efficiently learning the accuracy of labeling sources for selective sampling.

Pinar Donmez, Jaime G. Carbonell, Jeff G. Schneider

Large human communication networks: patterns and a utility-driven generator.

Nan Du, Christos Faloutsos, Bai Wang, Leman Akoglu

Learning with a non-exhaustive training dataset: a case study: detection of bacteria cultures using optical-scattering technology.

Murat Dundar, E. Daniel Hirleman, Arun K. Bhunia, J. Paul Robinson, Bartek Rajwa

Turning down the noise in the blogosphere.

Khalid El-Arini, Gaurav Veda, Dafna Shahaf, Carlos Guestrin

Feature shaping for linear SVM classifiers.

George Forman, Martin Scholz, Shyamsundar Rajaram

A multi-relational approach to spatial classification.

Richard Frank, Martin Ester, Arno J. Knobbe

Scalable pseudo-likelihood estimation in hybrid random fields.

Antonino Freno, Edmondo Trentin, Marco Gori

Issues in evaluation of stream learning algorithms.

João Gama, Raquel Sebastião, Pedro Pereira Rodrigues

Heterogeneous source consensus learning via decision propagation and negotiation.

Jing Gao, Wei Fan, Yizhou Sun, Jiawei Han

Multi-focal learning and its application to customer service support.

Yong Ge, Hui Xiong, Wenjun Zhou, Ramendra K. Sahoo, Xiaofeng Gao, Weili Wu

Co-clustering on manifolds.

Quanquan Gu, Jie Zhou

Analyzing patterns of user content generation in online social networks.

Lei Guo, Enhua Tan, Songqing Chen, Xiaodong Zhang, Yihong Eric Zhao

Tell me something I don't know: randomization strategies for iterative data mining.

Sami Hanhijärvi, Markus Ojala, Niko Vuokko, Kai Puolamäki, Nikolaj Tatti, Heikki Mannila

Exploiting Wikipedia as external knowledge for document clustering.

Xiaohua Hu, Xiaodan Zhang, Caimei Lu, E. K. Park, Xiaohua Zhou

TrustWalker: a random walk model for combining trust-based and item-based recommendation.

Mohsen Jamali, Martin Ester

Drosophila gene expression pattern annotation using sparse features and term-term interactions.

Shuiwang Ji, Lei Yuan, Ying-Xin Li, Zhi-Hua Zhou, Sudhir Kumar, Jieping Ye

Cartesian contour: a concise representation for a collection of frequent sets.

Ruoming Jin, Yang Xiang, Lin Liu

Genre-based decomposition of email class noise.

Aleksander Kolcz, Gordon V. Cormack

Characteristic relational patterns.

Arne Koopman, Arno Siebes

Collaborative filtering with temporal dynamics.

Yehuda Koren

Collective annotation of Wikipedia entities in web text.

Sayali Kulkarni, Amit Singh, Ganesh Ramakrishnan, Soumen Chakrabarti

Finding a team of experts in social networks.

Theodoros Lappas, Kun Liu, Evimaria Terzi

On burstiness-aware search for document sequences.

Theodoros Lappas, Benjamin Arai, Manolis Platakis, Dimitrios Kotsakos, Dimitrios Gunopulos

Improving data mining utility with projective sampling.

Mark Last

Meme-tracking and the dynamics of the news cycle.

Jure Leskovec, Lars Backstrom, Jon M. Kleinberg

DynaMMo: mining and summarization of coevolving sequences with missing values.

Lei Li, James McCann, Nancy S. Pollard, Christos Faloutsos

On the tradeoff between privacy and utility in data publishing.

Tiancheng Li, Ninghui Li

MetaFac: community discovery via relational hypergraph factorization.

Yu-Ru Lin, Jimeng Sun, Paul Castro, Ravi B. Konuru, Hari Sundaram, Aisling Kelliher

BBM: bayesian browsing model from petabyte-scale data.

Chao Liu, Fan Guo, Christos Faloutsos

Large-scale sparse logistic regression.

Jun Liu, Jianhui Chen, Jieping Ye

Classification of software behaviors for failure detection: a discriminative pattern mining approach.

David Lo, Hong Cheng, Jiawei Han, Siau-Cheng Khoo, Chengnian Sun

Consensus group stable feature selection.

Steven Loscalzo, Lei Yu, Chris H. Q. Ding

Grouped graphical Granger modeling methods for temporal causal modeling.

Aurelie C. Lozano, Naoki Abe, Yan Liu, Saharon Rosset

Spatial-temporal causal modeling for climate change attribution.

Aurelie C. Lozano, Hongfei Li, Alexandru Niculescu-Mizil, Yan Liu, Claudia Perlich, Jonathan R. M. Hosking, Naoki Abe

Using graph-based metrics with empirical risk minimization to speed up active learning on networked data.

Sofus A. Macskassy

Characterizing individual communication patterns.

R. Dean Malmgren, Jake M. Hofman, Luis A. Nunes Amaral, Duncan J. Watts

Large-scale graph mining using backbone refinement classes.

Andreas Maunz, Christoph Helma, Stefan Kramer

Differentially Private Recommender Systems: Building Privacy into the Netflix Prize Contenders.

Frank McSherry, Ilya Mironov

WhereNext: a location predictor on trajectory pattern mining.

Anna Monreale, Fabio Pinelli, Roberto Trasarti, Fosca Giannotti

Correlated itemset mining in ROC space: a constraint programming approach.

Siegfried Nijssen, Tias Guns, Luc De Raedt

TANGENT: a novel, 'Surprise me', recommendation algorithm.

Kensuke Onuma, Hanghang Tong, Christos Faloutsos

Mind the gaps: weighting the unknown in large-scale one-class collaborative filtering.

Rong Pan, Martin Scholz

An association analysis approach to biclustering.

Gaurav Pandey, Gowtham Atluri, Michael Steinbach, Chad L. Myers, Vipin Kumar

CP-summary: a concise representation for browsing frequent itemsets.

Ardian Kristanto Poernomo, Vivekanand Gopalkrishnan

Towards efficient mining of proportional fault-tolerant frequent itemsets.

Ardian Kristanto Poernomo, Vivekanand Gopalkrishnan

Audience selection for on-line brand advertising: privacy-friendly social network targeting.

Foster J. Provost, Brian Dalessandro, Rod Hook, Xiaohan Zhang, Alan Murray

A principled and flexible framework for finding alternative clusterings.

Zijie Qi, Ian Davidson

Learning optimal ranking with tensor factorization for tag recommendation.

Steffen Rendle, Leandro Balby Marinho, Alexandros Nanopoulos, Lars Schmidt-Thieme

Scalable graph clustering using stochastic flows: applications to community discovery.

Venu Satuluri, Srinivasan Parthasarathy

Measuring the effects of preprocessing decisions and network forces in dynamic network analysis.

Jerry Scripps, Pang-Ning Tan, Abdol-Hossein Esfahanian

Mining discrete patterns via binary matrix factorization.

Bao-Hong Shen, Shuiwang Ji, Jieping Ye

Anomalous window discovery through scan statistics for linear intersecting paths (SSLIP).

Lei Shi, Vandana Pursnani Janeja

User grouping behavior in online forums.

Xiaolin Shi, Jun Zhu, Rui Cai, Lei Zhang

Causality quantification and its applications: structuring and modeling of multivariate time series.

Takashi Shibuya, Tatsuya Harada, Yasuo Kuniyoshi

Ranking-based clustering of heterogeneous information networks with star network schema.

Yizhou Sun, Yintao Yu, Jiawei Han

Social influence analysis in large-scale networks.

Jie Tang, Jimeng Sun, Chi Wang, Zi Yang

Relational learning via latent social dimensions.

Lei Tang, Huan Liu

Constant-factor approximation algorithms for identifying dynamic communities.

Chayant Tantipathananandh, Tanya Y. Berger-Wolf

DOULION: counting triangles in massive graphs with a coin.

Charalampos E. Tsourakakis, U. Kang, Gary L. Miller, Christos Faloutsos

Category detection using hierarchical mean shift.

Pavan Vatturi, Weng-Keen Wong

Learning, indexing, and diagnosing network faults.

Ting Wang, Mudhakar Srivatsa, Dakshi Agrawal, Ling Liu

Mining broad latent query aspects from search sessions.

Xuanhui Wang, Deepayan Chakrabarti, Kunal Punera

Adapting the right measures for K-means clustering.

Junjie Wu, Hui Xiong, Jian Chen

A LRT framework for fast spatial anomaly detection.

Mingxi Wu, Xiuyao Song, Chris Jermaine, Sanjay Ranka, John Gums

Quantification and semi-supervised classification methods for handling changes in class distribution.

Jack Chongjie Xue, Gary M. Weiss

Fast approximate spectral clustering.

Donghui Yan, Ling Huang, Michael I. Jordan

Effective multi-label active learning for text classification.

Bishan Yang, Jian-Tao Sun, Tengjiao Wang, Zheng Chen

Combining link and content for community detection: a discriminative approach.

Tianbao Yang, Rong Jin, Yun Chi, Shenghuo Zhu

Efficient methods for topic model inference on streaming document collections.

Limin Yao, David M. Mimno, Andrew McCallum

Time series shapelets: a new primitive for data mining.

Lexiang Ye, Eamonn J. Keogh

Exploring social tagging graph for web object classification.

Zhijun Yin, Rui Li, Qiaozhu Mei, Jiawei Han

Mining social networks for personalized email prioritization.

Shinjae Yoo, Yiming Yang, Frank Lin, Il-Chul Moon

Learning patterns in the dynamics of biological networks.

Chang Hun You, Lawrence B. Holder, Diane J. Cook

Toward autonomic grids: analyzing the job flow with affinity streaming.

Xiangliang Zhang, Cyril Furtlehner, Julien Perez, Cécile Germain-Renaud, Michèle Sebag

Parallel community detection on large networks with propinquity dynamics.

Yuzhou Zhang, Jianyong Wang, Yi Wang, Lizhu Zhou

Co-evolution of social and affiliation networks.

Elena Zheleva, Hossam Sharara, Lise Getoor

Information theoretic regularization for semi-supervised boosting.

Lei Zheng, Shaojun Wang, Yan Liu, Chi-Hoon Lee

Cross domain distribution adaptation via kernel mapping.

Erheng Zhong, Wei Fan, Jing Peng, Kun Zhang, Jiangtao Ren, Deepak S. Turaga, Olivier Verscheure

Mining rich session context to improve web search.

Guangyu Zhu, Gilad Mishne

Primal sparse Max-margin Markov networks.

Jun Zhu, Eric P. Xing, Bo Zhang

Augmenting the generalized hough transform to enable the mining of petroglyphs.

Qiang Zhu, Xiaoyue Wang, Eamonn J. Keogh, Sang-Hee Lee

Modeling and predicting user behavior in sponsored search.

Josh Attenberg, Sandeep Pandey, Torsten Suel

Enabling analysts in managed services for CRM analytics.

Indrajit Bhattacharya, Shantanu Godbole, Ajay Gupta, Ashish Verma, Jeff Achtermann, Kevin English

Applying syntactic similarity algorithms for enterprise information management.

Ludmila Cherkasova, Kave Eshghi, Charles B. Morrey III, Joseph Tucek, Alistair C. Veitch

A case study of behavior-driven conjoint analysis on Yahoo!: front page today module.

Wei Chu, Seung-Taek Park, Todd Beaupre, Nitin Motgi, Amit Phadke, Seinjuti Chakraborty, Joe Zachariah

Seven pitfalls to avoid when running controlled experiments on the web.

Thomas Crook, Brian Frasca, Ron Kohavi, Roger Longbotham

Pervasive parallelism in data mining: dataflow solution to co-clustering large and sparse Netflix data.

Srivatsava Daruru, Nena M. Marin, Matt Walker, Joydeep Ghosh

Entity discovery and assignment for opinion mining applications.

Xiaowen Ding, Bing Liu, Lei Zhang

Migration motif: a spatial - temporal pattern mining approach for financial markets.

Xiaoxi Du, Ruoming Jin, Liang Ding, Victor E. Lee, John H. Thornton Jr.

Improving classification accuracy using automatically extracted training data.

Ariel Fuxman, Anitha Kannan, Andrew B. Goldberg, Rakesh Agrawal, Panayiotis Tsaparas, John C. Shafer

Address standardization with latent semantic association.

Honglei Guo, Huijia Zhu, Zhili Guo, Xiaoxun Zhang, Zhong Su

Catching the drift: learning broad matches from clickthrough data.

Sonal Gupta, Mikhail Bilenko, Matthew Richardson

COA: finding novel patents through text analysis.

Mohammad Al Hasan, W. Scott Spangler, Thomas D. Griffin, Alfredo Alba

Network anomaly detection based on Eigen equation compression.

Shunsuke Hirose, Kenji Yamanishi, Takayuki Nakata, Ryohei Fujimaki

OpinionMiner: a novel machine learning system for web opinion mining and extraction.

Wei Jin, Hung Hay Ho, Rohini K. Srihari

Query result clustering for object-level search.

Jongwuk Lee, Seung-won Hwang, Zaiqing Nie, Ji-Rong Wen

Grocery shopping recommendations based on basket-sensitive random walk.

Ming Li, M. Benjamin Dias, Ian H. Jarman, Wael El-Deredy, Paulo J. G. Lisboa

Learning dynamic temporal graphs for oil-production equipment monitoring system.

Yan Liu, Jayant R. Kalagnanam, Oivind Johnsen

Towards combining web classification and web information extraction: a case study.

Ping Luo, Fen Lin, Yuhong Xiong, Yong Zhao, Zhongzhi Shi

Beyond blacklists: learning to detect malicious web sites from suspicious URLs.

Justin Ma, Lawrence K. Saul, Stefan Savage, Geoffrey M. Voelker

Clustering event logs using iterative partitioning.

Adetokunbo Makanju, A. Nur Zincir-Heywood, Evangelos E. Milios

SNARE: a link analytic system for graph labeling and risk detection.

Mary McGlohon, Stephen Bay, Markus G. Anderle, David M. Steier, Christos Faloutsos

Sentiment analysis of blogs by combining lexical knowledge with text classification.

Prem Melville, Wojciech Gryc, Richard D. Lawrence

Anonymizing healthcare data: a case study on the blood transfusion service.

Noman Mohammed, Benjamin C. M. Fung, Patrick C. K. Hung, Cheuk-kwong Lee

Towards a universal marketplace over the web: statistical multi-label classification of service provider forms with simulated annealing.

Kivanc M. Ozonat, Donald Young

Sustainable operation and management of data center chillers using temporal data mining.

Debprakash Patnaik, Manish Marwah, Ratnesh K. Sharma, Naren Ramakrishnan

BGP-lens: patterns and anomalies in internet routing updates.

B. Aditya Prakash, Nicholas Valler, David Andersen, Michalis Faloutsos, Christos Faloutsos

Predicting bounce rates in sponsored search advertisements.

D. Sculley, Robert G. Malkin, Sugato Basu, Roberto J. Bayardo

Mining brain region connectivity for alzheimer's disease study via sparse inverse covariance estimation.

Liang Sun, Rinkal Patel, Jun Liu, Kewei Chen, Teresa Wu, Jing Li, Eric Reiman, Jieping Ye

Can we learn a template-independent wrapper for news article extraction from a single training site?

Junfeng Wang, Chun Chen, Can Wang, Jian Pei, Jiajun Bu, Ziyu Guan, Wei Vivian Zhang

PSkip: estimating relevance ranking quality from web search clickthrough data.

Kuansan Wang, Toby Walker, Zijian Zheng

Named entity mining from click-through data using weakly supervised latent dirichlet allocation.

Gu Xu, Shuang-Hong Yang, Hang Li

Incorporating site-level knowledge for incremental crawling of web forums: a list-wise strategy.

Jiang-Ming Yang, Rui Cai, Chunsong Wang, Hua Huang, Lei Zhang, Wei-Ying Ma

Intelligent file scoring system for malware detection from the gray list.

Yanfang Ye, Tao Li, Qingshan Jiang, Zhixue Han, Li Wan

OLAP on search logs: an infrastructure supporting data-driven applications in search engines.

Bin Zhou, Daxin Jiang, Jian Pei, Hang Li