
SIGKDD(KDD) 2009 论文列表

Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, June 28 - July 1, 2009.

OLAP on search logs: an infrastructure supporting data-driven applications in search engines.
Intelligent file scoring system for malware detection from the gray list.
Incorporating site-level knowledge for incremental crawling of web forums: a list-wise strategy.
Named entity mining from click-through data using weakly supervised latent dirichlet allocation.
PSkip: estimating relevance ranking quality from web search clickthrough data.
Can we learn a template-independent wrapper for news article extraction from a single training site?
Mining brain region connectivity for alzheimer's disease study via sparse inverse covariance estimation.
Predicting bounce rates in sponsored search advertisements.
BGP-lens: patterns and anomalies in internet routing updates.
Sustainable operation and management of data center chillers using temporal data mining.
Towards a universal marketplace over the web: statistical multi-label classification of service provider forms with simulated annealing.
Anonymizing healthcare data: a case study on the blood transfusion service.
Sentiment analysis of blogs by combining lexical knowledge with text classification.
SNARE: a link analytic system for graph labeling and risk detection.
Clustering event logs using iterative partitioning.
Beyond blacklists: learning to detect malicious web sites from suspicious URLs.
Towards combining web classification and web information extraction: a case study.
Learning dynamic temporal graphs for oil-production equipment monitoring system.
Grocery shopping recommendations based on basket-sensitive random walk.
Query result clustering for object-level search.
OpinionMiner: a novel machine learning system for web opinion mining and extraction.
Network anomaly detection based on Eigen equation compression.
COA: finding novel patents through text analysis.
Catching the drift: learning broad matches from clickthrough data.
Address standardization with latent semantic association.
Improving classification accuracy using automatically extracted training data.
Migration motif: a spatial - temporal pattern mining approach for financial markets.
Entity discovery and assignment for opinion mining applications.
Pervasive parallelism in data mining: dataflow solution to co-clustering large and sparse Netflix data.
Seven pitfalls to avoid when running controlled experiments on the web.
A case study of behavior-driven conjoint analysis on Yahoo!: front page today module.
Applying syntactic similarity algorithms for enterprise information management.
Enabling analysts in managed services for CRM analytics.
Modeling and predicting user behavior in sponsored search.
Augmenting the generalized hough transform to enable the mining of petroglyphs.
Primal sparse Max-margin Markov networks.
Mining rich session context to improve web search.
Cross domain distribution adaptation via kernel mapping.
Information theoretic regularization for semi-supervised boosting.
Co-evolution of social and affiliation networks.
Parallel community detection on large networks with propinquity dynamics.
Toward autonomic grids: analyzing the job flow with affinity streaming.
Learning patterns in the dynamics of biological networks.
Mining social networks for personalized email prioritization.
Exploring social tagging graph for web object classification.
Time series shapelets: a new primitive for data mining.
Efficient methods for topic model inference on streaming document collections.
Combining link and content for community detection: a discriminative approach.
Effective multi-label active learning for text classification.
Fast approximate spectral clustering.
Quantification and semi-supervised classification methods for handling changes in class distribution.
A LRT framework for fast spatial anomaly detection.
Adapting the right measures for K-means clustering.
Mining broad latent query aspects from search sessions.
Learning, indexing, and diagnosing network faults.
Category detection using hierarchical mean shift.
DOULION: counting triangles in massive graphs with a coin.
Constant-factor approximation algorithms for identifying dynamic communities.
Relational learning via latent social dimensions.
Social influence analysis in large-scale networks.
Ranking-based clustering of heterogeneous information networks with star network schema.
Causality quantification and its applications: structuring and modeling of multivariate time series.
User grouping behavior in online forums.
Anomalous window discovery through scan statistics for linear intersecting paths (SSLIP).
Mining discrete patterns via binary matrix factorization.
Measuring the effects of preprocessing decisions and network forces in dynamic network analysis.
Scalable graph clustering using stochastic flows: applications to community discovery.
Learning optimal ranking with tensor factorization for tag recommendation.
A principled and flexible framework for finding alternative clusterings.
Audience selection for on-line brand advertising: privacy-friendly social network targeting.
Towards efficient mining of proportional fault-tolerant frequent itemsets.
CP-summary: a concise representation for browsing frequent itemsets.
An association analysis approach to biclustering.
Mind the gaps: weighting the unknown in large-scale one-class collaborative filtering.
TANGENT: a novel, 'Surprise me', recommendation algorithm.
Correlated itemset mining in ROC space: a constraint programming approach.
WhereNext: a location predictor on trajectory pattern mining.
Differentially Private Recommender Systems: Building Privacy into the Netflix Prize Contenders.
Large-scale graph mining using backbone refinement classes.
Characterizing individual communication patterns.
Using graph-based metrics with empirical risk minimization to speed up active learning on networked data.
Spatial-temporal causal modeling for climate change attribution.
Grouped graphical Granger modeling methods for temporal causal modeling.
Consensus group stable feature selection.
Classification of software behaviors for failure detection: a discriminative pattern mining approach.
Large-scale sparse logistic regression.
BBM: bayesian browsing model from petabyte-scale data.
MetaFac: community discovery via relational hypergraph factorization.
On the tradeoff between privacy and utility in data publishing.
DynaMMo: mining and summarization of coevolving sequences with missing values.
Meme-tracking and the dynamics of the news cycle.
Improving data mining utility with projective sampling.
On burstiness-aware search for document sequences.
Finding a team of experts in social networks.
Collective annotation of Wikipedia entities in web text.
Collaborative filtering with temporal dynamics.
Characteristic relational patterns.
Genre-based decomposition of email class noise.
Cartesian contour: a concise representation for a collection of frequent sets.
Drosophila gene expression pattern annotation using sparse features and term-term interactions.
TrustWalker: a random walk model for combining trust-based and item-based recommendation.
Exploiting Wikipedia as external knowledge for document clustering.
Tell me something I don't know: randomization strategies for iterative data mining.
Analyzing patterns of user content generation in online social networks.
Co-clustering on manifolds.
Multi-focal learning and its application to customer service support.
Heterogeneous source consensus learning via decision propagation and negotiation.
Issues in evaluation of stream learning algorithms.
Scalable pseudo-likelihood estimation in hybrid random fields.
A multi-relational approach to spatial classification.
Feature shaping for linear SVM classifiers.
Turning down the noise in the blogosphere.
Learning with a non-exhaustive training dataset: a case study: detection of bacteria cultures using optical-scattering technology.
Large human communication networks: patterns and a utility-driven generator.
Efficiently learning the accuracy of labeling sources for selective sampling.
Mining for the most certain predictions from dyadic data.
A generalized Co-HITS algorithm and its application to bipartite graphs.
Regret-based online ranking for a growing digital library.
On compressing social networks.
Large-scale behavioral targeting.
Efficient influence maximization in social networks.
Constrained optimization for validation-guided conditional random field learning.
Extracting discriminative concepts for domain adaptation in text mining.
Connections between the lines: augmenting social networks with text.
Efficient anomaly monitoring over moving object trajectory streams.
CoCo: coding cost for parameter-free outlier detection.
New ensemble methods for evolving data streams.
The offset tree for learning with partial labels.
Probabilistic frequent itemset mining in uncertain databases.
Temporal mining for interactive workflow data analysis.
Improving clustering stability with combinatorial MRFs.
Optimizing web traffic via the media scheduling problem.
A viewpoint-based approach for interaction graph analysis.
Collusion-resistant anonymous data collection method.
Detection of unique temporal segments by information theoretic meta-clustering.
Name-ethnicity classification from open sources.
Structured correspondence topic models for mining captioned figures in biological literature.
Frequent pattern mining with uncertain data.
Regression-based latent factor models.
Open standards and cloud computing: KDD-2009 panel report.
Network science: an introduction to recent statistical approaches.
Data mining at NASA: from theory to applications.
Randomization methods in data mining.
Mining web logs: applications and challenges.
Mismatched models, wrong results, and dreadful decisions: on choosing appropriate data mining tools.