kdd37

SIGKDD(KDD) 2007 论文列表

Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, California, USA, August 12-15, 2007.

Data mining at the crossroads: successes, failures and learning from them.
Truth discovery with multiple conflicting information providers on the web.
IMDS: intelligent malware detection system.
Machine learning for stock selection.
LungCAD: a clinically approved, machine learning system for lung cancer detection.
Event summarization for system management.
Domain-constrained semi-supervised mining of tracking models in sensor networks.
Detecting changes in large data sets of payment card data: a case study.
A framework for classification and segmentation of massive audio data streams.
Extracting relevant named entities for automated expense reimbursement.
Corroborate and learn facts from the web.
Mining complex power networks for blackout prevention.
High-quantile modeling for customer wallet estimation and other applications.
Distributed classification in peer-to-peer networks.
Practical guide to controlled experiments on the web: listen to your customers not to the hippo.
Cleaning disguised missing data: a heuristic approach.
Relational data pre-processing techniques for improved securities fraud detection.
On-board analysis of uncalibrated data for a spacecraft at mars.
An event-based framework for characterizing the evolutionary behavior of interaction graphs.
Webpage understanding: an integrated approach.
Joint optimization of wrapper generation and template detection.
Mining templates from search result records of search engines.
Information distance from a question to an answer.
From frequent itemsets to semantically meaningful visual patterns.
Learning the kernel matrix in discriminant analysis via quadratically constrained quadratic programming.
Detecting time series motifs under uniform scaling.
Model-shared subspace boosting for multi-label classification.
SCAN: a structural clustering algorithm for networks.
Local decomposition for rare class analysis.
Mining favorable facets.
Generalized component analysis for text with heterogeneous attributes.
Mining correlated bursty topic patterns from coordinated text streams.
Privacy-preservation for gradient descent methods.
Characterising the difference.
Scalable look-ahead linear regression trees.
Fast direction-aware proximity for graph mining.
Fast best-effort pattern matching in large attributed graphs.
A scalable modular convex solver for regularized risk minimization.
A framework for community identification in dynamic social networks.
Enhancing semi-supervised clustering: a feature projection perspective.
Weighting versus pruning in rule validation for detecting network and host anomalies.
GraphScope: parameter-free mining of large time-evolving graphs.
Use of ranked cross document evidence trails for hypothesis generation.
Statistical change detection for multi-dimensional data.
Making generative classifiers robust to selection bias.
A spectral clustering approach to optimally combining numericalvectors with a modular network.
Partial example acquisition in cost-sensitive learning.
A concept-based model for enhancing text categorization.
Information genealogy: uncovering the flow of ideas in non-hyperlinked document databases.
Practical learning from one-sided feedback.
Using hierarchical clustering for learning theontologies used in recommendation systems.
Knowledge discovery of multiple-topic document using parametric mixture model with dirichlet prior.
Hierarchical mixture models: a probabilistic analysis.
Active exploration for learning rankings from clickthrough data.
Tracking multiple topics for finding interesting articles.
Applying collaborative filtering techniques to movie search for better ranking and browsing.
Association analysis-based transformations for protein interaction networks: a function prediction case study.
Mining optimal decision trees from itemset lattices.
Multiscale topic tomography.
Joint cluster analysis of attribute and relationship data withouta-priori specification of the number of clusters.
Expertise modeling for matching papers with reviewers.
Automatic labeling of multinomial topic models.
Nestedness and segmented nestedness.
A probabilistic framework for relational clustering.
Efficient mining of iterative patterns for software specification discovery.
BoostCluster: boosting clustering by pairwise constraints.
Very sparse stable random projections for dimension reduction in lalpha (0
Mining statistically important equivalence classes and delta-discriminative emerging patterns.
Cost-effective outbreak detection in networks.
A fast algorithm for finding frequent episodes in event streams.
Raising the baseline for high-precision text classifiers.
Correlation search in graph databases.
Exploiting duality in summarization with deterministic guarantees.
Detecting research topics via the correlation between graphs and texts.
Dynamic hybrid clustering of bioinformatics by incorporating text mining and citation analysis.
Finding low-entropy sets and trees from binary data.
Enhanced max margin learning on multimodal data mining in a multimedia database.
Trajectory pattern mining.
Constraint-driven clustering.
The minimum consistent subset cover problem and its applications in data mining.
Time-dependent event hierarchy construction.
Finding tribes: identifying close-knit individuals from employment patterns.
Semi-supervised classification with hybrid generative/discriminative methods.
Development of NeuroElectroMagnetic ontologies(NEMO): a framework for mining brainwave ontologies.
A learning framework using Green's function and kernel regularization with application to recommender system.
A framework for simultaneous co-clustering and learning from complex data.
Efficient incremental constrained clustering.
Feature selection methods for text classification.
Detecting anomalous records in categorical datasets.
Co-clustering based classification for out-of-domain documents.
Canonicalization of database records using adaptive similarity measures.
Exploiting underrepresented query aspects for automatic query expansion.
Stochastic processes and temporal data mining.
Discovering the hidden structure of house prices with a non-parametric latent manifold model.
Structural and temporal analysis of the blogosphere through community factorization.
Evolutionary spectral clustering by incorporating temporal smoothness.
Cross-language information retrieval using PARAFAC2.
Density-based clustering for real-time stream data.
Nonlinear adaptive distance metric learning for clustering.
Support feature machine for classification of abnormal brain activity.
Content-based document routing and index partitioning for scalable similarity-based searches in a large corpus.
Modeling relationships at multiple scales to improve accuracy of large recommender systems.
Real-time ranking with concept drift using expert advice.
Extracting semantic relations from query logs.
Temporal causal modeling with graphical granger methods.
Show me the money!: deriving the pricing power of product features by mining consumer reviews.
Xproj: a framework for projected structural clustering of xml documents.
On string classification in data streams.
Predictive discrete latent factor models for large scale dyadic data.
Estimating rates of rare events at multiple resolutions.
Efficient and effective explanation of change in hierarchical summaries.
Challenges in mining social network data: processes, privacy, and paradoxes.
From mining the web to inventing the new sciences underlying the internet.
Calculating latent demand in the long tail.