SIGKDD(KDD) 2002 论文列表
Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, July 23-26, 2002, Edmonton, Alberta, Canada.
|
Transforming classifier scores into accurate multiclass probability estimates.
Topic-conditioned novelty detection.
CLOPE: a fast and effective clustering algorithm for transactional data.
A unifying framework for detecting outliers and change points from non-stationary time series data.
B-EM: a classifier incorporating bootstrap with EM approach for data mining.
Making every bit count: fast nonlinear axis scaling.
Discovery net: towards a grid of knowledge discovery.
Item selection by "hub-authority" profit ranking.
Non-linear dimensionality reduction techniques for classification and visualization.
Privacy preserving association rule mining in vertically partitioned data.
What's the code?: automatic classification of source code archives.
Single-shot detection of multiple categories of text using parametric mixture models.
Combining clustering and co-training to enhance text classification using unlabelled data.
Discovering word senses from text.
Evaluating classifiers' performance in a constrained environment.
Incremental context mining for adaptive document classification.
Collusion in the U.S. crop insurance program: applied data mining.
Discovering informative content blocks from Web documents.
A robust and efficient clustering algorithm based on cohesion self-merging.
Distributed data mining in a chain store database of short transactions.
Instability of decision tree classification algorithms.
Construct robust rule sets for classification.
Clustering seasonality patterns in the presence of errors.
Finding surprising patterns in a time series database in linear time and space.
Similarity measure based on partial information of time series.
SimRank: a measure of structural-context similarity.
A model for discovering customer value for E-content.
Mining complex models from arbitrarily large databases in constant time.
Visualization support for a user-centered KDD process.
Scaling multi-class support vector machines using inter-class confusion.
SyMP: an efficient clustering approach to identify clusters of arbitrary shapes in large data sets.
Integrating feature and instance selection for text classification.
Tumor cell identification using features rules.
Tina Eliassi-Rad, Terence Critchlow, Ghaleb Abdulla.
SECRET: a scalable linear regression tree algorithm.
Learning to match and cluster large high-dimensional data sets for data integration.
CVS: a Correlation-Verification based Smoothing technique on information retrieval and term clustering.
A new two-phase sampling based algorithm for discovering association rules.
Extracting decision trees from trained neural networks.
Topics in 0--1 data.
A theoretical framework for learning from a pool of disparate data sources.
Frequent term-based text clustering.
Sequential PAttern mining using a bitmap representation.
Collaborative crawling: mining user experiences for topical resource discovery.
Mining heterogeneous gene expression data with time lagged recurrent neural networks.
On the potential of domain literature for clustering and Bayesian network learning.
Handling very large numbers of association rules in the analysis of microarray data.
ADMIT: anomaly-based data mining for intrusions.
Learning nonstationary models of normal network traffic for detecting novel attacks.
Mining intrusion detection alarms for actionable knowledge.
A system for real-time competitive market intelligence.
Learning domain-independent string transformation weights for high accuracy object identification.
Mining product reputations on the Web.
Customer lifetime value modeling and its use for customer retention planning.
Exploiting response models: optimizing cross-sell and up-sell opportunities in banking.
From run-time behavior to usage scenarios: an interaction-pattern mining approach.
Efficient handling of high-dimensional feature spaces by randomized classifier ensembles.
Predicting rare classes: can boosting make any weak learner strong?
Exploiting unlabeled data in ensemble methods.
Transforming data to satisfy privacy constraints.
Interactive deduplication using active learning.
Sequential cost-sensitive decision making with reinforcement learning.
Web site mining: a new way to spot competitors, customers and suppliers in the world wide web.
PEBL: positive example based learning for Web page classification using SVM.
Mining frequent item sets by opportunistic projection.
Privacy preserving mining of association rules.
A refinement approach to handling model misfit in text categorization.
A parallel learning algorithm for text classification.
Enhanced word clustering for hierarchical text classification.
Hierarchical model-based clustering of large datasets through fractionation and refractionation.
Shrinkage estimator generalizations of Proximal Support Vector Machines.
On effective classification of strings with wavelets.
Pattern discovery in sequences under a Markov assumption.
Relational Markov models and their application to adaptive web navigation.
Optimizing search engines using clickthrough data.
On interactive visualization of high-dimensional data using the hyperbolic plane.
Query, analysis, and visualization of hierarchically structured data using Polaris.
On the need for time series data mining benchmarks: a survey and empirical demonstration.
Bursty and hierarchical structure in streams.
ANF: a fast and scalable tool for data mining in massive graphs.
Efficiently mining frequent trees in a forest.
Mining knowledge-sharing sites for viral marketing.
Querying multiple sets of discovered rules.
DualMiner: a dual-pruning algorithm for itemsets with constraints.
Selecting the right interestingness measure for association patterns.
MARK: a boosting algorithm for heterogeneous kernel models.
Scalable robust covariance and correlation estimates for data mining.
Bayesian analysis of massive datasets via particle filters.