On-line science: the world-wide telescope as a prototype for the new computational science.
Statistical learning from relational data.
Analyzing customer behavior at Amazon.com.
Andreas S. Weigend
Towards systematic design of distance functions for data mining applications.
Charu C. Aggarwal
Generative model-based clustering of directional data.
Arindam Banerjee, Inderjit S. Dhillon, Joydeep Ghosh, Suvrit Sra
Mining distance-based outliers in near linear time with randomization and a simple pruning rule.
Stephen D. Bay, Mark Schwabacher
Adaptive duplicate detection using learnable string similarity measures.
Mikhail Bilenko, Raymond J. Mooney
An iterative hypothesis-testing strategy for pattern discovery.
Richard J. Bolton, Niall M. Adams
Efficient data reduction with EASE.
Hervé Brönnimann, Bin Chen, Manoranjan Dash, Peter J. Haas, Peter Scheuermann
Extracting semantics from data cubes using cube transversals and closures.
Alain Casali, Rosine Cicchetti, Lotfi Lakhal
Translation-invariant mixture models for curve clustering.
Darya Chudova, Scott Gaffney, Eric Mjolsness, Padhraic Smyth
Inderjit S. Dhillon, Subramanyam Mallela, Dharmendra S. Modha
SEWeP: using site semantics and a taxonomy to enhance the Web personalization process.
Magdalini Eirinaki, Michalis Vazirgiannis, Iraklis Varlamis
Inverted matrix: efficient discovery of frequent items in large datasets in the context of interactive mining.
Mohammad El-Hajj, Osmar R. Zaïane
To buy or not to buy: mining airfare data to minimize ticket purchase price.
Oren Etzioni, Rattapoom Tuchinda, Craig A. Knoblock, Alexander Yates
Fragments of order.
Aristides Gionis, Teija Kujala, Heikki Mannila
Maximizing the spread of influence through a social network.
David Kempe, Jon M. Kleinberg, Éva Tardos
PROXIMUS: a framework for analyzing very high dimensional discrete-attributed datasets.
Mehmet Koyutürk, Ananth Grama
Visualizing changes in the structure of data for exploratory feature selection.
Elias Pampalk, Werner Goebl, Gerhard Widmer
Aggregation-based feature invention and relational concept classes.
Claudia Perlich, Foster J. Provost
Cross-training: learning probabilistic mappings between topics.
Sunita Sarawagi, Soumen Chakrabarti, Shantanu Godbole
Generating English summaries of time series data using the Gricean maxims.
Somayajulu Sripada, Ehud Reiter, Jim Hunter, Jin Yu
Assessment and pruning of hierarchical model based clustering.
Jeremy Tantrum, Alejandro Murua, Werner Stuetzle
Privacy-preserving k-means clustering over vertically partitioned data.
Jaideep Vaidya, Chris Clifton
Indexing multi-dimensional time-series with support for multiple distance measures.
Michail Vlachos, Marios Hadjieleftheriou, Dimitrios Gunopulos, Eamonn J. Keogh
Mining concept-drifting data streams using ensemble classifiers.
Haixun Wang, Wei Fan, Philip S. Yu, Jiawei Han
CLOSET+: searching for the best strategies for mining frequent closed itemsets.
Jianyong Wang, Jiawei Han, Jian Pei
Mining unexpected rules by pushing user dynamics.
Ke Wang, Yuelong Jiang, Laks V. S. Lakshmanan
On detecting differences between groups.
Geoffrey I. Webb, Shane M. Butler, Douglas A. Newlands
Algorithms for estimating relative importance in networks.
Scott White, Padhraic Smyth
Screening and interpreting multi-item associations based on log-linear modeling.
Xintao Wu, Daniel Barbará, Yong Ye
CloseGraph: mining closed frequent graph patterns.
Xifeng Yan, Jiawei Han
Eliminating noisy information in Web pages for data mining.
Lan Yi, Bing Liu, Xiaoli Li
Classifying large data sets using SVMs with hierarchical clusters.
Hwanjo Yu, Jiong Yang, Jiawei Han
XRules: an effective structural classifier for XML data.
Mohammed Javeed Zaki, Charu C. Aggarwal
Fast vertical mining using diffsets.
Mohammed Javeed Zaki, Karam Gouda
Efficient elastic burst detection in data streams.
Yunyue Zhu, Dennis E. Shasha
Golden Path Analyzer: using divide-and-conquer to cluster Web clickstreams.
Kamal Ali, Steven P. Ketchpel
Empirical Bayesian data mining for discovering patterns in post-marketing drug safety.
David M. Fram, June S. Almenoff, William DuMouchel
Mining hepatitis data with temporal abstraction.
Tu Bao Ho, Trong Dung Nguyen, Saori Kawasaki, Si Quang Le, DucDung Nguyen, Hideto Yokoi, Katsuhiko Takabayashi
Information awareness: a prospective technical assessment.
David D. Jensen, Matthew J. Rattigan, Hannah Blau
The data mining approach to automated software testing.
Mark Last, Menahem Friedman, Abraham Kandel
Passenger-based predictive modeling of airline no-show rates.
Richard D. Lawrence, Se June Hong, Jacques Cherrier
Capturing best practice for microarray gene expression data analysis.
Gregory Piatetsky-Shapiro, Tom Khabaza, Sridhar Ramaswamy
Clinical and financial outcomes analysis with existing hospital patient records.
R. Bharat Rao, Sathyakama Sandilya, Radu Stefan Niculescu, Colin Germond, Harsha Rao
Critical event prediction for proactive management in large-scale computer clusters.
Ramendra K. Sahoo, Adam J. Oliner, Irina Rish, Manish Gupta, José E. Moreira, Sheng Ma, Ricardo Vilalta, Anand Sivasubramaniam
Frequent-subsequence-based prediction of outer membrane proteins.
Rong She, Fei Chen, Ke Wang, Martin Ester, Jennifer L. Gardy, Fiona S. L. Brinkman
Discovery of climate indices using clustering.
Michael Steinbach, Pang-Ning Tan, Vipin Kumar, Steven A. Klooster, Christopher Potter
Knowledge-based data mining.
Sholom M. Weiss, Stephen J. Buckley, Shubir Kapoor, Søren Damgaard
The anatomy of a multimodal information filter.
Yi-Leh Wu, Kingshy Goh, Beitao Li, Huaxin You, Edward Y. Chang
Style mining of electronic messages for multiple authorship discrimination: first results.
Shlomo Argamon, Marin Saric, Sterling Stuart Stein
Mining high dimensional data for classifier knowledge.
Raj Bhatnagar, Goutham Kurra, Wen Niu
Finding recent frequent itemsets adaptively over online data streams.
Joong Hyuk Chang, Won Suk Lee
Probabilistic discovery of time series motifs.
Bill Yuan-chi Chiu, Eamonn J. Keogh, Stefano Lonardi
Understanding captions in biomedical publications.
William W. Cohen, Richard C. Wang, Robert F. Murphy
Using randomized response techniques for privacy-preserving data mining.
Wenliang Du, Justin Zhijun Zhan
Applications of sampling and fractional factorial designs to model-free data squashing.
William DuMouchel, Deepak K. Agarwal
Experiments with random projections for machine learning.
Dmitriy Fradkin, David Madigan
Accurate decision trees for mining high-speed data streams.
João Gama, Ricardo Rocha, Pedro Medas
Correlating synchronous and asynchronous data streams.
Sudipto Guha, Dimitrios Gunopulos, Nick Koudas
A Web page prediction model based on click-stream tree representation of user behavior.
Sule Gündüz, M. Tamer Özsu
Natural communities in large linked networks.
John E. Hopcroft, Omar Khan, Brian Kulis, Bart Selman
Navigating massive data sets via local clustering.
Michael E. Houle
Mining viewpoint patterns in image databases.
Wynne Hsu, Jing Dai, Mong-Li Lee
Playing hide-and-seek with correlations.
Interactive exploration of coherent patterns in time-series gene expression data.
Daxin Jiang, Jian Pei, Aidong Zhang
Efficient decision tree construction on streaming data.
Ruoming Jin, Gagan Agrawal
A bag of paths model for measuring structural similarity in Web documents.
Sachindra Joshi, Neeraj Agrawal, Raghu Krishnapuram, Sumit Negi
Nantonac collaborative filtering: recommendation based on order responses.
A two-way visualization method for clustered data.
Yehuda Koren, David Harel
Empirical comparisons of various voting methods in bagging.
Kelvin T. Leung, Douglas Stott Parker Jr.
Mining data records in Web pages.
Bing Liu, Robert L. Grossman, Yanhong Zhai
On computing, storing and querying frequent patterns.
Guimei Liu, Hongjun Lu, Wenwu Lou, Jeffrey Xu Yu
Online novelty detection on temporal sequences.
Junshui Ma, Simon Perkins
Distributed cooperative mining for information consortia.
Satoshi Morinaga, Kenji Yamanishi, Jun'ichi Takeuchi
Learning relational probability trees.
Jennifer Neville, David D. Jensen, Lisa Friedland, Michael Hay
Graph-based anomaly detection.
Caleb C. Noble, Diane J. Cook
Carpenter: finding closed patterns in long biological datasets.
Feng Pan, Gao Cong, Anthony K. H. Tung, Jiong Yang, Mohammed Javeed Zaki
New unsupervised clustering algorithm for large datasets.
William Peter, John Chiochetti, Clare Giardina
Improving spatial locality of programs via data mining.
Karlton Sequeira, Mohammed Javeed Zaki, Boleslaw K. Szymanski, Christopher D. Carothers
Mining phenotypes and informative genes from gene expression data.
Chun Tang, Aidong Zhang, Jian Pei
Weighted Association Rule Mining using weighted support and significance framework.
Feng Tao, Fionn Murtagh, Mohsen M. Farid
PaintingClass: interactive construction, visualization and exploration of decision trees.
Soon Tee Teoh, Kwan-Liu Ma
Time and sample efficient discovery of Markov blankets and direct causal relations.
Ioannis Tsamardinos, Constantin F. Aliferis, Alexander R. Statnikov
Distributed multivariate regression based on influential observations.
Hang Yu, Ee-Chien Chang
Efficiently handling feature redundancy in high-dimensional data.
Lei Yu, Huan Liu
An adaptive nearest neighbor search for a parts acquisition ePortal.
Rafael Alonso, Jeffrey A. Bloom, Hua Li, Chumki Basu
Architecting a knowledge discovery engine for military commanders utilizing massive runs of simulations.
Philip S. Barry, Jianping Zhang, Mary McDonald
Data quality through knowledge engineering.
Tamraparni Dasu, Gregg T. Vesonder, Jon R. Wright
Similarity analysis on government regulations.
Gloria T. Lau, Kincho H. Law, Gio Wiederhold
Experimental design for solicitation campaigns.
Uwe F. Mayer, Armand Sarkissian
Towards NIC-based intrusion detection.
Matthew Eric Otey, Srinivasan Parthasarathy, Amol Ghoting, G. Li, Sundeep Narravula, Dhabaleswar K. Panda
Data-driven validation, completion and construction of event relationship networks.
Chang-Shing Perng, David Thoenen, Genady Grabarnik, Sheng Ma, Joseph L. Hellerstein
Visualizing concept drift.
Kevin B. Pratt, Gleb Tschapek
Experimental study of discovering essential information from customer inquiry.
Keiko Shimazu, Atsuhito Momma, Koichi Furukawa
Applying data mining in investigating money laundering crimes.
Zhongfei (Mark) Zhang, John J. Salerno, Philip S. Yu