SIGKDD(KDD) 2010论文列表 - Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, July 25-28, 2010.| 数据学习 (DataLearner)

Why label when you can search?: alternatives to active learning for applying human resources to build classification models under extreme class imbalance.

Josh Attenberg Foster J. Provost

The new iris data: modular data generators.

Iris Adä Michael R. Berthold

Learning with cost intervals.

Xu-Ying Liu Zhi-Hua Zhou

Cold start link prediction.

Vincent Leroy Berkant Barla Cambazoglu Francesco Bonchi

DUST: a generalized notion of similarity between uncertain time series.

Smruti R. Sarangi Karin Murthy

On the quality of inferring interests from social neighbors.

Zhen Wen Ching-Yung Lin

Privacy-preserving outsourcing support vector machines with random transformation.

Keng-Pei Lin Ming-Syan Chen

Versatile publishing for privacy preservation.

Xin Jin Mingyang Zhang Nan Zhang Gautam Das

Feature selection for support vector regression using probabilistic prediction.

Jian-Bo Yang Chong Jin Ong

Unsupervised feature selection for multi-cluster data.

Deng Cai Chiyuan Zhang Xiaofei He

An efficient algorithm for a class of fused lasso problems.

Jun Liu Lei Yuan Jieping Ye

A scalable two-stage approach for a class of dimensionality reduction techniques.

Liang Sun Betul Ceran Jieping Ye

Grafting-light: fast, incremental feature selection and structure learning of Markov random fields.

Jun Zhu Ni Lao Eric P. Xing

Probably the best itemsets.

Nikolaj Tatti

Mining top-k frequent items in a data stream with flexible sliding windows.

Hoang Thanh Lam Toon Calders

Mining uncertain data with probabilistic guarantees.

Liwen Sun Reynold Cheng David W. Cheung Jiefeng Cheng

Frequent regular itemset mining.

Salvatore Ruggieri

UP-Growth: an efficient algorithm for high utility itemset mining.

Vincent S. Tseng Cheng-Wei Wu Bai-En Shie Philip S. Yu

New perspectives and methods in link prediction.

Ryan Lichtenwalter Jake T. Lussier Nitesh V. Chawla

Suggesting friends using the implicit social graph.

Maayan Roth Assaf Ben-David David Deutscher Guy Flysher Ilan Horn Ari Leichtberg Naty Leiser Yossi Matias Ron Merom

User browsing models: relevance versus examination.

Ramakrishnan Srikant Sugato Basu Ni Wang Daryl Pregibon

Estimating rates of rare events with multiple hierarchies through scalable log-linear models.

Deepak Agarwal Rahul Agrawal Rajiv Khanna Nagaraj Kota

Mining advisor-advisee relationships from research publication networks.

Chi Wang Jiawei Han Yuntao Jia Jie Tang Duo Zhang Yintao Yu Jingyi Guo

Medical coding classification by leveraging inter-code relationships.

Yan Yan Glenn Fung Jennifer G. Dy Rómer Rosales

An integrated machine learning approach to stroke prediction.

Aditya Khosla Yu Cao Cliff Chiung-Yu Lin Hsu-Kuang Chiu Junling Hu Honglak Lee

Active learning for biomedical citation screening.

Byron C. Wallace Kevin Small Carla E. Brodley Thomas A. Trikalinos

Metric forensics: a multi-level approach for mining volatile graphs.

Keith Henderson Tina Eliassi-Rad Christos Faloutsos Leman Akoglu Lei Li Koji Maruhashi B. Aditya Prakash Hanghang Tong

TIARA: a visual exploratory text analytic system.

Furu Wei Shixia Liu Yangqiu Song Shimei Pan Michelle X. Zhou Weihong Qian Lei Shi Li Tan Qiang Zhang

Malstone: towards a benchmark for analytics on large data clouds.

Collin Bennett Robert L. Grossman David Locke Jonathan Seidman Steve Vejcik

Tropical cyclone event sequence similarity search via dimensionality reduction and metric learning.

Shen-Shyang Ho Wenqing Tang W. Timothy Liu

Using data mining techniques to address critical information exchange needs in disaster affected public-private networks.

Li Zheng Chao Shen Liang Tang Tao Li Steven Luis Shu-Ching Chen Vagelis Hristidis

Diagnosing memory leaks using graph mining on heap dumps.

Evan K. Maxwell Godmar Back Naren Ramakrishnan

Beyond heuristics: learning to classify vulnerabilities and predict exploits.

Mehran Bozorgi Lawrence K. Saul Stefan Savage Geoffrey M. Voelker

Automatic malware categorization using cluster ensemble.

Yanfang Ye Tao Li Yong Chen Qingshan Jiang

Detecting abnormal coupled sequences and sequence changes in group-based manipulative trading behaviors.

Longbing Cao Yuming Ou Philip S. Yu Gang Wei

Optimizing debt collections using constrained reinforcement learning.

Naoki Abe Prem Melville Cezar Pendus Chandan K. Reddy David L. Jensen Vince P. Thomas James J. Bennett Gary F. Anderson Brent R. Cooley Melissa Kowalczyk Mark Domick Timothy Gardinier

Data mining to predict and prevent errors in health insurance claims processing.

Mohit Kumar Rayid Ghani Zhu-Song Mei

Discovery of significant emerging trends.

Saurabh Goorha Lyle H. Ungar

Multiple kernel learning for heterogeneous anomaly detection: algorithm and aviation safety case study.

Santanu Das Bryan L. Matthews Ashok N. Srivastava Nikunj C. Oza

MineFleet®: an overview of a widely adopted distributed vehicle performance data mining system.

Hillol Kargupta Kakali Sarkar Michael Gilligan

Exploitation and exploration in a performance based contextual advertising system.

Wei Li Xuerui Wang Ruofei Zhang Ying Cui Jianchang Mao Rong Jin

Overlapping experiment infrastructure: more, better, faster experimentation.

Diane Tang Ashish Agarwal Deirdre O'Brien Mike Meyer

Evaluating online ad campaigns in a pipeline: causal models at scale.

David Chan Rong Ge Ori Gershony Tim Hesterberg Diane Lambert

The quantification of advertising: (+ lessons from building businesses based on large scale data mining).

Konrad Feldman

Data winnowing.

Yoav Freund

Data mining in the online services industry.

Qi Lu