
SIGKDD(KDD) 2014 论文列表

The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '14, New York, NY, USA - August 24 - 27, 2014.

Recommendation in social media: recent advances and new frontiers.
Statistically sound pattern discovery.
Sampling for big data: a tutorial.
Network mining and analysis for social applications.
Deep learning.
Correlation clustering: from theory to practice.
The recommender problem revisited: morning tutorial.
Management and analytic of biomedical big data with cloud-based in-memory database and dynamic querying: a hands-on experience with real-world data.
Computational epidemiology.
Bringing structure to text: mining phrases, entities, topics, and hierarchies.
Constructing and mining web-scale knowledge graphs: KDD 2014 tutorial.
Scaling up deep learning.
Does social good justify risking personal privacy?
Filling context-ad vocabulary gaps with click logs.
Modeling professional similarity by mining professional career trajectories.
We know what you want to buy: a demographic-based system for product recommendation on microblogs.
Large scale visual recommendations from street fashion images.
Large-scale high-precision topic modeling on twitter.
An empirical study of reserve price optimisation in real-time bidding.
A system to grade computer programming skills using machine learning.
Automated hypothesis generation based on mining scientific literature.
Log-based predictive maintenance.
Seven rules of thumb for web site experimenters.
ISIS: a networked-epidemiology based pervasive web app for infectious disease pandemic planning and response.
Modeling impression discounting in large-scale recommender systems.
Reducing gang violence through network influence based targeting of social programs.
New algorithms for parking demand management and a city-scale deployment.
LASTA: large scale topic assignment on multiple social networks.
'Beating the news' with EMBERS: forecasting civil unrest using open source indicators.
Spatially embedded co-offence prediction using supervised learning.
Identifying tourists from public transport commuters.
Up next: retrieval methods for large scale related video suggestion.
Knock it off: profiling the online storefronts of counterfeit merchandise.
EARS (earthquake alert and report system): a real time decision support system for earthquake crisis management.
Applying data mining techniques to address critical process optimization needs in advanced manufacturing.
Predicting employee expertise for talent management in the enterprise.
A hazard based approach to user return time prediction.
FoodSIS: a text mining system to improve the state of food safety in singapore.
Improving management of aquatic invasions by integrating shipping network, ecological, and environmental data: data mining for social good.
Scalable near real-time failure localization of data center networks.
A case study: privacy preserving release of spatio-temporal density in paris.
Shallow semantic parsing of product offering titles (for better automatic hyperlink insertion).
Modeling mass protest adoption in social network communities using geometric brownian motion.
Corporate residence fraud detection.
Style in the long tail: discovering unique interests with latent variable models in large scale social E-commerce.
Unveiling clusters of events for alert and incident management in large-scale enterprise it.
Large scale predictive modeling for micro-simulation of 3G air interface load.
Budget pacing for targeted online advertisements at LinkedIn.
Activity ranking in LinkedIn feed.
Proactive workflow modeling by stochastic processes with application to healthcare operation and management.
Correlating events with time series for incident diagnosis.
Scalable hands-free transfer learning for online advertising.
Targeting direct cash transfers to the extremely poor.
Novel geospatial interpolation analytics for general meteorological measurements.
Predicting student risks through longitudinal analysis.
Mining text snippets for images on the web.
Guilt by association: large scale malware detection by mining file-relation graphs.
Bringing data science to the speakers of every language.
Big data for social good.
Information environment security.
Data science through the lens of social science.
Algorithms for interpretable machine learning.
Medicine in the age of electronic health records.
Predictive modeling in practice: a case study from sprint.
Frontiers in E-commerce personalization.
Who are experts specializing in landscape photography?: analyzing topic-specific authority on content sharing services.
Predicting long-term impact of CQA posts: a comprehensive viewpoint.
Analyzing expert behaviors in collaborative networks.
Network structural analysis via core-tree-decomposition Publication of this article pending inquiry.
Using strong triadic closure to characterize ties in social networks.
Balanced graph edge partition.
Graph sample and hold: a framework for big-graph analytics.
FAST-PPR: scaling personalized pagerank estimation for large graphs.
Efficient SimRank computation via linearizationPublication of this article pending inquiry.
Almost linear-time algorithms for adaptive betweenness centrality using hypergraph sketches.
The interplay between dynamics and networks: centrality, communities, and cheeger inequality.
On the permanence of vertices in network communities.
Heat kernel based community detection.
Community detection in graphs through correlation.
Community membership identification from small seed sets.
Inside the atoms: ranking on a network of networks.
Focused clustering and outlier detection in large attributed graphs.
Temporal skeletonization on sequential data: patterns, categorization, and visualization.
Learning multifractal structure in large networks.
Core decomposition of uncertain graphs.
Minimizing seed set selection with probabilistic coverage guarantee in a social network.
Fast influence-based coarsening for large networks.
Meta-path based multi-network collective link prediction.
Activity-edge centric multi-label classification for mining heterogeneous information networks.
Who to follow and why: link prediction with explanations.
Stability of influence maximization.
MMRate: inferring multi-aspect diffusion networks with multi-pattern cascades.
Probabilistic latent network visualization: inferring and embedding diffusion networks.
Scalable diffusion-aware optimization of network topology.
A bayesian framework for estimating properties of network diffusions.
On social event organization.
Profit-maximizing cluster hires.
FEMA: flexible evolutionary multi-faceted analysis for dynamic behavioral pattern discovery.
Event detection in activity networks.
Non-parametric scan statistics for event detection and forecasting in heterogeneous social media graphs.
Open question answering over curated and extracted knowledge bases.
Entity profiling with varying source reliabilities.
Sentiment expression conditioned by affective transitions and social forces.
Integrating spreadsheet data via accurate and low-effort extraction.
Mining topics in documents: standing on the shoulders of big data.
Networked bandits with disjoint linear payoffs.
Modeling delayed feedback in display advertising.
Quantifying herding effects in crowd wisdom.
Optimal real-time bidding for display advertising.
From labor to trader: opinion elicitation via online crowds as a market.
Towards scalable critical alert mining.
Exploiting geographic dependencies for real estate appraisal: a mutual perspective of ranking and clustering.
Methods for ordinal peer grading.
Inferring gas consumption and pollution emission of vehicles throughout a city.
Grouping students in educational settings.
Semantic visualization for spherical representation.
Provable deterministic leverage score sampling.
LWI-SVD: low-rank, windowed, incremental singular value decompositions on time-evolving data sets.
Clustering and projected clustering with adaptive neighbors.
Fast DTT: a near linear algorithm for decomposing a tensor into factor tensors.
Mobile app recommendations with security and privacy awareness.
CatchSync: catching synchronized behavior in large directed graphs.
Top-k frequent itemsets via differentially private FP-trees.
Exponential random graph estimation under differential privacy.
Differentially private network data release via structural inference.
Dynamics of news events and social media reaction.
Reducing the sampling complexity of topic models.
Experiments with non-parametric topic models.
SigniTrend: scalable detection of emerging topics in textual streams by hashed significance thresholds.
TCS: efficient topic discovery over crowd-oriented service data.
Product selection problem: improve market share by learning consumer behavior.
Detecting anomalies in dynamic rating data: a robust probabilistic model for rating evolution.
GeoMF: joint geographical modeling and matrix factorization for point-of-interest recommendation.
ClusCite: effective citation recommendation by information network-based clustering.
Optimal recommendations under attraction, aversion, and social influence.
Matching users and items across domains to improve the recommendation quality.
Scalable heterogeneous translated hashing.
Unifying learning to rank and domain adaptation: enabling cross-task document scoring.
Multi-task copula by sparse graph regression.
Efficient multi-task feature learning with calibration.
Personalized search result diversification via structured learning.
LaSEWeb: automating search strategies over semi-structured web data.
Identifying and labeling search tasks via query-based hawkes processes.
Crowdsourced time-sync video tagging using temporal and personalized topic modeling.
Open-domain quantity queries on web tables: annotation, response, and consensus models.
DeepWalk: online learning of social representations.
Improved testing of low rank matrices.
Distance queries from sampled data: accurate and efficient.
Streaming submodular maximization: massive data summarization on the fly.
Efficient mini-batch training for stochastic optimization.
Scaling out big data missing value imputations: pythia vs. godzilla.
Correlation clustering in MapReduce.
Scalable histograms on large probabilistic data.
Fast flux discriminant for large-scale sparse nonlinear classification.
Improving the modified nyström method using spectral shifting.
Knowledge vault: a web-scale approach to probabilistic knowledge fusion.
Online chinese restaurant process.
Learning with dual heterogeneity: a nonparametric bayes model.
Empirical glitch explanations.
Parallel gibbs sampling for hierarchical dirichlet processes via gamma processes equivalence.
Factorized sparse learning models with interpretable high order feature interactions.
Safe and efficient screening for sparse support vector machine.
Simultaneous feature and feature group selection through hard thresholding.
Gradient boosted feature selection.
Effective global approaches for mutual information based feature selection.
Active collaborative permutation learning.
Active semi-supervised learning using sampling theory for graph signals.
Large-scale adaptive semi-supervised learning via unified inductive and transductive model.
Active learning for sparse bayesian multilabel classification.
Active-transductive learning with label-adapted kernels.
Time-varying learning and content analytics via sparse factor analysis.
Streamed approximate counting of distinct elements: beating optimal batch methods.
The setwise stream classification problem.
Detecting moving object outliers in massive-scale trajectory streams.
Prototype-based learning on concept-drifting data streams.
Utilizing temporal patterns for estimating uncertainty in interpretable early decision making.
Learning time-series shapelets.
FBLG: a simple and effective approach for temporal dependence discovery from time series data.
GLAD: group anomaly detection in social media analysis.
Sleep analytics and online selective anomaly detection.
Supervised deep learning with auxiliary networks.
Incremental and decremental training for linear classification.
Box drawings for learning with imbalanced data.
Distance metric learning using dropout: a structured regularization approach.
Large margin distribution machine.
Class-distribution regularized consensus maximization for alleviating overfitting in model combination.
Online multiple kernel regression.
An efficient algorithm for weak hierarchical lasso.
A multi-class boosting method with direct optimization.
FastXML: a fast, accurate and stable tree-classifier for extreme multi-label learning.
SMVC: semi-supervised multi-view clustering in subspace projections.
Representative clustering of uncertain data.
A dirichlet multinomial mixture model-based approach for short text clustering.
Batch discovery of recurring rare classes toward identifying anomalous samples.
Relevant overlapping subspace clusters on categorical data.
User effort minimization through adaptive diversification.
Jointly modeling aspects, ratings and sentiments for movie recommendation (JMARS).
Topic-factorized ideal point estimation model for legislative voting network.
Leveraging user libraries to bootstrap collaborative filtering.
COM: a generative model for group recommendation.
Dual beta process priors for latent cluster discovery in chronic obstructive pulmonary disease.
Clinical risk prediction with multilinear sparse logistic regression.
From micro to macro: data driven phenotyping by densification of longitudinal electronic medical records.
Scalable noise mining in long-term electrocardiographic time-series to predict death following heart attacks.
Marble: high-throughput phenotyping from electronic health records via sparse nonnegative tensor factorization.
FUNNEL: automatic mining of spatially coevolving epidemics.
Good-enough brain model: challenges, algorithms and discoveries in multi-subject experiments.
Unsupervised learning of disease progression models.
Unfolding physiological state: mortality modelling in intensive care units.
People on drugs: credibility of user statements in health communities.
LUDIA: an aggregate-constrained low-rank reconstruction algorithm to leverage publicly released health data.
A cost-effective recommender system for taxi drivers.
Modeling human location data with mixtures of kernel densities.
Travel time estimation of a path using sparse trajectories.
Inferring user demographics and social strategies in mobile social networks.
Prediction of human emergency behavior and their mobility following large-scale disaster.
Bugbears or legitimate threats?: (social) scientists' criticisms of machine learning?
A data driven approach to diagnosing and treating disease.
Data, predictions, and decisions in support of people and society.
The battle for the future of data mining.