kdd73

SIGKDD(KDD) 2013 论文列表

The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013, Chicago, IL, USA, August 11-14, 2013.

The dataminer's guide to scalable mixed-membership and nonparametric bayesian models.
Network sampling.
Entity resolution for big data.
Big data analytics for healthcare.
Mining data from mobile devices: a survey of smart sensing and analytics.
Algorithmic techniques for modeling and mining large graphs (AMAzING).
Risk-O-Meter: an intelligent clinical risk calculator.
A transfer learning based framework of crowd-selection on twitter.
LAFT-Explorer: inferring, visualizing and predicting how your social network expands.
FIU-Miner: a fast, integrated, and user-friendly system for data mining in distributed environment.
SAE: social analytic engine for large networks.
SEA: a system for event analysis on chinese tweets.
EventCube: multi-dimensional search and mining of structured and text data.
When TEDDY meets GrizzLY: temporal dependency discovery for triggering road deicing operations.
An online system with end-user services: mining novelty concepts from tv broadcast subtitles.
Understanding Twitter data with TweetXplorer.
KeySee: supporting keyword search on evolving events in social streams.
Real-time disease surveillance using Twitter data: demonstration on flu and cancer.
Forex-foreteller: currency trend modeling using news articles.
STED: semi-supervised targeted-interest event detectionin in twitter.
A tool for collecting provenance data in social media.
AMETHYST: a system for mining and exploring topical hierarchies of heterogeneous data.
Inferring distant-time location in low-sampling-rate trajectories.
JobMiner: a real-time system for mining job-related patterns from social media.
LAICOS: an open source platform for personalized social web search.
Panel: a data scientist's guide to making money from start-ups.
U-Air: when urban air quality inference meets big data.
A privacy preserving framework for managing vehicle data in road pricing systems.
Gaussian multiple instance learning approach for mapping the slums of the world using very high resolution imagery.
An integrated framework for suicide risk prediction.
Mining for geographically disperse communities in social networks by leveraging distance modularity.
Detecting insider threats in a real corporate database of computer usage activity.
Experience from hosting a corporate prediction market: benefits beyond the forecasts.
Exploratory analysis of highly heterogeneous document collections.
Assessing team strategy using spatiotemporal data.
Discriminant malware distance learning on structural information for automated malware classification.
Efficiently rewriting large multimedia application execution traces with few event sequences.
Empirical bayes model to combine signals of adverse drug reactions.
Heat pump detection from coarse grained smart meter data with positive and unlabeled learning.
Palette power: enabling visual search through colors.
Knowledge discovery from massive healthcare claims data.
Uncertainty in online experiments with dependent data: an evaluation of bootstrap methods.
Predictive model performance: offline and online evaluations.
Towards long-lead forecasting of extreme flood events: a data mining framework for precipitation cluster precursors identification.
Why people hate your app: making sense of user feedback in a mobile app store.
A data mining driven risk profiling method for road asset management.
Improving quality control by early prediction of manufacturing outcomes.
An integrated framework for optimizing automatic monitoring systems in large IT infrastructures.
Using co-visitation networks for detecting large scale online display advertising exchange fraud.
Modeling and probabilistic reasoning of population evacuation during large-scale disaster.
Ad click prediction: a view from the trenches.
Scalable supervised dimensionality reduction using clustering.
Amplifying the voice of youth in Africa via text analytics.
A unified search federation system based on online user feedback.
Dynamic memory allocation policies for postings in real-time Twitter search.
iHR: an online recruiting system for Xiamen Talent Service Center.
Online controlled experiments at large scale.
Analysis of advanced meter infrastructure data of water consumption in apartment buildings.
Query clustering based on bid landscape for sponsored search auction optimization.
Financing lead triggers: empowering sales reps through knowledge discovery and fusion.
Using "big data" to solve "small data" problems.
Cyber security: how visual analytics unlock insight.
Hadoop: a view from the trenches.
Targeting and influencing at scale: from presidential elections to social good.
Adaptive adversaries: building systems to fight fraud and cyber intruders.
The business impact of deep learning.
Mining the digital universe of data to develop personalized cancer therapies.
To buy or not to buy: that is the question.
On the equivalent of low-rank linear regressions and linear discriminant analysis based regressions.
Understanding evolution of research themes: a probabilistic generative model for citations.
Restreaming graph partitioning: simple versatile algorithms for advanced balancing.
Information cartography: creating zoomable, large-scale maps of information.
Synthetic review spamming and defense.
Privacy-preserving data exploration in genome-wide association studies.
Mining evidences for named entity disambiguation.
Measuring spontaneous devaluations in user preferences.
Learning mixed kronecker product graph models with simulated method of moments.
Learning geographical preferences for point-of-interest recommendation.
FeaFiner: biomarker identification from medical data through feature generalization and selection.
Collaborative matrix factorization with multiple similarities for predicting drug-target interactions.
Trial and error in influential social networks.
On community detection in real-world networks and the importance of degree assortativity.
Efficient single-source shortest path and distance queries on large graphs.
Silence is also evidence: interpreting dwell time for recommendation from psychological perspective.
Exploiting user clicks for automatic seed set generation for entity matching.
A data-driven method for in-game decision making in MLB: when to pull a starting pitcher.
Scalable inference in max-margin topic models.
A new collaborative filtering approach for increasing the aggregate diversity of recommender systems.
A time-dependent enhanced support vector machine for time series regression.
Modeling the dynamics of composite social networks.
The bang for the buck: fair competitive viral marketing from the host perspective.
Cost-sensitive online active learning with application to malicious URL detection.
Combining latent factor model with location features for event-based group recommendation.
Cascading outbreak prediction in networks: a data-driven approach.
Making recommendations from multiple domains.
Constrained stochastic gradient descent for large-scale least squares problem.
Towards never-ending learning from time series streams.
Multi-space probabilistic sequence modeling.
Direct optimization of ranking measures for learning to rank models.
Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms.
Massively parallel expectation maximization using graphics processing units.
Scalable all-pairs similarity search in metric spaces.
Repetition-aware content placement in navigational networks.
Quadratic optimization to identify highly heritable quantitative traits from complex phenotypic features.
Location-aware publish/subscribe.
Geo-spotting: mining online location-based services for optimal retail store placement.
Link prediction with social vector clocks.
Unsupervised link prediction using aggregative statistics on heterogeneous social networks.
Multi-source deep learning for information trustworthiness estimation.
Optimizing parallel belief propagation in junction treesusing regression.
A "semi-lazy" approach to probabilistic path prediction.
Fast rank-2 nonnegative matrix factorization for hierarchical document clustering.
Active search on graphs.
Mining evolutionary multi-branch trees from text streams.
Maximizing acceptance probability for active friending in online social networks.
Adaptive collective routing using gaussian process dynamic congestion models.
Inferring social roles and statuses in social networks.
Evaluating the crowd with confidence.
Cross-task crowdsourcing.
Nonparametric hierarchal bayesian modeling in non-contractual heterogeneous survival data.
FISM: factored item similarity models for top-N recommender systems.
Speeding up large-scale learning with a social prior.
An efficient ADMM algorithm for multidimensional anisotropic total variation regularization problems.
Spotting opinion spammers using behavioral footprints.
Accurate intelligible models with pairwise interactions.
Multi-label classification by mining label and instance correlations from heterogeneous information networks.
Who, where, when and what: discover spatio-temporal topics for twitter users.
A space efficient streaming algorithm for triangle counting using the birthday paradox.
Simple and deterministic matrix sketching.
SIGMa: simple greedy matching for aligning large knowledge bases.
Psychological advertising: exploring user psychology for click prediction in sponsored search.
Statistical quality estimation for general crowdsourcing tasks.
Mining frequent graph patterns with differential privacy.
Mining high utility episodes in complex event sequences.
Summarizing probabilistic frequent patterns: a fast approach.
Approximate graph mining with label costs.
Mining discriminative subgraphs from global-state networks.
Debiasing social wisdom.
Trace complexity of network inference.
Collaborative boosting for activity classification in microblogs.
Scalable text and link analysis with mixed-topic link models.
Multi-label relational neighbor classification using social context features.
WiseMarket: a new paradigm for managing wisdom of online social users.
Stochastic collapsed variational Bayesian inference for latent Dirichlet allocation.
A phrase mining framework for recursive construction of a topical hierarchy.
Subsampling for efficient and effective unsupervised outlier detection ensembles.
A general bootstrap performance diagnostic.
Mining lines in the sand: on trajectory discovery from untrustworthy data in cyber-physical system.
Information cascade at group scale.
Model-based kernel for efficient time series analysis.
DTW-D: time series semi-supervised learning from a single example.
Model selection in markovian processes.
Extracting social events for learning better information diffusion models.
The role of information diffusion in the evolution of social networks.
Confluence: conformity influence in large social networks.
Social influence based clustering of heterogeneous information networks.
Graph cluster randomization: network exposure to multiple universes.
Flexible and robust co-regularized multi-domain graph clustering.
Robust principal component analysis via capped norms.
Exact sparse recovery with L0 projections.
Robust sparse estimation of multiresponse regression and inverse covariance matrix via the L2 distance.
Fast structure learning in generalized stochastic processes with latent factors.
STRIP: stream learning of influence probabilities.
Discovering latent influence in online social activities via shared cascade poisson processes.
Recursive regularization for large-scale classification with hierarchical and graphical dependencies.
Indexed block coordinate descent for large-scale linear classification with limited memory.
Fast and scalable polynomial kernels via explicit feature maps.
Comparing apples to oranges: a scalable solution with heterogeneous hashing.
LCARS: a location-content-aware recommender system.
Active learning and search on low-rank matrices.
Learning to question: leveraging user preferences for shopping advice.
Network discovery via constrained tensor analysis of fMRI data.
Multi-source learning with block-wise missing data for Alzheimer's disease prediction.
Succinct interval-splitting tree for scalable similarity search of compound-protein pairs with property constraints.
SVMpAUCtight: a new support vector method for optimizing partial AUC based on a tight convex upper bound.
Querying discriminative and representative samples for batch mode active learning.
MI2LS: multi-instance learning from multiple informationsources.
Density-based logistic regression.
Selective sampling on graphs for classification.
Redundancy-aware maximal cliques.
Guided learning for role discovery (GLRD): framework, algorithms, and applications.
Denser than the densest subgraph: extracting optimal quasi-cliques with quality guarantees.
Big data analytics with small footprint: squaring the cloud.
Beyond myopic inference in big data pipelines.
TurboGraph: a fast parallel graph engine handling billion-scale graphs in a single PC.
Linking named entities in Tweets with knowledge base via user interest modeling.
Estimating sharer reputation via social data calibration.
Automatic selection of social media responses to news.
Connecting users across social media sites: a behavioral-modeling approach.
Diversity maximization under matroid constraints.
Text-based measures of document diversity.
Representing documents through their readers.
One theme in all views: modeling consensus topics in multiple contexts.
Predicting the present with search engine data.
Optimization in learning and data analysis.
The online revolution: education for everyone.
Scale-out beyond map-reduce.