Main Page

Table of Contents

Author Index

Sponsors & Supporters

Table of Contents

General Chairs’ Welcome – This Year’s Theme: Data Science for Social Good 
Sofus A. Macskassy (Facebook,USA)
Claudia Perlich (Dstillery)

Research Track Program Chairs’ Welcome 
Jure Leskovec (Stanford University)

Wei Wang (University of California, Los Angeles)

Industry & Government Track Program Chairs' Welcome
Rayid Ghani (University of Chicago & Edgeflip)
Prem Melville (Social Alpha)
Brian Dalessandro (Dstillery)
Paul Bradley (MethodCare)

Bloomberg Welcome
Vlad K. (Bloomberg)
Shawn E. (Bloomberg)

KDD’14 Conference on Knowledge Discovery & Data Mining Organization 

KDD'14 Research Track Senior Program Committee 

KDD'14 Research Track Program Committee 

KDD'14 Industry & Government Track Senior Program Committee 

KDD'14 Industry & Government Track Program Committee 

KDD'14 Research Track Additional Reviewers 

KDD'14 Sponsors & Supporters 

Keynote Talks Research Session 17: Recommendations and Ratings
Research Session 1: Location-based Services Research Session 18: Topic Modeling
Research Session 2: Applications to Healthcare and Medicine I Research Session 19: Security and Privacy
Research Session 3: Applications to Healthcare and Medicine II Research Session 20: Dimensionality Reduction
Research Session 4: Recommender Systems Research Session 21: Novel Applications
Research Session 5: Clustering Research Session 22: Crowds and Markets
Research Session 6: Supervised Learning I Research Session 23: Text Mining
Research Session 7: Supervised Learning II Research Session 24: Dynamic Graph Analysis
Research Session 8: Trend, Anomaly and Novelty Detection Research Session 25: Diffusion in Social and Information Networks
Research Session 9: Data Streams Research Session 26: Social and Information Networks
Research Session 10: Active Learning Research Session 27: Graph Mining and Modeling
Research Session 11: Feature Selection Research Session 28: Network Community Detection
Research Session 12: Statistical Techniques for Big Data Research Session 29: Scaling-up Graph Algorithms
Research Session 13: Scaling-up Methods for Big Data Research Session 30: Social Network Analysis
Research Session 14: Large-scale Optimization and Learning Industry & Government Invited Talks
Research Session 15: Web Mining Industry & Government
Research Session 16: Transfer Learning Panel
  Tutorials
 
(Return to Top)

Keynote Talks

The Battle for the Future of Data Mining (Page 1)
Oren Etzioni (Allen Institute for AI)

Data, Predictions, and Decisions in Support of People and Society (Page 2)
Eric Horvitz (Microsoft Research)

A Data Driven Approach to Diagnosing and Treating Disease (Page 3)
Eric Schadt (Icahn School of Medicine at Mount Sinai)

Bugbears Or Legitimate Threats? (Social) Scientists' Criticisms of Machine Learning? (Page 4)
Sendhil Mullainathan (Harvard University)

(Return to Top)

Research Session 1: Location-based Services

Prediction of Human Emergency Behavior and Their Mobility Following Large-scale Disaster (Page 5)
Xuan Song (The University of Tokyo)
Quanshi Zhang (The University of Tokyo)
Yoshihide Sekimoto (The University of Tokyo)
Ryosuke Shibasaki (The University of Tokyo)

Inferring User Demographics and Social Strategies in Mobile Social Networks (Page 15)
Yuxiao Dong (University of Notre Dame)
Yang Yang (Tsinghua University)
Jie Tang (Tsinghua University)
Yang Yang (University of Notre Dame)
Nitesh V. Chawla (University of Notre Dame)

Travel Time Estimation of a Path Using Sparse Trajectories (Page 25)
Yilun Wang (Microsoft Research & Zhejiang University)
Yu Zheng (Microsoft Research)
Yexiang Xue (Microsoft Research & Cornell University)

Modeling Human Location Data with Mixtures of Kernel Densities (Page 35)
Moshe Lichman (University of California, Irvine)
Padhraic Smyth (University of California, Irvine)

A Cost-Effective Recommender System for Taxi Drivers (Page 45)
Meng Qu (Rutgers Business School)
Hengshu Zhu (University of Science and Technology of China)
Junming Liu (Rutgers University)
Guannan Liu (Tsinghua University)
Hui Xiong (Rutgers University)

(Return to Top)

Research Session 2: Applications to Healthcare and Medicine I

LUDIA: an Aggregate-Constrained Low-Rank Reconstruction Algorithm to Leverage Publicly Released Health Data (Page 55)
Yubin Park (The University of Texas at Austin)
Joydeep Ghosh (The University of Texas at Austin)

People on Drugs: Credibility of User Statements in Health Communities (Page 65)
Subhabrata Mukherjee (Max Planck Institute for Informatics)
Gerhard Weikum (Max Planck Institute for Informatics)
Cristian Danescu-Niculescu-Mizil (Max Planck Institute for Software Systems)

Unfolding Physiological State: Mortality Modelling in Intensive Care Units (Page 75)
Marzyeh Ghassemi (Massachusetts Institute of Technology)
Tristan Naumann (Massachusetts Institute of Technology)
Finale Doshi-Velez (Harvard)
Nicole Brimmer (Massachusetts Institute of Technology)
Rohit Joshi (Massachusetts Institute of Technology)
Anna Rumshisky (University of Massachusetts, Lowell)
Peter Szolovits (Massachusetts Institute of Technology)

Unsupervised Learning of Disease Progression Models (Page 85)
Xiang Wang (IBM Research)
David Sontag (New York University)
Fei Wang (IBM Research)

Good-Enough Brain Model: Challenges, Algorithms and Discoveries in Multi-Subject Experiments (Page 95)
Evangelos E. Papalexakis (Carnegie Mellon University)
Alona Fyshe (Carnegie Mellon University)
Nicholas D. Sidiropoulos (University of Minnesota)
Partha Pratim Talukdar (Carnegie Mellon University)
Tom M. Mitchell (Carnegie Mellon University)
Christos Faloutsos (Carnegie Mellon University)

(Return to Top)

Research Session 3: Applications to Healthcare and Medicine II

FUNNEL: Automatic Mining of Spatially Coevolving Epidemics (Page 105)
Yasuko Matsubara (Kumamoto University)
Yasushi Sakurai (Kumamoto University)
Willem G. van Panhuis (University of Pittsburgh)
Christos Faloutsos (Carnegie Mellon University)

Marble: High-Throughput Phenotyping from Electronic Health Records via Sparse Nonnegative Tensor Factorization (Page 115)
Joyce Ho (The University of Texas at Austin)
Joydeep Ghosh (The University of Texas at Austin)
Jimeng Sun (Georgia Institute of Technology)

Scalable Noise Mining in Long-Term Electrocardiographic Time-Series to Predict Death Following Heart Attacks (Page 125)
Chih-Chun Chia (University of Michigan, Ann Arbor)
Zeeshan Syed (University of Michigan, Ann Arbor)

From Micro to Macro: Data Driven Phenotyping by Densification of Longitudinal Electronic Medical Records (Page 135)
Jiayu Zhou (Arizona State University)
Fei Wang (IBM T.J. Watson Research Center)
Jianying Hu (IBM T.J. Watson Research Center)
Jieping Ye (Arizona State University)

Clinical Risk Prediction with Multilinear Sparse Logistic Regression (Page 145)
Fei Wang (IBM T. J. Watson Research Center)
Ping Zhang (IBM T. J. Watson Research Center)
Buyue Qian (IBM T. J. Watson Research Center)
Xiang Wang (IBM T. J. Watson Research Center)
Ian Davidson (University of California, Davis)

Dual Beta Process Priors for Latent Cluster Discovery in Chronic Obstructive Pulmonary Disease (Page 155)
James C. Ross (Brigham and Women's Hospital, Harvard Medical School)
Peter J. Castaldi (Brigham and Women's Hospital, Harvard Medical School)
Michael H. Cho (Brigham and Women's Hospital, Harvard Medical School)
Jennifer G. Dy (Northeastern University)

(Return to Top)

Research Session 4: Recommender Systems

COM: A Generative Model for Group Recommendation (Page 163)
Quan Yuan (Nanyang Technological University)
Gao Cong (Nanyang Technological University)
Chin-Yew Lin (Microsoft Research)

Leveraging User Libraries to Bootstrap Collaborative Filtering (Page 173)
Laurent Charlin (Princeton University)
Richard S. Zemel (University of Toronto)
Hugo Larochelle (Université de Sherbrooke)

Topic-Factorized Ideal Point Estimation Model for Legislative Voting Network (Page 183)
Yupeng Gu (Northeastern University)
Yizhou Sun (Northeastern University)
Ning Jiang (University of Illinois at Urbana-Champaign)
Bingyu Wang (Northeastern University)
Ting Chen (Northeastern University)

Jointly Modeling Aspects, Ratings and Sentiments for Movie Recommendation (JMARS) (Page 193)
Qiming Diao (Singapore Management University)
Minghui Qiu (Singapore Management University)
Chao-Yuan Wu (Carnegie Mellon University)
Alexander J. Smola (Carnegie Mellon University and Google)
Jing Jiang (Singapore Management University)
Chong Wang (Carnegie Mellon University)

User Effort Minimization Through Adaptive Diversification (Page 203)
Mahbub Hasan (University of California, Riverside)
Abhijith Kashyap (University of California, Riverside)
Vagelis Hristidis (University of California, Riverside)
Vassilis Tsotras (University of California, Riverside)

(Return to Top)

Research Session 5: Clustering

Relevant Overlapping Subspace Clusters on Categorical Data (Page 213)
Xiao He (University of Munich)
Jing Feng (University of Munich)
Bettina Konte (University of Munich)
Son T. Mai (University of Munich)
Claudia Plant (Helmholtz Zentrum München, Technische Universität München)

Batch Discovery of Recurring Rare Classes Toward Identifying Anomalous Samples (Page 223)
Murat Dundar (IUPUI)
Halid Ziya Yerebakan (IUPUI)
Bartek Rajwa (Purdue University)

A Dirichlet Multinomial Mixture Model-Based Approach for Short Text Clustering (Page 233)
Jianhua Yin (Tsinghua University)
Jianyong Wang (Tsinghua University)

Representative Clustering of Uncertain Data (Page 243)
Andreas Züfle (Ludwig-Maximilians-Universität München)
Tobias Emrich (Ludwig-Maximilians-Universität München)
Klaus Arthur Schmid (Ludwig-Maximilians-Universität München)
Nikos Mamoulis (University of Hong Kong)
Arthur Zimek (Ludwig-Maximilians-Universität München)
Matthias Renz (Ludwig-Maximilians-Universität München)

SMVC: Semi-Supervised Multi-View Clustering in Subspace Projections (Page 253)
Stephan Günnemann (Carnegie Mellon University)
Ines Färber (RWTH Aachen University)
Matthias Rüdiger (RWTH Aachen University)
Thomas Seidl (RWTH Aachen University)

(Return to Top)

Research Session 6: Supervised Learning I

FastXML: A Fast, Accurate and Stable Tree-Classifier for eXtreme Multi-label Learning (Page 263)
Yashoteja Prabhu (Indian Institute of Technology - Delhi)
Manik Varma (Microsoft Research)

A Multi-Class Boosting Method with Direct Optimization (Page 273)
Shaodan Zhai (Wright State University)
Tian Xia (Wright State University)
Shaojun Wang (Wright State University)

An Efficient Algorithm for Weak Hierarchical Lasso (Page 283)
Yashu Liu (Arizona State University)
Jie Wang (Arizona State University)
Jieping Ye (Arizona State University)

Online Multiple Kernel Regression (Page 293)
Doyen Sahoo (Singapore Management University)
Steven C.H. Hoi (Singapore Management University)
Bin Li (Wuhan University)

Class-Distribution Regularized Consensus Maximization for Alleviating Overfitting in Model Combination (Page 303)
Sihong Xie (University of Illinois at Chicago)
Jing Gao (University at Buffalo)
Wei Fan (Huawei Noah’s Ark Lab)
Deepak Turaga (IBM T.J. Watson Research Center)
Philip S. Yu (University of Illinois at Chicago)

(Return to Top)

Research Session 7: Supervised Learning II

Large Margin Distribution Machine (Page 313)
Teng Zhang (Nanjing University)
Zhi-Hua Zhou (Nanjing University)

Distance Metric Learning Using Dropout: A Structured Regularization Approach (Page 323)
Qi Qian (Michigan State University)
Juhua Hu (Simon Fraser University)
Rong Jin (Michigan State University)
Jian Pei (Simon Fraser University)
Shenghuo Zhu (NEC Laboratories America)

Box Drawings for Learning with Imbalanced Data (Page 333)
Siong Thye Goh (Massachusetts Institute of Technology)
Cynthia Rudin (Massachusetts Institute of Technology)

Incremental and Decremental Training for Linear Classification (Page 343)
Cheng-Hao Tsai (National Taiwan University)
Chieh-Yen Lin (National Taiwan University)
Chih-Jen Lin (National Taiwan University)

Supervised Deep Learning with Auxiliary Networks (Page 353)
Junbo Zhang (Southwest Jiaotong University & Huawei Noah's Ark Lab)
Guangjian Tian (Huawei Noah's Ark Lab)
Yadong Mu (Huawei Noah's Ark Lab)
Wei Fan (Huawei Noah's Ark Lab)

(Return to Top)

Research Session 8: Trend, Anomaly and Novelty Detection

Sleep Analytics and Online Selective Anomaly Detection (Page 362)
Tahereh Babaie (University of Sydney & NICTA)
Sanjay Chawla (University of Sydney & NICTA)
Romesh Abeysuriya (University of Sydney)

GLAD: Group Anomaly Detection in Social Media Analysis (Page 372)
Rose Yu (University of Southern California)
Xinran He (University of Southern California)
Yan Liu (University of Southern California)

FBLG: A Simple and Effective Approach for Temporal Dependence Discovery from Time Series Data (Page 382)
Dehua Cheng (University of Southern California)
Mohammad Taha Bahadori (University of Southern California)
Yan Liu (University of Southern California)

Learning Time-Series Shapelets (Page 392)
Josif Grabocka (University of Hildesheim)
Nicolas Schilling (University of Hildesheim)
Martin Wistuba (University of Hildesheim)
Lars Schmidt-Thieme (University of Hildesheim)

Utilizing Temporal Patterns for Estimating Uncertainty in Interpretable Early Decision Making (Page 402)
Mohamed F. Ghalwash (Temple University)
Vladan Radosavljevic (Yahoo Labs)
Zoran Obradovic (Temple University)

(Return to Top)

Research Session 9: Data Streams

Prototype-based Learning on Concept-drifting Data Streams (Page 412)
Junming Shao (University of Mainz; University of Electronic Science and Technology of China)
Zahra Ahmadi (University of Mainz)
Stefan Kramer (University of Mainz)

Detecting Moving Object Outliers in Massive-Scale Trajectory Streams (Page 422)
Yanwei Yu (Yantai University)
Lei Cao (Worcester Polytechnic Institute)
Elke A. Rundensteiner (Worcester Polytechnic Institute)
Qin Wang (University of Science and Technology Beijing)

The Setwise Stream Classification Problem (Page 432)
Charu C. Aggarwal (IBM T. J. Watson Research Center)

Streamed Approximate Counting of Distinct Elements: Beating Optimal Batch Methods (Page 442)
Daniel Ting (Facebook)

Time-Varying Learning and Content Analytics via Sparse Factor Analysis (Page 452)
Andrew S. Lan (Rice University)
Christoph Studer (Cornell University)
Richard G. Baraniuk (Rice University)

(Return to Top)

Research Session 10: Active Learning

Active-Transductive Learning with Label-Adapted Kernels (Page 462)
Dan Kushnir (Alcatel-Lucent Bell Laboratories)

Active Learning for Sparse Bayesian Multilabel Classification (Page 472)
Deepak Vasisht (Massachusetts Institute of Technology)
Andreas Damianou (University of Sheffield, UK)
Manik Varma (Microsoft Research)
Ashish Kapoor (Microsoft Research)

Large-Scale Adaptive Semi-Supervised Learning via Unified Inductive and Transductive Model (Page 482)
De Wang (University of Texas at Arlington)
Feiping Nie (University of Texas at Arlington)
Heng Huang (University of Texas at Arlington)

Active Semi-Supervised Learning Using Sampling Theory for Graph Signals (Page 492)
Akshay Gadde (University of Southern California)
Aamir Anis (University of Southern California)
Antonio Ortega (University of Southern Calfornia)

Active Collaborative Permutation Learning (Page 502)
Jialei Wang (University of Chicago)
Nathan Srebro (Toyota Technological Institute at Chicago)
James A. Evans (University of Chicago)

(Return to Top)

Research Session 11: Feature Selection

Effective Global Approaches for Mutual Information Based Feature Selection (Page 512)
Nguyen Xuan Vinh (The University of Melbourne)
Jeffrey Chan (The University of Melbourne)
Simone Romano (The University of Melbourne)
James Bailey (The University of Melbourne)

Gradient Boosted Feature Selection (Page 522)
Zhixiang Eddie Xu (Washington University in St. Louis)
Gao Huang (Tsinghua University)
Kilian Q. Weinberger (Washington University in St. Louis)
Alice X. Zheng (GraphLab, Seattle)

Simultaneous Feature and Feature Group Selection Through Hard Thresholding (Page 532)
Shuo Xiang (Arizona State University)
Tao Yang (Arizona State University)
Jieping Ye (Arizona State University)

Safe and Efficient Screening for Sparse Support Vector Machine (Page 542)
Zheng Zhao (SAS Institute Inc.)
Jun Liu (SAS Institute Inc.)
James Cox (SAS Institute Inc.)

Factorized Sparse Learning Models with Interpretable High Order Feature Interactions (Page 552)
Sanjay Purushotham (University of Southern California)
Martin Renqiang Min (NEC Labs America)
C.-C. Jay Kuo (University of Southern California)
Rachel Ostroff (SomaLogic, Inc.)

(Return to Top)

Research Session 12: Statistical Techniques for Big Data

Parallel Gibbs Sampling for Hierarchical Dirichlet Processes via Gamma Processes Equivalence (Page 562)
Dehua Cheng (University of Southern California)
Yan Liu (University of Southern California)

Empirical Glitch Explanations (Page 572)
Tamraparni Dasu (AT&T Labs - Research)
Ji Meng Loh (New Jersey Institute of Technology)
Divesh Srivastava (AT&T Labs - Research)

Learning with Dual Heterogeneity: A Nonparametric Bayes Model (Page 582)
Hongxia Yang (IBM T.J. Watson Research Center)
Jingrui He (Arizona State University)

Online Chinese Restaurant Process (Page 591)
Chien-Liang Liu (Industrial Technology Research Institute, Taiwan)
Tsung-Hsun Tsai (National Chiao Tung University)
Chia-Hoang Lee (National Chiao Tung University)

Knowledge Vault: A Web-Scale Approach to Probabilistic Knowledge Fusion (Page 601)
Xin Luna Dong (Google, Mountain View, CA)
Evgeniy Gabrilovich (Google, Mountain View, CA)
Geremy Heitz (Google, Mountain View, CA)
Wilko Horn (Google, Mountain View, CA)
Ni Lao (Google, Mountain View, CA)
Kevin Murphy (Google, Mountain View, CA)
Thomas Strohmann (Google, Mountain View, CA)
Shaohua Sun (Google, Mountain View, CA)
Wei Zhang (Google, Mountain View, CA)

(Return to Top)

Research Session 13: Scaling-up Methods for Big Data

Improving the Modified Nyström Method Using Spectral Shifting (Page 611)
Shusen Wang (Zhejiang University)
Chao Zhang (Zhejiang University)
Hui Qian (Zhejiang University)
Zhihua Zhang (Shanghai Jiao Tong University)

Fast Flux Discriminant for Large-Scale Sparse Nonlinear Classification (Page 621)
Wenlin Chen (Washington University in St. Louis)
Yixin Chen (Washington University in St. Louis)
Kilian Q. Weinberger (Washington University in St. Louis)

Scalable Histograms on Large Probabilistic Data (Page 631)
Mingwang Tang (University of Utah)
Feifei Li (University of Utah)

Correlation Clustering in MapReduce (Page 641)
Flavio Chierichetti (Sapienza University)
Nilesh Dalvi (Trooly Inc.)
Ravi Kumar (Google Inc.)

Scaling Out Big Data Missing Value Imputations: Pythia vs. Godzilla (Page 651)
Christos Anagnostopoulos (University of Glasgow)
Peter Triantafillou (University of Glasgow)

(Return to Top)

Research Session 14: Large-scale Optimization and Learning

Efficient Mini-Batch Training for Stochastic Optimization (Page 661)
Mu Li (Carnegie Mellon University & Baidu, Inc.)
Tong Zhang (Baidu, Inc. & Rutgers University)
Yuqiang Chen (Baidu, Inc.)
Alexander J. Smola (Carnegie Mellon University & Google, Inc.)

Streaming Submodular Maximization: Massive Data Summarization on the Fly (Page 671)
Ashwinkumar Badanidiyuru (Cornell University)
Baharan Mirzasoleiman (ETH Zurich)
Amin Karbasi (ETH Zurich)
Andreas Krause (ETH Zurich)

Distance Queries from Sampled Data: Accurate and Efficient (Page 681)
Edith Cohen (Microsoft Research)

Improved Testing of Low Rank Matrices (Page 691)
Yi Li (Max-Planck Institute for Informatics)
Zhengyu Wang (Tsinghua University)
David P. Woodruff (IBM Almaden Research Center)

DeepWalk: Online Learning of Social Representations (Page 701)
Bryan Perozzi (Stony Brook University)
Rami Al-Rfou (Stony Brook University)
Steven Skiena (Stony Brook University)

(Return to Top)

Research Session 15: Web Mining

Open-Domain Quantity Queries on Web Tables: Annotation, Response, and Consensus Models (Page 711)
Sunita Sarawagi (IIT Bombay)
Soumen Chakrabarti (IIT Bombay)

Crowdsourced Time-sync Video Tagging Using Temporal and Personalized Topic Modeling (Page 721)
Bin Wu (Hong Kong University of Science and Technology)
Erheng Zhong (Hong Kong University of Science and Technology)
Ben Tan (Hong Kong University of Science and Technology)
Andrew Horner (Hong Kong University of Science and Technology)
Qiang Yang (Hong Kong University of Science and Technology)

Identifying and Labeling Search Tasks via Query-based Hawkes Processes (Page 731)
Liangda Li (Georgia Institute of Technology)
Hongbo Deng (Yahoo Labs)
Anlei Dong (Yahoo Labs)
Yi Chang (Yahoo Labs)
Hongyuan Zha (East China Normal University & Georgia Institute of Technology)

LaSEWeb: Automating Search Strategies over Semi-Structured Web Data (Page 741)
Oleksandr Polozov (University of Washington)
Sumit Gulwani (Microsoft Research)

Personalized Search Result Diversification via Structured Learning (Page 751)
Shangsong Liang (University of Amsterdam)
Zhaochun Ren (University of Amsterdam)
Maarten de Rijke (University of Amsterdam)

(Return to Top)

Research Session 16: Transfer Learning

Efficient Multi-Task Feature Learning with Calibration (Page 761)
Pinghua Gong (Arizona State University)
Jiayu Zhou (Arizona State University)
Wei Fan (Huawei Noah’s Ark Lab)
Jieping Ye (Arizona State University)

Multi-Task Copula By Sparse Graph Regression (Page 771)
Tianyi Zhou (University of Technology, Sydney, Australia & University of Washington)
Dacheng Tao (University of Technology, Sydney)

Unifying Learning to Rank and Domain Adaptation: Enabling Cross-Task Document Scoring (Page 781)
Mianwei Zhou (University of Illinois at Urbana-Champaign)
Kevin Chen-Chuan Chang (University of Illinois at Urbana-Champaign)

Scalable Heterogeneous Translated Hashing (Page 791)
Ying Wei (Hong Kong University of Science and Technology)
Yangqiu Song (University of Illinois at Urbana-Champaign)
Yi Zhen (Duke University)
Bo Liu (Hong Kong University of Science and Technology)
Qiang Yang (Hong Kong University of Science and Technology & Huawei Noah's Ark Lab,Hong Kong)

Matching Users and Items Across Domains to Improve the Recommendation Quality (Page 801)
Chung-Yi Li (National Taiwan University)
Shou-De Lin (National Taiwan University)

(Return to Top)

Research Session 17: Recommendations and Ratings

Optimal Recommendations Under Attraction, Aversion, and Social Influence (Page 811)
Wei Lu (University of British Columbia)
Stratis Ioannidis (Technicolor Los Altos Research Center)
Smriti Bhagat (Technicolor Los Altos Research Center)
Laks V.S. Lakshmanan (University of British Columbia)

ClusCite: Effective Citation Recommendationby Information Network-Based Clustering (Page 821)
Xiang Ren (University of Illinois at Urbana-Champaign)
Jialu Liu (University of Illinois at Urbana-Champaign)
Xiao Yu (University of Illinois at Urbana-Champaign)
Urvashi Khandelwal (University of Illinois at Urbana-Champaign)
Quanquan Gu (University of Illinois at Urbana-Champaign)
Lidan Wang (University of Illinois at Urbana-Champaign)
Jiawei Han (University of Illinois at Urbana-Champaign)

GeoMF: Joint Geographical Modeling and Matrix Factorization for Point-of-Interest Recommendation (Page 831)
Defu Lian (University of Science and Technology of China & Microsoft Research, Beijing, China)
Cong Zhao (University of Science and Technology of China)
Xing Xie (Microsoft Research, Beijing, China)
Guangzhong Sun (University of Science and Technology of China)
Enhong Chen (University of Science and Technology of China)
Yong Rui (Microsoft Research, Beijing, China)

Detecting Anomalies in Dynamic Rating Data: A Robust Probabilistic Model for Rating Evolution (Page 841)
Stephan Günnemann (Carnegie Mellon University)
Nikou Günnemann (Carnegie Mellon University)
Christos Faloutsos (Carnegie Mellon University)

Product Selection Problem: Improve Market Share by Learning Consumer Behavior (Page 851)
Silei Xu (Department of Computer Science and Engineering, The Chinese University of Hong Kong)
John C. S. Lui (Department of Computer Science and Engineering, The Chinese University of Hong Kong)

(Return to Top)

Research Session 18: Topic Modeling

TCS: Efficient Topic Discovery Over Crowd-Oriented Service Data (Page 861)
Yongxin Tong (Hong Kong University of Science & Technology)
Caleb Chen Cao (Hong Kong University of Science & Technology)
Lei Chen (Hong Kong University of Science & Technology)

SigniTrend: Scalable Detection of Emerging Topics in Textual Streams by Hashed Significance Thresholds (Page 871)
Erich Schubert (Ludwig-Maximilians Universität München)
Michael Weiler (Ludwig-Maximilians Universität München)
Hans-Peter Kriegel (Ludwig-Maximilians Universität München)

Experiments with Non-Parametric Topic Models (Page 881)
Wray Buntine (Monash University)
Swapnil Mishra (The Australian National University)

Reducing the Sampling Complexity of Topic Models (Page 891)
Aaron Q. Li (Carnegie Mellon University)
Amr Ahmed (Google Inc.)
Sujith Ravi (Google Inc.)
Alexander J. Smola (Carnegie Mellon University and Google Inc.)

Dynamics of News Events and Social Media Reaction (Page 901)
Mikalai Tsytsarau (University of Trento)
Themis Palpanas (Paris Descartes University)
Malu Castellanos (Hewlett Packard)

(Return to Top)

Research Session 19: Security and Privacy

Differentially Private Network Data Release via Structural Inference (Page 911)
Qian Xiao (National University of Singapore)
Rui Chen (Hong Kong Baptist University)
Kian-Lee Tan (National University of Singapore)

Exponential Random Graph Estimation under Differential Privacy (Page 921)
Wentian Lu (University of Massachusetts, Amherst)
Gerome Miklau (University of Massachusetts, Amherst)

Top-k Frequent Itemsets via Differentially Private FP-Trees (Page 931)
Jaewoo Lee (Purdue University)
Chris Clifton (Purdue University)

CatchSync: Catching Synchronized Behavior in Large Directed Graphs (Page 941)
Meng Jiang (Tsinghua University)
Peng Cui (Tsinghua University)
Alex Beutel (Carnegie Mellon University)
Christos Faloutsos (Carnegie Mellon University)
Shiqiang Yang (Tsinghua University)

Mobile App Recommendations with Security and Privacy Awareness (Page 951)
Hengshu Zhu (University of Science and Technology of China)
Hui Xiong (Rutgers University)
Yong Ge (UNC Charlotte)
Enhong Chen (University of Science and Technology of China)

(Return to Top)

Research Session 20: Dimensionality Reduction

Fast Dtt — A Near Linear Algorithm for Decomposing A Tensor into Factor Tensors (Page 967)
Xiaomin Fang (Sun Yat-sen University)
Rong Pan (Sun Yat-sen University)

Clustering and Projected Clustering with Adaptive Neighbors (Page 977)
Feiping Nie (University of Texas at Arlington)
Xiaoqian Wang (University of Texas at Arlington)
Heng Huang (University of Texas at Arlington)

LWI-Svd: Low-rank, Windowed, Incremental Singular Value Decompositions on Time-Evolving Data Sets (Page 987)
Xilun Chen (Arizona State University)
K. Selçuk Candan (Arizona State University)

Provable Deterministic Leverage Score Sampling (Page 997)
Dimitris Papailiopoulos (University of Texas)
Anastasios Kyrillidis (École Polytechnique Fédérale de Lausanne)
Christos Boutsidis (Yahoo! Labs)

Semantic Visualization for Spherical Representation (Page 1007)
Tuan M. V. Le (Singapore Management University)
Hady W. Lauw (Singapore Management University)

(Return to Top)

Research Session 21: Novel Applications

Grouping Students in Educational Settings (Page 1017)
Rakesh Agrawal (Microsoft Research)
Behzad Golshan (Boston University)
Evimaria Terzi (Boston University)

Inferring Gas Consumption and Pollution Emission of Vehicles Throughout a City (Page 1027)
Jingbo Shang (Shanghai Jiao Tong University & Microsoft Research, Beijing China)
Yu Zheng (Microsoft Research, Beijing, China)
Wenzhu Tong (Wuhan University & Microsoft Research, Beijing, China)
Eric Chang (Microsoft Research, Beijing, China)
Yong Yu (Shanghai Jiao Tong University)

Methods for Ordinal Peer Grading (Page 1037)
Karthik Raman (Cornell University)
Thorsten Joachims (Cornell University)

Exploiting Geographic Dependencies for Real Estate Appraisal: A Mutual Perspective of Ranking and Clustering (Page 1047)
Yanjie Fu (Rutgers University)
Hui Xiong (Rutgers University)
Yong Ge (University of North Carolina at Charlotte)
Zijun Yao (Rutgers University)
Yu Zheng (Microsoft Research Asia)
Zhi-Hua Zhou (Nanjing University)

Towards Scalable Critical Alert Mining (Page 1057)
Bo Zong (University of California, Santa Barbara)
Yinghui Wu (University of California, Santa Barbara)
Jie Song (LogicMonitor)
Ambuj K. Singh (University of California, Santa Barbara)
Hasan Cam (Army Research Lab)
Jiawei Han (University of Illinois at Urbana-Champaign)
Xifeng Yan (University of California, Santa Barbara)

(Return to Top)

Research Session 22: Crowds and Markets

From Labor to Trader: Opinion Elicitation via Online Crowds as a Market (Page 1067)
Caleb Chen Cao (The Hong Kong University of Science and Technology)
Lei Chen (The Hong Kong University of Science and Technology)
H. V. Jagadish (University of Michigan)

Optimal Real-Time Bidding for Display Advertising (Page 1077)
Weinan Zhang (University College London)
Shuai Yuan (University College London)
Jun Wang (University College London)

Quantifying Herding Effects in Crowd Wisdom (Page 1087)
Ting Wang (IBM T. J. Watson Research Center)
Dashun Wang (IBM T. J. Watson Research Center)
Fei Wang (IBM T. J. Watson Research Center)

Modeling Delayed Feedback in Display Advertising (Page 1097)
Olivier Chapelle (Criteo)

Networked Bandits with Disjoint Linear Payoffs (Page 1106)
Meng Fang (University of Technology, Sydney)
Dacheng Tao (University of Technology, Sydney)

(Return to Top)

Research Session 23: Text Mining

Mining Topics in Documents: Standing on the Shoulders of Big Data (Page 1116)
Zhiyuan Chen (University of Illinois at Chicago)
Bing Liu (University of Illinois at Chicago)

Integrating Spreadsheet Data via Accurate and Low-Effort Extraction (Page 1126)
Zhe Chen (University of Michigan)
Michael Cafarella (University of Michigan)

Sentiment Expression Conditioned by Affective Transitions and Social Forces (Page 1136)
Moritz Sudhof (Stanford University)
Andrés Gómez Emilsson (Stanford University)
Andrew L. Maas (Stanford University)
Christopher Potts (Stanford University)

Entity Profiling with Varying Source Reliabilities (Page 1146)
Furong Li (National University of Singapore)
Mong Li Lee (National University of Singapore)
Wynne Hsu (National University of Singapore)

Open Question Answering Over Curated and Extracted Knowledge Bases (Page 1156)
Anthony Fader (Allen Institute for Artificial Intelligence)
Luke Zettlemoyer (University of Washington)
Oren Etzioni (Allen Institute for Artificial Intelligence)

(Return to Top)

Research Session 24: Dynamic Graph Analysis

Non-Parametric Scan Statistics for Event Detection and Forecasting in Heterogeneous Social Media Graphs (Page 1166)
Feng Chen (State University of New York at Albany)
Daniel B. Neill (Carnegie Mellon University)

Event Detection in Activity Networks (Page 1176)
Polina Rozenshtein (Aalto University)
Aris Anagnostopoulos (Sapienza University of Rome)
Aristides Gionis (Aalto University & HIIT)
Nikolaj Tatti (Aalto University & HIIT)

FEMA: Flexible Evolutionary Multi-Faceted Analysis for Dynamic Behavioral Pattern Discovery (Page 1186)
Meng Jiang (Tsinghua National Laboratory for Information Science and Technology & Tsinghua University)
Peng Cui (Tsinghua National Laboratory for Information Science and Technology & Tsinghua University)
Fei Wang (IBM T.J. Watson Research Center)
Xinran Xu (Tsinghua National Laboratory for Information Science and Technology & Tsinghua University)
Wenwu Zhu (Tsinghua National Laboratory for Information Science and Technology & Tsinghua University)
Shiqiang Yang (Tsinghua National Laboratory for Information Science and Technology & Tsinghua University)

Profit-Maximizing Cluster Hires (Page 1196)
Behzad Golshan (Boston University)
Theodoros Lappas (Stevens Institute of Technology)
Evimaria Terzi (Boston University)

On Social Event Organization (Page 1206)
Keqian Li (University of British Columbia)
Wei Lu (University of British Columbia)
Smriti Bhagat (Technicolor Research)
Laks V.S. Lakshmanan (University of British Columbia)
Cong Yu (Google Research)

(Return to Top)

Research Session 25: Diffusion in Social and Information Networks

A Bayesian Framework for Estimating Properties of Network Diffusions (Page 1216)
Varun R. Embar (IBM India Research Lab)
Rama Kumar Pasumarthi (IBM India Research Lab)
Indrajit Bhattacharya (IBM India Research Lab)

Scalable Diffusion-Aware Optimization of Network Topology (Page 1226)
Elias Khalil (Georgia Institute of Technology)
Bistra Dilkina (Georgia Institute of Technology)
Le Song (Georgia Institute of Technology)

Probabilistic Latent Network Visualization: Inferring and Embedding Diffusion Networks (Page 1236)
Takeshi Kurashima (NTT Service Evolution Labs., NTT Corporation)
Tomoharu Iwata (NTT Communication Science Labs., NTT Corporation)
Noriko Takaya (NTT Service Evolution Labs., NTT Corporation)
Hiroshi Sawada (NTT Service Evolution Labs., NTT Corporation)

MMrate: Inferring Multi-Aspect Diffusion Networks with Multi-Pattern Cascades (Page 1246)
Senzhang Wang (Beihang University)
Xia Hu (Arizona State University)
Philip S. Yu (University of Illinois at Chicago)
Zhoujun Li (Beihang University)

Stability of Influence Maximization (Page 1256)
Xinran He (University of Southern California)
David Kempe (University of Southern California)

(Return to Top)

Research Session 26: Social and Information Networks

Who to Follow and Why: Link Prediction with Explanations (Page 1266)
Nicola Barbieri (Yahoo Labs)
Francesco Bonchi (Yahoo Labs)
Giuseppe Manco (ICAR-CNR)

Activity-edge Centric Multi-label Classification for Mining Heterogeneous Information Networks (Page 1276)
Yang Zhou (Georgia Institute of Technology)
Ling Liu (Georgia Institute of Technology)

Meta-Path Based Multi-Network Collective Link Prediction (Page 1286)
Jiawei Zhang (University of Illinois at Chicago)
Philip S. Yu (University of Illinois at Chicago)
Zhi-Hua Zhou (Nanjing University)

Fast Influence-based Coarsening for Large Networks (Page 1296)
Manish Purohit (University of Maryland)
B. Aditya Prakash (Virginia Tech)
Chanhyun Kang (University of Maryland)
Yao Zhang (Virginia Tech)
V.S. Subrahmanian (University of Maryland)

Minimizing Seed Set Selection with Probabilistic Coverage Guarantee in a Social Network (Page 1306)
Peng Zhang (Purdue University)
Wei Chen (Microsoft)
Xiaoming Sun (Institute of Computing Technology, CAS)
Yajun Wang (Microsoft)
Jialin Zhang (Institute of Computing Technology, CAS)

(Return to Top)

Research Session 27: Graph Mining and Modeling

Core Decomposition of Uncertain Graphs (Page 1316)
Francesco Bonchi (Yahoo Labs, Spain)
Francesco Gullo (Yahoo Labs, Spain)
Andreas Kaltenbrunner (Barcelona Media - Innovation Centre, Spain)
Yana Volkovich (Barcelona Media - Innovation Centre, Spain)

Learning Multifractal Structure in Large Networks (Page 1326)
Austin R. Benson (Stanford University)
Carlos Riquelme (Stanford University)
Sven Schmit (Stanford University)

Temporal Skeletonization on Sequential Data: Patterns, Categorization, and Visualization (Page 1336)
Chuanren Liu (Rutgers, The State University of New Jersey)
Kai Zhang (NEC Laboratories America, Inc.)
Hui Xiong (Rutgers, The State University of New Jersey)
Geoff Jiang (NEC Laboratories America, Inc.)
Qiang Yang (Hong Kong University of Science and Technology)

Focused Clustering and Outlier Detection in Large Attributed Graphs (Page 1346)
Bryan Perozzi (Stony Brook University)
Leman Akoglu (Stony Brook University)
Patricia Iglesias Sánchez (Karlsruhe Institute of Technology)
Emmanuel Müller (Karlsruhe Institute of Technology & University of Antwerp)

Inside the Atoms: Ranking on a Network of Networks (Page 1356)
Jingchao Ni (Case Western Reserve University)
Hanghang Tong (Arizona State University)
Wei Fan (Huawei Noahs Ark Lab)
Xiang Zhang (Case Western Reserve University)

(Return to Top)

Research Session 28: Network Community Detection

Community Membership Identification from Small Seed Sets (Page 1366)
Isabel M. Kloumann (Cornell University)
Jon M. Kleinberg (Cornell University)

Community Detection in Graphs through Correlation (Page 1376)
Lian Duan (New Jersey Institute of Technology)
W. Nick Street (University of Iowa)
Yanchi Liu (New Jersey Institute of Technology)
Haibing Lu (Santa Clara University)

Heat Kernel Based Community Detection (Page 1386)
Kyle Kloster (Purdue University)
David F. Gleich (Purdue University)

On the Permanence of Vertices in Network Communities (Page 1396)
Tanmoy Chakraborty (Indian Institute of Technology, Kharagpur, India)
Sriram Srinivasan (University of Nebraska)
Niloy Ganguly (Indian Institute of Technology, Kharagpur, India)
Animesh Mukherjee (Indian Institute of Technology, Kharagpur, India)
Sanjukta Bhowmick (University of Nebraska)

The Interplay Between Dynamics and Networks: Centrality, Communities, and Cheeger Inequality (Page 1406)
Rumi Ghosh (Robert Bosch LLC)
Shang-Hua Teng (University of Southern California)
Kristina Lerman (University of Southern California)
Xiaoran Yan (University of Southern California)

(Return to Top)

Research Session 29: Scaling-up Graph Algorithms

Almost Linear-Time Algorithms for Adaptive Betweenness Centrality Using Hypergraph Sketches (Page 1416)
Yuichi Yoshida (Ntional Institute of Informatics)

Efficient SimRank Computation via Linearization (Page 1426)
Takanori Maehara (National Institute of Informatics)
Mitsuru Kusumoto (Preferred Infrastructure, Inc.)
Ken-ichi Kawarabayashi (National Institute of Informatics)

FAST-Ppr: Scaling Personalized PageRank Estimation for Large Graphs (Page 1436)
Peter Lofgren (Stanford University)
Siddhartha Banerjee (Stanford University)
Ashish Goel (Stanford University)
C Seshadhri (Sandia National Labs)

Graph Sample and Hold: A Framework for Big-Graph Analytics (Page 1446)
Nesreen K. Ahmed (Purdue University)
Nick Duffield (Rutgers University)
Jennifer Neville (Purdue University)
Ramana Kompella (Purdue University)

Balanced Graph Edge Partition (Page 1456)
Florian Bourse (ENS, France)
Marc Lelarge (INRIA-ENS, France)
Milan Vojnovic (Microsoft Research)

(Return to Top)

Research Session 30: Social Network Analysis

Using Strong Triadic Closure to Characterize Ties in Social Networks (Page 1466)
Stavros Sintos (University of Ioannina)
Panayiotis Tsaparas (University of Ioannina)

Network Structural Analysis via Core-Tree-Decomposition (Page 1476)
Takuya Akiba (The University of Tokyo)
Takanori Maehara (National Institute of Informatics)
Ken-ichi Kawarabayashi (National Institute of Informatics)

Analyzing Expert Behaviors in Collaborative Networks (Page 1486)
Huan Sun (University of California, Santa Barbara)
Mudhakar Srivatsa (IBM T.J. Watson Research Center)
Shulong Tan (University of California, Santa Barbara)
Yang Li (University of California, Santa Barbara)
Lance M. Kaplan (U.S. Army Research Lab)
Shu Tao (IBM T.J. Watson Research Center)
Xifeng Yan (University of California, Santa Barbara)

Predicting Long-Term Impact of CQA Posts: A Comprehensive Viewpoint (Page 1496)
Yuan Yao (State Key Laboratory for Novel Software Technology, China)
Hanghang Tong (Arizona State University)
Feng Xu (State Key Laboratory for Novel Software Technology, China)
Jian Lu (State Key Laboratory for Novel Software Technology, China)

Who Are Experts Specializing in Landscape Photography? Analyzing Topic-Specific Authority on Content Sharing Services (Page 1506)
Bin Bi (University of California, Los Angeles)
Ben Kao (The University of Hong Kong)
Chang Wan (The University of Hong Kong)
Junghoo Cho (University of California, Los Angeles)

(Return to Top)

Industry & Government Invited Talks

Frontiers in E-commerce Personalization (Page 1516)
Sri Subramaniam (Groupon)

Predictive Modeling in Practice (Page 1517)
Tracy De Poalo (Sprint)
Jeremy Howard (Khosla Ventures)

Medicine in the Age of Electronic Health Records (Page 1518)
Nigam Shah (Stanford)

Algorithms for Interpretable Machine Learning (Page 1519)
Cynthia Rudin (Massachusetts Institute of Technology)

Data Science Through the Lens of Social Science (Page 1520)
Drew Conway (Project Florida)

Information Environment Security (Page 1521)
Rand Waltzman (DARPA)

Big Data for Social Good (Page 1522)
Nathan Eagle (Jana)

Bringing Data Science to the Speakers of Every Language (Page 1523)
Robert Munro (Idibon)

(Return to Top)

Industry & Government

Guilt by Association: Large Scale Malware Detection by Mining File-relation Graphs (Page 1524)
Acar Tamersoy (Georgia Institute of Technology)
Kevin Roundy (Symantec Research Labs)
Duen Horng Chau (Georgia Institute of Technology)

Mining Text Snippets for Images on the Web (Page 1534)
Anitha Kannan (Microsoft)
Simon Baker (Microsoft)
Krishnan Ramnath (Microsoft)
Juliet Fiss (University of Washington)
Dahua Lin (TTI Chicago)
Lucy Vanderwende (Microsoft)
Rizwan Ansary (Microsoft)
Ashish Kapoor (Microsoft)
Qifa Ke (Microsoft)
Matt Uyttendaele (Microsoft)
Xin-Jing Wang (Microsoft)
Lei Zhang (Microsoft)

Predicting Student Risks Through Longitudinal Analysis (Page 1544)
Ashay Tamhane (IBM Research India)
Shajith Ikbal (IBM Research India)
Bikram Sengupta (IBM Research India)
Mayuri Duggirala (Tata Research Development & Design Centre, India)
James Appleton (Gwinnett County Public Schools, GA)

Novel Geospatial Interpolation Analytics for General Meteorological Measurements (Page 1553)
Bingsheng Wang (Virginia Tech)
Jinjun Xiong (IBM Thomas J. Watson Research Center)

Targeting Direct Cash Transfers to the Extremely Poor (Page 1563)
Brian Abelson (Enigma)
Kush R. Varshney (IBM Thomas J. Watson Research Center)
Joy Sun (GiveDirectly)

Scalable Hands-Free Transfer Learning for Online Advertising (Page 1573)
Brian Dalessandro (Dstillery)
Daizhuo Chen (Dstillery)
Troy Raeder (Dstillery)
Claudia Perlich (Dstillery)
Melinda Han Williams (Dstillery)
Foster Provost (NYU & Dstillery)

Correlating Events with Time Series for Incident Diagnosis (Page 1583)
Chen Luo (Jilin University)
Jian-Guang Lou (Microsoft Research)
Qingwei Lin (Microsoft Research)
Qiang Fu (Microsoft Research)
Rui Ding (Microsoft Research)
Dongmei Zhang (Microsoft Research)
Zhe Wang (Jilin University)

Proactive Workflow Modeling By Stochastic Processes with Application to Healthcare Operation and Management (Page 1593)
Chuanren Liu (Rutgers University)
Yong Ge (UNC Charlotte)
Hui Xiong (Rutgers University)
Keli Xiao (Stony Brook University)
Wei Geng (Awarepoint Corporation)
Matt Perkins (Awarepoint Corporation)

Activity Ranking in LinkedIn Feed (Page 1603)
Deepak Agarwal (LinkedIn Corporation)
Bee-Chung Chen (LinkedIn Corporation)
Rupesh Gupta (LinkedIn Corporation)
Joshua Hartman (LinkedIn Corporation)
Qi He (LinkedIn Corporation)
Anand Iyer (LinkedIn Corporation)
Sumanth Kolar (LinkedIn Corporation)
Yiming Ma (LinkedIn Corporation)
Pannaga Shivaswamy (LinkedIn Corporation)
Ajit Singh (LinkedIn Corporation)
Liang Zhang (LinkedIn Corporation)

Budget Pacing for Targeted Online Advertisements at LinkedIn (Page 1613)
Deepak Agarwal (LinkedIn)
Souvik Ghosh (LinkedIn)
Kai Wei (LinkedIn)
Siyu You (LinkedIn)

Large Scale Predictive Modeling for Micro-Simulation of 3G Air Interface Load (Page 1620)
Dejan Radosavljevik (Leiden University)
Peter van der Putten (Leiden University)

Unveiling Clusters of Events for Alert and Incident Management in Large-Scale Enterprise IT (Page 1630)
Derek Lin (Pivotal Software, Inc.)
Rashmi Raghu (Pivotal Software, Inc.)
Vivek Ramamurthy (Pivotal Software, Inc.)
Jin Yu (Pivotal Software, Inc.)
Regunathan Radhakrishnan (Pivotal Software, Inc.)
Joseph Fernandez (Visa, Inc.)

Style in the Long Tail: Discovering Unique Interests with Latent Variable Models in Large Scale Social E-Commerce (Page 1640)
Diane Hu (Etsy)
Rob Hall (Etsy)
Josh Attenberg (Etsy)

Corporate Residence Fraud Detection (Page 1650)
Enric Junqué de Fortuny (University of Antwerp)
Marija Stankova (University of Antwerp)
Julie Moeyersoms (University of Antwerp)
Bart Minnaert (Ghent University)
Foster Provost (New York University)
David Martens (University of Antwerp)

Modeling Mass Protest Adoption in Social Network Communities Using Geometric Brownian Motion (Page 1660)
Fang Jin (Virginia Tech)
Rupinder Paul Khandpur (Virginia Tech)
Nathan Self (Virginia Tech)
Edward Dougherty (Virginia Tech)
Sheng Guo (LinkedIn Incorporated)
Feng Chen (University at Albany, SUNY)
B. Aditya Prakash (Virginia Tech)
Naren Ramakrishnan (Virginia Tech)

Shallow Semantic Parsing of Product Offering Titles (for better automatic hyperlink insertion) (Page 1670)
Gabor Melli (VigLink Inc.)

A Case Study: Privacy Preserving Release of Spatio-Temporal Density in Paris (Page 1679)
Gergely Acs (INRIA)
Claude Castelluccia (INRIA)

Scalable Near Real-Time Failure Localization of Data Center Networks (Page 1689)
Herodotos Herodotou (Microsoft Research)
Bolin Ding (Microsoft Research)
Shobana Balakrishnan (Microsoft Research)
Geoff Outhred (Microsoft)
Percy Fitter (Microsoft)

Improving Management of Aquatic Invasions by Integrating Shipping Network, Ecological, and Environmental Data: Data Mining for Social Good (Page 1699)
Jian Xu (University of Notre Dame)
Thanuka L. Wickramarathne (University of Notre Dame)
Nitesh V. Chawla (University of Notre Dame)
Erin K. Grey (University of Notre Dame)
Karsten Steinhaeuser (University of Minnesota)
Reuben P. Keller (Loyola University Chicago)
John M. Drake (University of Georgia)
David M. Lodge (University of Notre Dame)

FoodSIS: A Text Mining System to Improve the State of Food Safety in Singapore (Page 1709)
Kiran Kate (IBM Research)
Sneha Chaudhari (Carnegie Mellon University, Pittsburgh, USA)
Andy Prapanca (IBM Research)
Jayant Kalagnanam (IBM Research)

A Hazard Based Approach to User Return Time Prediction (Page 1719)
Komal Kapoor (University of Minnesota)
Mingxuan Sun (Pandora Media Inc.)
Jaideep Srivastava (University of Minnesota)
Tao Ye (Pandora Media Inc.)

Predicting Employee Expertise for Talent Management in the Enterprise (Page 1729)
Kush R. Varshney (IBM T. J. Watson Research Center)
Vijil Chenthamarakshan (IBM Thomas J. Watson Research Center)
Scott W. Fancher (IBM Corporate Headquarters)
Jun Wang (IBM T. J. Watson Research Center)
Dongping Fang (IBM T. J. Watson Research Center)
Aleksandra Mojsilovic (IBM T. J. Watson Research Center)

Applying Data Mining Techniques to Address Critical Process Optimization Needs in Advanced Manufacturing (Page 1739)
Li Zheng (Florida International University)
Chunqiu Zeng (Florida International University)
Lei Li (Florida International University)
Yexi Jiang (Florida International University)
Wei Xue (Florida International University)
Jingxuan Li (Florida International University)
Chao Shen (Florida International University)
Wubai Zhou (Florida International University)
Hongtai Li (Florida International University)
Liang Tang (Florida International University)
Tao Li (Florida International University)
Bing Duan (ChangHong COC Display Devices Co., Ltd)
Ming Lei (ChangHong COC Display Devices Co., Ltd)
Pengnian Wang (ChangHong COC Display Devices Co., Ltd)

EARS (Earthquake Alert and Report System): A Real Time Decision Support System for Earthquake Crisis Management (Page 1749)
Marco Avvenuti (University of Pisa)
Stefano Cresci (National Research Council (CNR), Italy)
Andrea Marchetti (National Research Council (CNR), Italy)
Carlo Meletti (National Institute of Geophysics and Volcanology, Italy)
Maurizio Tesconi (National Research Council (CNR), Italy)

Knock It Off: Profiling the Online Storefronts of Counterfeit Merchandise (Page 1759)
Matthew F. Der (University of California, San Diego)
Lawrence K. Saul (University of California, San Diego)
Stefan Savage (University of California, San Diego)
Geoffrey M. Voelker (University of California, San Diego)

Up Next: Retrieval Methods for Large Scale Related Video Suggestion (Page 1769)
Michael Bendersky (Google, Inc.)
Lluis Garcia-Pueyo (Google, Inc.)
Jeremiah Harmsen (Google, Inc.)
Vanja Josifovski (Google, Inc.)
Dima Lepikhin (Google, Inc.)

Identifying Tourists from Public Transport Commuters (Page 1779)
Mingqiang Xue (Institute for Infocomm Research, Singapore)
Huayu Wu (Institute for Infocomm Research, Singapore)
Wei Chen (Institute for Infocomm Research, Singapore)
Gin Howe Goh (Land Transport Authority of Singapore)

Spatially Embedded Co-Offence Prediction Using Supervised Learning (Page 1789)
Mohammad A. Tayebi (Simon Fraser University)
Martin Ester (Simon Fraser University)
Uwe Glässer (Simon Fraser University)
Patricia L. Brantingham (Simon Fraser University)

Beating the News' with EMBERS: Forecasting Civil Unrest Using Open Source Indicators (Page 1799)
Naren Ramakrishnan (Virginia Tech)
Patrick Butler (Virginia Tech)
Sathappan Muthiah (Virginia Tech)
Nathan Self (Virginia Tech)
Rupinder Khandpur (Virginia Tech)
Parang Saraf (Virginia Tech)
Wei Wang (Virginia Tech)
Jose Cadena (Virginia Tech)
Anil Vullikanti (Virginia Tech)
Gizem Korkmaz (Virginia Tech)
Chris Kuhlman (Virginia Tech)
Achla Marathe (Virginia Tech)
Liang Zhao (Virginia Tech)
Ting Hua (Virginia Tech)
Feng Chen (University at Albany, NY)
Chang-Tien Lu (Virginia Tech)
Bert Huang (University of Maryland)
Aravind Srinivasan (University of Maryland)
Khoa Trinh (University of Maryland)
Lise Getoor (University of California, Santa Cruz)
Graham Katz (CACI Inc.)
Andy Doyle (CACI Inc.)
Chris Ackermann (CACI Inc.)
Ilya Zavorin (CACI Inc.)
Jim Ford (CACI Inc.)
Kristen Summers (CACI Inc)
Youssef Fayed (BASIS Technology)
Jaime Arredondo (University of California at San Diego)
Dipak Gupta (San Diego State University)
David Mares (University of California, San Diego)

LASTA: Large Scale Topic Assignment on Multiple Social Networks (Page 1809)
Nemanja Spasojevic (Klout, Inc.)
Jinyun Yan (Klout, Inc.)
Adithya Rao (Klout, Inc.)
Prantik Bhattacharyya (Klout, Inc.)

New Algorithms for Parking Demand Management and a City Scale Deployment (Page 1819)
Onno Zoeter (Xerox Research Centre Europe)
Christopher Dance (Xerox Research Centre Europe)
Stéphane Clinchant (Xerox Research Centre Europe)
Jean-Marc Andreoli (Xerox Research Centre Europe)

Reducing Gang Violence Through Network Influence Based Targeting of Social Programs (Page 1829)
Paulo Shakarian (Arizona State University)
Joseph Salmento (U.S. Military Academy)
William Pulleyblank (U.S. Military Academy)
John Bertetto (Chicago Police Dept.)

Modeling Impression Discounting in Large-scale Recommender Systems (Page 1837)
Pei Lee (University of British Columbia)
Laks V.S. Lakshmanan (University of British Columbia)
Mitul Tiwari (LinkedIn Corporation)
Sam Shah (LinkedIn Corporation)

ISIS: A Networked-Epidemiology Based Pervasive Web App for Infectious Disease Pandemic Planning and Response (Page 1847)
Richard Beckman (Virginia Tech)
Keith R. Bisset (Virginia Tech)
Jiangzhuo Chen (Virginia Tech)
Bryan Lewis (Virginia Tech)
Madhav Marathe (Virginia Tech)
Paula Stretz (Virginia Tech)

Seven Rules of Thumb for Web Site Experimenters (Page 1857)
Ron Kohavi (Microsoft)
Alex Deng (Microsoft)
Roger Longbotham (SW Jiaotong University)
Ya Xu (LinkedIn)

Log-based Predictive Maintenance (Page 1867)
Ruben Sipos (Cornell University)
Dmitriy Fradkin (Siemens Corporation)
Fabian Moerchen (Amazon)
Zhuang Wang (Skytree)

Automated Hypothesis Generation Based on Mining Scientific Literature (Page 1877)
Scott Spangler (IBM Research)
Angela D. Wilkins (Baylor College of Medicine)
Benjamin J. Bachman (Baylor College of Medicine)
Meena Nagarajan (IBM Research)
Tajhal Dayaram (Baylor College of Medicine)
Peter Haas (IBM Research)
Sam Regenbogen (Baylor College of Medicine)
Curtis R. Pickering (The University of Texas MD Anderson Cancer Center)
Austin Comer (The University of Texas MD Anderson Cancer Center)
Jeffrey N. Myers (The University of Texas MD Anderson Cancer Center)
Ioana Stanoi (IBM Research)
Linda Kato (IBM Research)
Ana Lelescu (IBM Research)
Jacques J. Labrie (IBM Research)
Neha Parikh (Baylor College of Medicine)
Andreas Martin Lisewski (Baylor College of Medicine)
Lawrence Donehower (Baylor College of Medicine)
Ying Chen (IBM Research)
Olivier Lichtarge (Baylor College of Medicine)

A System to Grade Computer Programming Skills Using Machine Learning (Page 1887)
Shashank Srikant (ASPIRING MINDS)
Varun Aggarwal (ASPIRING MINDS)

An Empirical Study of Reserve Price Optimisation in Real-Time Bidding (Page 1897)
Shuai Yuan (University College London)
Jun Wang (University College London)
Bowei Chen (University College London)
Peter Mason (Advance International Media)
Sam Seljan (AppNexus)

Large-Scale High-Precision Topic Modeling on Twitter (Page 1907)
Shuang-Hong Yang (Twitter, Inc)
Alek Kolcz (Twitter, Inc)
Andy Schlaikjer (Twitter, Inc)
Pankaj Gupta (Twitter, Inc)

Early Prediction of Code Blue Using Electronic Medical Records (Page 1917)
Sriram Somanchi (Carnegie Mellon University)
Samrachana Adhikari (Carnegie Mellon University)
Allen Lin (Harvard University)
Elena Eneva (Accenture)
Rayid Ghani (University of Chicago)

Large Scale Visual Recommendations from Street Fashion Images (Page 1925)
Vignesh Jagadeesh (eBay Research Labs)
Robinson Piramuthu (eBay Research Labs)
Anurag Bhardwaj (eBay Research Labs)
Wei Di (eBay Research Labs)
Neel Sundaresan (eBay Research)

We Know What You Want to Buy: A Demographic-based System for Product Recommendation on Microblogs (Page 1935)
Wayne Xin Zhao (Renmin University of China)
Yanwei Guo (Peking University)
Yulan He (Aston University)
Han Jiang (Peking University)
Yuexin Wu (Peking University)
Xiaoming Li (Peking University)

Modeling Professional Similarity by Mining Professional Career Trajectories (Page 1945)
Ye Xu (Dartmouth College)
Zang Li (LinkedIn Corporation)
Abhishek Gupta (LinkedIn Corporation)
Ahmet Bugdayci (LinkedIn Corporation)
Anmol Bhasin (LinkedIn Corporation)

Filling Context-Ad Vocabulary Gaps with Click Logs (Page 1955)
Yukihiro Tagami (Yahoo Japan Corporation)
Toru Hotta (Yahoo Japan Corporation)
Yusuke Tanaka (Yahoo Japan Corporation)
Shingo Ono (Yahoo Japan Corporation)
Koji Tsukamoto (Yahoo Japan Corporation)
Akira Tajima (Yahoo Japan Corporation)

(Return to Top)

Panel

Does Social Good Justify Risking Personal Privacy? (Page 1965)
Raghu Ramakrishnan (Microsoft)
Geoffrey I. Webb (Monash University)

(Return to Top)

Tutorials

Scaling Up Deep Learning (Page 1966)
Yoshua Bengio (University of Montreal)

Constructing and Mining Web-Scale Knowledge Graphs: KDD 2014 Tutorial (Page 1967)
Antoine Bordes (Facebook)
Evgeniy Gabrilovich (Google)

Bringing Structure to Text: Mining Phrases, Entities, Topics, and Hierarchies (Page 1968)
Jiawei Han (The University of Illinois at Urbana Champaign)
Chi Wang (The University of Illinois at Urbana Champaign)
Ahmed El-Kishky (The University of Illinois at Urbana Champaign)

Computational Epidemiology (Page 1969)
Madhav Marathe (Virginia Tech)
Anil Kumar S. Vullikanti (Virginia Tech)

Management and Analytic of Biomedical Big Data with Cloud-Based In-Memory Database and Dynamic Querying: A Hands-on Experience with Real-world Data (Page 1970)
Mengling Feng (Massachusetts Institute of Technology)
Mohammad Ghassemi (Massachusetts Institute of Technology)
Thomas Brennan (Massachusetts Institute of Technology)
John Ellenberger (SAP Research)
Ishrar Hussain (SAP Research)
Roger Mark (Massachusetts Institute of Technology)

The Recommender Problem Revisited: Morning Tutorial (Page 1971)
Xavier Amatriain (Netflix)
Bamshad Mobasher (DePaul Universtity)

Correlation Clustering: from Theory to Practice (Page 1972)
Francesco Bonchi (Yahoo Labs)
David García-Soriano (Yahoo Labs)
Edo Liberty (Yahoo Labs)

Deep Learning (Page 1973)
Ruslan Salakhutdinov (University of Toronto)

Network Mining and Analysis for Social Applications (Page 1974)
Feida Zhu (Singapore Management University)
Huan Sun (University of California, Santa Barbara)
Xifeng Yan (University of California, Santa Barbara)

Sampling for Big Data: A Tutorial (Page 1975)
Graham Cormode (University of Warwick)
Nick Duffield (Rutgers University / DIMACS)

Statistically Sound Pattern Discovery (Page 1976)
Wilhelmiina Hämäläinen (University of Eastern Finland)
Geoffrey I. Webb (Monash University)

Recommendation in Social Media: Recent Advances and New Frontiers (Page 1977)
Jiliang Tang (Arizona State University)
Jie Tang (Tsinghua University)
Huan Liu (Arizona State University)