Tutorials

Dr. Qirong Ho, A*STAR and Singapore Management University

Dr. Qirong Ho is a scientist at the Institute for Infocomm Research, A*STAR, Singapore, and an adjunct assistant professor at the Singapore Management University School of Information Systems. His primary research focus is distributed cluster software systems for Machine Learning at Big Data scales, with a view towards correctness and performance guarantees. In addition, Dr. Ho has performed research on statistical models for large-scale network analysis --- particularly latent space models for visualization, community detection, user personalization and interest prediction --- as well as social media analysis on hyperlinked documents with text and network data. Dr. Ho received his PhD in 2014, under Eric P. Xing at Carnegie Mellon University's Machine Learning Department. He is a recipient of the 2015 KDD Dissertation Award (runner-up), and the Singapore A*STAR National Science Search Undergraduate and PhD fellowships.

Abstract

The rise of Big Data has led to new demand for Machine Learning (ML) systems to learn complex models often with millions to billions of parameters that promise adequate capacity to analyze massive datasets and offer predicative functions thereupon. For example, in many modern applications such as web-scale content extraction via topic models, genome-wide association mapping via sparse structured regression, and image understanding via deep neural networks, one needs to handle BIG ML problems that threaten to exceed the limit of current architectures and algorithms. In this tutorial, we present a systematic overview of modern scalable ML approaches for such applications --- the insights and challenges of designing scalable and parallelizable algorithms for working with Big Data and Big Model; the principles and architectures of building distributed systems for executing these models and algorithms; and the theory and analysis necessary for understanding the behaviors and providing guarantees of these models, algorithms, and systems.

We present a comprehensive, principled, yet highly unified and application-grounded view of the fundamentals and strategies underlying a wide range of modern ML programs practiced in industry and academia, beginning with introducing the basic algorithmic roadmaps of both optimization-theoretic and probabilistic-inference methods --- two major workhorse algorithmic engines that power nearly all ML programs, and the technical developments therein aiming at large scales built on algorithmic acceleration, stochastic approximation, and parallelization. We then turn to the challenges such algorithms must face in a practical distributed computing environment due to memory/storage limit, communication bottleneck, resource contention, straggler, etc., and review and discuss various modern parallelization strategies and distributed frameworks that can actually run these algorithms at Big Data and Big Model scales, while also exposing the theoretical insights that make such systems and strategies possible. We focus on what makes ML algorithms peculiar, and how this can lead to algorithmic and systems designs that are markedly different from today’s Big Data platforms. We discuss such new opportunities in algorithm, system, and theory on parallel machine learning, in real (instead of ideal) distributed communication, storage, and computing environments.

  • Entity Search, Recommendation and Understanding

Dr. Hao Ma, Microsoft Research, USA

Dr. Hao is a Researcher at the Internet Services Research Center, Microsoft Research at Redmond, USA. He obtained his Ph.D. in Computer Science at The Chinese University of Hong Kong. His research interests include Natural Language Processing, Machine Learning, Information Retrieval, Recommender Systems and Social Network Analysis. Most recently, Dr. Ma has been working on entity related research problems and applications. He designed the core learning algorithms that powered both Bing's and Microsoft's entity experience, including entity recommendation, attributes ranking, interpretation, exploration, carousel ranking, question and answering, etc. He has published more than 40 research papers in prestigious conferences and journals, including WWW, SIGIR, WSDM, CIKM, AAAI, TOIS, TKDE, TMM, TIST, etc. Some of his research output has been widely reported by popular News media, like MIT Technology Review, Search Engine Land, etc. Dr. Ma is also in the winning team that won the Microposts Entity Linking Challenge in WWW 2014.

Abstract

Recent years have witnessed rapidly increasing interests on the research field of semantic search. Knowledge base powered entity search and recommendation experience has been widely adopted by major search engine companies. Entity search, recommendation and understanding techniques differ significantly from those in traditional search problems due to the introduction of knowledge base. The heterogeneity, semantic richness and large-scale nature of knowledge base make traditional approaches less effective. In this tutorial, we provide a detailed interdisciplinary introduction on how entity search and recommendation work and how various entity understanding methods based on Machine Learning, Natural Language Processing and Information Retrieval techniques could further improve entity related experience.

This tutorial consists of four major parts. In the first part, we give a brief introduction on entities and knowledge bases. We also show how we collect information from different data sources as well as how we infer users' interests on specific entities. In the second part, we demonstrate various entity search and recommendation applications we developed and productionized in Bing and Microsoft, including entity recommendation, natural language interpretation of recommendation, attribute ranking, carousel ranking, entity exploration, factoid answers, conversational search, semantic question and answering, etc. The architectures, challenges, and corresponding solutions on these systems will also be briefly introduced in the second part of the tutorial. The third part will give a deep dive on the algorithms that are related to entity search and recommendation, including basic non-personalized search and recommendation algorithms as well as recommendation models that tailor related entities to an individual search user's unique taste and preference. The fourth part of this tutorial will focus on how to further improve semantic recommendation and search experience by employing other entity understanding techniques, including entity linking to the knowledge bases, question and answering on the web documents, etc.

The tutorial will conclude by summarizing and reflecting back on the semantic search applications that users are experiencing on the Web and posit that what we have presented in the tutorial is just a tip of the iceberg to a whole area of exciting and dynamic research that is worthy of more detailed investigation for many years to come.

  • Big Data Analytics: Optimization and Randomization [Slides]

Prof. Tianbao Yang, University of Iowa

Dr. Tianbao Yang is currently an assistant professor at the University of Iowa (UI). He received his Ph.D. degree in Computer Science from Michigan State University in 2012. Before joining UI, he was a researcher in NEC Laboratories America at Cupertino (2013-2014) and a Machine Learning Researcher in GE Global Research (2012-2013), mainly focusing on developing distributed optimization system for various classification and regression problems. Dr. Yang has broad interests in machine learning and he has focused on several research topics, including large-scale optimization in machine learning, online optimization and distributed optimization. His recent research interests revolve around randomized algorithms for solving big data problems. He has published over 25 papers in prestigious machine learning conferences and journals. He has won the Mark Fulk Best student paper award at 25th Conference on Learning Theory (COLT) in 2012.

Abstract

As the scale and dimensionality of data continue to grow in many applications of data analytics (e.g., bioinformatics, finance, computer vision, medical informatics), it becomes critical to develop efficient and e ffective algorithms to solve numerous machine learning and data mining problems. This tutorial will focus on simple yet practically eff ective techniques and algorithms for big data analytics. In the fi rst part, we plan to present the state-of-the-art large-scale optimization algorithms, including various stochastic gradient descent methods, stochastic coordinate descent methods and distributed optimization algorithms, for solving various machine learning problems. In the second part, we will focus on randomized approximation algorithms for learning from large-scale data. We will discuss i) randomized algorithms for low-rank matrix approximation; ii) approximation techniques for solving kernel learning problems; iii) randomized reduction methods for addressing the high-dimensional challenge. Along with the descriptions of algorithms, we will also present some empirical results to facilitate understanding and comparison between diff erent algorithms.

  • Causal Discovery and Inference: Traditional Approach and Recent Advances

Prof. Kun Zhang, Carnegie Mellon University

Dr. Kun Zhang is an assistant professor in the philosophy department at Carnegie Mellon University (CMU). Before joining CMU, he was a senior research scientist at Max-Planck Institute for Intelligent Systems, Germany. He got his Ph.D from Chinese University of Hong Kong and then worked at University of Helsinki as a postdoctoral fellow. His main research interests include causal discovery, machine learning, and large-scale data analysis. He has served as a co-organizer of a series of workshops to foster interdisciplinary research in causality.

Prof. Jiji Zhang, Lingnan University

Dr. Jiji Zhang is an associate professor of philosophy at Lingnan University. He got his PhD from Carnegie Mellon University in 2006, and subsequently taught at the California Institute of Technology before moving to Hong Kong in 2008. His research is interdisciplinary in nature, and centers on methodological, epistemological, and logical issues in causal inference and statistical inference. His work has been published in some venues in machine learning and statistics as well as those in philosophy.

Abstract

Causality is a fundamental notion in science, and plays an important role in explanation, prediction, decision making, and control. Recently, interesting advances were made in machine learning for tackling some long-standing problems in causality, such as how to distinguish cause from effect given two random variables. On the other hand, causal models provide compact descriptions of the properties of data distributions, and it has been demonstrated that causal knowledge can facilitate various machine learning tasks. This tutorial talk aims to give a broad coverage of emerging approaches to causal discovery and causal inference from i.i.d data and from time series, with both theoretical and practical results, and related issues.

We start with the constraint-based approach to causal discovery, which relies on the conditional independence relationships in the data, and discuss its wide applicability as well as its drawbacks. We then talk about the identifiability of the causal structure implied by appropriately defined functional causal models; in particular, in the two-variable case, under what conditions (and why) is the causal direction between the two variables identifiable? We show that the independence between the noise and causes, together with appropriate structural constraints on the functional form, makes it possible. We will focus on the linear non-Gaussian causal model and the post-nonlinear causal model, due to their simplicity and generality, respectively. Next, we report some recent advances in causal discovery from time series. Assuming that the causal relations are linear with non-Gaussian noise, we focus on two problems which are traditionally difficult to solve, namely, causal discovery from subsampled data and that in the presence of confounding time series.

Finally, we address how causal knowledge is able to facilitate understanding and solving certain machine learning problems. We consider two learning problems--semi-supervised learning and domain adaptation--from a causal point of view, and discuss the implications of causal knowledge that help understand or solve the problems better. A number of open questions in the field of causal discovery and inference are also provided.