How can we perform deep learning efficiently? Deep learning is one of the most widely used machine learning techniques, and is a key driving force of the 4th industrial revolution. Deep learning outperforms many existing algorithms and even humans especially for many difficult tasks including speech recognition, go, language translation, game, etc. One crucial challenge of deep learning, however, is its efficiency both in training and inference. Deep learning requires a lot of parameters which need huge amount of time and space for storage and running. The problem becomes worse in mobile devices like smart phone since they have a limited amount of storage and computing power. It is necessary to design an efficient method for learning and inference in deep learning, which is exactly the goal of this tutorial.
We start with a brief background of deep learning, including its history, application, and popular models including feedforward neural network, convolutional neural network, and recurrent neural network. Then we describe how to compress deep learning models using techniques including pruning, weight sharing, quantization, approximation, and regularization. The audience is expected to gain substantial knowledge about reducing time and space in using deep learning.
U Kang is an associate professor in the Department of Computer Science and Engineering of Seoul National University. He received Ph.D. in Computer Science at Carnegie Mellon University, after receiving B.S. in Computer Science and Engineering at Seoul National University. He won 2013 SIGKDD Doctoral Dissertation Award, 2013 New Faculty Award from Microsoft Research Asia, 2016 Korean Young Information Scientist Award, 2018 ICDM 10-year best paper award, and two best paper awards. He has published over 60 refereed articles in major data mining and database venues. He holds four U.S. patents. His research interests include big data mining and machine learning.