BigComp 2016

Tutorials

Tutorial One : Modification of Machine Learning Algorithms for Word Classification

Date : Morning, January 18 (Mon), 2016

Lecturer : Prof. Taeho Jo, Inha University

Description : This tutorial is concerned with the modifications of machine learning algorithms for processing symbolic data such as words and texts. Among tasks of processing symbolic data, we set the word classification as the scope of this tutorial, and explore the various types of classification tasks. In this tutorial, we describe the process of encoding words into tables, string vectors, and graphs as well as numerical vectors as the structured forms. We present the schemes of modifying the machine learning algorithms such as k nearest neighbor, Naïve Bayes, learning vector quantization, and perceptron, into the version which can receive the alternative structured forms to numerical vectors as the input data. Therefore, the goal of this research is to improve the word classification performance by solving problems such as dimensionality and sparse distribution in encoding symbolic data into numerical vectors.

The topic of this tutorial spans over data mining, machine learning, and information retrieval. The audiences of this tutorial need the basic knowledge about the data classification as a data mining task, in order to understand various types of classification tasks. The basic level of data mining and information retrieval is required for understanding the process of encoding words into structured forms. In order to understand how to modify the machine learning algorithms into the proposed versions, they need to know the basic machine learning algorithms: k nearest neighbor, Naïve Bayes, and Perceptron. Therefore, this tutorial is targeted for the audiences whose research interests span over machine learning.

Outline

Introduction
Types of Word Categorization

2.1. Hard Word Categorization
2.2. Soft Word Categorization
2.3. Hierarchical Word Categorization
2.4. Multiple Viewed Word Categorization

Word Encoding Schemes

3.1. Numerical Vectors
3.2. Tables
3.3. String Vectors
3.4. Graphs

Word Encoding Schemes

4.1. K Nearest Neighbor
4.2. Naïve Bayes
4.3. Learning Vector Quantization
4.4. Perceptron

Summary and Further Discussions

Bio : Taeho Jo works currently as a faculty member for the department of computer and information engineering in Inha University, South Korea. He received his Bachelor degree from Korea University in 1994, his Master degree from Pohang University of Science and Technology in 1997, and his PhD degree from University of Ottawa in 2006. His research area spans mainly over text mining, neural networks, machine learning, and information retrieval. He has the four year experience of working for industrial organizations and ten year experience of working for academic ones. So his research is characterized as the connection from fundamental researches for creating theories and to applied ones for developing products, by his experience of working for both sides.

Tutorial Two : User Understanding from Large Scale Human Behavioral Data

Date : Afternoon, January 18 (Mon), 2016

Lecturer : Dr. Xing Xie, Microsoft Research Asia

Description : With the rapid development of positioning, sensor and smart device technologies, large quantities of human behavioral data are now readily available. They reflect various aspects of human mobility and activities in the physical world. The availability of this data presents an unprecedented opportunity to gain a more in depth understanding of users and provide them with personalized online experience while respecting their privacy. In this tutorial, I will present a number of our recent research efforts on this direction, including user mobility understanding and prediction, location and activity recommendation, user linking across multiple networks, psychological trait inference, life pattern analysis, and driving behavior understanding.

Outline

Background

references

• Yanyan Fu, Hao Fu, Xing Xie, Guangzhong Sun, Min Zhang, Social Network Anonymization and Privacy Protection, Communications of the CCF (in Chinese), Vol. 10, No. 6, Jun. 2014, p51-37
• Nicholas Yuan, Xing Xie, User Understanding from Large Scale Human Behavioral Data, Communications of the CCF (in Chinese), Vol. 10, No. 5, May. 2014, p14-17
• Xing Xie, Mining Intelligence from the Physical World, Communications of the CCF (in Chinese), Vol. 6, No. 11, Nov. 2010, p31-37

User mobility understanding

references

• Fuzheng Zhang, Nicholas Jing Yuan, Yingzi Wang, Xing Xie, Reconstructing Individual Mobility from Smart Card Transactions: A Collaborative Space Alignment Approach, Knowledge and Information Systems, accepted
• Defu Lian, Xing Xie, Vincent W. Zheng, Nicholas Yuan, Fuzheng Zhang, Enhong Chen, CEPR: A Collaborative Exploration and Periodically Returning Model for Location Prediction, ACM Transactions on Intelligent Systems and Technology, accepted
• Defu Lian, Xing Xie, Fuzheng Zhang, Nicholas J. Yuan, Tao Zhou, Yong Rui, Mining Location-based Social Networks: A Predictive Perspective, IEEE Data Engineering Bulletin, Vol. 38, No. 2, Jun. 2015, p35-46
• Defu Lian, Cong Zhao, Xing Xie, Guangzhong Sun, Enhong Chen, Yong Rui, GeoMF: Joint Geographical Modeling and Matrix Factorization for Point-of-Interest Recommendation, 20th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2014, acceptance rate 151/1036=14.6%), New York City, USA, Aug. 2014
• Nicholas Jing Yuan, Yingzi Wang, Fuzheng Zhang, Xing Xie, and Guang-Zhong Sun, Reconstructing Individual Mobility from Smart Card Transactions: A Space Alignment Approach, The 13th IEEE International Conference on Data Mining (ICDM 2013), Dallas, Texas, USA, Dec. 2013 (best paper award)

User profiling and privacy

references

• Hao Fu, Aston Zhang, Xing Xie, Effective Social Graph De-anonymization based on Graph Structure and Descriptive Information, ACM Transactions on Intelligent Systems and Technology, accepted
• Aston Zhang, Xing Xie, Kevin Chang, Carl Gunter, Jiawei Han and Xiaofeng Wang, Privacy Risk in Anonymized Heterogeneous Information Networks, 17th International Conference on Extending Database Technology (EDBT 2014), Athens, Greece, Mar. 2014
• Nicholas Jing Yuan, Fuzheng Zhang, Defu Lian, Kai Zheng, Siyu Yu, Xing Xie, We Know How You Live: Exploring the Spectrum of Urban Lifestyles, 2013 ACM Conference on Online Social Networks (COSN 2013, acceptance rate 22/138=15.9%), Boston, USA, Oct. 2013

Bio : Dr. Xing Xie is currently a senior researcher in Microsoft Research Asia, and a guest Ph.D. advisor for the University of Science and Technology of China. He received his B.S. and Ph.D. degrees in Computer Science from the University of Science and Technology of China in 1996 and 2001, respectively. He joined Microsoft Research Asia in July 2001, working on spatial data mining, location based services, social networks and ubiquitous computing. During the past years, he has published over 160 referred journal and conference papers. He has more than 50 patents filed or granted. He has been invited to give keynote speeches at Socialinformatics 2015, GbR 2015, W2GIS 2011, HotDB 2012, SRSM 2012, etc. He currently serves on the editorial boards of ACM Transactions on Intelligent Systems and Technology (TIST), Springer GeoInformatica, Elsevier Pervasive and Mobile Computing, Journal of Location Based Services, and Communications of the China Computer Federation (CCCF). In recent years, he was involved in the program or organizing committees of over 70 conferences and workshops. Especially, he initiated the LBSN workshop series and served as program co-chair of ACM UbiComp 2011, the 8th Chinese Pervasive Computing Conference (PCC 2012) and the 12th International Conference on Ubiquitous Intelligence and Computing (UIC 2015). In Oct. 2009, he founded the SIGSPATIAL China chapter which was the first regional chapter of ACM SIGSPATIAL. He is a member of Joint Steering Committee of the UbiComp and Pervasive Conference Series. He is a senior member of ACM and the IEEE, and a distinguished member of China Computer Federation (CCF).

ⓒ Copyright 2015 KIISE – All Rights Reserved.
[KIISE] Korean Institute of Information Scientists and Engineers
#401 Meorijae Bldg., 76, Bangbae-ro, Seocho-gu, Seoul 137-849, Korea
Phone: +82-2-588-9240 / Fax: +82-2-521-1352 / Email: shpark@kiise.or.kr / Homepage: www.kiise.or.kr
Business Registration Number: 114-82-03170