- Tutorial One : Modification of Machine Learning Algorithms for Word Classification
- Date : Morning, January 18 (Mon), 2016
- Lecturer : Prof. Taeho Jo, Inha University
- Description : This tutorial is concerned with the modifications of machine learning algorithms for processing symbolic data such as words and texts. Among tasks of processing symbolic data, we set the word classification as the scope of this tutorial, and explore the various types of classification tasks. In this tutorial, we describe the process of encoding words into tables, string vectors, and graphs as well as numerical vectors as the structured forms. We present the schemes of modifying the machine learning algorithms such as k nearest neighbor, Naïve Bayes, learning vector quantization, and perceptron, into the version which can receive the alternative structured forms to numerical vectors as the input data. Therefore, the goal of this research is to improve the word classification performance by solving problems such as dimensionality and sparse distribution in encoding symbolic data into numerical vectors.
The topic of this tutorial spans over data mining, machine learning, and information retrieval. The audiences of this tutorial need the basic knowledge about the data classification as a data mining task, in order to understand various types of classification tasks. The basic level of data mining and information retrieval is required for understanding the process of encoding words into structured forms. In order to understand how to modify the machine learning algorithms into the proposed versions, they need to know the basic machine learning algorithms: k nearest neighbor, Naïve Bayes, and Perceptron. Therefore, this tutorial is targeted for the audiences whose research interests span over machine learning.
- Introduction
- Types of Word Categorization
- 2.1. Hard Word Categorization
- 2.2. Soft Word Categorization
- 2.3. Hierarchical Word Categorization
- 2.4. Multiple Viewed Word Categorization
- Word Encoding Schemes
- 3.1. Numerical Vectors
- 3.2. Tables
- 3.3. String Vectors
- 3.4. Graphs
- Word Encoding Schemes
- 4.1. K Nearest Neighbor
- 4.2. Naïve Bayes
- 4.3. Learning Vector Quantization
- 4.4. Perceptron
- Summary and Further Discussions
- Bio : Taeho Jo works currently as a faculty member for the department of computer and information engineering in Inha University, South Korea. He received his Bachelor degree from Korea University in 1994, his Master degree from Pohang University of Science and Technology in 1997, and his PhD degree from University of Ottawa in 2006. His research area spans mainly over text mining, neural networks, machine learning, and information retrieval. He has the four year experience of working for industrial organizations and ten year experience of working for academic ones. So his research is characterized as the connection from fundamental researches for creating theories and to applied ones for developing products, by his experience of working for both sides.
- Tutorial Two : User Understanding from Large Scale Human Behavioral Data
- Date : Afternoon, January 18 (Mon), 2016
- Lecturer : Dr. Xing Xie, Microsoft Research Asia
- Description : With the rapid development of positioning, sensor and smart device technologies, large quantities of human behavioral data are now readily available. They reflect various aspects of human mobility and activities in the physical world. The availability of this data presents an unprecedented opportunity to gain a more in depth understanding of users and provide them with personalized online experience while respecting their privacy. In this tutorial, I will present a number of our recent research efforts on this direction, including user mobility understanding and prediction, location and activity recommendation, user linking across multiple networks, psychological trait inference, life pattern analysis, and driving behavior understanding.