第13期:Introduction to textual document analysis

发布者:系统管理员发布时间:2020-06-03浏览次数:66

题目: Introduction to textual document analysis
主讲人:欧卫助理教授
时间:2020-06-04 2020年6月4日(周四)14:00-15:00
地点:综合楼526
摘要:
Textual documents are a very important source of information. According to a rough estimation by a prestigious sociologist, approximately 70% of the information consumed by human society comes in the form of textual documents. The essential information of a textual document is embedded in natural human languages, that are usually very noisy and free of structures. Therefore, to extract the information, it takes either intense reading efforts by human readers, or computational analysis by machines. In the context of big data, employing human readers for the information extraction is always out of the question. Nowadays, it has become increasingly practical to use machines to do the analysis. In this seminar, Dr. Ou is going to share his knowledge and experience in computational analysis of textual documents. The outline of his talk is as follows.


Outline
1.Main tasks in textual document analysis
2.Numeric representations of textualdocuments
3.Document similarities
4.Dcoument classification
5.Extracting keywords from documents



主讲人简介:

OU Wei holds a PhD degree in Knowledge Science. He graduated from Japan Advanced Institute of Science and Technology, under the sponsorship of Chinese Scholarship Council (CSC, 中国政府建设高水平大学项目). Dr. Ou's research interests mainly lie in the area of machine learning and natural language processing. Before joining IBS, Dr. OU worked as a data scientist at a major hiring company based in Tokyo, and gained rich experience in both academic and industrial research. At IBS, Dr. Ou is currently focusing on the analysis of textual patent documents, and devising efficient algorithms to mine insightful information from these documents.