Knowledge Mining in Heterogeneous Information Networks
OVERVIEW: People and informational objects are interconnected, forming gigantic, interconnected, integrated information networks. By structuring these data objects into multiple types, such networks become semi-structured heterogeneous information networks. Most real world applications that handle big data, including interconnected social media and social networks, medical information systems, online e-commerce systems, or database systems, can be structured into typed, semi-structured, heterogeneous information networks. For example, in a medical care network, objects of multiple types, such as patients, doctors, diseases, medication, and links such as visits, diagnosis, and treatments are intertwined together, providing rich information and forming heterogeneous information networks. Effective construction, exploration and analysis of large-scale heterogeneous information networks poses an interesting but critical challenge.
In this talk, we present principles, methodologies and algorithms for mining in heterogeneous social and information networks and show that mining typed, heterogeneous networks is a promising research frontier in data mining research. Departing from many existing network models that view data as homogeneous graphs or networks, the semi-structured heterogeneous information network model leverages the rich semantics of typed nodes and links in a network and can uncover surprisingly rich knowledge from interconnected data. This heterogeneous network modeling will lead to the discovery of a set of new principles and methodologies for mining and exploring interconnected data, such as rank-based clustering and classification, meta path-based similarity search, and meta path-based link/relationship prediction. We will also discuss our recent progress on construction of quality semi-structured heterogeneous information networks from unstructured data and point out some promising research directions.
READINGS:Yizhou Sun and Jiawei Han (2012) Mining Heterogeneous Information Networks: Principles and Methodologies, Morgan & Claypool Publishers
Chi Wang, Marina Danilevsky, Jialu Liu, Nihit Desai, Heng Ji, and Jiawei Han, Constructing Topical Hierarchies in Heterogeneous Information Networks, Proc. 2013 IEEE Int. Conf.on Data Mining (ICDM'13), Dallas, TX, Dec. 2013