Abstract :
[en] As a ubiquitous complex system in quotidian life around everyone, online social networks (OSNs) provide a rich source of information about billions of users worldwide. To some extent, OSNs have mirrored our real society: people perform a multitude of different activities in OSNs as they do in the offline world, such as establishing social relations, sharing life moments, and expressing opinions about various topics. Therefore, understanding OSNs is of immense importance. One key characteristic of human social behaviour in OSNs is their inter-relational nature, which can be represented as graphs. Due to sparsity and complex structure, analysing these graphs is quite challenging and expensive.
Over the past several decades, many expert-designed approaches to graphs have been proposed with elegant theoretical properties and successfully addressed numerous practical problems. Nevertheless, most of them are either not data-driven or do not benefit from the rapidly growing scale of data. Recently, in the light of remarkable achievements of artificial intelligence, especially deep neural networks techniques, graph machine learning (GML) has emerged to provide us with novel perspectives to understanding and analysing graphs. However, the current efforts of GML are relatively immature and lack attention to specific scenarios and characteristics of OSNs. Based on the pros and cons of GML, this thesis discusses several aspects of how to build advanced approaches to better simplify and ameliorate OSN analytic tasks. Specifically:
1) Overcoming flat message-passing graph neural networks. One of the most widely pursued branches in GML research, graph neural networks (GNNs), follows a similar flat message-passing principle for representation learning. Precisely, information is iteratively passed between adjacent nodes along observed edges via non-linear transformation and aggregation functions. Its effectiveness has been widely proved; however, two limitations need to be tackled: (i) they are costly in encoding long-range information spanning the graph structure; (ii) they are failing to encode features in the high-order neighbourhood in the graphs as they only perform information aggregation across the observed edges in the original graph. To fill up the gap, we propose a novel hierarchical message-passing framework to facilitate the existing GNN mechanism. Following this idea, we design two practical implementations, i.e., HC-GNN and AdamGNN, to demonstrate the framework's superiority.
2) Extending graph machine learning to heterophilous graphs. The existing GML approaches implicitly hold a homophily assumption that nodes of the same class tend to be connected. However, previous expert studies have shown the enormous importance of addressing the heterophily scenario, where ``opposites attract'', is essential for network analysis and fairness study. We demonstrate the possibility of extending GML to heterophilous graphs by simplifying supervised node classification models on heterophilous graphs (CLP) and designing an unsupervised heterophilous graph representation learning model (Selene).
3) Online social network analysis with graph machine learning. As GML approaches have demonstrated significant effectiveness over general graph analytic tasks, we perform two practical OSN analysis projects to illustrate the possibility of employing GML in practice. Specifically, we propose a semantic image graph embedding (SiGraph) to improve OSN image recognition task with the associated hashtags semantics and a simple GNN-based neural link prediction framework (NeuLP) to boost the performance with tiny change.
Keywords: Graph machine learning, Social network analysis, Graph neural networks, Hierarchical structure, Homophily/Heterophily graphs, Link prediction, Online image content understanding.