Mining Top-k Pairs of Correlated Subgraphs in a Large Network
Mining Top-k Pairs of Correlated Subgraphs in a Large Network A summary of the VLDB 2020 research paper by Arneish Prateek, Arijit Khan, Akshit Goyal, and Sayan Ranu [Background and Problem] A large body of work exists on mining recurring structural patterns among a group of nodes in the form of frequent subgraphs [1, 2]. However, can we mine recurring patterns among the frequent subgraphs themselves ? In this paper, we explored this question by mining correlated pairs of frequent subgraphs. Correlated subgraphs are different from frequent subgraphs due to the flexibility in connections between constituent subgraph instances. To elaborate, in Figure 1, we highlight three regions inside the chemical structure of Taxol, an anti-cancer drug, where CCCH and O occur closely albeit connected in different ways in all three instances. For simplicity, we do not consider the edge types (i.e., single or double bonds) in this example. This figure illustrates th...