An Evaluation of Backpropagation Interpretability for Graph Classification with Deep Learning

 

A summary of the IEEE BigData 2020 research paper by Kenneth Teo Tian Shun, Eko Edita Limanta, and Arijit Khan

 

[Graph Classification] Graph data are ubiquitous in many domains, such as social networks, knowledge graphs, biological networks, geo-spatial road networks, and internet-of-things. There are plenty of interest and applications in developing high-quality machine learning techniques for classifying graph objects, including cheminformatics (e.g., compounds that are active or inactive for some target) [1], [2] and bioinformatics (e.g., classifying proteins into different families) [3], malware detection and classification with call graphs [18]. There are two research directions for classification tasks on graphs: (1) Given a single large graph, infer labels of individual nodes (the node classification problem) [4], [5]; and (2) given a set of graphs with different structure and sizes, predict the class labels of unseen graphs (the graph classification problem) [6], [7], [19]. We focus on the second problem — graph classification, which is more difficult because graph datasets contain graphs with varying numbers of nodes and edges.

 

[Why Deep Learning in Graph Classification] Unlike images, general graphs do not have regular grid-like structures, and the neighborhood size of each node varies to a great extent. The lack of ordered vector representation limits the applicability of machine learning on graphs, and consequently, makes it difficult to build a classifier directly over the graph space [19], [7]. In the past, graph kernels (e.g., random walks, shortest paths, cyclic patterns, subtrees, graphlets, and subgraph kernels) [8] and feature based classification (e.g., frequent and significant subgraphs) methods [9], [10] were developed. Due to the challenges of feature engineering, coupled with the hardness of subgraph mining and isomorphism testing, deep learning methods are increasingly becoming popular. In particular, the end-to-end nature of learning in convolutional neural networks (CNNs) and their ability to extract localized spatial features, have turned them into powerful tools for learning from a large corpus of data including graphs. We focus on three state-of-the-art graph convolutional neural networks (GCNNs): GCNN with global average pooling (GCNN+GAP)[11], [5], DGCNN [19], and DIFFPOOL [7], as well as their variants.

 

[Interpretability Challenges in Deep Learning-based Graph Classification] Deep learning models and neural networks are “black-box” — it is inherently difficult to understand which aspects of the input data drive the decisions of the network. Interpretability can improve the model’s transparency related to fairness, privacy, and other safety challenges, thus enhancing the trust in decision-critical applications [12], [13]. Network data are complex, noisy, and content-rich. End users (e.g., data scientists, business analysts) often find them difficult to query and explore [14]. Hence, it is even more critical to provide human-understandable interpretation over classification results on such datasets. While interpretability for GCNNs is an open area of research, inspired by recent interpretability tools for CNNs, in this work we investigate three backpropagation based post-hoc interpretability methods: (1) saliency map with contrastive gradients (CG) [15], (2) gradient-weighted class activation mapping (Grad-CAM) [16], and (3) deep learning important features (DeepLIFT) [17]. We conduct a thorough experimental evaluation of these interpretability methods in conjunction with three state-of-the-art graph convolutional neural networks: GCNN+GAP [11], [5], DGCNN [19], and DIFFPOOL [7], and with their variants.

We design novel strategies for integrating them, measuring their efficiency and performance both qualitatively and quantitatively. We implement GCNNs and interpretability tools in a common environment, using same hyperparameters selection criteria, and present extensive empirical comparisons over five real-world graph datasets from different categories. We report their quantitative, visualization, and active subgraph based performance, compare them with the classic significant subgraph mining results [10], summarize their trade-offs, and provide guidelines for researchers and practitioners.

[Summary and our recommendation] The following table summarizes our recommendation levels of each interpretability method according to different performance metrics. The scale is from 1 to 4 stars, larger number of stars stands for higher recommendation. Clearly, there is no single winner.



(1) Considering various trade-offs, our recommended interpretability tools for graph classification are DeepLIFT, GradCAM, in combination with recent (and more accurate) GCNNs: DGCNN, DIFFPOOL, and its variant DIFFPOOL_D. (2) Grad-CAM is useful for class-discriminative visualization, whereas DeepLIFT can identify differences between two individual instances. (3) Finally, DIFFPOOL and DIFFPOOL_D can assist visualizing node clustering at various granularity, and this has the potential to identify active subgraphs, e.g., active functional groups for biological networks classification.

 

We open sourced an end-to-end framework (https://github.com/tsKenneth/interpretable-graph-classification) in which one can plug and play with different graph datasets, GCNNs, and backpropagation-based interpretability tools. In future, it would be interesting to test and model how high accuracy and fidelity could impact the user’s trust in the underlying graph classification tool, as well as to conduct usability analysis due to higher contrastivity, sparsity, explanation subgraph size, and efficiency.


For more information, click here for our paper. Code: GitHub.

 

References

[1] Z. Wu, B. Ramsundar, E. N. Feinberg, J. Gomes, C. Geniesse, A. S. Pappu, K. Leswing, and V. S. Pande, “MoleculeNet: A Benchmark for Molecular Machine Learning”, CoRR, vol. abs/1703.00564, 2017.

[2] D. Duvenaud, D. Maclaurin, J. A.-Iparraguirre, R. G.-Bombarelli, T. Hirzel, A. A.-Guzik, and R. P. Adams, “Convolutional Networks on Graphs for Learning Molecular Fingerprints,” NeurIPS, 2015.

[3] R. Sharan, S. Suthram, R. M. Kelley, T. Kuhn, S. McCuine, P. Uetz, T. Sittler, R. M. Karp, and T. Ideker, “Conserved Patterns of Protein Interaction in Multiple Species”, PNAS, vol. 102, no. 6, 2005.

[4] J. Bruna, W. Zaremba, A. Szlam, and Y. LeCun, “Spectral Networks and Locally Connected Networks on Graphs,” ICLR, 2014.

[5] T. N. Kipf and M. Welling, “Semi-Supervised Classification with Graph Convolutional Networks”, ICLR, 2017.

[6] D. Duvenaud, D. Maclaurin, J. A.-Iparraguirre, R. G.-Bombarelli, T. Hirzel, A. A.-Guzik, and R. P. Adams, “Convolutional Networks on Graphs for Learning Molecular Fingerprints,” NeurIPS, 2015.

[7] Z. Ying, J. You, C. Morris, X. Ren, W. L. Hamilton, and J. Leskovec, “Hierarchical Graph Representation Learning with Differentiable Pooling”, NeurIPS, 2018.

[8] S. V. N. Vishwanathan, N. N. Schraudolph, R. Kondor, and K. M. Borgwardt, “Graph Kernels”, J. Mach. Learn. Res., vol. 11, 2010.

[9] X. Yan, H. Cheng, J. Han, and P. S. Yu, “Mining Significant Graph Patterns by Leap Search”, SIGMOD, 2008.

[10] S. Ranu and A. K. Singh, “GraphSig: A Scalable Approach to Mining Significant Subgraphs in Large Graph Databases”, ICDE, 2009.

[11] M. Lin, Q. Chen, and S. Yan, “Network In Network”, ICLR, 2014.

[12] M. Du, N. Liu, and X. Hu, “Techniques for Interpretable Machine Learning”, Commun. ACM, vol. 63, no. 1, 2020.

[13] Z. C. Lipton, “The Mythos of Model Interpretability”, Commun. ACM, vol. 61, no. 10, 2018.

[14] N. Jayaram, A. Khan, C. Li, X. Yan, and R. Elmasri, “Querying Knowledge Graphs by Example Entity Tuples”, IEEE Trans. Knowl. Data Eng., vol. 27, no. 10, 2015.

[15] K. Simonyan, A. Vedaldi, and A. Zisserman, “Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps”, ICLR, 2013.

[16] R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, “Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization”, ICCV, 2017.

[17] A. Shrikumar, P. Greenside, and A. Kundaje, “Learning Important Features Through Propagating Activation Differences”, ICML, 2017.

[18] M. Fan, X. Luo, J. Liu, M. Wang, C. Nong, Q. Zheng, and T. Liu, “Graph Embedding based Familial Analysis of Android Malware using Unsupervised Learning,” ICSE, 2019.

[19] M. Zhang, Z. Cui, M. Neumann, and Y. Chen, “An End-to-End Deep Learning Architecture for Graph Classification”, AAAI, 2018.

Comments

  1. All thanks to Mr Anderson for helping with my profits and making my fifth withdrawal possible. I'm here to share an amazing life changing opportunity with you. its called Bitcoin / Forex trading options. it is a highly lucrative business which can earn you as much as $2,570 in a week from an initial investment of just $200. I am living proof of this great business opportunity. If anyone is interested in trading on bitcoin or any cryptocurrency and want a successful trade without losing notify Mr Anderson now.Whatsapp: (+447883246472 )
    Email: tdameritrade077@gmail.com

    ReplyDelete

Post a Comment

Popular posts from this blog

Measurements, Analyses, and Insights on the Entire Ethereum Blockchain Network