An Evaluation of Backpropagation Interpretability for Graph Classification with Deep Learning
A summary of the IEEE BigData 2020 research paper by Kenneth Teo Tian Shun, Eko Edita Limanta, and Arijit Khan
[Graph Classification] Graph data are ubiquitous in many domains,
such as social networks, knowledge graphs, biological networks, geo-spatial road
networks, and internet-of-things. There are plenty of interest and applications
in developing high-quality machine learning techniques for classifying graph
objects, including cheminformatics (e.g., compounds that are active or inactive
for some target) [1], [2] and bioinformatics (e.g., classifying proteins into
different families) [3], malware detection and classification with call graphs
[18]. There are two research directions for classification tasks on graphs: (1)
Given a single large graph, infer labels of individual nodes (the node classification
problem) [4], [5]; and (2) given a set of graphs with different structure and
sizes, predict the class labels of unseen graphs (the graph classification
problem) [6], [7], [19]. We focus on the second problem — graph classification,
which is more difficult because graph datasets contain graphs with varying
numbers of nodes and edges.
[Why Deep
Learning in Graph Classification] Unlike images, general graphs do not
have regular grid-like structures, and the neighborhood size of each node
varies to a great extent. The lack of ordered vector representation limits the
applicability of machine learning on graphs, and consequently, makes it
difficult to build a classifier directly over the graph space [19], [7]. In the
past, graph kernels (e.g., random walks, shortest paths, cyclic patterns, subtrees,
graphlets, and subgraph kernels) [8] and feature based classification (e.g.,
frequent and significant subgraphs) methods [9], [10] were developed. Due to
the challenges of feature engineering, coupled with the hardness of subgraph mining
and isomorphism testing, deep learning methods are increasingly becoming
popular. In particular, the end-to-end nature of learning in convolutional
neural networks (CNNs) and their ability to extract localized spatial features,
have turned them into powerful tools for learning from a large corpus of data including
graphs. We focus on three state-of-the-art graph convolutional neural networks
(GCNNs): GCNN with global average pooling (GCNN+GAP)[11], [5], DGCNN [19], and
DIFFPOOL [7], as well as their variants.
[Interpretability
Challenges in Deep Learning-based Graph Classification] Deep
learning models and neural networks are “black-box” — it is inherently
difficult to understand which aspects of the input data drive the decisions of
the network. Interpretability can improve the model’s transparency related to
fairness, privacy, and other safety challenges, thus enhancing the trust in decision-critical
applications [12], [13]. Network data are complex, noisy, and content-rich. End
users (e.g., data scientists, business analysts) often find them difficult to
query and explore [14]. Hence, it is even more critical to provide human-understandable
interpretation over classification results on such datasets. While
interpretability for GCNNs is an open area of research, inspired by recent
interpretability tools for CNNs, in this work we investigate three
backpropagation based post-hoc interpretability methods: (1) saliency map with contrastive
gradients (CG) [15], (2) gradient-weighted class activation mapping (Grad-CAM)
[16], and (3) deep learning important features (DeepLIFT) [17]. We conduct a
thorough experimental evaluation of these interpretability methods in conjunction
with three state-of-the-art graph convolutional neural networks: GCNN+GAP [11],
[5], DGCNN [19], and DIFFPOOL [7], and with their variants.
We design novel strategies for integrating them, measuring their efficiency and performance both qualitatively and quantitatively. We implement GCNNs and interpretability tools in a common environment, using same hyperparameters selection criteria, and present extensive empirical comparisons over five real-world graph datasets from different categories. We report their quantitative, visualization, and active subgraph based performance, compare them with the classic significant subgraph mining results [10], summarize their trade-offs, and provide guidelines for researchers and practitioners.
[Summary and our recommendation] The following table summarizes our recommendation levels of each interpretability method according to different performance metrics. The scale is from 1 to 4 stars, larger number of stars stands for higher recommendation. Clearly, there is no single winner.
(1) Considering various trade-offs, our recommended interpretability tools for graph classification are DeepLIFT, GradCAM, in combination with recent (and more accurate) GCNNs: DGCNN, DIFFPOOL, and its variant DIFFPOOL_D. (2) Grad-CAM is useful for class-discriminative visualization, whereas DeepLIFT can identify differences between two individual instances. (3) Finally, DIFFPOOL and DIFFPOOL_D can assist visualizing node clustering at various granularity, and this has the potential to identify active subgraphs, e.g., active functional groups for biological networks classification.
We open sourced an end-to-end framework (https://github.com/tsKenneth/interpretable-graph-classification) in which one can plug and play with different graph datasets, GCNNs, and backpropagation-based interpretability tools. In future, it would be interesting to test and model how high accuracy and fidelity could impact the user’s trust in the underlying graph classification tool, as well as to conduct usability analysis due to higher contrastivity, sparsity, explanation subgraph size, and efficiency.
For more information, click here for our paper. Code: GitHub.
References
[1] Z. Wu, B. Ramsundar, E. N. Feinberg, J. Gomes, C. Geniesse, A. S. Pappu, K. Leswing, and V. S. Pande, “MoleculeNet: A Benchmark for Molecular Machine Learning”, CoRR, vol. abs/1703.00564, 2017.
[2] D. Duvenaud, D. Maclaurin, J. A.-Iparraguirre, R. G.-Bombarelli, T. Hirzel, A. A.-Guzik, and R. P. Adams, “Convolutional Networks on Graphs for Learning Molecular Fingerprints,” NeurIPS, 2015.
[3] R. Sharan, S. Suthram, R. M. Kelley, T. Kuhn, S. McCuine, P. Uetz, T. Sittler, R. M. Karp, and T. Ideker, “Conserved Patterns of Protein Interaction in Multiple Species”, PNAS, vol. 102, no. 6, 2005.
[4] J. Bruna, W. Zaremba, A. Szlam, and Y. LeCun, “Spectral Networks and Locally Connected Networks on Graphs,” ICLR, 2014.
[5] T. N. Kipf and M. Welling, “Semi-Supervised Classification with Graph Convolutional Networks”, ICLR, 2017.
[6] D. Duvenaud, D. Maclaurin, J. A.-Iparraguirre, R. G.-Bombarelli, T. Hirzel, A. A.-Guzik, and R. P. Adams, “Convolutional Networks on Graphs for Learning Molecular Fingerprints,” NeurIPS, 2015.
[7] Z. Ying, J. You, C. Morris, X. Ren, W. L. Hamilton, and J. Leskovec, “Hierarchical Graph Representation Learning with Differentiable Pooling”, NeurIPS, 2018.
[8] S. V. N. Vishwanathan, N. N. Schraudolph, R. Kondor, and K. M. Borgwardt, “Graph Kernels”, J. Mach. Learn. Res., vol. 11, 2010.
[9] X. Yan, H. Cheng, J. Han, and P. S. Yu, “Mining Significant Graph Patterns by Leap Search”, SIGMOD, 2008.
[10] S. Ranu and A. K. Singh, “GraphSig: A Scalable Approach to Mining Significant Subgraphs in Large Graph Databases”, ICDE, 2009.
[11] M. Lin, Q. Chen, and S. Yan, “Network In Network”, ICLR, 2014.
[12] M. Du, N. Liu, and X. Hu, “Techniques for Interpretable Machine Learning”, Commun. ACM, vol. 63, no. 1, 2020.
[13] Z. C. Lipton, “The Mythos of Model Interpretability”, Commun. ACM, vol. 61, no. 10, 2018.
[14] N. Jayaram, A. Khan, C. Li, X. Yan, and R. Elmasri, “Querying Knowledge Graphs by Example Entity Tuples”, IEEE Trans. Knowl. Data Eng., vol. 27, no. 10, 2015.
[15] K. Simonyan, A. Vedaldi, and A. Zisserman, “Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps”, ICLR, 2013.
[16] R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, “Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization”, ICCV, 2017.
[17] A. Shrikumar, P. Greenside, and A. Kundaje, “Learning Important Features Through Propagating Activation Differences”, ICML, 2017.
[18] M. Fan, X. Luo, J. Liu, M. Wang, C. Nong, Q. Zheng, and T. Liu, “Graph Embedding based Familial Analysis of Android Malware using Unsupervised Learning,” ICSE, 2019.
[19] M. Zhang, Z. Cui, M. Neumann, and Y. Chen, “An
End-to-End Deep Learning Architecture for Graph Classification”, AAAI, 2018.
Nice article, its very informative content..thanks for sharing...Waiting for the next update.
ReplyDeleteLoadRunner Training in Chennai
Loadrunner Course in Chennai
Best Loadrunner Training Institute in Chennai
All thanks to Mr Anderson for helping with my profits and making my fifth withdrawal possible. I'm here to share an amazing life changing opportunity with you. its called Bitcoin / Forex trading options. it is a highly lucrative business which can earn you as much as $2,570 in a week from an initial investment of just $200. I am living proof of this great business opportunity. If anyone is interested in trading on bitcoin or any cryptocurrency and want a successful trade without losing notify Mr Anderson now.Whatsapp: (+447883246472 )
ReplyDeleteEmail: tdameritrade077@gmail.com