Reviews: Demystifying Black-box Models with Symbolic Metamodels

Originality: although similar in spirit to past works on approximating black-box models with interpretable surrogates / performing symbolic regression (which it does a good job citing), this work seems to go several steps further and contribute something novel both in framing the problem and solving it. (It doesn't seem like past work on symbolic regression use Meijer G-functions.) Quality: the work seems technically sound, though it makes a number of arbitrary choices and I think it could do with more experiments. I'll elaborate on these points in the improvements section. Clarity: the paper is very clearly written and its results are nicely presented. Significance: I think this paper has the potential to become very significant, with immediate practical applications in medicine and accelerated science and many areas for follow-up research.

Reviewer 2

I am new to the domain of symbolic regression and found the article to constitute a well-written and interesting introduction to it. Yet, I kept wondering to what extent the presented approach can really help interpreting complex black box functions. In the final example, it is clear that the results are fairly simple and interpretable while delivering a moderate loss in prectivity compared to the crude algorithm. But in more generality, I still don't see how combinations of Bessel functions and alike will help most practitioners. Which leads us to a question that to the best of my understanding was somehow underinvestigated here, namely some more systematic approach on how to tune the complexity of the metamodel, and maybe explore the Pareto front of simplicity versus predictivity. Besides this, I keep also wondering why such approach should be restricted to functions returned by ML algorithm and could not be trained directly based on raw data? On a different question, here an L2 loss is considered, and furthermore the (meta)model fitting is performed by (local) gradient descent. Why not consider on the one hand some more general class of misfits, and on the other hand allow more varied classes of (global) optimization algorithms? Unless I have overlooked an important point, I don't see why the fitting problem at hand should be convex? Last but not least, I was surprised not to find more references/comparisons to further machine learning approaches magnifying interpretability, e.g. those based on kernels and their decompositions, with a variety of methods encompassing Additive/ANOVA GPs and splines (Durrande, Duvenaud, Plate, Wahba, etc.), High Dimensional Model Representation, etc: T.A. Plate. Accuracy versus interpretability in flexible modeling: Implementing a tradeoff using Gaussian process models. Behaviormetrika, 26:29–50, 1999. Li et al. Global uncertainty assessments by high dimensional model representations (HDMR). Chemical Engineering Science. Volume 57, Issue 21, Pages 4445-4460 (2002). Whaba, G. et al. Smoothing Spline Anova for Exponential Families, with Application to the Wisconsin Epidemiological Study of Diabetic Retinopathy. The Annals of Statistics Vol. 23, No. 6, pp. 1865-1895 (1995) Duvenaud, D. Nickisch, H. and Rasmussen, C.E. Additive Gaussian Processes. Neural Information Processing Systems (2011) Durrande, N., Ginsbourger, D. and Roustant, O. Additive covariance kernels for high-dimensional Gaussian process modeling. Annales de la Faculté des sciences de Toulouse: Mathématiques 21 (3), 481-499 (2012). Duvenaud, D. et al. Structure Discovery in Nonparametric Regression through Compositional Kernel Search. ICML 2013. Durrande, N. et al. ANOVA kernels and RKHS of zero mean functions for model-based sensitivity analysis Journal of Multivariate Analysis 115, 57-67 (2013). Finally, the term "metamodelling" is also vastly used in the domain of "Computer Experiments", see works of authors including Santner, Wynn, Schonlau, etc. See for instance Santer, T.J., Williams, B.J and Notz, W.I. The Design and Analysis of Computer Experiments. Springer 2003 and references therein such as Sacks et al. Design and Analysis of Computer Experiments. Statistical Science. Vol. 4, No. 4, pp. 409-423 (1989) and Jones, D.R., Schonlau, M. & Welch, W.J. Journal of Global Optimization (1998) 13: 455. https://doi.org/10.1023/A:1008306431147 ******** Update afer rebuttal ********** I found the rebuttal quite constructive and while I did not get completely rid of formerly expressed reservations (regarding the actual interpretability of classes of functions appealed to as well as some arbitrariness in the way regularization is performed), I feel that the work has improved and that it is worth investigating further such approaches and potential practical benefits. As a consequence, I increased my score by one unit.

Reviewer 3

After reading reviews and author feedback I raise my score to 7. The authors responded to my concern and I think this submission is a good contribution. ----------- **tl;dr** The ability to learn interpretable meta-models with gradient descent is a good contribution. Some more empirical evidence would make the paper’s claim stronger. **Summary** The authors propose an elegant way to create an interpretable, symbolic model via regressing the target model's function. By using Meijer G-function they are able to learn the structure of the symbolic model via gradient descent, in contrast to prevailing genetic methods. Building on the constructed meta-model the paper shows how these models can be seen as a unification of two branches of the interpretation literature: feature importance estimators and local additive approximations. Finally an empirical evaluation on synthetic data and a real world dataset is provided. The use case on cancer data is very interesting. **Originality** The original contribution of this paper is to show a way to learn symbolic models via gradient descent rather than with genetic methods. **Quality** The overall quality of the paper is good. Yet a larger/better empirical evaluation is needed: * When does this approach not work anymore? E.g., how about input with a large feature dimensions? models that create rich internal feature representations? * Which results are reached by competitors on the given example on breast cancer? Is there a runtime benefit compared to genetic models? * Cancer data result: Did you reach to this result at every experiment run? What is the variance due to gradient descent of your approach? * What is the influence of the hyper-parameters m, n, p, q, r? **Clarity** The paper is well written and easy to read. **Significance** The significance is medium. The original contribution is there, but it seems the application is limited (or at least it was not shown that the method works on more complex models/input data for which such a method is needed most). **Questions** * Genetic models can build trees of symbolic expressions, your approach seems to be limited to additive meta-models. Is this right? Is this a limitation? * SHAPE also unifies the same branches of the interpretation literature, can you please set this into context?

Paper ID:	6036
Title:	Demystifying Black-box Models with Symbolic Metamodels

Reviewer 1

Reviewer 2

Reviewer 3