This paper tests the following hypothesis, about features learned by a deep network trained on the ImageNet dataset: *Object features and anticausal features are closely related. Context features and causal features are not necessarily related.* First, some definitions. Let $X$ be a visual feature (i.e. value of a hidden unit) and $Y$ be information about a label (e.g. the log-odds of probability of different object appearing in the image). A causal feature would be one for which the causal direction is $X \rightarrow Y$. An anticausal feature would be the opposite case, $X \leftarrow Y$. As for object features, in this paper they are features whose value tends to change a lot when computed on a complete original image versus when computed on an image whose regions *falling inside* object bounding boxes have been blacked out (see Figure 4). Contextual features are the opposite, i.e. values change a lot when blacking out the regions *outside* object bounding boxes. See section 4.2.1 for how "object scores" and "context scores" are computed following this description, to quantitatively measure to what extent a feature is an "object feature" or a "context feature". Thus, the paper investigates whether 1) for object features, their relationship with object appearance information is anticausal (i.e. whether the object feature's value seems to be caused by the presence of the object) and whether 2) context features are not clearly causal or anticausal. To perform this investigation, the paper first proposes a generic neural network model (dubbed the Neural Causation Coefficient architecture or NCC) to predict a score of whether the relationship between an input variable $X$ and target variable $Y$ is causal. This model is trained by taking as input datasets of $X$ and $Y$ pairs synthetically generated in such a way that we know whether $X$ caused $Y$ or the opposite. The NCC architecture first embeds each individual $X$,$Y$ instance pair into some hidden representation, performs mean pooling of these representations and then feeds the result to fully connected layers (see Figure 3). The paper shows that the proposed NCC model actually achieves SOTA performance on the Tübingen dataset, a collection of real-world cause-effect observational samples. Then, the proposed NCC model is used to measure the average object score of features of a deep residual CNN identified as being most causal and most anticausal by NCC. The same is done with the context score. What is found is that indeed, the object score is always higher for the top anticausal features than for the top causal features. However, for the context score, no such clear trend is observed (see Figure 5). **My two cents** I haven't been following the growing literature on machine learning for causal inference, so it was a real pleasure to read this paper and catch up a little bit on that. Just for that I would recommend the reading of this paper. The paper does a really good job at explaining the notion of *observational causal inference*, which in short builds on the observation that if we assume IID noise on top of a causal (or anticausal) phenomenon, then causation can possibly be inferred by verifying in which direction of causation the IID assumption on the noise seems to hold best (see Figure 2 for a nice illustration, where in (a) the noise is clearly IID, but isn't in (b)). Also, irrespective of the study of causal phenomenon in images, the NCC architecture, which achieves SOTA causal prediction performance, is in itself a nice contribution. Regarding the application to image features, one thing that is hard to wrap your head around is that, for the $Y$ variable, instead of using the true image label, the log-odds at the output layer are used instead in the study. The paper justifies this choice by highlighting that the NCC network was trained on examples where $Y$ is continuous, not discrete. On one hand, that justification makes sense. On the other, this is odd since the log-odds were in fact computed directly from the visual features, meaning that technically the value of the log-odds are directly caused by all the features (which goes against the hypothesis being tested). My best guess is that this isn't an issue only because NCC makes a causal prediction between *a single feature* and $Y$, not *from all features* to $Y$. I'd be curious to read the authors' perspective on this. Still, this paper at this point is certainly just scratching the surface on this topic. For instance, the paper mentions that NCC could be used to encourage the learning of causal or anticausal features, providing a new and intriguing type of regularization. This sounds like a very interesting future direction for research, which I'm looking forward to.