An interesting category of machine learning papers - to which this paper belongs - are papers which use learning systems as a way to explore the incentive structures of problems that are difficult to intuitively reason about the equilibrium properties of. In this paper, the authors are trying to better understand how different dynamics of a cooperative communication game between agents, where the speaking agent is trying to describe an object such that the listening agent picks the one the speaker is being shown, influence the communication protocol (or, to slightly anthropomorphize, the language) that the agents end up using. In particular, the authors experiment with what happens when the listening agent is frequently replaced during training with a untrained listener who has no prior experience with the agent. The idea of this experiment is that if the speaker is in a scenario where listeners need to frequently "re-learn" the mapping between communication symbols and objects, this will provide an incentive for that mapping to be easier to quickly learn. https://i.imgur.com/8csqWsY.png The metric of ease of learning that the paper focuses on is "topographic similarity", which is a measure of how compositional the communication protocol is. The objects they're working with have two properties, and the agents use a pair of two discrete symbols (two letters) to communicate about them. A perfectly compositional language would use one of the symbols to represent each of the properties. To mathematically measure this property, the authors calculate (cosine) similarity between the two objects property vectors, and the (edit) distance between the two objects descriptions under the emergent language, and calculate the correlation between these quantities. In this experimental setup, if a language is perfectly compositional, the correlation will be perfect, because every time a property is the same, the same symbol will be used, so two objects that share that property will always share that symbol in their linguistic representation. https://i.imgur.com/t5VxEoX.png The premise and the experimental setup of this paper are interesting, but I found the experimental results difficult to gain intuition and confidence from. The authors do show that, in a regime where listeners are reset, topographic similarity rises from a beginning-of-training value of .54 to an end of training value of .59, whereas in the baseline, no-reset regime, the value drops to .51. So there definitely is some amount of support for their claim that listener resets lead to higher compositionality. But given that their central quality is just a correlation between similarities, it's hard to gain intuition for whether the difference is a meaningful. It doesn't naively seem particularly dramatic, and it's hard to tell otherwise without more references for how topographic similarity would change under a wider range of different training scenarios.