Publications

I have published on a wide range of topics in several different fields, all in the hope of one day conquering all the sub-fields of cognitive science. I do this not only for the thrill of dilettantism, but also because I harbour ambitions to understand cognition in all its forms: human, animal, and machine. One day I will get to bacteria, plants, and other life-forms too!

Since I am on the job market and machine learning/AI is a best seller right now, we will start with my machine learning contributions:

Testing the Limits of Fine-Tuning for Improving Visual Cognition in Vision Language Models (2025). Accepted at ICML. We find that the intuitive physics capabilities of fine-tuned VLMs is extraordinarily brittle and not very human-like.
PredictaBoard: Benchmarking LLM score predictability (2025). Accepted in ACL Findings. We propose a novel benchmark for quantifying and comparing the predictability of LLM responses.
metabench-A Sparse Benchmark of Reasoning and Knowledge in Large Language Models (2025) Accepted at ICLR. We use Item Response Theory to distill six common benchmarks down to 3% of their original size, while preserving predictive accuracy.
Bringing comparative cognition approaches to AI systems (2025). Accepted in Nature Reviews Psychology. We propose that AI Evaluation has a lot to learn from comparative cognition in terms of experimental design.
A little less conversation, a little more action, please: Investigating the physical common-sense of LLMs in a 3D embodied environment (2025). Accepted at PRICAI 2025. We find that agentic LLMs aren’t very good at solving physical reasoning problems that are straightforward for humans and other animals.
The Animal-AI Environment: A Virtual Laboratory For Comparative Cognition and Artificial Intelligence Research (2024). Accepted in Behavior Research Methods. We present the latest and greatest version of the Animal-AI Environment, a tool for uniting comparative cognition and AI.
Inferring capabilities from task performance with bayesian triangulation. Under review. We propose a novel strategy for statistically inferring latent capabilities in machine learning models using instance-level behavioural data.
Evaluating Object Permanence in Embodied Agents using the Animal-AI Environment (2022). Accepted at EBeM Workshop @ IJCAI. We present a benchmark for testing object permanence in agentic systems.
Direct Human-AI Comparison in the Animal-AI Environment (2022). Accepted in Frontiers in Psychology. We directly compare children and deep reinforcement learning agents on cognitively-inspired tasks.

In cognitive psychology, I played a small part in:

A foundation model to predict and capture human cognition (2025). Accepted in Nature. We build the most predictive “model” of human decision-making ever seen, by fine-tuning a large language model.

In (the philosophy of) comparative cognition, I have published a number of thrillers, including:

Analogies And The Associative-Cognitive Distinction In Comparative Psychology (2025). Accepted in Biology & Philosophy. I provide an account of the distinction between associative learning and cognition in terms of hypothesis generation and analogical reasoning. (Pre-Print)
Cognitive Simplicity As An Idealization (forthcoming). Accepted in Animal Behavior & Cognition. I argue that considering some explanations for animal behaviour as being simpler than others is a useful idealisation for hypothesis generation, even if this practice is based on false assumptions.
Morgan’s Canon and the Associative-Cognitive Distinction Today: A Survey of Practitioners (2024). Accepted in The Journal of Comparative Psychology. We surveyed 220 comparative psychologists and asked them about the methodological challenges that they face when studying non-human animal cognition.
The Future Is Computational Comparative Cognition (2024) Accepted in CCBR. We argue that comparative cognition needs computational modelling.
Replications, comparisons, sampling and the problem of representativeness in animal cognition research (2021). We tackle the problems of generalizability and representativeness in empirical animal cognition research.

In evolutionary biology, I have just one contribution from my time as post-doc on the *Major Transitions in the Evolution of Cognition:

Exploring Major Transitions in the Evolution of Biological Cognition With Artificial Neural Networks. Under review. We use artificial grammar learning and artificial neural networks to empirically evaluate to proposed major transitions in the evolution of biological cognition.

In linguistics, I continue to dream of making a sizeable contribution. For now, I have contributed to:

Metatheoretical linguistics: A philosopher’s guide (2024). Accepted in Biolinguistics We review Ryan Nefdt’s The Philosophy of Theoretical Linguistics (2024), giving a balanced account of its achievements and its shortcomings over the course of 15 pages.
Oops, I Did It Again: A typology of phonological iterativity (2021) Accepted at the Penn Linguistics Conference. We present several examples and a formal analysis of iterative phonological rule applications in the phonological systems of the world’s languages. I really only contributed a small amount - it was nice that the co-authors had me on-board!

If you need help getting to sleep in the evenings, you can read: