Publications
I have published on a wide range of topics in several different fields, all in the hope of one day conquering all the sub-fields of cognitive science. I do this not only for the thrill of dilettantism, but also because I harbour ambitions to understand cognition in all its forms: human, animal, and machine. One day I will get to bacteria, plants, and other life-forms too!
Since I am on the job market and machine learning/AI is a best seller right now, we will start with my machine learning contributions:
- Testing the Limits of Fine-Tuning for Improving Visual Cognition in Vision Language Models (2025). Accepted at ICML. We find that the intuitive physics capabilities of fine-tuned VLMs is extraordinarily brittle and not very human-like.
- PredictaBoard: Benchmarking LLM score predictability (2025). Accepted in ACL Findings. We propose a novel benchmark for quantifying and comparing the predictability of LLM responses.
- metabench-A Sparse Benchmark of Reasoning and Knowledge in Large Language Models (2025) Accepted at ICLR. We use Item Response Theory to distill six common benchmarks down to 3% of their original size, while preserving predictive accuracy.
- Bringing comparative cognition approaches to AI systems (2025). Accepted in Nature Reviews Psychology. We propose that AI Evaluation has a lot to learn from comparative cognition in terms of experimental design.
- A little less conversation, a little more action, please: Investigating the physical common-sense of LLMs in a 3D embodied environment (2025). Accepted at PRICAI 2025. We find that agentic LLMs aren’t very good at solving physical reasoning problems that are straightforward for humans and other animals.
- The Animal-AI Environment: A Virtual Laboratory For Comparative Cognition and Artificial Intelligence Research (2024). Accepted in Behavior Research Methods. We present the latest and greatest version of the Animal-AI Environment, a tool for uniting comparative cognition and AI.
- Inferring capabilities from task performance with bayesian triangulation. Under review. We propose a novel strategy for statistically inferring latent capabilities in machine learning models using instance-level behavioural data.
- Evaluating Object Permanence in Embodied Agents using the Animal-AI Environment (2022). Accepted at EBeM Workshop @ IJCAI. We present a benchmark for testing object permanence in agentic systems.
- Direct Human-AI Comparison in the Animal-AI Environment (2022). Accepted in Frontiers in Psychology. We directly compare children and deep reinforcement learning agents on cognitively-inspired tasks.
In cognitive psychology, I played a small part in:
- A foundation model to predict and capture human cognition (2025). Accepted in Nature. We build the most predictive “model” of human decision-making ever seen, by fine-tuning a large language model.
In (the philosophy of) comparative cognition, I have published a number of thrillers, including:
- Analogies And The Associative-Cognitive Distinction In Comparative Psychology (forthcoming). Accepted in Biology & Philosophy. I provide an account of the distinction between associative learning and cognition in terms of hypothesis generation and analogical reasoning.
- Cognitive Simplicity As An Idealization (forthcoming). Accepted in Animal Behavior & Cognition. I argue that considering some explanations for animal behaviour as being simpler than others is a useful idealisation for hypothesis generation, even if this practice is based on false assumptions.
- Morgan’s Canon and the Associative-Cognitive Distinction Today: A Survey of Practitioners (2024). Accepted in The Journal of Comparative Psychology. We surveyed 220 comparative psychologists and asked them about the methodological challenges that they face when studying non-human animal cognition.
- The Future Is Computational Comparative Cognition (2024) Accepted in CCBR. We argue that comparative cognition needs computational modelling.
- Replications, comparisons, sampling and the problem of representativeness in animal cognition research (2021). We tackle the problems of generalizability and representativeness in empirical animal cognition research.
In linguistics, I continue to dream of making a sizeable contribution. For now, I have contributed to:
- Metatheoretical linguistics: A philosopher’s guide (2024). Accepted in Biolinguistics We review Ryan Nefdt’s The Philosophy of Theoretical Linguistics (2024), giving a balanced account of its achievements and its shortcomings over the course of 15 pages.
- Oops, I Did It Again: A typology of phonological iterativity (2021) Accepted at the Penn Linguistics Conference. We present several examples and a formal analysis of iterative phonological rule applications in the phonological systems of the world’s languages. I really only contributed a small amount - it was nice that the co-authors had me on-board!
If you need help getting to sleep in the evenings, you can read: