Kale-ab Tessera

I am a third-year PhD candidate in the Bayesian and Neural Systems group at the University of Edinburgh, advised by Amos Storkey, Tim Rocktäschel (UCL), and Aris Filos-Ratsikas. My research focuses on robust multi-agent systems for dynamic, open-ended environments (multi-agent RL, agentic systems). I am also affiliated with the Multi-Agent, Reinforcement, Behavior and Learning (MARBLE) interest group.

Before starting my PhD, I spent 4.5 years in machine learning, including 2.5 years in MARL research at InstaDeep, and an additional 3 years in software engineering. I am also passionate about supporting impactful technology projects in Africa and promoting diversity in the machine learning community.

For more information, you can view my resumé.

news

Dec, 2025	🌟 Attending NeurIPS 2025 in San Diego, US.
Sep, 2025	🗣️ Talk on “Algorithms and Benchmarks for Robust Multi-Agent Coordination” at the RAIL Lab, University of the Witwatersrand.
Aug, 2025	🏅 Best poster (1st place) out of 278 submissions at the Deep Learning Indaba in Kigali, Rwanda.
Aug, 2025	📅 Co-Programme Chair for the Deep Learning Indaba and Head of Practicals and Tutorials in Kigali, Rwanda.
Aug, 2025	Our reading group is back – 🤖 RL & Agents Reading Group.
Aug, 2025	🌟 Attending RLC in Edmonton, Canada.
Mar, 2025	🌟 Attending UK Multi-Agent Systems Symposium 2025 at King’s College London.
Aug, 2024	🗣️ Taught “Introduction to ML” at DLI.
Jul, 2024	🏅 Awarded a scholarship to attend the CIFAR Deep Learning and Reinforcement Learning (DLRL) Summer School in Toronto, Canada.
Jan, 2024	🗣️ Begin co-hosting the UOE RL reading group, YouTube.
Sep, 2023	🎓 Started my PhD at the University of Edinburgh (UOE), through the Informatics Global PhD Scholarship.
Aug, 2023	🛠️ PC member and Practicals Chair of DLI - notebooks 2023, RL Prac.
May, 2023	🗣️ Talk on “Introduction to Deep Reinforcement Learning” at the University of Pretoria and Indaba X Ghana.
Apr, 2023	🌟 Attended ICLR in Kigali, Rwanda.
Aug, 2022	🛠️ Co-Organiser of the ML Efficiency Workshop at the DLI.
Aug, 2022	🛠️ Programme committee member and Practicals Chair of Deep Learning Indaba (DLI) – notebooks 2022, ML Prac, RL Prac.
Jun, 2022	🗣️ Taught an “Introduction to Machine Learning” course at Africa to Silicon Valley.
Mar, 2021	🤖 Joined the Multi-Agent RL research team at InstaDeep.
Dec, 2019	🌟 Attended NeurIPS in Vancouver, Canada.
Aug, 2019	🏆 Won Best Poster (1 out of 194) at the Deep Learning Indaba, sponsored by Microsoft.

selected publications

AAMAS

Diagnosing Dec-POMDP Requirements in Cooperative MARL

Kale-ab Abebe Tessera, Leonard Hinckeldey, Riccardo Zamboni, and 2 more authors

In (To Appear) The 25th International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), Oral, 2026
AAMAS

Fairness over Equality: Correcting Social Incentives in Asymmetric Sequential Social Dilemmas

Alper Demir, Hüseyin Aydın, Kale-ab Abebe Tessera, and 2 more authors

In (To Appear) The 25th International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), Oral, 2026
NeurIPS
HyperMARL: Adaptive Hypernetworks for Multi-Agent RL

Kale-ab Abebe Tessera, Arrasy Rahman, Amos Storkey, and 1 more author

In The Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS), 2025

Abs Bib PDF Code Poster Slides

Adaptive cooperation in multi-agent reinforcement learning (MARL) requires policies to express homogeneous, specialised, or mixed behaviours, yet achieving this adaptivity remains a critical challenge. While parameter sharing (PS) is standard for efficient learning, it notoriously suppresses the behavioural diversity required for specialisation. This failure is largely due to cross-agent gradient interference, a problem we find is surprisingly exacerbated by the common practice of coupling agent IDs with observations. Existing remedies typically add complexity through altered objectives, manual preset diversity levels, or sequential updates – raising a fundamental question: can shared policies adapt without these intricacies? We propose a solution built on a key insight: an agent-conditioned hypernetwork can generate agent-specific parameters and decouple observation- and agent-conditioned gradients, directly countering the interference from coupling agent IDs with observations. Our resulting method, HyperMARL, avoids the complexities of prior work and empirically reduces policy gradient variance. Across diverse MARL benchmarks (22 scenarios, up to 30 agents), HyperMARL achieves performance competitive with six key baselines while preserving behavioural diversity comparable to non-parameter sharing methods, establishing it as a versatile and principled approach for adaptive MARL. The code is publicly available at https://github.com/KaleabTessera/HyperMARL.
@inproceedings{tessera2025hypermarl, title = {HyperMARL: Adaptive Hypernetworks for Multi-Agent {RL}}, author = {Tessera, {Kale-ab} Abebe and Rahman, Arrasy and Storkey, Amos and Albrecht, Stefano V}, booktitle = {The Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS)}, year = {2025}, url = {https://openreview.net/forum?id=56CgYnf9Dr}, }
NeurIPS Workshop

Are we going MAD? Benchmarking Multi-Agent Debate between Language Models for Medical Q&A

Andries Smit, Paul Duckworth, Nathan Grinsztajn, and 3 more authors

In Deep Generative Models for Health Workshop NeurIPS 2023, Nov 2023

Abs PDF

Recent advancements in large language models (LLMs) underscore their potential for responding to medical inquiries. However, ensuring that generative agents provide accurate and reliable answers remains an ongoing challenge. In this context, multi-agent debate (MAD) has emerged as a prominent strategy for enhancing the truthfulness of LLMs. In this work, we provide a comprehensive benchmark of MAD strategies for medical Q&A, along with open-source implementations. This sheds light on the effective utilization of various strategies including the trade-offs between cost, time, and accuracy. We build upon these insights to provide a novel debate-prompting strategy based on agent agreement that outperforms previously published strategies on medical Q&A tasks.
arXiv

Mava: a research framework for distributed multi-agent reinforcement learning

Arnu Pretorius ^*, Kale-ab Abebe Tessera ^*, Andries P Smit ^*, and 8 more authors

arXiv preprint arXiv:2107.01460v1, Jul 2021

Abs PDF Blog Code

Breakthrough advances in reinforcement learning (RL) research have led to a surge in the development and application of RL. To support the field and its rapid growth, several frameworks have emerged that aim to help the community more easily build effective and scalable agents. However, very few of these frameworks exclusively support multi-agent RL (MARL), an increasingly active field in itself, concerned with decentralised decision-making problems. In this work, we attempt to fill this gap by presenting Mava: a research framework specifically designed for building scalable MARL systems. Mava provides useful components, abstractions, utilities and tools for MARL and allows for simple scaling for multi-process system training and execution, while providing a high level of flexibility and composability. Mava is built on top of DeepMind’s Acme \citephoffman2020acme, and therefore integrates with, and greatly benefits from, a wide range of already existing single-agent RL components made available in Acme. Several MARL baseline systems have already been implemented in Mava. These implementations serve as examples showcasing Mava’s reusable features, such as interchangeable system architectures, communication and mixing modules. Furthermore, these implementations allow existing MARL algorithms to be easily reproduced and extended. We provide experimental results for these implementations on a wide range of multi-agent environments and highlight the benefits of distributed system training.