A2R2 Group | Autonomous Agents and Robotics Research Group

Adaptive Weighting in Online Ensemble Reinforcement Learning

This project investigates ensemble reinforcement learning methods where multiple reinforcement learning algorithms are combined within a single adaptive agent. The agent dynamically adjusts the contribution of each algorithm during decision-making based on the current state and environment conditions, with the goal of improving robustness, adaptability, and online learning performance across diverse environments.

Date: September 2024 - May 2028

Persons participating in the project:

PIs: Dr. Francisco Cruz, Dr. Eduardo Benitez Sandoval, Prof. Richard Dazeley, Prof. Peter Vamplew
Associates: Charlie Stinson
Corresponding contact: charles.stinson@unsw.edu.au

Research areas:

Reinforcement Learning
Ensemble Learning
Policy Aggregation
Online Learning
Adaptive AI Systems
Dynamic Weighting
Non-Stationary Environments
Sequential Decision Making

Description:
Traditional reinforcement learning systems typically rely on a single learning algorithm, which may perform well in some situations but poorly in others. This research explores ensemble reinforcement learning approaches that combine multiple reinforcement learning algorithms into a unified adaptive agent.

The project focuses on dynamically weighting the contribution of different algorithms during action selection. Rather than following a single fixed policy, the ensemble continuously adapts how much influence each algorithm has depending on the current state, observed performance, and changing environment conditions.

The research investigates online adaptation, policy aggregation, and ensemble learning techniques across both discrete and continuous control environments. A key objective is improving robustness and adaptability in non-stationary settings where environment dynamics may shift over time. Applications include autonomous systems, game AI, adaptive control, and sequential decision-making problems requiring robust real-time adaptation.

Media:
Additional images/video

Selected Publications	Web
Stinson, C., Vamplew, P., Dazeley, R., Sandoval, E., & Cruz, F. (2026, June). Trajectory-Guided Weight Adaptation for Ensemble Reinforcement Learning. In press The International Joint Conference on Neural Networks (IJCNN).

A2R2 Research Group	CONTACT	QUICK LINKS
Autonomous Agents and Robotics Research	f.cruz@unsw.edu.au	Google Scholar
School of Computer Science and Engineering	Room 510J, Ainsworth Building (J17)	LinkedIn
UNSW Sydney	Kensington NSW 2052, Australia	Personal webpage