Human activity Monitoring and Pose Estimation under partial Occlusion Understanding human body motion is a fundamental capability in computer vision, with broad applications in robotics, healthcare, and activity monitoring. However, correctly detecting poses remains challenging due to unavoidable factors such as occlusion, depth ambiguity, and motion uncertainty. PIs: Dr. Francisco Cruz, Dr. Eduardo Benitez Sandoval, Prof. Erik Meijering Associates: Marsha Mariya Kappan
Visual Language Agents for Safe and Adaptive Autonomous UAV Operation This project develops vision language agents that govern drone behaviour in safety critical regulated environments, combining visual perception, language grounded reasoning, and cross episode learning to enable safe and adaptive autonomous operation from simulation to real world deployment. PIs: Dr. Francisco Cruz, Dr. Imran Razzak, Dr. Mohammad Deghat Associates: Abdulrahman Althobaiti
Applied Spiking Architectures: From Dynamic Visual Sensing to Transparent Physical Actuation Developing an end-to-end, biologically inspired computing stack that pairs event-based vision with explainable Spiking Neural Networks (SNNs) to enable robust, power-efficient, and transparent perception for physical autonomous systems operating in dynamic real-world environments. PIs: Dr. Francisco Cruz, A/Prof. Leo Wu Associates: Oltan Sevinc
Emotional Sensitivity in Human-Robot Interaction (HRI) for Explainability upon Decision-Making Reasoning In social robotics, it is vital for agents to be responsive to both the surrounding physical environment and the people they interact with. Failure to adapt to continuously changing environments can negatively impact user trust and acceptance. Furthermore, increasing emphasis is placed on human alignment, where human competence is prioritized over AI intelligence. This includes explaining decision-making processes, anticipating others' needs, and understanding implicit emotional cues. Widespread adoption requires AI systems to communicate, collaborate, learn, and adapt to the people they interact with while aligning with human values. PIs: Dr. Francisco Cruz, Prof. Flora Salim, Dr. Benjamin Tag Associates: Chris Lee
Spatiotemporal Audio-Visual Robot Learning The instinctive spatial audio-visual learning ability of humans and some animals has long become a source of inspiration for embodied AI and robotics implementation. However, the bio-inspired adoption by artificial agents introduces major theoretical and technical challenges. Processing multiple data streams with diverse spatiotemporal representations, such as spatial audio-visual data, is indeed beyond the capability of conventional Machine Learning and unimodal Deep Learning methods. These challenges expose the unseen barriers to multimodal data integration, leading us to a novel perspective of Multimodal Machine Learning to facilitate spatial audio-visual interactions. PIs: Dr. Francisco Cruz, A/Prof. Vidhyasaharan Sethu, Dr. Shadi Abpeikar Associates: Hadha Afrisal
Adaptive Weighting in Online Ensemble Reinforcement Learning This project investigates ensemble reinforcement learning methods where multiple reinforcement learning algorithms are combined within a single adaptive agent. The agent dynamically adjusts the contribution of each algorithm during decision-making based on the current state and environment conditions, with the goal of improving robustness, adaptability, and online learning performance across diverse environments. PIs: Dr. Francisco Cruz, Dr. Eduardo Benitez Sandoval, Prof. Richard Dazeley, Prof. Peter Vamplew Associates: Charlie Stinson
Adaptive Multi-Source Fusion for Interactive Reinforcement Learning This project investigates how reinforcement learning agents can learn effectively from multiple imperfect sources of guidance. Rather than assuming that advice sources are uniformly reliable, the project studies how the structure of their errors affects learning, and develops adaptive fusion methods that decide when and how to use different sources of advice. PIs: Dr. Francisco Cruz, Dr. Pamela Carreno-Medrano, Prof. Claude Sammut Associates: Maher Mesto
Robust Skill Chaining for Hierarchical Reinforcement Learning Robotic manipulation has advanced rapidly in recent years, yet moving from short, isolated behaviours to complex, long-horizon tasks that involve multiple contact-rich primitives remains a significant challenge. Hierarchical Reinforcement Learning (HRL) offers a promising approach by breaking down such tasks into reusable skills, each operating over a shorter time scale. A high-level controller learns to sequence these skills during execution, enabling more sophisticated behaviour. This project investigates how primitive skills can be acquired, adapted and reused to facilitate the execution of long-horizon manipulation tasks, potentially improving efficiency and robustness in robotic systems. PIs: Dr. Francisco Cruz, A/Prof. Gelareh Mohammadi Associates: Madeleine Nouri
MERCI/PERCY: Real-Robot Multimodal Affective Dialogue Dataset and Benchmark PERCY is a real-time conversational system deployed on the ARI social robot, and MERCI is the multimodal dataset collected from live human-robot interactions using this system. The project investigates how visual and textual affect signals behave in real embodied dialogue and supports affect-aware multimedia indexing and benchmark evaluation. PIs: Dr. Francisco Cruz, Dr. Imran Razzak Associates: Zhijin Meng, Mohammed Althubyani
Teaching With Interactive Reinforcement Learning (TWIRL) Reinforcement Learning has been a very useful approach, but often works slowly, because of large-scale exploration. A variant of RL, that tries to improve speed of convergence, and that has been rarely used until now is Interactive Reinforcement Learning (IRL), that is, RL is supported by a human trainer who gives some directions on how to tackle the problem. PIs: Prof. Dr. Stefan Wermter Associates: Dr. Francisco Cruz, Dr. Sven Magg, Dr. Cornelius Weber Date: July 2013 - July 2017