close
close

Solondais

Where news breaks first, every time

sinolod

PhD Defense of a thesis project in computer science: Yasin Findik 10/28

10/16/2024

By Yasin Findik

The Kennedy College of Sciences, Department of Computer Science, invites you to attend a doctoral dissertation proposal defense by Yasin Findik on “Advanced Cooperation Algorithms in MARL: From Discrete to Continuous”

Candidate: Yasin Findik

Date: Monday October 28, 2024

Time: 10 a.m. to 11:30 a.m. (EST)

Location: DAN 309

Committee members:

Reza Azadeh (Advisor), Miner School of Computer & Information Sciences, University of Massachusetts Lowell

Tingjian Ge (member), Miner School of Computer & Information Sciences, University of Massachusetts Lowell

Matteo Leonetti (member), computer science, King’s College London

Hadi Amiri (member), Miner School of Computer & Information Sciences, University of Massachusetts Lowell

Title: Advanced cooperation algorithms in MARL: from discrete to continuous

Abstract:

Rapid advances in intelligent systems have generated considerable interest in reinforcement learning (RL) due to its potential to enable autonomous agents to learn optimal behaviors through interactions with their environment. As the complexity of these environments and tasks increases, the need for agents to operate both independently and collaboratively has become more apparent. This led to the emergence of multi-agent reinforcement learning (MARL), a field focused on developing frameworks that enable multiple agents to cooperate, compete, or coexist to achieve individual or collective goals. MARL has become increasingly important in real-world applications such as robotics, autonomous vehicles, and finance, where agents must make decisions based on their own and others’ actions.

This thesis explores the challenges and opportunities presented by MARL in real-world applications, where agents must operate under conditions of partial observability, non-stationarity, and the need for coordinated decision-making. Traditional single-agent reinforcement learning (SARL) methods, while effective in isolated environments, are not effective when applied to multi-agent environments due to the additional complexity introduced by inter-agent interactions. -agents. These complexities require more sophisticated algorithms that can handle decentralized decision-making, improve cooperation, and ensure robustness. To address these issues, this thesis presents novel methods to improve the efficiency and effectiveness of ADR systems in both discrete and continuous action domains, leveraging recent advances in deep learning and RL.

First, the thesis introduces a new cooperation strategy based on relational awareness (RA), which allows agents to work together more effectively by integrating awareness of relationships between agents. We evaluate the effectiveness of our proposed approach by conducting fifteen experiments in two different discrete environments. The results demonstrate that the proposed algorithm can influence and shape team behavior, guide cooperation strategies, and accelerate agent learning. Therefore, our approach is promising for use in multi-agent systems, especially when agents have diverse properties.

Another key contribution is the development of Mixed Q-Functionals (MQF), a value-based algorithm designed for continuous action domains, which significantly outperforms existing methods in terms of performance and promotes collaboration between agents. We evaluate the effectiveness of our algorithm in six cooperative multi-agent scenarios within continuous environments. Our empirical results reveal that MQF outperforms four variants of the deep deterministic policy gradient through rapid action evaluation and increased sample efficiency.

Additionally, this thesis presents the Collaborative Adaptation (CA) framework, which leverages relational networks to improve the resilience of multi-agent systems in scenarios involving unexpected failures. Empirical evaluations in discrete and continuous environments demonstrate that, in scenarios involving unanticipated malfunction, even though state-of-the-art algorithms often converge to suboptimal solutions, the proposed CA framework not only mitigates and recovers more efficiently, but also offers valuable information. in the practical deployment of MARL algorithms in dynamic, real-world applications.