Agentic RL Researcher – Distributed Computing
Confidential
Posted: March 11, 2026
Interested in this position?
Create a free account to apply with AI-powered matching
Quick Summary
Distributed RL research for Huawei's cloud computing initiatives, focusing on developing cutting-edge serverless infrastructure and networking technologies.
Required Skills
Job Description
Huawei Canada has an immediate permanent opening for a Researcher.
About the team:
The Distributed Data Storage and Management Lab leads research in distributed data systems, aiming to develop next-generation cloud serverless products that encompass core infrastructure and databases. This lab addresses various data challenges, including cloud-native disaggregated databases, pay-by-query user models, and optimizing low-level data transfers via RDMA. Teams within this lab create advanced cloud serverless data infrastructure and implement cutting-edge networking technologies for Huawei's global AI infrastructure.
About the job:
• Design and develop advanced Agentic Reinforcement Learning (RL) and Multi-Agent Reinforcement Learning (MARL) algorithms for cooperative, competitive, and mixed-agent environments, including CTDE, decentralized learning, and hierarchical agent systems.
• Build scalable simulation and training platforms for large-scale agent systems, supporting self-play, population-based training, curriculum learning, and emergent behavior analysis.
• Optimize multi-agent learning performance on distributed compute clusters, improving sample efficiency, credit assignment, agent coordination, communication learning, and training stability.
• Research and prototype new approaches for multi-agent intelligence, including communication protocols, credit assignment, game-theoretic learning dynamics, meta-learning, and adaptive agent populations.
• Translate cutting-edge research in agentic AI and MARL into production-ready systems for real-world or high-fidelity simulated environments.
• Develop benchmarking frameworks and evaluation metrics for agent coordination, robustness, scalability, and safety.
• Collaborate with research, infrastructure, and product teams to deploy scalable agentic learning systems in real-world applications.
• Contribute to technical leadership and innovation through publications, patents, open-source contributions, and conference presentations.
The total target annual compensation for this position ranges from $106,000 to $156,000 depending on education, experience, and demonstrated expertise.