摘要: |
The emerging concepts of Urban Air Mobility
(UAM) and Advanced Air Mobility (AAM) open a new paradigm
for urban air transportation. A big challenge is that these
new aerial vehicles will quickly saturate the already crowded
aviation spectrum, which is an essential resource to ensure
reliable communications for safe operations. In this paper, we
consider an air transportation system where multiple aerial
vehicles are operated to transport passengers or cargo from
different sources to destinations along their pre-defined paths.
During the flight, the minimum communication Quality of Service
(QoS) requirement must be achieved to ensure flight safety. Our
objective is to minimize the average mission completion time by
jointly optimizing the velocity selection and spectrum allocation
for all aerial vehicles. We formulate the optimization problem
as a multi-stage Markov Decision Process (MDP) where the
optimization variables are coupled together. A multi-agent Deep
Reinforcement Learning (DRL) based solution is proposed where
Value Decomposition Networks (VDN) algorithm is utilized to
take discrete actions. Additionally, we propose a heuristic greedy
algorithm as a baseline solution. Simulation results show that
our learning based solution outperforms the heuristic greedy
algorithm and another Orthogonal Multiple Access (OMA)
solution in minimizing the mission completion time. |