Towards AI-driven Intelligent Decision Making in Warfare PhD

Updated: 9 months ago
Location: Cranfield, ENGLAND
Deadline: The position may have been removed or expired!

In October 2015, AlphaGo became the first computer Go program to beat a human professional Go player without handicap on a full-sized 19×19 board. Go game, despite its relatively simple rules, is very complex, (even in comparison to chess) with legal board positions to be approximately on the order of 2 × 10170 .


A number which is vastly greater than the number of atoms in the universe. The artificial intelligence of the original AlphaGo had been programmed through deep learning which used extensive training, both from human and computer play. After 4 years and three evolutions, the current variant AlphaZero, which was trained solely via self-play (i.e. reinforcement learning), was implementing game strategies that was never seen/known in Go play while generalizing the playing capability to other games such as chess and shogi. AlphaZero, within 24 hours of training, had achieved a superhuman level of play in these three games by defeating world-champion programs including its predecessor AlphaGo Zero.

In that respect, Go game has considerable resemblance to complex and cascaded decision-making scenarios involved in interwoven attack and defence patterns across a set of airborne, naval, ground and/or space assets. Specifically, progressive intelligent decision-making leads to tree-based planning and search/optimization algorithms. However, in real-world problems the dynamics governing the conflict environment are often complex and unknown. By combining a tree-based search with a learned model (as demonstrated by MuZero algorithm (2019)), one can achieve generalisation without any knowledge of their underlying dynamics and environment utilizing the capability of learned model’s iterative prediction capability of the outcomes associated with action-selection policy.

The proposed research is to generalize this AI-driven intelligent decision-making to warfare which is further complicated by the facts that a) the number of decision-makers are high and the decision-making can exhibit decentralized behaviour, b) the number of elements/assets involved is not fixed and dynamic and c) the actions taken by parties are not always apparent and the observations can be erroneous (because of sensing errors or deception or jamming and spoofing). In that respect, significant progress beyond-the-state-art is needed on all these critical frontiers. Therefore, this study aims to look at proposed areas of research:

  • A review of existing research work on optimization based, game theory based and AI-driven intelligent decision-making methods including decision trees, discrete dynamic programming, game-theoretic programming, imitation learning, deep learning and reinforcement learning.
  • Integration of computational HW and development of conops, models and interfaces for running machine learning algorithms using a fast-time computer generated forces simulation • Generalization of tree-based decision algorithms and reinforcement learning with learned models across multiple decision makers, stochastic approximation/optimization/search methods, creation and parametrized learning/optimization of pre-structured approximate attack and defence tactics.
  • Dynamic reconfiguration of reinforcement learning and learned models with inclusion of new elements and decision-makers, model-predictive decision-making with reactive planning of time horizon and elements/assets, surrogate models and digital twins, transfer learning for cascaded decision-making.
  • Embedding of hidden and mis-information into game-theoretic formulation of reinforcement learning, quantification/estimation/learning of hidden and mis-information through iterative action-selection policy.

This research is expected to include theoretical analysis, modelling, computational and synthetic simulated environment implementation in association with BAE Systems. The student is asked to consider the FCAS-TEMPEST-Loyal Wingman-End-effectors as an area of particular interest. It is expected that the results from this work will pave the way to obtain credibility from the users / customers towards deployment of such AI-driven Intelligent Decision-Support Systems in conflicts of near future.

Cranfield is an exclusively postgraduate university that is a global leader for education and transformational research in technology and management. This PhD will be hosted by the Centre for Autonomous and Cyber-Physical Systems. The Centre for Autonomous and Cyber-Physical Systems is one of the world’s largest centres of postgraduate education and research, with over 200 MSc and PhD students. The Centre is also hosting the UK’s EPSRC Trustworthy Autonomous Systems : Security Node with Lancaster University.

You will be encouraged and supported in publishing own work in high-quality peer-reviewed journals. Also, you will have opportunities and supports to present your work at the relevant UK and international conferences.

You will obtain knowledge on the technologies for the related disciplines, experience the procedures of algorithm development in autonomy and AI, and learn skills for modelling, embedded programming, synthetics and simulations.



Similar Positions