PhD Position F/M Grounding Artificial Intelligence in the origins of human behavior

Updated: over 2 years ago
Location: Bordeaux, AQUITAINE
Deadline: 02 Nov 2021

This project will address the following research question: Can the acquisition of complex behaviours in artificial agents be improved by modelling ecological conditions that may have played a role in human evolution? We will address it from the perspective of computer simulation using state-of-the-art methods in machine learning and artificial life. Deep Reinforcement Learning algorithms (Deep-RL, (Mnih et al., 2015)) allow agents with perception and action capabilities to learn complex behaviours from experience in order to maximize long-term rewards. Multi-Agent Reinforcement Learning (MARL, (Leibo et al., 2017; Littman, 1994))) applies the mechanisms of RL to multi-agent environments with various cooperation/competition pressures. Meta Reinforcement Learning ((Finn et al., 2019; J. X. Wang et al., 2017)) relies on a bilevel optimization procedure: An inner Reinforcement Learning loop (occurring during the life time of an agent, analogous to developmental learning) nested within an outer loop optimizing hyperparameters of the inner loop (occurring across successive generations of agents, analogous to evolution). Finally, Neuroevolution (D’Ambrosio et al., 2014; Najarro & Risi, 2021) uses methods from Evolutionary Computation in order to evolve parameters of artificial neural networks (e.g. their weights, structure or learning rules). We will benchmark these different frameworks in simulated environments that seek to reproduce certain ecological conditions that may have played a role in human evolution (e.g. (Maslin et al., 2015)) and study how such systems can improve the acquisition of complex skills. We will consider several hypotheses and behaviours from the HBE literature related to the emergence of complex tool use, shared resources management as well as cooperation, communication and culture (see e.g. (Antón et al., 2014) for existing hypotheses on these topics, as well our recent position paper (Nisioti & Moulin-Frier, 2020)).

Antón, S. C., Potts, R., & Aiello, L. C. (2014). Evolution of early Homo: An integrated biological perspective. Science, 345(6192), 1236828. https://doi.org/10.1126/science.1236828

Borgerhoff Mulder, M., & Schacht, R. (2001). Human behavioural ecology. E LS.

D’Ambrosio, D. B., Gauci, J., & Stanley, K. O. (2014). HyperNEAT: The First Five Years. In T. Kowaliw, N. Bredeche, & R. Doursat (Eds.), Growing Adaptive Machines: Combining Development and Learning in Artificial Neural Networks (pp. 159–185). Springer. https://doi.org/10.1007/978-3-642-55337-0_5

Finn, C., Rajeswaran, A., Kakade, S., & Levine, S. (2019). Online Meta-Learning. ArXiv:1902.08438 [Cs, Stat]. http://arxiv.org/abs/1902.08438

Leibo, J. Z., Zambaldi, V., Lanctot, M., Marecki, J., & Graepel, T. (2017). Multi-agent Reinforcement Learning in Sequential Social Dilemmas. Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, 464–473.

Littman, M. L. (1994). Markov games as a framework for multi-agent reinforcement learning. Machine Learning Proceedings 1994, 157–163. https://doi.org/10.1016/B978-1-55860-335-6.50027-1

Maslin, M. A., Shultz, S., & Trauth, M. H. (2015). A synthesis of the theories and concepts of early human evolution. Philosophical Transactions of the Royal Society B: Biological Sciences, 370(1663), 20140064. https://doi.org/10.1098/rstb.2014.0064

Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D., … Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529–533. https://doi.org/10.1038/nature14236

Najarro, E., & Risi, S. (2021). Meta-Learning through Hebbian Plasticity in Random Networks. ArXiv:2007.02686 [Cs]. http://arxiv.org/abs/2007.02686

Nisioti, E., & Moulin-Frier, C. (2020). Grounding Artificial Intelligence in the Origins of Human Behavior. ArXiv:2012.08564 [Cs]. http://arxiv.org/abs/2012.08564

Portelas, R., Colas, C., Hofmann, K., & Oudeyer, P.-Y. (2019, October). Teacher algorithms for curriculum learning of Deep RL in continuously parameterized environments. CoRL 2019 - Conference on Robot Learning.

Wang, J. X., Kurth-Nelson, Z., Tirumala, D., Soyer, H., Leibo, J. Z., Munos, R., Blundell, C., Kumaran, D., & Botvinick, M. (2017). Learning to reinforcement learn. ArXiv:1611.05763 [Cs, Stat]. http://arxiv.org/abs/1611.05763

Wang, R., Lehman, J., Clune, J., & Stanley, K. O. (2019). Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions. ArXiv:1901.01753 [Cs].



Similar Positions