Nguyen Sao Mai

NGUYEN Sao Mai

PhD in Cognitive Developmental Robotics

Flowers Team
INRIA Bordeaux Sud-Ouest
351 Cours de la Libération
33405 Talence Cedex
France

nguyensmai

.@-

gmail

com

Research Interests

Keywords: Cognitive Developmental Robotics, Artificial Intelligence, Cognitive Science, Child Development, Life-Long Learning, Open-Ended Learning, Intrinsic Motivation, Imitation Learning, Interactive Learning, Active Learning.

Interested in robotics, artificial intelligence and cognitive science, I try to bridge these 3 aspects in my work, and started working in cognitive developmental robotics. I have special interest in linking research of machine learning, robotic embodiment, child development psychology and neuroscience. I also explore possibilities to make robots adapt to their physical and social environments.
This is why I strive to answer the questions: Can a robot learn like a child? How do children learn? How do human beings or other species learn? Can a robot learn through social interaction with humans?

Current Research

Keywords: Hierarchical Learning, Intrinsic Motivation, Goal-oriented Exploration, Data-collection Strategy, Active Learning, Interactive Learning, Learning by Demonstration, Programming by Demonstration, Imitation Learning, Mimicking, Emulation.

My work in the field of developmental cognitive robotics aims to devise a new domain bridging between reinforcement learning and imitation learning, with a model of the intrinsic motivation for learning agents to learn with guidance from tutors multiple tasks, including sequential tasks. My main contribution has been to propose a common formulation of intrinsic motivation based on empirical progress for a learning agent to choose automatically its learning curriculum by actively choosing its learning strategy for simple or sequential tasks: which task to learn, between autonomous exploration or imitation learning, between low-level actions or task decomposition, between several tutors. The originality is to design a learner that benefits not only passively from data provided by tutors, but to actively choose when to request tutoring and what and whom to ask. The learner is thus more robust to the quality of the tutoring and learns faster with fewer demonstrations.

Hierarchical Reinforcement Learning

For open-ended learning of multiple tasks, learning agents need to learn tasks of growing complexity. I hypothesize that it needs transfer of knowledge between tasks with a hierarchical representation of tasks. Our research has led to several hierarchical reinforcement learning algorithms using intrinsic motivation. See publications +

Activity Recognition in Smart Homes

How can we classify as activities of daily living the logs from ambiant sensors in smart homes. The data are event-driven signals which are often binary values, not giving significant information about the current state of the house. We deal with the non-Markovian irregularly sampled time series by contextualising each signal with language models. See publications +

Robot Coach for Physical Rehabilitation

I was the leader of the experiment KERAAL from 2016 to 2018, funded by the European Union’s Seventh Framework Programme for research, technological development and demonstration under grant agreement no 601116 as part of the ECHORD++ (The European Coordination Hub for Open Robotics Development) project. Our consortium made up of roboticists and doctors propose to develop a robot coach for rehabilitation exercises.
We published a medical dataset for human pose analysis. See the presentation video and see publications +

Social Guidance for a Strategic Learning

My research focuses on developing a strategic learner that can learn multiple tasks with several strategies. In particular, I study the combination of autonomous exploration and imitation learning strategies to build an interactive learner that can explore its environment both via
active imitation learning
and autonomous goal-babbling. It learns at the same time when, who and what to actively imitate from several available teachers, and it learns when not to use social guidance but use active goal-oriented self-exploration.

The challenges posed by robots operating in human environments on a daily basis and in the long-term point out the importance of adaptivity to changes which can be unforeseen at design time. Therefore, the robot must learn continuously in an open-ended, non-stationary and high dimensional space. It can not possibly explore all its environment to learn about everything within a life-time. To be useful and acquire skills, the robot must on the contrary be able to know which parts to sample and what kind of skills are interesting to learn. One way is to decide what to explore by oneself. Another way is to refer to a mentor. We name these two ways of collecting data sampling modes. The first sampling mode correspond to algorithms developed in the literature in order to autonomously drive the robot in interesting parts of the environment or useful kinds of skills. Such algorithms are called artificial curiosity or intrinsic motivation algorithms. The second sampling mode correspond to social guidance or imitation where the teacher indicates where to explore as well as where not to explore. Starting from the study of the relationships between these two concurrent methods, we ended up building an algorithmic architecture where relationships between the two modes intertwine into a hierarchical learning structure, called Socially Guided Intrinsic Motivation (SGIM). Four versions of SGIM have been presented, each enabling the learner to actively decide on more aspects of its learning:

SGIM-D (Socially Guided Intrinsic Motivation by Demonstration) which uses imitation learning and actively chooses which outcome to learn to produce based on intrinsic motivation. See Nguyen and Oudeyer, 2014, Autonomous Robots.
SGIM-IM (Socially Guided Intrinsic Motivation with Interactive learning at the Meta level) which in addition actively chooses between an autonomous learning strategy or an imitation learning strategy. See Nguyen and Oudeyer, 2012, Humanoids.
SGIM-ACTS (Socially Guided Intrinsic Motivation with Active Choice of Teacher and Strategy) which chooses actively and hierarchichally at each learning episode what and how to learn, and when, what and who to imitate, depending on measures of competence progress that each learning strategy and teacher induces for different kinds of outcome. See Nguyen and Oudeyer, Paladyn Journal of Behavioral Robotics, DOI 10.2478/s13230-013-0110-z.
SGIM-PB (Socially Guided Intrinsic Motivation with Procedure Babbling) which learns sequential actions in a recursive manner, by choosing to request demontrations of actions at the low-level, or instructions about task decomposition. See Duminy et al, 2021, Applied Sciences.

I was involved in the MACSi project with UPMC and ENSTA universities. In the project we thrive to build a cognitive architecture in a developmental robotics scenario and based on perception, control and active exploration modules. It uses a Socially Guided Intrinsic Motivation algorithm in the upper level of its cognitive architecture to actively learn to recognise 3d objects by manipulation.

Below is a video of my PhD defense:

PhD Thesis (2013): A Curious Robot Learner for Interactive Goal-Babbling : Strategically Choosing What, How, When and from Whom to Learn . [ bib code ]

Previous Projects

Keywords: Face-swapper, Face Tracking, Video Processing, Theory of Mind, Self-Recognition, Obstacle Avoidance

During my Master Thesis in the Department of Adaptative Machine Systems, Faculty of Engineering, Osaka University I studied the modelling of the development process of self-consciousness in childhood between the ages of 2 months and 2 years. As social interaction depends greatly on self-knowledge, we investigated the onset of self-consciousness between the time a baby knows how his body is situated in respect to the environment at 2 months of age, and the time he successfully passes the rouge mirror test at 24 months. +
We set up 2 experiments to study if:
- children prefer familiarity of faces or contingency of movements
- a self-recognition brain activity can be measured in children through Near InfraRed Spectroscopy
Master Thesis (2010): Real-time face swapping based on head posture using particle filter towards the understanding of infant self-recognition.
During my internship at the Artificial Intelligence Laboratory, Stanford University, I completed a project for making the PUMA arm avoid moving obstacles. The summary is only available in French: Comportement réactif pour l'évitement d'obstacles en Robotique : application à un bras manipulateur.
I took part in 2008-2009 in the play “I, worker” casting 2 human actors and 2 robot actors. The programming and management of the robot was realised by our group of 3 students, and was led by Pr. Ishiguro and Pr. Hirata. You can find information on the theater company or from the press release (eg. bbc).
I also took part in robotic competitions such as:
- Robocup and Japan Open in 2008 with the JEAP team. I was in charge of designing the movements for walking and kicking, as well as programming the behaviour for the football game.
- Coupe de la Robotique in 2006 with the robotics club of Ecole Polytechnique. I was the secretary-treasurer of our team of 10, and in charge of programming the trajectory and behaviour of the robot we built from scratch to play at a golf game at the French national robotics challenge.

This site is permanently under construction. New content is higlighted in red.
Click on the + to read more details