Results from paper "Online-Learning and Planning in High Dimensions with Finite Element Goal Babbling", presented at ICDL-Epirob 2017 (PdfSlides, Video).

These videos show Finite Element Goal Babbling (FEGB) being used for different settings. The goal for the arm is to learn to move its end-effector (its tip), to any position of the room. The resolution 1x1 is equivalent to traditional goal babbling (GB), as one single state does not allow for higher level planning. The first part of the videos shows the arm exploring freely in order to learn both useful postures and also make a model of probabilities to transition between different states. The second part shows the arm trying to reach for different goals, using what it learned during the first exploration. The agent is not allowed to improve during this phase.

FEGB 10'000 Iter, 100 DoF, Res 6x6, Seed 4

FEGB 10'000 Iter, 5 DoF, Res 6x6, Seed 4

FEGB 20'000 Iter, 100 DoF, Res 10x10, Seed 5

GB 10'000 Iter, 100 DoF, Res 1x1, Seed 2

Full code can be found here. To replicate results: Run 'main_record.py' with 'iterations', 'seed', 'dof' and 'resolution' set to the values of the video's description.

Abstract:

Goal babbling (GB) has proved to be a powerful tool in online learning of inverse kinematic models of high-dimensional redundant robots that are acting in low dimensional sensor-spaces. To only look for inverse models is however not sufficient. An inverse model will only tell the robot what posture it should have in order to reach a goal, but not how to reach that posture. As many environments restrict what motions are possible this becomes a limitation.

This paper introduces a new method, Finite Element Goal Babbling (FEGB), that presents a natural extension to GB. By partitioning the sensor-space into a disjoint set of finite elements where every element is seen as an independent GB problem, a planning module can be added by observing transitions between the different elements.

The method is evaluated on a high dimensional planar arm, acting in an environment that restricts its movements. The goal is to learn to control the position of the end-effector so that it can reach any position in the environment. The results show that FEGB is able to learn such control rapidly while naturally dealing with stationary obstacles and workspace limits that would prohibit the applicability of GB.