Title: See to Act and Act to See


David Hsu, National University of Singapore

In a robot system, perception and action are two interlocked essential elements. The purpose of perception is to act. The purpose of action is, at least, sometimes to achieve improved perception. Historically, the venerable modularity principle of system design led us to decompose a robot system and develop separate perception and decision modules that communicate through a "narrow" information interface. This separation became a major technical barrier and confined robots to well-controlled environments for decades. Capturing perceptual uncertainty and connecting perception with robot decision-making are key to robust robot performance. In this talk, we will look at ideas for tackling this challenge through planning, learning, and more interestingly, integrating planning and learning, in the context of autonomous driving among pedestrians and human-robot interaction.

Biography: David Hsu is a professor of computer science at the National University of Singapore (NUS) and a member of NUS Graduate School for Integrative Sciences & Engineering. He received PhD in computer science from Stanford University . At NUS, he co-founded NUS Advanced Robotics Center and has since been serving as the Deputy Director. He is an IEEE Fellow. His research spans robotics, AI, and computational structural biology. In recent years, he has been working on robot planning and learning under uncertainty and human-robot collaboration. He, together with colleagues and students, won the Humanitarian Robotics and Automation Technology Challenge Award at International Conference on Robotics & Automation (ICRA) 2015, the RoboCup Best Paper Award at International Conference on Intelligent Robots & Systems (IROS) 2015, and the Best Systems Paper Award at Robotics: Science & Systems (RSS), 2017. He has chaired or co-chaired several major international robotics conferences, including International Workshop on the Algorithmic Foundation of Robotics (WAFR) 2004 and 2010, ICRA 2016, and RSS 2015. He was an associate editor of IEEE Transactions on Robotics. He is currently an editorial board member of Journal of Artificial Intelligence Research and a member of the RSS Foundation Board.

Title: From Vision and Learning to Action and Understanding


Chris Pal, Polytechnique Montréal, Mila, Element AI & Canada CIFAR AI Chair

As computer vision techniques based on machine learning mature, their potential for driving complex control policies for robotics has captured the imagination of many researchers. However, understanding the behaviour of learning algorithms coupling vision and control as well as understanding the behaviour of control policies learned by such algorithms remain challenging open problems. I'll examine some recent work from my group and collaborators on these themes.

I'll begin with some examples of deep learning techniques that more explicitly account for the three and four dimensionality of the visual world. I'll go on to examine a recurrent neural network based visual comparison technique that allows the reward signal of a reinforcement learning (RL) algorithm to be learned. This allows us to combine imitation learning techniques with RL to control physical simulations of humanoid agents by simply watching an example of a desired motion.

Going further, as deep reinforcement learning driven by visual perception becomes more widely used, there is a growing need to better understand and probe the learned agents. I'll present a new method for synthesizing visual inputs that lead to critical or risky states in RL agents -- of the type in which a very high or a very low reward can be achieved depending on which action is taken. In our experiments we show that this method can generate insights for a variety of environments and reinforcement learning methods. I'll present some of our results on the standard Atari benchmark games as well as in an autonomous driving simulator.

Biography: Dr. Chris Pal is an associate professor in the department of computer and software engineering at the École Polytechnique of Montreal. Prior to arriving in Montreal, he was a professor in the department of Computer Science at the University of Rochester. He has been a research scientist with the University of Massachusetts and has also been affiliated with the Interactive Visual Media Group and the Machine Learning and Applied Statistics groups at Microsoft Research. His research at Microsoft lead to three patents on image processing, computer vision and interactive multimedia. He earned his M. Math and PhD from the University of Waterloo in Canada. During his masters research he developed methods for automated cartography and the analysis of high resolution digital aerial photography. He was also involved with a number of software engineering projects developing spatial databases for managing environmental information. His PhD research led to contributions applying probability models and optimization techniques to image, video and signal processing. During his PhD studies Chris was also a research assistant at the University of Toronto in the Department of Electrical and Computer Engineering. At Toronto he collaborated closely with the Banting and Best Department of Medical Research. He preformed research on image processing and statistical methods for the analysis of large scale genomics and computational molecular biology experiments using DNA microarrays. Prior to his graduate studies Chris was with the multimedia research company Interval in Palo Alto, CA (Silicon Valley). As a result of his research at Interval he was awarded a patent on audio signal processing.