Third Winter School on Humanoid programming

Disentangled 3D Representations for Performance Capture of Humans

Sean Ryan Fanello

Google, California

Abstract

3D Capturing and Rendering of humans has shown incredible progress in the last few years, reaching a level of quality closer to Image Based Renderings (IBR) methods. These systems usually rely on multi-view capture setups and sophisticated pipelines to build a consistent, parameterized mesh of the performer. The final goal of these approaches is to capture and render high quality, photo-realistic humans that can match the quality of Hollywood products, without manual intervention or post-processing.

However, despite steady progress and encouraging results, these 3D capture systems still face important challenges and limitations when it comes to capture translucent and transparent objects or thin structures, such as hair. At the same time, the computer vision community is investigating deep learning techniques to overcome limitations of geometric and graphics approaches.

In this talk, I will show how to combine geometric pipelines with recent approaches in neural rendering to construct disentangled 3D representations for photo-realistic renderings of humans in novel viewpoints, pose and desired lighting conditions. I will walk the audience through the current state-of-the-art for 3D performance capture and reflectance estimation methods, highlighting the pros and the cons of the various techniques. I will then show how deep learning can be applied to overcome the limitations of traditional capture pipelines. Finally, I will give examples of real-world applications that can benefit from these systems.

Biography

Sean Ryan Fanello is a Research Scientist and Manager at Google where he leads the effort on volumetric performance capture. His research interests include: digital humans, volumetric reconstruction, high quality depth sensing and non rigid tracking. Sometimes deep, sometimes geometric, sometimes both. Previously, he was a Senior Scientist and a Founding Team Member at perceptiveIO, Inc., where he developed computer vision and machine learning algorithms for 3D sensing, visual recognition and human-computer interaction. Prior to that, he was a Post-Doc Researcher in the Interactive 3D Technologies (I3D) group at Microsoft Research Redmond where he substantially contributed to the Hololens 3D sensing capabilities. He was also one of the leading members of the Holoportation project. He obtained my PhD in Robotics, Cognition and Interaction Technologies at the Italian Institute of Technology in collaboration with the University of Genoa in 2013. During his PhD he developed computer vision and machine learning techniques for the iCub humanoid robot. In 2010 he completed his Master’s Degree in Computer Engineering at Sapienza University of Rome, with a specialization in Artificial Intelligence and Pattern Recognition.