As-Rigid-As-Possible Shape Manipulation

The problem of shape manipulation is of interest to many domains, including but not limited to: image editing platforms, usage in real-time live performances, and enhancing graphical user interfaces. The goal of shape manipulation is to impart the ability to move and deform shapes in a manner that is akin to interacting with an object in the real world. Previous approaches to shape manipulation can be broadly categorized into (a) space-warp and (b) physics based techniques. ...

October 26, 2020 · 5 min · Kumar Abhishek

DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills

The adoption of physically simulated character animation in the industry remains a challenging problem, primarily because of the lack of the directability and generalizability of existing methods. With the goal being to amalgamate data-driven behavior specification with the ability to reproduce such behavior in a physical simulation, there have been several categories of approaches which try to achieve it. Kinematic models rely on large amounts of data, and their ability to generalize to unseen situations can be limiting. Physics-based models incorporate prior knowledge based on the physics of motion, but they do not perform well for “dynamic motions” involving long-term planning. Motion imitation approaches can achieve highly dynamic motions, but are limited by the complexity of the system and lack of adaptability to task objectives. Techniques based on reinforcement learning (RL), although comparatively successful in achieving the defined objectives, often produce unrealistic motion artifacts. This paper addresses these problems by proposing a deep RL-based “framework for physics-based character animation” called DeepMimic by combining a motion-imitation objective with a task objective, allowing it to demonstrate a wide range of motions skills and to adapt to a variety of characters, skills, and tasks by leveraging rich information from the high-dimensional state and environment descriptions, is conceptually simpler than motion imitation based approaches, and can work with data provided in the form of either motion capture clips or keyframed animation. While the paper presents intricate details about the DeepMimic framework, the high level details and novel contribution claims are summarized here, skipping the common details about deep RL problem formulations. ...

October 26, 2020 · 5 min · Kumar Abhishek

Building Rome in A Day

With the advent of digital photography and the popularity of cloud-based digital image sharing websites, there has been a huge proliferation in the number of publicly accessible photographs of popular cities (and landmarks thereof) across the world. As a result, the ability to leverage these photos in a meaningful manner is of massive interest in computer vision communities. One key research area that could immensely benefit from it is city-scale 3D reconstruction. Traditionally, existing systems for this task have relied on images and data acquired in a structured manner, making computation simple. On the contrary, images uploaded on the internet have no such constraints, necessitating the development of algorithms which can work on “extremely diverse, large, and unconstrained image collections”. Building upon previous research and incorporating elements from other disciplines of computer science, this paper proposes a system to construct large-scale 3D geometry from large and unorganized image collections publicly available on the internet, with the ability to process more than a hundred thousand images in a day. ...

October 19, 2020 · 5 min · Kumar Abhishek

KinectFusion: Real-Time Dense Surface Mapping and Tracking

The surge of interest in augmented and mixed reality applications can at least in part be attributed to research in the “real-time infrastructure-free” tracking of a camera with the simultaneous generation of detailed maps of physical scenes. While computer vision research has enabled this (especially accurate camera tracking and dense scene surface reconstructions) using structure from motion and multiview stereo algorithms, they are not quite well suited for either real-time applications or detailed surface reconstructions. There has also been a contemporaneous improvement of camera technologies, especially depth-sensing cameras based on time-of-flight or structured light sensing, such as Microsoft Kinect, a consumer-grade offering. Microsoft Kinect features a structured light-based depth sensor (sensor hereafter) and generates a 11-bit $640 \times 480$ depth map at 30Hz using an on-board ASIC. However, these depth images are usually noisy with ‘holes’ indicating regions where depth reading was not possible. This paper proposes a system to process these noisy depth maps and perform real-time (9 million new point measurements per second) dense simultaneous localization and mapping (SLAM), thereby generating an incremental and consistent 3D scene model while also tracking the sensor’s motion (all 6 degrees-of-freedom) through each frame. While the paper presents quite an involved description of the method, the key components have been briefly summarized here. ...

October 19, 2020 · 4 min · Kumar Abhishek

A Practical Model for Subsurface Light Transport

In computer graphics, a bidirectional reflectance distribution function (BRDF) is used to model light reflectance properties at a surface, and is defined as the ratio of the radiance (incident light) to the irradiance (reflected light) per unit surface area. All BRDF models operate on surface scattering, i.e., the assumption that “light scatters at one surface point” and that light enters and exits a material at the same position, and do not model subsurface transport of incident light. Although this assumption stands valid for metals, translucent surfaces modeled using BRDF exhibit a distinct hard and computer-generated appearance and poor blending of local color and geometry features. While there have been works to model subsurface transport of light, the existing methods are either slow or inefficient for anisotropic or highly scattering translucent media (such as skin and milk). This papers attempts to address this shortcoming by proposing a model for subsurface light transport in translucent materials using bidirectional surface scattering reflectance distribution function (BSSRDF). BSSRDFs are a generalization of BRDFs and, unlike the latter, can model light transport between any two rays that hit a surface. Since the exact BSSRDF derivation is quite involved, we only present a brief summary here, followed by its extension to a model for rendering computer graphics. ...

October 12, 2020 · 5 min · Kumar Abhishek