CVLab COMPUTER VISION LAB

National Tsing Hua University


Multiple View Structure Reconstruction

30_cover
Thu, 25 Nov 2010 - yilin
Contour-Based Structure from Reflection
Po-Hao Huang and Shang-Hong Lai
In this paper, we propose a novel contour-based algorithm for 3D object reconstruction from a single uncalibrated image acquired under the setting of two plane mirrors. With the epipolar geometry recovered from the image and the properties of mirror reflection, metric reconstruction of an arbitrary rigid object is accomplished without knowing the camera parameters and the mirror poses. For this mirror setup, the epipoles can be estimated from the correspondences between the object and its reflection, which can be established automatically from the tangent lines of their contours. By using the property of mirror reflection as well as the relationship between the mirror plane normal with the epipole and camera intrinsic, we can estimate the camera intrinsic, plane normals and the orientation of virtual cameras. The positions of the virtual cameras are determined by minimizing the distance between the object contours and the projected visual cone for a reference view. After the camera parameters are determined, the 3D object model is constructed via the image-based visual hulls (IBVH) technique. The 3D model can be refined by integrating the multiple models reconstructed from different views. The main advantage of the proposed contour-based Structure from Reflection (SfR) algorithm is that it can achieve metric reconstruction from an uncalibrated image without feature point correspondences. Experimental results on synthetic and real images are presented to show its performance.
IEEE Conf. on Computer Vision & Pattern Recognition (CVPR'06), New York, USA, Jun. 17-22, 2006. (pdf)
29_cover
Thu, 25 Nov 2010 - wei
Turntable 3D model reconstruction
Po-Hao Huang, Chia-Ming Cheng, Hsiao-Wei Chen,Hao-Liang Yang, Li-Hsuan Chin and Shang-Hong Lai
In this project, we developed a 3D model reconstruction system. Using a turntable, we can take photo of object easily in different pose. By using the camera calibration method which is proposed by Po-Hao Huang, we can get all points in 3D model, and we used visual hull reconstruct surface of the 3D model. In final step, we merge all images into a seamless texture map.
25_cover
Thu, 25 Nov 2010 - yilin
Silhouette-Based Camera Calibration from Sparse Views under Circular Motion
Po-Hao Huang and Shang-Hong Lai
In this paper, we propose a new approach to camera calibration from silhouettes under circular motion with minimal data. We exploit the mirror symmetry property and derive a common homography that relates silhouettes with epipoles under circular motion. With the epipoles determined, the homography can be computed from the frontier points induced by epipolar tangencies. On the other hand, given the homography, the epipoles can be located directly from the bi-tangent lines of silhouettes. With the homography recovered, the image invariants under circular motion and camera parameters can be determined. If the epipoles are not available, camera parameters can be determined by a low-dimensional search of the optimal homography in a bounded region. In the degenerate case, when the camera optical axes intersect at one point, we derive a closed-form solution for the focal length to solve the problem. By using the proposed algorithm, we can achieve camera calibration simply from silhouettes of three images captured under circular motion. Experimental results on synthetic and real images are presented to show its performance.

3D Face Modeling

23_cover
Wed, 24 Nov 2010 - Savan
The System for Reconstructing 3D Shape and Expression Deformation from a Single Face Image
Shu-Fan Wang and Shang-Hong Lai
Facial expression modeling is central to facial expression recognition and expression synthesis for facial animation. In this work, we propose a manifold-based 3D face reconstruction approach to estimating the 3D face model and the associated expression deformation from a single face image. In the training phase, we build a nonlinear 3D expression manifold from a large set of 3D facial expression models to represent the facial shape deformations due to facial expressions. Then a Gaussian mixture model in this manifold is learned to represent the distribution of expression deformation. By combining the merits of morphable neutral face model and the low-dimensional expression manifold, a novel algorithm is developed to reconstruct the 3D face geometry as well as the 3D shape deformation from a single face image with expression in an energy minimization framework. To construct the manifold for the facial expression deformations, we also propose a robust weighted feature map (RWF) based on the intrinsic geometry property of human faces for robust 3D non-rigid registration. Experimental results on CMU-PIE image database and FG-Net video database are shown to validate the effectiveness and accuracy of the proposed algorithm.
22_cover
Wed, 24 Nov 2010 - Savan
Reconstructing 3D Shape, Albedo and Illumination from a Single Face Image
Shu-Fan Wang and Shang-Hong Lai
In this project, we propose a geometrically consistent algorithm to reconstruct the 3D face shape and the associated albedo from a single face image iteratively by combining the morphable model and the SH model. The reconstructed 3D face geometry can uniquely determine the SH bases, therefore the optimal 3D face model can be obtained by minimizing the error between the input face image and a linear combination of the associated SH bases. In this way, we are able to preserve the consistency between the 3D geometry and the SH model, thus refining the 3D shape reconstruction recursively. Furthermore, we present a novel approach to recover the illumination condition from the estimated weighting vector for the SH bases in a constrained optimization formulation independent of the 3D geometry. Experimental results show the effectiveness and accuracy of the proposed face reconstruction and illumination estimation algorithm under different face poses and multiple-light-source illumination conditions.

Geometry Modeling and Processing

38_cover
Sun, 19 Dec 2010 - yilin
Binary Orientation Trees for Volume and Surface Reconstruction from Unoriented Point Clouds
Yi-Ling Chen, Bing-Yu Chen, Shang-Hong Lai and Tomoyuki Nishita
Given a complete unoriented point set, we propose to build a binary orientation tree (BOT) for volume and surface representation, which roughly splits the space into the interior and exterior regions with respect to the input point set. The BOTs are constructed by performing a traditional octree subdivision technique while the corners of each cell are associated with a tag indicating the in/out relationship with respect to the input point set. Starting from the root cell, a growing stage is performed to efficiently assign tags to the connected empty sub-cells. The unresolved tags of the remaining cell corners are determined by examining their visibility via the hidden point removal operator. We show that the outliers accompanying the input point set can be effectively detected during the construction of the BOTs. After removing the outliers and resolving the in/out tags, the BOTs are ready to support any volume or surface representation techniques. To represent the surfaces, we also present a modified MPU implicits algorithm enabled to reconstruct surfaces from the input unoriented point clouds by taking advantage of the BOTs.
Computer Graphics Forum (Proceedings of Pacific Graphics 2010), vol.29, no. 7, pp2011-2019, Sep. 2010. (slides)
33_cover
Thu, 25 Nov 2010 - rily
3D Non-rigid Registration for MPU Implicit Surfaces
Tung-Ying Lee and Shang-Hong Lai
Implicit surface representation is well suited for surface reconstruction from a large amount of 3D noisy data points with non-uniform sampling density. Previous 3D non-rigid model registration methods can only be applied to the mesh or volume representations, but not directly to implicit surfaces. To our best knowledge, the previous 3D registration methods for implicit surfaces can only handle rigid transformation and they must keep the data points on the surface. In this paper, we propose a new 3D non-rigid registration algorithm to register two multi-level partition of unity (MPU) implicit surfaces with a variational formulation. The 3D non-rigid transformation between two implicit surfaces is a continuous deformation function, which is determined via an energy minimization procedure. Under the octree structure in the MPU surface, each leaf cell is transformed by an individual affine transformation associated with an energy that is related to the distance between two general quadrics. The proposed algorithm can directly register between two 3D implicit surfaces without sampling on the two signed distance functions or polygonalizing implicit surfaces, which makes our algorithm efficient in both computation and memory requirement. Experimental results on 3D human organ and sculpture models demonstrate the effectiveness of the proposed algorithm.
CVPR'08 Workshop on Non-Rigid Shape Analysis and Deformable Image Alignment (NORDIA), 2008 (slides)
24_cover
Wed, 24 Nov 2010 - Savan
Geometry Image Resizing and Mesh Simplification
Shu-Fan Wang, Yi-Ling Chen, Chen-Kuo Chiang and Shang-Hong Lai
Polygonal meshes are widely used to represent the shape of 3D objects and the generation of multi-resolution models has been a significant research topic in computer graphics. In this project, we demonstrate how to generate multi-resolution models through 2D image processing techniques. The goal of generating multi-resolution models is accomplished by resizing the corresponding geometry images of 3D models. By defining appropriate energy on 2D images reflecting the importance of 3D vertices, we propose a modified content-aware image resizing algorithm suitable for geometry images, which achieves the preservation of salient structures and features in 3D models as well. We evaluate various image resizing techniques and show experimental results to validate the effectiveness of the proposed algorithm.

3D Stereo Vision

37_cover
Sun, 19 Dec 2010 - yilin
Stereo Matching Algorithm Using Hierarchical Over-segmentation and Belief Propagation
Li-Hsuan Chin
In this work, we present a novel algorithm to infer disparity map from given a pair of rectified images. We first employ image over-segmentation to construct a Content-based Hierarchical Markov Random Field (CHMRF). This image representation contains two advantages for vision applications. One is the hierarchical MRF construction, and the other is the regular graph structure. The former has been widely applied to computer vision problems to improve the efficiency in MRF optimization. The latter can simplify the message passing and hardware implementation of MRF optimization techniques. After the construction of CHMRF, we perform symmetric stereo matching and occlusion handing using Hierarchical Belief Propagation (HBP) based on the proposed graphical model. Finally, a refinement process for the disparity map is introduced (e.g. plane fitting or bilateral filtering) to reduce the disparity errors caused by occlusion, textureless region or image noise, etc. Our experimental results show that we can efficiently obtain disparity maps of comparable accuracy when compared to most global stereo algorithms. For real stereo video sequences, we are able to accurately estimate the depth information for each frame with the pre-processing of robust self image rectification.
Master Thesis, Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan. (pdf)
31_cover
Thu, 25 Nov 2010 - yilin
Improved Novel View Synthesis from Depth Image with Large Baseline
Chia-Ming Cheng, Shu-Jyuan Lin, Shang-Hong Lai, Jinn-Cherng Yang
In this paper, a new algorithm is developed for recovering the large disocclusion regions in depth image based rendering (DIBR) systems on 3DTV. For the DIBR systems, undesirable artifacts occur in the disocclusion regions by using the conventional view synthesis techniques especially with large baseline. Three techniques are proposed to improve the view synthesis results. The first is the preprocessing of the depth image by using the bilateral filter, which helps to sharpen the discontinuous depth changes as well as to smooth the neighboring depth of similar color, thus restraining noises from appearing on the warped images. Secondly, on the warped image of a new viewpoint, we fill the disocclusion regions on the depth image with the background depth levels to preserve the depth structure. For the color image, we propose the depth-guided exemplar-based image inpainting that combines the structural strengths of the color gradient to preserve the image structure in the restored regions. Finally, a trilateral filter, which simultaneous combines the spatial location, the color intensity, and the depth information to determine the weighting, is applied to enhance the image synthesis results. Experimental results are shown to demonstrate the superior performance of the proposed novel view synthesis algorithm compared to the traditional methods.
In Proc. of International Conference on Pattern Recognition (ICPR), Tampa, Florida, U.S.A., Dec. 2008. (pdf)
26_cover
Thu, 25 Nov 2010 - yilin
Geodesic Tree-Based Dynamic Programming for Fast Stereo Reconstruction
Chin-Hong Sin, Chia-Ming Cheng, Shang-Hong Lai and Shan-Yung Yang
In this paper, we present a novel tree-based dynamic programming (TDP) algorithm for efficient stereo reconstruction. We employ the geodesic distance transformation for tree construction, which results in sound image over-segmentation and can be easily parallelized on graphic processing unit (GPU). Instead of building a single tree to convey message in dynamic programming (DP), we construct multiple trees according to the image geodesic distance to allow for parallel message passing in DP. In addition to efficiency improvement, the proposed algorithm provides visually sound stereo reconstruction results. Compared with previous related approaches, our experimental results demonstrate superior performance of the proposed algorithm in terms of efficiency and accuracy.
IEEE Embedded Computer Vision Workshop (ICCV Workshops), Kyoto, Japan, Oct. 2009. (pdf)

Multiview People Localization and Tracking

34_cover
Thu, 25 Nov 2010 - rily
People Localization in a Camera Network Combining Background Subtraction and Scene-Aware Human Detection
Tung-Ying Lee, Tsung-Yu Lin, Szu-Hao Huang, Shang-Hong Lai, and Shang-Chih Hung
In a network of cameras, people localization is an important issue. Traditional methods utilize camera calibration and combine results of background subtraction in different views to locate people in the three dimensional space. Previous methods usually solve the localization problem iteratively based on background subtraction results, and high-level image information is neglected. In order to fully exploit the image information, we suggest incorporating human detection into multi-camera video surveillance. We develop a novel method combining human detection and background subtraction for multi-camera human localization by using convex optimization. This convex optimization problem is independent of the image size. In fact, the problem size only depends on the number of interested locations in ground plane. Experimental results show this combination performs better than background subtraction-based methods and demonstrate the advantage of combining these two types of complementary information.

3D Object Pose Estimation

32_cover
Thu, 25 Nov 2010 - wei
3D Object Pose Estimation
Tung-Ying Lee, Hsiao-Wei Chen, Hong-Ren Su and Shang-Hong Lai
The goal of this project is to develop a novel 3D object alignment technique for industrial robot 3D object localization. The alignment system will consist of two main components: 2D local pattern alignment and 2D-3D pose estimation. The 2D local pattern alignment component is to quickly find 2D affine alignments of some selected local patches based on the Fourier-based image matching technique. The affine alignment results are then sent to the second stage, which is to solve the 2D-3D pose estimation problem in computer vision. In this stage, an optimization-based pose estimation procedure is applied to estimate the 3D pose of the object from the 2D correspondences of some local patches. The proposed alignment technique is based on the matching of the geometric information so that the alignment system is robust against lighting changes.