Representations and Matching Techniques for 3D Free-form Object and Face Recognition
The aim of visual recognition is to identify objects in a scene and estimate their pose. Object recognition from 2D images is sensitive to illumination, pose, clutter and occlusions. Object recognition from range data on the other hand does not suffer from these limitations. An important paradigm of recognition is model-based whereby 3D models of objects are constructed offline and saved in a database, using a suitable representation. During online recognition, a similar representation of a scene is matched with the database for recognizing objects present in the scene.
A 3D model of a free-form object is constructed offline from its multiple range images (views) acquired from different viewpoints. These views are registered in a common coordinate basis by establishing correspondences between them followed by their integration into a seamless 3D model. Automatic correspondences between overlapping views is the major problem in 3D modeling. This problem becomes more challenging when the views are unordered and hence there is no a priori knowledge about which view pairs overlap. The main challenges in the online recognition phase are the presence of clutter due to unwanted objects and noise, and the presence of occluding objects.
This thesis addresses the above challenges and investigates novel representations and matching techniques for 3D free-form rigid object and non-rigid face recognition. A robust representation based on third order tensors is presented. The tensor representation quantizes local surface patches of an object into three-dimensional grids. Each grid is defined in an object centered local coordinate basis which makes the tensors invariant to rigid transformations.
This thesis presents a novel multiview correspondence algorithm which automatically establishes correspondences between unordered views of a free-form object with O(N) complexity. It also presents a novel algorithm for 3D free-form object recognition and segmentation in complex scenes containing clutter and occlusions. The combination of the strengths of the tensor representation and the customized use of a 4D hash table for matching constitute the basic ingredients of these algorithms. This thesis demonstrates the superiority of the tensor representation in terms of descriptiveness compared to an existing competitor, i.e. the spin images. It also demonstrates that the proposed correspondence and recognition algorithms outperform the spin image recognition in terms of accuracy and efficiency.
The tensor representation is extended to automatic and pose invariant 3D face recognition. As the face is a non-rigid object, expressions can significantly change its 3D shape. Therefore, the last part of this thesis investigates representations and matching techniques for automatic 3D face recognition which are robust to facial expressions. A number of novelties are proposed in this area along with their extensive experimental validation using the largest available 3D face database. These novelties include a region-based matching algorithm for 3D face recognition, a 2D and 3D multimodal hybrid face recognition algorithm, fully automatic 3D nose ridge detection, fully automatic normalization of 3D and 2D faces, a low cost rejection classifier based on a novel Spherical Face Representation, and finally, automatic segmentation of the expression insensitive regions of a face.
To address the above challenges and investigate novel representations and matching techniques for 3D free-form rigid object and non-rigid face recognition.