Visual hull – Knowledge and References

Explore chapters and articles related to this topic

Accurate and Detailed Image-Based 3D Documentation of Large Sites and Complex Objects

Published in Filippo Stanco, Sebastiano Battiato, Giovanni Gallo, Digital Imaging for Cultural Heritage Preservation, 2017

Filippo Stanco, Sebastiano Battiato, Giovanni Gallo

The state of the art in image matching is the multi-image approach, where multiple images are matched simultaneously and not only pairwise. In [82] the different stereo and multi-view image matching algorithms are classified according to six fundamental properties: the scene representation, photo-consistency measure, visibility model, shape prior, reconstruction algorithm, and initialization requirements. According to [90], multi-view image matching and 3D reconstruction algorithms can be classified in (1) voxel-based approaches which require the knowledge of a bounding box containing the scene and that present an accuracy limited by the resolution of the voxel grid [87, 91, 92]; (2) algorithms based on deformable polygonal meshes which demand a good starting point (like a visual hull model) to initialize the corresponding optimization process, therefore limiting their applicability [84,85]; (3) multiple depth maps approaches which are more flexible, but require the fusion of individual depth maps into a single 3D model [86,93], and (4) patch-based methods which represent scene surfaces by collections of small patches (or surfels) [94]. On the other hand, in [33] the image matching algorithms are classified in area-based or feature-based procedures, i.e., using the two main classes of matching primitives: image intensity patterns (windows composed of gray values around a point of interest) and features (edges and regions). This leads respectively to area-based (e.g., cross-correlation or Least Squares Matching (LSM) [95]) and feature-based (e.g., relational, structural, etc.) matching algorithms.

A review of silhouette extraction algorithms for use within visual hull pipelines

View Article

Journal Information

Published in Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization, 2020

Guido Ascenso, Moi Hoon Yap, Thomas Allen, Simon S. Choppin, Carl Payton

Because of such limitations, researchers in biomechanics have long been trying to develop markerless motion capture tools (Fern’ndez-Baena et al. 2012, Moeslund et al. 2006) or to adapt existing ones to their needs (Ceseracciu et al. 2011). One such tool is the visual hull, a shape-from-silhouette method first developed by Laurentini in 1994 (Laurentini 1994) which uses 2D images to reconstruct the object of interest. In biomechanics, the visual hull has been used to study the biomechanical differences between three types of tennis serves (Sheets et al. 2011; Abrams et al., 2014), to perform gait analysis (Corazza et al. 2006), to study the biomechanics of the arm during front crawl swimming (Ceseracciu et al. 2011), and to analyse the movement pattern of gymnasts (Corazza et al. 2010). The first step in the visual hull pipeline is to separate the object of interest (‘foreground’) from the rest of the image (‘background’), a process referred to as ‘silhouette extraction’ in this paper.1 To compute a visual hull, a silhouette needs to be extracted from each camera view (of which there may be several) at each frame of a possibly long video (13). Consequently, an automatic method for accurate silhouette extraction is necessary (Mikhnevich and Laurendeau 2014). The extent to which the accuracy of the silhouettes influences the accuracy of the reconstructed visual hull has not been investigated. However, Grauman et al. (2003) and Gall et al. (2009) have suggested that a small segmentation error in even just one camera view could have a significant effect on the reconstructed visual hull. Though the authors did not quantify this ‘significant effect’, it is reasonable to assume that higher-quality silhouettes would produce higher-quality visual hulls, thus making silhouette accuracy a critical bottleneck in the reconstruction of a visual hull. When constructing a visual hull, a 3D point is labelled as part of the visual hull if and only if its projection lies within the silhouette on all the camera views; therefore, a view having errors in its silhouette could spoil the quality of the entire visual hull (Nobuhara et al. 2009). Several authors who have applied the visual hull do not mention the silhouette extraction method used in their studies (Mundermann et al. 2005; Nobuhara and Matsuyama 2006), while others have used basic silhouette extraction methods (Corazza et al. 2006; Vlasic et al. 2008; Furukawa and Ponce 2006; Lazebnik et al. 2007). The main reason such basic methods for silhouette extraction were used for the visual hull in these publications is simply because the were published before the advanced methods available today had been developed. Nowadays there are several methods that can rival the silhouette segmentation accuracy of humans (Goyette et al. 2012) and which take as little as a hundred milliseconds to extract a silhouette from a large high-quality image (Bakkay et al. 2018). It is our hope that, by presenting a detailed review of the methods available today for accurate silhouette extraction, future researchers in biomechanics will be able to make a more informed decision with regards to which silhouette extraction method to choose when using the visual hull as a tool for markerless motion capture.