Scale space – Knowledge and References

Explore chapters and articles related to this topic

Basic Approaches of Artificial Intelligence and Machine Learning in Thermal Image Processing

Published in U. Snekhalatha, K. Palani Thanaraj, Kurt Ammer, Artificial Intelligence-Based Infrared Thermal Image Processing and Its Applications, 2023

U. Snekhalatha, K. Palani Thanaraj, Kurt Ammer

SIFT algorithm was developed in 2004 by D. Lowe. The algorithm works in four ways of scale-space pixel selection: keypoint localization, orientation assignment, keypoint descriptor, and keypoint matching (Lowe, 2004). Scale-space of a picture is given by a task M (x, y, α) which is created from the convolution of Gaussian kernel in varying scales with the input picture, where (x, y) represents the pixel coordinates of Gaussian space and α indicates the smoothening parameter. The scale-space is split into octaves and the integer of octaves and scale depends also on the size of the actual image. A SIFT feature is a chosen picture area (also known as a keypoint) accompanied by a description. The SIFT detector extracts keypoints and the SIFT descriptor computes their descriptors. It is also typical to employ the SIFT detector and the description separately. A SIFT keypoint is a circular, orientated picture area. It is characterized by a geometric frame with four parameters: x- and y-coordinates of the keypoint’s center, its scale (the region’s radius), and its orientation. As keypoints, the SIFT detector employs picture structures that resemble “blobs.” The SIFT detector is insensitive to translation, rotation, and rescaling of the picture since it searches for blobs at different scales and places.

Review of Image Tampering Detection Techniques

View Chapter

Purchase Book

Published in S. Ramakrishnan, Cryptographic and Information Security, 2018

V. T. Manu, B. M. Mehtre

In a given image, the ‘interest points’ are computed at characteristic locations like corners, blobs, and T-junctions [2]. The most important property is their repeatability. This means we are able to reliably locate the same interest points under various viewing conditions. This is followed by the representation of its neighborhood by a feature vector. In essence, the components of keypoints are detectors and descriptors. The Harris detector is believed to be the pioneering work in keypoint detection, which featured a combined corner and edge detection scheme [3]. The keypoint detection algorithms commonly used in the literature employ a concept called scale-space representation. This idea was proposed by [4] and [5] and considered to be seminal works. Scale-space representation essentially treats an image under consideration into different scales followed by the use of a Gaussian or similar filter to identify points which are dominant among its neighbors in these scales.

IoT-Based Smart Surveillance: Role of Sensor Data Analytics and Mobile Crowd Sensing in Crowd Behavior Analysis

View Chapter

Purchase Book

Published in Khan Pathan Al-Sakib, Crowd-Assisted Networking and Computing, 2018

Sabu M. Thampi, Elizabeth B. Varghese

One of the common feature extraction methods is scale-invariant feature transform (SIFT) [29–31]. SIFT helps to detect the location of interest points as maxima or minima of the difference of Gaussians (DoG) in scale space. SIFT uses the scale space extrema detection by DoG for locating the potential key points. The key points are then localized by Taylor series expansion of the scale space, and the edges are removed using a Harris corner detector. After this step, only the strong interest points remain. Then orientation histograms are taken of the interest points to create key points with the same location and scale. Finally, the key point descriptor is found by taking the orientation histogram of the 16 ×16 neighborhood around the key point, which is represented as a vector. SIFT is ideal for feature matching, but it fails in complex scenarios such as dense crowds, as the number of objects in the scene is very large.

Research status and prospect of visual image feature point detection in body measurements

View Article

Journal Information

Published in The Journal of The Textile Institute, 2022

Wenqian Feng, Yanli Hu, Xinrong Li, Yuzhuo Li

The SIFT algorithm was proposed by David Lowe in 1999, and then refined in 2004 (Lowe, 2004). SIFT is a widely used feature point recognition methods, and has been successfully applied to computer vision algorithms such as target detection, target tracking, and large-scale image retrieval (Acharya et al., 2018). The SIFT algorithm is divided into four steps: scale space extreme value detection, feature point location, determination of feature point direction, and construction of feature point descriptors. For the detection of the feature points of a 2D image of body measurements, it is only necessary to focus on the first two steps of the algorithm and locate the feature points. Using the Gaussian blur operation in different scale spaces, that is, a Gaussian kernel and the image convolution operation, the image interval points are sampled and an image pyramid is constructed. This Gaussian pyramid is then used to subtract the upper and lower layers in each group to obtain the Gaussian difference image, and then the difference-of-Gaussians (DOG) function is applied to detect and locate key points (Xu et al., 2019). The advantage of this algorithm is that it is not affected by the light level. In the case of complex, noisy scenes, it detects prominent feature points such as edge points, corner points, and points with different brightness values in bright and dark areas.

Fast binary shape categorization

View Article

Journal Information

Published in The Imaging Science Journal, 2019

Insaf Setitra, Slimane Larabi

The scale space of an image is defined as a function, , produced from the convolution of a variable-scale Gaussian, , with the input image, . When the Gaussian parameter σ is small, only small curvatures are removed. With further scale space, i.e. with a larger Gaussian parameter σ, the larger curvatures are smoothed. To detect curvatures of different levels, we apply the difference of Gaussian between two consecutive scales as in [14]. The difference of Gaussian images, , can be computed from the difference of two nearby scales separated by a constant multiplicative factor k. As the Gaussian images can be computed in advance, is computed as the difference between and . Figure 1 shows an original image convolved using a set of Gaussian kernels separated by a constant .

A region based remote sensing image fusion using anisotropic diffusion process

View Article

Journal Information

Published in International Journal of Image and Data Fusion, 2022

Bikash Meher, Sanjay Agrawal, Rutuparna Panda, Ajith Abraham

The scale space technique (Babaud et al. 1986) produces coarser resolution images by convolving the original image with a Gaussian kernel. However, this method has major disadvantage: it is difficult to get precisely the locations of the meaningful edges at the coarser scale. Perona and Malik (1990) proposed a new definition of the scale space technique using a diffusion process called the AD process. It has transformed the application of the heat equation to digital images. AD is used to pre-process the image to efficiently retain the image texture details. The image is viewed as a heat field, with each pixel acting as a heat flow. It is determined whether to diffuse to the surroundings based on the relationship between the current pixel and the surrounding pixels. When the distance between the current pixel and the surrounding pixels is large, the surrounding pixels may form a boundary. The current pixel will not diffuse to the boundary as a result and the boundary will be preserved. Furthermore, it is known that the isotropic diffusion uses inter-region smoothing where the edges are not detected properly. The AD process overcomes this problem. It employs intra-region smoothing. The benefit of this approach is that, at every coarser resolution, the edges are sharp. The concept of AD has been applied in image processing to divide the image. It separates the pixels of the source image into two regions: homogeneous (base layer) and non-homogeneous (detail layer). The base layers are obtained by processing the input images using the AD technique. The detail layers are obtained by subtracting the base layers from the source images.