Active vision – Knowledge and References

Explore chapters and articles related to this topic

MEMS Fabrication

Published in Mohamed Gad-el-Hak, MEMS, 2005

One example of a smart pixel configuration is embodied in the retina chip shown in Figure 3.28 [IMEC, 1994]. This is an integrated circuit chip working like the human retina to select out only the necessary information from a presented image to greatly speed up image processing. The chip features 30 concentric circles with 64 pixels each. The pixels increase in size from 30 × 30 μm for the inner circle to 412 × 412 μm for the outer circle. The circle radius increases exponentially with eccentricity. The center of the chip, called the fovea, is filled with 104 pixels measuring 30 × 30 μm placed on an orthogonal pattern. The total chip area is 11 × 11 mm. The chip is designed for those applications in which real-time coordination between the sensory perception and the system control is of prime concern. The main application area is active vision, and its potential application is expected in robot navigation, surveillance, recognition, and tracking. The system can cover a wide field of view with a relatively low number of pixels without sacrificing the overall resolution and leads to a significant reduction in required image processing hardware and calculation time. The fast, but rather insensitive, large pixels on the rim of the retina chip pick up a sudden movement in the scenery swiftly (peripheral vision), prompting the robot equipped with this “eye” to redirect itself in the direction of the movement in order to focus better on the moving object with the more sensitive fovea pixels.

Planning and robotics

View Chapter

Purchase Book

Published in Janet Finlay, Alan Dix, An Introduction to Artificial Intelligence, 2020

Janet Finlay, Alan Dix

Controlling a robot’s limbs is not intrinsically difficult, but typically involves a complicated series of translations between co-ordinate systems. Feedback can be used to compensate for slackness and inaccuracy and also allow local planning. It allows closed-loop control, which is more robust than preplanned open-loop control. Pressure feedback is especially useful, as it allows compliant motion to be used to position objects. Many mobile robots use wheels or tracks, but some walk on one, two or more legs. Again, it is usually best not to preplan movements, but instead constantly start to fall over and recover. Active vision uses the movement of the robot or camera adjustments to give more information about a scene and resolve ambiguities.

Active vision systems and catadioptric vision systems for robotic applications

View Chapter

Purchase Book

Published in João Manuel, R. S. Tavares, R. M. Natal Jorge, Computational Modelling of Objects Represented in Images, 2018

J. Araújo Hélder

The issue of integrating panoramic and omnidirectional vision systems into robotic devices has to take into account requirements related to the relevant information to be extracted (and its usefulness for the tasks that have to be performed). In many robotic applications the difficulties that researchers faced when trying to extract information from images (or sequences of images) whose conditions of acquisition were not controlled, led to the development of solutions based on active systems. In these systems several camera parameters can be actively changed. This type of solutions was a result from the realization that the extraction of information from the images could be facilitated if the acquisition parameters could be changed to suit the specific nature of the data to be extracted. This kind of approaches was strongly influenced by the views of the psychophysicist J. J. Gibson (Gibson 1983; Gibson 1987). Gibson developed the concept of “ecological perception”. According to his view it would be a mistake to consider the problems of visual perception from an abstract and static point of view. From his point of view perception is entirely dependent on a “matrix of stimuli” and perception is a direct consequence of the properties of the environment. Gibson’s ideas require active organisms. Perception is a function of the interaction between the organism and the environment. These ideas were first adopted in computer vision by Ruzena Bajcsy who first proposed the concept of active perception in (Bajcsy 1985). In 1987, in a paper presented at the “First International Conference on Computer Vision (ICCV)” Yiannis Aloimonos and co-workers proposed the concept of active vision (Alioimonos et al. 1988) and formally proved the advantages of the approach. In particular they proved that an active observer can transform problems such as shape-from-shading, shape-from-contour and shape-from-texture from ill-posed problems into well-posed problems. Two other papers were important for the establishment of the solutions based on active vision namely (Bajcsy 1988) and (Ballard 1991). In both these papers both Bajcsy and Ballard defined active vision as being a sequential and interactive process for selecting and analyzing parts of a scene. Indeed, the main idea is that such a process can decrease the amount of computation required by the visual process, since it allows the reduction of the amount of information to be processed by selecting the features of the visual scene that are relevant to the task at stake. As a matter of fact active vision gets inspiration from the methods and processes that both mammals and insects use to extract information from the environment. The wide variety of the biological vision systems suggests that the solutions are highly dependent on the specific task or problem to be dealt with.

Viewpoint optimization for aiding grasp synthesis algorithms using reinforcement learning

View Article

Journal Information

Published in Advanced Robotics, 2018

B. Calli, W. Caarls, M. Wisse, P. Jonker

Active vision methods are utilized in many robotics applications, e.g. surveillance, inspection, object recognition, tracking, path planning (a comprehensive list and a comparison of methods can be found in [17] and [18], respectively). Among these application, the next best view planning problem for object recognition (active object recognition) [19] has resemblance to the viewpoint optimization for grasp synthesis. Active object recognition algorithms provide methods to alter the viewpoint of the vision sensor in order to obtain a descriptive view of a 3D object for recognizing it. One of the common solutions in literature is view selection by increasing the mutual information [20–22]. By this approach, planning the next viewpoint is conducted by searching the whole action space for the maximum object class and observation mutuality considering the previous actions and observations. Another common approach is increasing the discriminative information among the class predictions by entropy minimization [23,24]. Learning techniques are also utilized for active object recognition, in which a policy that maps states to actions is learned for increasing the discriminative information. A very common way of learning this policy is via reinforcement learning algorithms [25–27]. While designing our viewpoint optimization strategy for grasp synthesis, we applied a similar framework to [27] with the following differences: We train our system specifically for boosting grasp synthesis instead of object recognition performance, which requires a completely different process.Our method is designed to generate brief and continuous motions; the policy does not output discrete camera positions on the viewsphere for the next-best view but generates local exploration directions in the camera coordinate frame.While the camera moves on the viewsphere, we continuously fuse the acquired point clouds in order to benefit from all the available data. In this way, even brief motions can provide the required information for a good grasp. This approach requires a different state description than [27].