Self-supervised learning – Knowledge and References

Explore chapters and articles related to this topic

Neural Networks for Medical Image Computing

Published in K. Gayathri Devi, Kishore Balasubramanian, Le Anh Ngoc, Machine Learning and Deep Learning Techniques for Medical Science, 2022

V.A. Pravina, P.K. Poonguzhali, A Kishore Kumar

The images which were not labeled were utilized effectively by formulating an algorithm that uses self-supervised learning. The image features are helpful in image analysis. The context restoration is executed in ultrasound images, CT images, and magnetic resonance images [2]. During the analysis stage, feature maps were extracted from the input images. It consisted of convolutional units. There were downsampling layers in this stage. The reconstruction stage consisted of convolutional units. There were upsampling units present in this stage. It restored context information in the output images. Thus in all three cases, the algorithm yielded an improved performance.

Big Data Analytics and Machine Learning for Industry 4.0: An Overview

View Chapter

Purchase Book

Published in G. Rajesh, X. Mercilin Raajini, Hien Dang, Industry 4.0 Interoperability, Analytics, Security, and Case Studies, 2021

Nguyen Tuan Thanh Le, Manh Linh Pham

A specific kind of supervised learning is self-supervised learning, in which the machine learns without human-annotated labels [19]. There are still associated labels, but they are generated from the input data typically using a heuristic algorithm [19].

Height estimation from single aerial imagery using contrastive learning based multi-scale refinement network

View Article

Journal Information

Published in International Journal of Digital Earth, 2023

Wufan Zhao, Hu Ding, Jiaming Na, Mengmeng Li, Dirk Tiede

Due to the fact that the majority of images are unlabeled, much research has been conducted to optimize neural network training using a small number of annotated datasets. Self-supervised learning creates a pretext task using just unlabeled input in order for the network to acquire valuable visual representations from the image prior to performing task-specific supervised learning operations like classification or object detection. Doersch, Gupta, and Efros (2015) divides an image into many nonoverlapping patches and trains neural networks to estimate their relative placements. In Gidaris, Singh, and Komodakis (2018), a rotation pretext is added that rotates the input image, and the network predicts the amount of rotation applied. Gao, Sun, and Liu (2022) proposed a self-supervised approach for scene classification, which leverages the spatial relationship between object proposals and their context. Similarly, Li et al. (2022) proposed a contrastive learning approach for semantic segmentation of aerial images, which learns to segment objects based on their visual similarity and spatial context.

A Corneal Surface Reflections-Based Intelligent System for Lifelogging Applications

View Article

Journal Information

Published in International Journal of Human–Computer Interaction, 2023

Tharindu Kaluarachchi, Shamane Siriwardhana, Elliott Wen, Suranga Nanayakkara

Self Supervised Learning is a type of representation learning that aims to learn how to represent a given data distribution with lower dimensions. The specialty of SSL is that it can do it without having annotated data using a certain set of tasks called pretext tasks. For image data distributions, these tasks can be as simple as predicting if an image is upright or rotated by 90°, 180°, or 270°. Some of such tasks are rotation recognition (Gidaris et al., 2018), soling a 3 × 3 jigsaw puzzle (Kim et al., 2018), and determining a relative patch location (Doersch et al., 2015) and identifying if two modified image segments are from the same image or different images. Labels for these tasks can be automatically generated (self-generated) while the training is happening. This is the main reason for identifying this as self-supervised rather than unsupervised. These pretext tasks help the network learn the general nature of the data distribution which can be later used as the starting point for a downstream task like object classification. This can be done in two ways. The first one is to use the SSL-trained network and finetune it and the second one is to use the frozen network as a feature extractor and use those features to train a smaller network (see Figure 3). SSL training of the model is generally recognized as pretraining as another round of training or finetuning is expected later. The second stage of SSL only requires a very small number of annotated data as the network is already familiar with the data distribution. In our work, we use self-supervisory signals to pretrain the object recognition model.

Self-supervised optical flow derotation network for rotation estimation of a spherical camera

View Article

Journal Information

Published in Advanced Robotics, 2021

Dabae Kim, Sarthak Pathak, Alessandro Moro, Atsushi Yamashita, Hajime Asama

In this paper, we proposed a self-supervised learning approach for rotation estimation of a spherical camera. In general, fully supervised learning approaches require a large amount of labeled data, which are difficult to acquire. By contrast, our self-supervised learning approach can accomplish the training without using any labeled data. This approach is unique to spherical cameras owing to their property that optical flow can be derotated for decoupling rotational and translational optical flow components. For the regression of the camera rotation, we adopted the optical flow moment, which comprises the derotated optical flow. We experimentally confirmed that the estimation error of our approach was decreased comparing to the previous SfMLearner approach, and that the performance of our approach was comparable with that of the fully supervised learning approach. This implies that our approach could effectively estimate the camera rotation without using any labeled data. In addition, several ablation studies demonstrated that the batch normalization contributed to the improvement of the estimation performance and that the optical flow acted as robust training data rather than raw images. Finally, transfer learning with newly captured datasets was conducted to confirm the performance improvement.