Transfer learning – Knowledge and References

Explore chapters and articles related to this topic

Physics-Guided Recurrent Neural Networks for Predicting Lake Water Temperature

Published in Anuj Karpatne, Ramakrishnan Kannan, Vipin Kumar, Knowledge-Guided Machine Learning, 2023

Xiaowei Jia, Jared D. Willard, Anuj Karpatne, Jordan S. Read, Jacob A. Zwart, Michael Steinbach, Vipin Kumar

Researchers have explored different ways to physically inform a model starting state. One way to inform the model initialization is to use transfer learning. In transfer learning, a model can be pre-trained on a related task prior to being fine-tuned with limited training data to fit the desired task. The pre-trained model serves as an informed initial state that ideally is closer to the desired parameters for the desired task than random initialization. For example, researchers in computer vision have used large-scale datasets such as ImageNet to pre-train network models with the aim to learn useful feature extractors before having them fine-tuned with target dataset [16]. Similarly, in scientific problems, one way to harness physical knowledge is to use the physics-based model's simulated data to pre-train the ML model, which also alleviates data paucity issues. Here the availability of simulation data is not a limitation which makes it possible to train even highly complex ML models. However, the simulation data may be inaccurate due to approximations and parameterizations used in physics-based models. The pre-trained model is generally expected to do only as well as the physics-based model used for generating the simulation data. Depending on the bias of the physics-based model, varying amounts of true observations may be needed to fine-tune the pre-trained model to a quality model.

Deep Learning Algorithms for Cognitive IoT Solutions

View Chapter

Purchase Book

Published in Pethuru Raj, Anupama C. Raman, Harihara Subramanian, Cognitive Internet of Things, 2022

Pethuru Raj, Anupama C. Raman, Harihara Subramanian

Transfer Learning – Most DL applications use the transfer learning method. The transfer learning method involves fine-tuning a pre-trained model. We start with an existing network such as GoogLeNet and feed in new data to bring in appropriate refinements. These network refinements help to perform a few necessary tweaks to the NN. Once the model reached better accuracy, it can take a new task. The advantage of this learning is that there is no need to provide a large amount of data to arrive at competent models. When the amount of data is not massive, the time, energy, and space complexities are bound to be lesser. Transfer learning needs an exposed interface to the internals of the chosen existing network. These interfaces come in handy so that they can be modified and enhanced for the new task.

A Study and Comparative Analysis of Various Use Cases of NLP Using Sequential Transfer Learning Techniques

View Chapter

Purchase Book

Published in R. Sujatha, S. L. Aarthy, R. Vettriselvan, Integrating Deep Learning Algorithms to Overcome Challenges in Big Data Analytics, 2021

R. Mangayarkarasi, C. Vanmathi, Rachit Jain, Priyansh Agarwal

This section briefly discusses the application of transfer learning for various applications. Previously, those tasks were implemented using machine learning techniques, and, more recently, using Deep Learning architectures. One of the major problems of machine learning is that the models are highly dependent on large amounts of high-quality data. Unfortunately, these datasets are rarely available, and highly expensive to access. With the help of transfer learning, there is less need for high-quality data. Transfer learning is a pre-trained model that has already been trained on a task for which labeled training data is enormous, and which can handle similar tasks with less data. These pre-trained models are often faster than our traditional machine learning models and can contribute to the SOTA in a variety of tasks. This chapter demonstrates the transfer learning models in designing the predictive modeling tasks, such as SA and NER.

Swarm of reconnaissance drones using artificial intelligence and networking

View Article

Journal Information

Published in Australian Journal of Multi-Disciplinary Engineering, 2023

Aashrith A Jain, Akarsh Saraogi, Pawan Sharma, Vibhav Pandit, Shivakumar R Hiremath

MediaPipe Holistic utilises several techniques such as data augmentation, transfer learning, and model ensembling. Data augmentation involves generating additional training data by applying various transformations to the existing data, such as rotation, scaling, and cropping. Transfer learning involves using a pre-trained model as a starting point and fine-tuning it on a new dataset. Model ensembling involves combining the outputs of multiple models to improve the accuracy and reduce the risk of overfitting. To ensure optimal performance and maximum efficiency, the method is configured to utilise only the essential landmarks, such as those of the body and hand, while avoiding the use of superfluous landmarks, which would only increase the complexity and compiling time of the program. It is worth noting that the complete MediaPipe Holistic system includes over 540 landmarks, but the ML model utilises only the landmarks that are necessary for the tasks at hand.

Enhancing deep learning techniques for the diagnosis of the novel coronavirus (COVID-19) using X-ray images

View Article

Journal Information

Published in Cogent Engineering, 2023

Maha Mesfer Meshref Alghamdi, Mohammed Yehia Hassan Dahab, Naael Homoud Abdulrahim Alazwary

An improved ResNet-50 CNN architecture called COVIDResNet was proposed in another study (Farooq & Hafeez, 2020). The experiment was conducted through increasingly re-sizing input images to 128 x 128 x 3, 224 x 224 x 3 and 229 x 229 x 3 pixels and selecting the automatic learning rate for fine-tuning the network at each stage. The results from the work showed high accuracy and computational efficiency for multi-class classification. In another study, a 24 layered CNN model for the classification of COVID-19 and normal images was developed by Panwar et al. (2020). The model was called nCOVnet, and the training of the model involved use of an X-ray data set. The model produced an accuracy of up to 97%. Zhang et al. (Zhang et al) developed a new deep learning supported anomaly detection model for COVID-19 using X-ray images. When the threshold was set to 0.25, their model produced a sensitivity of 90%, and a specificity of 87.84%. Similarly, another transfer learning-based CNN model for detecting COVID-19 was proposed by Narin et al. (2021). They used ResNet50, InceptionV3, and Inception-ResNetV2 pre-trained models for transfer learning. Their simulation results showed that the ResNet50-based model produced the best results. In the following section, we provide details of how our study, which focused on multiclass classification of COVID-19, was carried out.

Classification of brain tumours from MR images with an enhanced deep learning approach using densely connected convolutional network

View Article

Journal Information

Published in Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization, 2023

R. Meena Prakash, R. Shantha Selva Kumari, K. Valarmathi, K. Ramalakshmi

Transfer learning is a machine learning method in which knowledge gained in one task is transferred to learn a new related task. The pre-trained models which are already trained on huge database of images are used and are fine-tuned to classify new data. Hence, it is not necessary to train from the scratch, and the computational complexity is greatly reduced. Transfer learning is successfully being employed in classification of images especially where limited training data are available like medical images. The popular pre-trained models are VGG-16, ResNet50, DenseNet121, Google Net and others which are trained on huge database of ImageNet consisting of more than 1 million images. The VGG16 architecture is shown in Figure 2. It consists of 13 convolutional layers with ReLU activation and three dense layers. The filter size of the convolutional layer is 3 × 3, and the layers are at different depths of 64, 128, 256 and 512. Softmax activation is used in the last FCL (Simonyan and Zisserman 2015).