Generative pre-trained transformer – Knowledge and References

Explore chapters and articles related to this topic

Quantum Artificial Intelligence for the Science of Climate Change

Published in Thiruselvan Subramanian, Archana Dhyani, Adarsh Kumar, Sukhpal Singh Gill, Artificial Intelligence, Machine Learning and Blockchain in Quantum Satellite, Drone and Network, 2023

Manmeet Singh, Chirag Dhara, Adarsh Kumar, Sukhpal Singh Gill, Steve Uhlig

AI algorithms suffer from two main problems: one is the availability of good quality data and the other is computational resources for processing big data at the scale of planet Earth. The impediments to the growth of AI-based modelling can be understood from the way language models have developed in the past decade. In the early days of their success, developments were limited to computer vision, while Natural Language Processing (NLP) lagged behind. Many researchers tried to use different algorithms for NLP problems but the only solution that broke ice was increasing the depth of the neural networks. Present-day Generative Pre-trained Transformer (GPT), Bidirectional Encoder Representations from Transformers (BERT) and Text-To-Text Transfer Transformer (T5) models are the evolved versions from that era. Maximizing gains from the rapid advances in AI algorithms requires that they be complemented by hardware developments; quantum computing is an emerging field in this regard [14,15].

An Unexpected Renaissance Age

View Chapter

Purchase Book

Published in Alessio Plebe, Pietro Perconti, The Future of the Artificial Mind, 2021

Alessio Plebe, Pietro Perconti

Transformer proved to be the turning point, boosting DL for language, with a rapid progression in a few years. The BERT (Bidirectional Encoder Representations from Transformers) model by Devlin et al. (2019), as the name reveals, applies attention in the decoder section to both the left and the right side of the output sentence. In the original Transformer, the attention score is first computed for each word in the sentence with respect to all the other words, but then in the decoder the output words are generated one at time, and the attention computed by the encoder is used only with respect to previous words. But above all, BERT establishes a new general approach to most language application tasks. There is a basic model that grasps as much as possible knowledge about a language by training on huge text corpora, up to several billion words. One or a few more additional layers are then added on this model, and a second training, to fine tune the model for a custom language task, is performed. Almost every natural language processing task can be performed, with the advantage of a deeper and intimate understandings of how the language works. Tasks can include sentence classification, question answering systems, named entity recognition, automatic summarization, sentiment analysis, conversation, and machine translation. A similar approach is followed by OpenAI’s GPT (Generative Pre-trained Transformer) Brown et al. (2020), the last version of which, GPT-3, generated the article on The Guardian shown at the beginning of this section.

Working Machines

View Chapter

Purchase Book

Published in Robert H. Chen, Chelsea Chen, Artificial Intelligence, 2022

Robert H. Chen, Chelsea Chen

In 2020, the Elon Musk founded OpenAI and announced its Generative Pre-trained Transformer version 3 (GPT-3) that could not only write computer programs, but through enormous database training from crawling the Internet and billions of parameters. its algorithms, through reinforcement and unsupervised learning, could compose prose and poetry, and indeed write any text up to 50,000 words.

Improving social media use for disaster resilience: challenges and strategies

View Article

Journal Information

Published in International Journal of Digital Earth, 2023

Nina S. N. Lam, Michelle Meyer, Margaret Reams, Seungwon Yang, Kisung Lee, Lei Zou, Volodymyr Mihunov, Kejin Wang, Ryan Kirby, Heng Cai

To enable social media research by a wide range of researchers with diverse perspectives, some of the technical challenges will need to be addressed. First (Strategy 17), developing user-friendly open-source software tools for social media data collection and processing is a fundamental step. The tools should be made available to the end-users (e.g. social scientists, first responders) with minimal technical details (e.g. server configurations, programming using APIs) so that they can focus on their core studies/responsibilities. The tools should be made to allow the users to select keywords and regions of interest using a graphical user interface (GUI) to start collecting social media posts in real time. One critical requirement for the tools will be to have an ability to integrate with the latest developments in natural language processing (e.g. BERT, GPT: Generative Pre-trained Transformer) for analyzing social media posts. Building a social media cyberinfrastructure that contains essential data and algorithms will help generate more research that would lead to better understanding of social media data use for disaster response and management.

Recent advances in artificial intelligence for video production system

View Article

Journal Information

Published in Enterprise Information Systems, 2023

YuFeng Huang, ShiJuan Lv, Kuo-Kun Tseng, Pin-Jen Tseng, Xin Xie, Regina Fang-Ying Lin

GPT (Generative Pre-trained Transformer) is a powerful language model that has been widely used for automatic story generation. The source code for GPT-based story generation can be found on GitHub, a popular platform for hosting and sharing code repositories. By accessing the GPT source code on GitHub, developers and researchers can understand the inner workings of the model, explore different techniques for fine-tuning GPT on story datasets, and experiment with novel approaches for generating coherent and engaging narratives. By leveraging the GPT source code on GitHub for automatic story generation, developers and researchers have access to a powerful tool for creating compelling narratives. They can explore the possibilities of AI-assisted storytelling and advance the field of automated narrative generation.

ChatGPT versus engineering education assessment: a multidisciplinary and multi-institutional benchmarking and analysis of this generative artificial intelligence tool to investigate assessment integrity

View Article

Journal Information

Published in European Journal of Engineering Education, 2023

Sasha Nikolic, Scott Daniel, Rezwanul Haque, Marina Belkina, Ghulam M. Hassan, Sarah Grundy, Sarah Lyden, Peter Neal, Caz Sandison

OpenAI’s ChatGPT (officially Chat Generative Pre-Trained Transformer) released its popular GPT-3 version in October 2020, following the release of GPT-2 in February 2019 and GPT-1 in 2018. ChatGPT is a Large Language Model (LLM) that uses a form of NLP called ‘unsupervised learning’ to generate its responses. This involves training the model on large amounts of text data to learn patterns and relationships between words and phrases. When presented with a new prompt or question, ChatGPT uses its learned knowledge to generate a response that is contextually relevant and grammatically correct (OpenAI 2023b; Bubeck et al. 2023). The first model was based on 117 million parameters, the second on 1.5 billion parameters, and the third version (used in this study) on 175 billion parameters (OpenAI 2023c). As can be seen, the increase in training parameters in such a short time has been substantial. The size of training parameters is important because the software uses machine learning to autonomously learn (van Dis et al. 2023). With the increase in training size, GPT-3 can now capture even more complex patterns and relationships in language, resulting in more sophisticated and nuanced responses.