Explore chapters and articles related to this topic
Multi-Agent System for Text Mining
Published in Wahiba Ben Abdessalem Karaa, Nilanjan Dey, Mining Multimedia Documents, 2017
Safa Selmi, Wahiba Ben Abdessalem Karaa
The most frequent applications using NLP include the following:Machine translation refers to the automated translation of text from one human language to another assisted by computer [6].Information retrieval (IR) is generally concerned with the representation, storage, organization, and access to information items such as text documents, sound, images, or data [6].Information extraction (IE) is the process of deriving, from digital text documents written in natural language, structured information that expresses relationships between entities and transforming them into a structured representation (e.g., a database) [6].Automatic summarization is the creation of a shortened version of a text by means of a computer program. The generated document contains the most important points of the original document [6].Speech recognition is a computer-driven conversion of a speech signal (i.e., voice) into readable text [6].
Low-Resource Language Document Summarization: A Challenge
Published in Pallavi Vijay Chavan, Parikshit N Mahalle, Ramchandra Mangrulkar, Idongesit Williams, Data Science, 2022
Pranjali Deshpande, Sunita Jahirabadkar
Every day a humongous amount of e-text is generated on the internet in varied domains. Any document is formed by some meaning-bearing sentences and some miscellaneous sentences. In the documentation of certain domains like legal and medical, it is observed that long documents are generated but actual contextual information is very less. In such cases, automatic document summarization proves to be extremely beneficial. Summary is a condensed piece of text produced from one or more documents. Summary contains contextually important information from the original document. Summary of a document should not be not more than half of the original document. Automatic summarization can be done using two approaches: extractive and abstractive. Extractive summary contains the key phrases or sentences, as they are written in the source document. Whereas in abstractive summarization the contextually important key sentences are rewritten by forming a new sentence in the summary. Extractive summarization is simpler than abstractive summarization as only the knowledge and techniques to address natural language understanding tasks are sufficient. In addition, abstractive summarization requires statistical applications for natural language understanding as well as natural language generation. One more level of complexity is added to the summarization task when the source documents are written in LRL. Every language is unique in its own sense. In any natural language processing application, right from the phonetics stage to the pragmatics stage, ambiguities of various forms exist. Generic summarization model building is an impossible task due to this linguistic diversity. In this context, the chapter focuses on the various constraints posed by LRL diversity. A literature survey about various approaches and techniques to handle LRL documents will be discussed in Section 15.2, followed by the probable approaches of summarization of LRL and the conclusion.
Mining Eye-Tracking Data for Text Summarization
Published in International Journal of Human–Computer Interaction, 2023
Meirav Taieb-Maimon, Aleksandr Romanovski-Chernik, Mark Last, Marina Litvak, Michael Elhadad
The increasing volume of data in general and specifically textual data (Rydning, 2018) makes text summarization technologies a necessity (Nenkova & McKeown, 2011; Radev et al., 2002; Zhao et al., 2022) for the last five decades, since the first work of Luhn (Luhn, 1958). Text summarization must perform two main tasks: (a) identify central information in the source document; (b) generate concise text that captures the central information coherently and fluently. Human text summarization is a complex, subjective and time-consuming intellectual activity (Lin, 2004; Salton et al., 1997). When people summarize text, they usually read it entirely to develop an understanding, and then write a summary of its main points (Allahyari et al., 2017). The process of automatic text summarization is aimed at replacing the time-consuming process of manually extracting important information. Traditional summarization methods rely only on the input text to determine what content to convey in the summary and generate the summary most often by extracting elements from the source document (extractive methods), or by generating new text guided by selected content (abstractive methods). However, automatic summarization is a non-trivial task, since computers lack human knowledge and language understanding capability (Allahyari et al., 2017). The supervised text summarization systems (Litvak & Last, 2013; Liu & Lapata, 2019; Makino et al., 2019), which train machine learning algorithms on large text summarization corpora, have only partially overcome these limitations.
ISSE: a new iterative sentence scoring and extraction scheme for automatic text summarization
Published in International Journal of Computers and Applications, 2022
Saeed Hosseinabadi, Manoochehr Kelarestaghi, Farshad Eshghi
Large volumes of texts available on the web, and the ever-decreasing time available to users have made the selection and understanding of related documents a difficult task. Thus, automatic text summarization has emerged as a helping tool for users in the last decade. The goal of text summarization is to preserve the essential information in a shorter version of the original text. Automatic summarization is the process of summarizing the text document with a computer program. Automatic summarizers are classified according to their summarization methods and operation domains (single-document vs. multi-document) [1]. Summarization methods can be further categorized as Extractive or Abstractive. Extractive summarization is the selection and putting together the most important sentences or phrases from the original text to reach a shorter form without changing the selected sentences. Abstractive summarization, on the other hand, uses linguistic methods to interpret the text and generate a shorter version of it. In the latter, new forms of words and sentences might appear in the resulting summary. Text interpreting requires the availability of many Natural Language Processing (NLP) techniques, making it a complicated process.
Toward Better Understanding Older Adults: A Biography Brief Timeline Extraction Approach
Published in International Journal of Human–Computer Interaction, 2023
Ning An, Fang Gui, Liuqi Jin, Hong Ming, Jiaoyun Yang
Text summarization is the task of creating a concise summary to capture essential information from the original texts. There are mainly two types of automatic summarization methods: abstractive and extractive summarization (Saranyamol & Sindhu, 2014). Abstractive summarization methods parse and comprehend the semantic information of input texts and reconstruct the words and sentences to generate summaries. Abstractive summarization methods may generate summaries including words or phrases never existed in the input texts (Khan & Salim, 2014). Extractive summarization methods select the most critical sentences from input texts and use them to generate the summaries (Nenkova & McKeown, 2012).