Explore chapters and articles related to this topic
The Application of Text Mining in Detecting Financial Fraud: A Literature Review
Published in Deepmala Singh, Anurag Singh, Amizan Omar, S.B. Goyal, Business Intelligence and Human Resource Management, 2023
Pratibha Maurya, Anurag Singh, Mohd Salim
Text is a common means of data exchange in the modern world. Text mining encompasses a variety of subfields, including natural language processing (NLP), information retrieval, web mining, computational linguistics, data extraction, and data mining. Automated structured data extraction from unstructured and semi-structured materials was accomplished through the use of text mining (Kautish, 2008, Kautish and Thapliyal, 2013). Commercially, it is rather valuable. It is a novel technique for analysing massive sets of formless documents with the goal of extracting knowledge or non-trivial patterns. Document files come in a variety of forms, including text files, flat files, and PDF files. These files were assembled from a number of sources, including message boards, newsgroups, emails, online chat, text messages, and websites (Bagale et al., 2021). Humans are capable of rapidly resolving problems and of identifying and applying linguistic patterns to text (Singh & Gite, 2015). On the other hand, computers are incapable of handling difficulties, such as spelling, context, slang, and variation. Nonetheless, our language abilities and computing capabilities enable us to analyse text quickly or in enormous quantities in order to grasp unstructured data. A computer can analyse unstructured data using the text-mining technique. Fraud detection is a priority for financial sector organisations (Figure 12.1).
Role of Artificial Intelligence in COVID-19
Published in Salah-ddine Krit, Vrijendra Singh, Mohamed Elhoseny, Yashbir Singh, Artificial Intelligence Applications in a Pandemic, 2022
S. Lalitha, H. T. Bhavana, K. N. Madhusudhan, Prascheth, Harshitha
Structured data analysis is based on machine learning and deep learning algorithms. The unstructured data analysis relies on Natural Language Processing algorithms. Figure 1.1 shows the ML and DL processing in healthcare systems. Machine learning algorithms usually extract features from available data known as patients’ “traits,” and is the outcome of patients’ diagnosis. ML has different algorithms that serve different purposes [8]. Some of the algorithms and techniques used in development of machine learning models for COVID-19: Support Vector MachineNeural NetworksDecision treesKNN (K Nearest Neighbours)Logistic regressionRandom forestLinear regression
Data Lakes: A Panacea for Big Data Problems, Cyber Safety Issues, and Enterprise Security
Published in Mohiuddin Ahmed, Nour Moustafa, Abu Barkat, Paul Haskell-Dowland, Next-Generation Enterprise Security and Governance, 2022
A. N. M. Bazlur Rashid, Mohiuddin Ahmed, Abu Barkat Ullah
Variety refers to the diverse type of data – structured, semi-structured, or unstructured. Structured data constitutes of about 5% of all existing data, and refers to the tabular data in relational databases or spreadsheets. In contrast, unstructured data usually lacks the structure organization required for analysis purposes. Audio, video, text, and images are examples of unstructured data. Semi-structured data lies in between the structured and unstructured data, and does not follow any strict standards. A typical example of semi-structured data is the Extensible Markup Language (XML), which is a textual language for exchanging data on the Web containing machine-readable user-defined data tags. According to IBM, 80% of data is unstructured [13].
Discussion of “Experiences with big data: Accounts from a data scientist’s perspective”
Published in Quality Engineering, 2020
Timothy J. Robinson, Richard C. Giles, Rasika U. Rajapakshage
KFKRS have provided an excellent overview of the challenges associated with accessing and storing data. As we consider data acquisition and storage, it’s important to note that there are two general data types: structured and unstructured data. Structured data are data that can be organized in a spreadsheet or in a relational database. A classic example might be data generated from a factorial design where the columns would make up the experimental factors and the response(s) of interest and the rows would represent level combinations of the factors and the resulting values of the response variable(s). Unstructured data on the other hand, is data that is not organized in a pre-defined manner and is not generated from a pre-defined data model. Common examples of unstructured data include email, social media posts, MP3 files, photos, video, imaging data, PDF files of text data, sensor data and many others. It is estimated that today, nearly 80% of all data that is generated is unstructured in nature [see for example, Sint et al. (2009), and Rogers (2019)].
Big Data technologies to process spatial and attribute data when designing and operating mine-engineering systems
Published in International Journal of Image and Data Fusion, 2019
Yuri A. Stepanov, Alexander V. Stepanov
Standard examples of unstructured data include files of the documents, electronic messages, audio files, digital images, etc. Although all these files have some structure (for example, electronic messages contain address, subject, ‘body’ of the letter, etc.), usually they are stored in the form that makes it impossible to classify them in an easy or logical manner, as opposed to the data obtained by entering information into electronic forms as a result of computations or any other computer transactions, during which sets of structured information are automatically created. Each set of data is eventually systemised and transformed into a structured set of data.
Systematic Survey: Secure and Privacy-Preserving Big Data Analytics in Cloud
Published in Journal of Computer Information Systems, 2023
Arun Amaithi Rajan, Vetriselvi V
Unstructured data refer to information that is without any predetermined conceptual definitions and is difficult for traditional databases or data models to interpret or analyze. The majority of big data are made up of unstructured data, which include facts, dates, and numbers. Examples of this kind of big data include satellite imaging, mobile activities, audio, and video files. The amount of unstructured data are expanding as a result of the Instagram photos and YouTube videos we publish and watch.