Explore chapters and articles related to this topic
The Uses of Big Data in the Health Sector
Published in Soraya Sedkaoui, Mounia Khelfaoui, Nadjat Kadi, Big Data Analytics, 2022
Fatima Mana, Redouane Ensaad, Djazia Hassini
The term “big data” appeared in 2000, and the term refers to the raw material of information prior to sorting, arrangement, and processing and cannot be used in its initial form before processing. Information is data that have been processed, analyzed, and interpreted and can be used to develop different relationships between phenomena and decision-making, and raw data can be divided into three types [1]: Structured Data: Data organized in the form of tables or databases for processing.Unstructured Data: The largest proportion of data is the data that people generate daily from text, images, videos, e-mails, etc.Semi‑Structured Data: It is a type of structured data, but the data are not designed in tables or databases.
Data Lakes: A Panacea for Big Data Problems, Cyber Safety Issues, and Enterprise Security
Published in Mohiuddin Ahmed, Nour Moustafa, Abu Barkat, Paul Haskell-Dowland, Next-Generation Enterprise Security and Governance, 2022
A. N. M. Bazlur Rashid, Mohiuddin Ahmed, Abu Barkat Ullah
Variety refers to the diverse type of data – structured, semi-structured, or unstructured. Structured data constitutes of about 5% of all existing data, and refers to the tabular data in relational databases or spreadsheets. In contrast, unstructured data usually lacks the structure organization required for analysis purposes. Audio, video, text, and images are examples of unstructured data. Semi-structured data lies in between the structured and unstructured data, and does not follow any strict standards. A typical example of semi-structured data is the Extensible Markup Language (XML), which is a textual language for exchanging data on the Web containing machine-readable user-defined data tags. According to IBM, 80% of data is unstructured [13].
Role and Support of Image Processing in Big Data
Published in Ankur Dumka, Alaknanda Ashok, Parag Verma, Poonam Verma, Advanced Digital Image Processing and Its Applications in Big Data, 2020
Ankur Dumka, Alaknanda Ashok, Parag Verma, Poonam Verma
The data volume can be quantified in terms of size as terabyte, petabyte, etc. as well as it can be quantified in terms of number of records, tables, transactions, and number of files. The reason for a large amount of data can be justified by the number of sources they come from such as logs, clickstream, social media, satellite data, etc. Data can be divided as unstructured, semi-structured, and structured data. The unstructured data are data such as text, human languages, etc., whereas the semi-structured data are data such as eXtensible Markup Language (XML) or Rich Site Summary feeds, and when these unstructured and semi-structured data are converted in a form that can be processed by a machine, then it is termed as structured data. There are some data which are hard to be categorized such as audio or video data. There are also streaming data which are available on real-time basis. There is also another type of data which is multi-dimensional data that can be drawn from a data warehouse to add historic context to big data. Thus, with big data, variety is just as big as volume. The velocity of data is the speed or frequency of generation or delivery of data.
A semantic model for enterprise application integration in the era of data explosion and globalisation
Published in Enterprise Information Systems, 2023
H.Y. Yu, Akinola Ogbeyemi, W.J. Lin, Jingyi He, Wei Sun, W.J. Zhang
The modern era of data explosion and globalisation has several important features on manufacturing as well as service systems and their combination, see the definition of these systems in (Wang et al., 2014a). The first feature is that the data representation has a mixture of formats, including structured, semi-structured and unstructured ones. A structured data or data represented in a structured format that the semantic of data can be understood by computers. The format of data expression by humans in their natural way is unstructured with respect to computers, that is, computers cannot understand the unstructured data if no ‘translator’ is available. Any format between the structured and unstructured data is semi-structured data in a format, e.g., XML, HTML,1 etc. The second feature is that the amount of data is huge, uncertain,2 and volatile, and data creation rate is high, which refers to the so-called big data (Singh 2019; Wigan and Clarke 2013). It is noted that big data usually takes an unstructured data format. The third feature is the emerging of the concept of cloud computing (Langmead and Nellore 2018; Rajaraman 2014). The fourth feature is that not only data of manufacturing and service systems but also data of politic and economic policies play a role in decision making with manufacturers or service providers. This is the reason that globalisation is today highly constrained.
Data-driven Begins with DATA; Potential of Data Assets
Published in Journal of Computer Information Systems, 2022
Hannu Hannila, Risto Silvola, Janne Harkonen, Harri Haapasalo
Data assets of a company are classifiable based on their nature as structured, semi-structured, and unstructured. An online tech dictionary32 explain these as follows: 1) Structured data is stored, processed, and accessed based on the data model. Storing format is typically in tables in a database and managed using Structured Query Language (SQL). 2) Semi-structured data is a type of structured data but lacks the strict data model structure. 3) Unstructured data is information that does not reside in a traditional row-column database and often includes text and multimedia content. 80–90% of enterprise data is estimated to be unstructured.32 Unstructured data is often associated with big data first described by Laney1, who provided three dimensions of high V’s; volume (the size of the data), velocity (changing rate of the data), and variety (different data formats and types, structured and non-structured). Afterward, some other V dimensions have been provided, such as veracity, variability, and value.29 Big data is also often referred to as semi-structured data, such as XML standard or corresponding or unstructured data, such as Weblogs, social media data, and real-time data, such as event data, spatial data, data generated by machines. [26 p642, 45].
Schema on read modeling approach as a basis of big data analytics integration in EIS
Published in Enterprise Information Systems, 2018
Slađana Janković, Snežana Mladenović, Dušan Mladenović, Slavko Vesković, Draženko Glavić
XML (eXtensible Markup Language) has been a de facto standard for the exchange of information in the past two decades and, consequently, it also plays a major role in the field of data integration. XML files are a typical example of semi-structured data (Gandomi and Haider 2015). Modern data integration software enables the transformation of data from XML files into other types of data warehouses (Big Data included) and vice versa. Other self-documenting data interchange formats that are popular include JSON (Java Script Object Notation).