Explore chapters and articles related to this topic
Advances in Urban Remote Sensing
Published in Guoqing Zhou, Urban High-Resolution Remote Sensing, 2020
Distributed databases. As the data collection point for different storage nodes in the distributed system, the distributed database mainly assigns data storage tasks to each partition according to the workload performance, so as to support the query processing of various data models and improve the throughput and speed of data transmission and storage (Hercezelaya et al. 2019; Muzammal et al. 2019). There are NoSQL (Not Only SQL), NewSQL, SQL Engine, and so on. Among them, NoSQL improves system performance by abandoning transaction ACID semantics and the complex relationship model, which leads to weak data consistency and integrity (e.g., Etcd, HBase, Mongo DB); NewSQL ensures the transaction ACID semantics and relationship model without reducing the scalability and performance of the system (e.g., CockroachDB, TiDB, Spanner). As an extension of NoSQL, the SQL Engine mainly receives and processes tasks based on the NoSQL system (e.g., Hadapt, Hive) (Meng and Ci 2013; Ruan et al. 2019). As shown in Figure 3.2, in the distributed database, each storage node is related to each other in the form of a network, and they are in a relationship of mutual trust, thus forming a trust boundary to judge whether there is a malicious attack. External users need to have data access control to break through the trust boundary to query and process internal data, which is an effective measure to ensure that the data in the database are not damaged (Rani and Sharma 2019).
Storage and databases for big data
Published in Jun Deng, Lei Xing, Big Data in Radiation Oncology, 2019
Tomas Skripcak, Uwe Just, Ida Schönfeld, Esther G.C. Troost, Mechthild Krause
All previously described models are commonly known under the term “not only SQL” (NoSQL). They oppose the relational data stores to resolve the limitation in scale-out scenarios and do not give explicit ACID guarantees. They also excel in analytic workloads; however, their usage in transactional workloads is not always optimal. Each NoSQL solution normally comes with its own data processing APIs, and there is very little standardization within this field (except SPARQL). NoSQL systems are not trivial to use and the missing standard query language leads to a fractioned user base, each often with a small set of specially trained individuals. As a reflection on these issues, NewSQL databases have been developed. They originate on the resurrection of the relational logical data model (Figure 3.4e), ACID compliance, and SQL as a standard interface. Looking at the low-level implementation details, they are, in fact, closer to their NoSQL cousins than to their SQL predecessors in providing competitive horizontal scalability. NewSQL databases can be considered as alternatives to NoSQL systems for transactional as well as analytic workloads, but because they respect the relational logical model of data, they may not fit scenarios where schema on write is required.
Security, integrity, and privacy of cloud computing and big data
Published in Muhammad Imran Tariq, Valentina Emilia Balas, Shahzadi Tayyaba, Security and Privacy Trends in Cloud Computing and Big Data, 2022
Muhammad Salman Mushtaq, Muhammad Yousaf Mushtaq, Muhammad Waseem Iqbal, Syed Aamer Hussain
Conventionally relational databases (SQL databases, RDBMS) have been used along with structure data management techniques [25]. However, these are useful in the case of small datasets with a specific type of data. For the topic at hand, these databases are incapable of achieving the task. The traits of big data that is velocity, volume, and variety of data are beyond the capabilities of conventional relational databases. The other type of database used is a non-relational database that is NoSQL databases (not only SQL), NewSQL, and file systems. These databases are distributed in nature and are scalable. NoSQL is the term used for non-relational databases that include different models like key-value pair model, document model, and column-family model. The key-value pair model is a simple model with minimal data collision and a simple programming model. The document model is similar using a key to identify every document; however, unlike key-value model data in the document model can be queried. Compared to them, column models are developed based on Google Bigtable [26]. NewSQL is another kind of relational distributed databases with SQL capabilities and also has the scalability of NoSQL [27]. NewSQL include spanner and MemSQL databases [28,29]. In the file system category, Google has a scalable file system for distributed data storage named Google file system (GFS) [30] and Apache has its file system named Hadoop distributed file system (HDFS) [31]. GFS is meant for large-scale user data storage, while HDFS is the most used big data system that supports redundancy, consistency, and scalability in the case of parallel distributed architectures.
NewSQL Database Management System Compiler Errors: Effectiveness and Usefulness
Published in International Journal of Human–Computer Interaction, 2022
As explained in Section 2, we deemed comparing DBMSs utilizing SQL with DBMSs utilizing some other query language difficult for internal validity. On the other hand, a recent study (Taipalus et al., 2021) compared SQL compiler usability of traditional RDBMS. For these reasons, in this study, we chose to focus on NewSQL systems using the SQL compiler usability framework reported in a previous study (Taipalus et al., 2021). We deemed it more interesting to focus on popular NewSQL systems, even though measuring popularity is rather difficult. Based on three NewSQL studies (Kaur & Sachdeva, 2017; Pavlo & Aslett, 2016; Schreiner et al., 2019), we identified four popular NewSQL database management systems for this study: CockroachDB (v19.2.2), SingleStore (7.0.10, previously known as MemSQL), NuoDB (build 4.0.4-2), and VoltDB (Community 9.2.2). All these systems implement relational or semi-relational data models, use SQL as their query language, and are built from the ground up in the 2010s (Grolinger et al., 2013). Additionally, DB-Engines1 ranks these four DBMSs high in popularity among NewSQL systems, when NewSQL systems are defined as in Section 2.1. In regard to different types of errors, we focus on syntax errors, and based on a previously reported framework (Taipalus et al., 2018), we focus on the 16 most common syntax errors in SQL queries. These previously reported syntax errors and our corresponding tests are reported in Table 1. These tests and queries within are in turn based on those reported in a previous study (Taipalus et al., 2021), but adjusted to account for the chosen four NewSQL systems. In the next subsections, we detail the data collection, hypotheses, and analyses, which are summarized in Figure 2.