Explore chapters and articles related to this topic
IoT, Big Data, and Analytics
Published in Vijay Kumar, Mangey Ram, Predictive Analytics, 2021
Priyanka Vashisht, Vijay Kumar, Meghna Sharma
Here the data and events analytics are time based to expose the related trends and their patterns. Applications such as health monitoring and weather forecasting include time series that help to find a “long-term change in the mean level” [35]. For storing data collected for sensors, a number of sensor databases are developed. Bader et al. [36] evaluated 12 well-known time series sensor databases with 27 decisive factors. These factors are categorized into six groups (i.e. (i) function, (ii) clustering, (iii) granularity, (iv) tagging of data and long-term storage, (v) interfaces and extensibility, and (vi) license and support). Authors claim that their work is extensive, repeatable, and an open-source benchmark for enterprise readiness. Databases that meet all the requirements of time-based big IoT data are KairosDB (2018), InfluxDB (2018b), and MonentDB [33]. KairosDB uses the distributed NoSQL database management system [37], and for storing the data, it uses Apache Cassandra. Three column families are used for storing time series–based sensor data – namely, Row Key Index Column Family, Data Points Column Family, and String Index Column Family [38, 39]. A data compression technique was used in this database during the writing process [40]. In InfluxDB, the logical grouping of a database is done for storing time series–based IoT data. The logical group of InfluxDB is known as measurements. The identification of each time series is done using a unique tab in the measurement. InfluxDB supports query language similar to SQL [41]. Here the values and their time stamps are compressed and stored independently using some encodings scheme, which is dependent on the type of data and its characteristics. Storing data autonomously permits the same encoding scheme to be used for all time stamps, while allowing diverse encodings for unlike field types.
An open source approach to the design and implementation of Digital Twins for Smart Manufacturing
Published in International Journal of Computer Integrated Manufacturing, 2019
Violeta Damjanovic-Behrendt, Wernher Behrendt
Knowledge of common and distinct features of open source software is an important factor when deciding which of the available tools to choose. For example, the process of capturing, storing and managing a large amount of time-series data requires considerable effort and research on combining open source technologies for the storage mechanisms and the data analytics engine. For example, Churilo (2018) discusses the results of recent benchmark for time-series workloads for InfluxDB 1.4.2, an open source time-series database and Elasticsearch 5.6.3 (see Table 7 in Section 4.3 for details). The performed benchmark shows that InfluxDB outperforms Elasticsearch in two tests: write throughput (InfluxDB is 9.9x greater than Elasticsearch) anddisc space usage (InfluxDB uses 13.1x less disc space when compared against Elasticsearch’s time-series optimised configuration).