Data-driven model – Knowledge and References

Explore chapters and articles related to this topic

Accelerating design processes using data-driven models

Published in Juhani Ukko, Minna Saunila, Janne Heikkinen, R. Scott Semken, Aki Mikkola, Real-time Simulation for Sustainable Production, 2021

Emil Kurvinen, Iines Suninen, Grzegorz Orzechowski, Jin H. Choi, Jin-Gyun Kim, Aki Mikkola

Data-driven models also make it possible to better understand and utilize the growing amount of raw data that is generated and collected during the lifecycle of a modern machine. The data-driven approach extracts relationships from the data and offers accurate system responses without relying on classical laws and equations. Frequently, the data-driven model approach results in efficiencies that are higher than those of classical approaches. However, identifying the appropriate application is challenging, a lack of understanding of the underlaying laws can be a drawback, and model effectiveness and reliability are strongly case and objective dependent. Nevertheless, the data-driven paradigm is developing rapidly and becoming widely applicable. Recent advances in physics-based data-driven formulations will only reinforce that trend.

Human monitoring systems for health, fitness and performance augmentation

View Chapter

Purchase Book

Published in Adedeji B. Badiru, Cassie B. Barlow, Defense Innovation Handbook, 2018

Mark M. Derriso, Kimberly Bigelow, Christine Schubert Kabban, Ed Downs, Amanda Delaney

In a data-driven model, data is used to understand the relationships between the potential predictor variables and the performance outcome. Although such models are driven by data, it is important to realize that the data, and therefore, the variables considered should be used because these represent the correct, or theorized, variables related to the performance outcome. Variables and data should not be included for reasons such as (1) the sensor computes these additional variables anyway, (2) justification for excessive or previous data collection efforts on the same subjects, and (3) the effort to include additional sensors that are currently assessable is minimal. Instead, data and variables collected should be hypothesized in a relationship to the performance outcome. The hypothesized relationship defines the form of the model to be fit to the data.

Combining Theory and Data-Driven Approaches for Epidemic Forecasts

View Chapter

Purchase Book

Published in Anuj Karpatne, Ramakrishnan Kannan, Vipin Kumar, Knowledge-Guided Machine Learning, 2023

Lijing Wang, Aniruddha Adiga, Jiangzhuo Chen, Bryan Lewis, Adam Sadilek, Srinivasan Venkatramanan, Madhav Marathe

The main concept of a data-driven model is to find relationships between the input and output without explicit knowledge of the physical behavior of the system. Both statistical models and deep learning models are examples of purely data-driven models. They employ statistical and time series-based methodologies to learn patterns in historical epidemic data and leverage those patterns for forecasting.

Machine learning predicts bioaerosol trajectories in enclosed environments: Introducing a novel method

View Article

Journal Information

Published in Aerosol Science and Technology, 2023

Zhijian Liu, Jiaqi Chu, Zhenzhe Huang, Haochuan Li, Xia Xiao, Junzhou He, Weijie Yang, Xuqiang Shao, Haiyang Liu

The data-driven model necessitates training data. To obtain this data, the conventional force analysis method was employed to calculate the motion equation of bioaerosols and generate the training set. The differential equation for solving the motion of an individual bioaerosol under the influence of various forces is as follows: where and are the velocity of airflow and bioaerosols, respectively; ρ and ρp are the density of airflow and particles, respectively; is gravitational acceleration; represents the additional forces acting on the bioaerosols. To solve the trajectory, Equation (2) should be combined with the following Equation (3): where are the position of the bioaerosols.

Parameter estimation of unknown properties using transfer learning from virtual to existing buildings

View Article

Journal Information

Published in Journal of Building Performance Simulation, 2021

Yun-Dam Ko, Cheol-Soo Park

The lack of detailed data for training dataset is the main challenge to train a new data-driven model. In addition, training a model with a large dataset often takes a long time. To overcome these issues, transfer learning (TL) was introduced and has been applied to many tasks in various fields including image classification and natural language processing. The aim of TL is to transfer knowledge acquired from ‘source’ domain to ‘target’ domain in order to improve the performance of a model with reduced training time. In this study, the authors used TL for identifying unknown factors because TL can inherit knowledge learned from ‘source’ data as exemplified from Figure 4. While the conventional machine learning approach requires a separate model for each task, TL approach can use one model for many tasks by rapid fine-tuning (Figure 4). In other words, a data-driven model trained with ‘source’ data, i.e. a pre-trained model, can be re-used after fine-tuning with target data which may be not sufficient for a training dataset.

Evaluation of novel-objective functions in the design optimization of a transonic rotor by using deep learning

View Article

Journal Information

Published in Engineering Applications of Computational Fluid Mechanics, 2021

A. Zeinalzadeh, M.R. Pakatchian

Each neural network is trained over a design space, and is also responsible for predicting an objective function. In this regard, the Latin hypercube method is utilized for near-random sampling of the geometrical variables. The initial dataset including 500 members is divided to training, validation, and test data, which are normalized with ‘MinMaxScaler’ from the scikit-learn library in Python™. All members are solved by MISES and then related objective functions are calculated. The training data are used during the learning process to fit the data-driven model. Since the validation dataset is also used to fit the model, it is also regarded as part of the training set. The test data are completely independent of the training data but follow the same probability distribution and are used to obtain the performance characteristics of the final model. In this study, 5 to 10% of data are used for the validation, 10 to 15% for testing, and the remaining dataset is applied for the training of each neural network.