Explore chapters and articles related to this topic
Data Collection and Analysis
Published in James William Martin, Lean Six Sigma for the Office, 2021
When Lean was introduced to manufacturing decades ago, data was collected directly from the process that was being analyzed. Data collection activities consisted of people walking into manufacturing operations and auditing inventory, measuring floor area, and calculating operational throughput rates. Additional historical information such as machine downtime, lead times, safety, and quality issues were taken from operational reports. This information representing an operational snapshot was brought into a hands-on Kaizen workshop to create a VFM, analyze it, and look for ways to improve the process. The analysis identified non-value adding (NVA) operations and process waste. People did this work in person. In contrast, over the past few decades, work is increasingly done through information technology (IT) systems and applications. It is virtual. Also, it is not unusual for large organizations to have several hundred software applications managed by different teams and business owners. Improvement teams now need to understand how and where data is stored as well as how it flows from source to consuming systems to complete work. Tracing the movement of metadata from source to consuming applications is called data lineage or data mapping.
Big Data
Published in James William Martin, Operational Excellence, 2021
A business process from a data perspective is an aggregation of metadata from several sources that build work products such as a customer order profile. How an organization's systems use metadata is important for information governance. Data mapping or data lineage traces the sources of metadata and how it flows through various IT applications to create work products, such as an order or invoice. Data lineage shows whether metadata came from a trusted source. It is also important if there are quality issues or if the business rules need to be modified. Software tools are used to crawl though IT applications and trace the end-to-end flow of metadata. Visualization of metadata lineage is important to see high-volume transaction flows to focus process improvement efforts, including the evaluation of metadata performance against business rules. In summary, because of the number of applications and platforms used by organizations, process improvement professionals need to understand metadata quality, lineage, and governance to work effectively with business data stewards, business stakeholders, and others to improve process quality and efficiency.
Agile Project Management and Data Analytics
Published in Seweryn Spalek, Data Analytics in Project Management, 2018
A key failure point in data analytics projects is planning enough time to get the data required for the project. Often the time needed to work with the data is underestimated or not known at the start of the project. Working with multiple technology platforms, multiple data structures, data sampling, data integration, and merging data into a final set for modeling takes time and planning. Additionally, once a final data set is created, the project team should document the data lineage to ensure the data processing logic is clear which can enable the recreating of the data set for iterative development (Gartner, 2015; Halper, 2016; Larson & Chang, 2016). Data lineage deals with the rawest form of data (data at its lowest grain), which enables efficiencies in data movement and storage resulting in higher return on investment in data analytics projects.
Lessons from a Marine Spatial Planning data management process for Ireland
Published in International Journal of Digital Earth, 2021
Sarah Flynn, Will Meaney, Adam M. Leadbetter, Jeffrey P. Fisher, Caitriona Nic Aonghusa
Our findings from the pilot show metadata records containing information about the dataset, including how it can be accessed, will likely be the first information about the dataset that is read. This record should contain enough information about the data, collection methods, and format of the data to enable the re-user to determine whether the data are fit for purpose or not. Data lineage is an important first step for data governance; a visual representation of data lineage helps to track data from its origin to its destination. MSP users require data of high integrity, uniformity and correctness. ‘A picture paints a thousand words … ’ the ability to visually identify strengths, weaknesses, opportunities and threats within a process will greatly enhance the delivery of the highest standards of evidence used for planning and decision making for MSP.
Artificial Intelligence Governance For Businesses
Published in Information Systems Management, 2023
Johannes Schneider, Rene Abraham, Christian Meske, Jan Vom Brocke
Data lineage includes tracking its origin, where it moves and how it is processed. In particular, in AI, this includes connecting data, pre-processing, and models using the pre-processed data (for training; Vartak et al., 2016). The same data might be used in many different ML models, e.g., through transfer learning. Thus, proper governance should ensure that any updates, e.g., improvements in data quality, should be propagated adequately to all relevant stakeholders.