Explore chapters and articles related to this topic
Collaboration with version control
Published in Tiffany Timbers, Trevor Campbell, Melissa Lee, Data Science, 2022
Tiffany Timbers, Trevor Campbell, Melissa Lee
After you are done with your edits, they can be “saved” by committing your changes. When you commit a file in a repository, the version control system takes a snapshot of what the file looks like. As you continue working on the project, over time you will possibly make many commits to a single file; this generates a useful version history for that file. On GitHub, if you click the green “Commit changes” button, it will save the file and then make a commit (Figure 12.13).
A review of the state of the art in business intelligence software
Published in Enterprise Information Systems, 2022
Gautam Srivastava, Muneeswari S, Revathi Venkataraman, Kavitha V, Parthiban N
All the necessary data is obtained from the metadata stage Huang et al. (2017); Jia, Liu, and Yang (2014); Lv et al. (2020); Sivapathasundaram, Cheng, and Petridis (2020); Tang, Srivastava, and Liu (2020); Zhu et al. (2017). (1) Internal sources like ERP, CPM, Sales and Human Resource Management (HRM) contain data that is used in the supply chain management of the organization Richards et al. (2019). It is located in an internal metadata repository. Meta layer handles the data format, encoding, and decoding, performance metrics, domain constraints, quality alerts, definitions, and other aspects of the organization that assumes new interests Berthold et al. (2010).(2) External sources of data include data from users, managers, suppliers, policy agencies, sales, finance, and other business competitors. The acquired information is extracted, transformed, and loaded (ETL) into storage devices. Ultimately, data sources are processed for making decisions in warehousing.
SuperMat: construction of a linked annotated dataset from superconductors-related publications
Published in Science and Technology of Advanced Materials: Methods, 2021
Luca Foppiano, Sae Dieb, Akira Suzuki, Pedro Baptista de Castro, Suguru Iwasaki, Azusa Uzuki, Miren Garbine Esparza Echevarria, Yan Meng, Kensei Terashima, Laurent Romary, Yoshihiko Takano, Masashi Ishii
Annotation guidelines include the principles and the rules that describe what constitutes as desired information for the SuperMat dataset and how to annotate it. They include detailed description of the specific rules that have been defined for each type of information to be annotated, with one or more definitions and examples illustrating what to annotate in different cases, exceptions, and references. We used an online system to track the discussions and decisions when a question or a comment was raised, and provided a link to such issues in the respective description or example. In addition, the guidelines include linking rules that provide information on how to correctly connect the entities in a relationship. The guidelines were built using a dynamic markup language (called RestructuredText) and stored in a git (https://git-scm.com/) version control system repository. We deployed them as HTML files via web, which were updated automatically after each modification. They can be accessed at https://supermat.readthedocs.io.
An architecture for synchronising cloud file storage and organisation repositories
Published in International Journal of Parallel, Emergent and Distributed Systems, 2019
Gil Andriani, Eduardo Godoy, Guilherme Koslovski, Rafael Obelheiro, Mauricio Pillon
Cloud-based file storage comprises two main actors: data storage repositories and synchronisation clients. The internal architecture of repositories is usually composed of modules for monitoring and indexing data, implemented as a distributed file system, focusing on delivering availability, authenticity, and confidentiality to users. Complementarily, cloud providers offer services to control versioning, quota sharing and collaborative editing. For their part, file synchronisation clients are responsible for maintaining a directory structure and files synchronised with the cloud repository. File transfer events are realised by synchronisers and observers, that act on the metadata associated with the files. There are two observers, locally- and remotely-placed, responsible for identifying changes in files generated by sharing users. When needed, a synchronising module performs the transfer of files (partial or total) between local and remote repositories. Some common features are found in popular clients such as incremental synchronisation, encryption and data compression.