Explore chapters and articles related to this topic
Digitalization in the Energy Sector
Published in Muhammad Asif, Handbook of Energy Transitions, 2023
Muhammad Umer, Muhammad Abid, Tahira Nazir, Zaineb Abid
The privacy of the data could be secured to some extent by eliminating the identifiable data associated with an individual or a group. This can be accomplished by utilizing quasi-identifiers. A quasi-identifier is a piece of information that does not itself identify an individual but can do so when combined with other quasi-identifiers. The research on the algorithm indicating the greater level of confidentiality and safeguarding it against the personal level of identification due to the quasi-identifiers has been conducted globally. Practically, such research, when implemented on a daily usage, supposes a greater level of command in terms of choosing the validity of testing data, which could permit the data-based frameworks to function and add value to maintain the greater degree of privacy (Asghar, Dán, Miorandi, & Chlamtac, 2017; Khatoun & Zeadally, 2017).
Security and Privacy Issues in Biomedical AI Systems and Potential Solutions
Published in Saravanan Krishnan, Ramesh Kesavan, B. Surendiran, G. S. Mahalakshmi, Handbook of Artificial Intelligence in Biomedical Engineering, 2021
k-Anonymity (Sweeney 2002) is a concept used frequently in relation to PPDM. Related concepts are p-sensitive anonymity (Wibowo 2018; Cooper and Elstun, 2018) and l-diversity (Tu et al., 2019; Mach-anavajjhala et al., 2006), which is a technique based on group-based anonymization that focuses on decreasing the “granularity” of the dataset, t-closeness, which is basically an improvement over the former, and quasi-identifiers, which are attributes that may have linkage with an external dataset (Veličković et al., 2017). However, these methods are not foolproof and we often need to look at other newer techniques like differential privacy to ensure maximal privacy and security. For example, in situations when one has auxiliary information from secondary sources, these methods fail. They also tend to overfit or over-generalize resulting in misleading predictions, which is very harmful to biomedical AI systems. A relevant study in this regard is Pycroft and Aziz (2018), where the idea of k-anonymity has been improved by introducing “semantic linkage k-anonymity” to balance the privacy loss and accuracy.
Privacy in Wireless Sensor Networks: Issues, Challenges, and Solutions
Published in Shafiullah Khan, Al-Sakib Khan Pathan, Nabil Ali Alrajeh, Wireless Sensor Networks, 2016
In data generalization method, sensitive data or quasi-identifiers are mapped to another domain, which enables statistical disclosure control. There exists fair distinction between sensitive data and quasi-identifiers. Data sets or the attributes that should not be disclosed in public is called sensitive data or sensitive attribute. Quasi-identifiers are those attributes or combination of attributes which on their own are nonsensitive in nature, but on combination with external data, they are capable of identifying private records. Another important term in reference to PPDM is “equivalence class,” which is defined as the set of tuples that cannot be distinguished from each other with respect to quasi-identifiers. The motivating factor behind the generalization method is that many attributes in the data set are pseudo-identifiers. When they are analyzed with respect to the publicly available records, sensitive data (which is masked or randomized) can be retrieved back. For example, if in a certain medical database, patient’s name is obfuscated, attacker can identify the patient by analyzing the attributes like zip code, birth date, doctors attended, and so on. In fact, k-anonymity was developed to address the problem of indirect inference to identify private data from public record [29].
Methods and tools for healthcare data anonymization: a literature review
Published in International Journal of General Systems, 2023
Olga Vovk, Gunnar Piho, Peeter Ross
Certain methods are intended to be used in specific cases and with specific types of datasets. For example, m-invariance is a privacy protection technique for the re-publication of dynamic datasets with sensitive personal information (Jayabalan and Rana 2018). k-join-anonymity permits more effective generalization to reduce the information loss on microdata. It was designed to provide the same level of accuracy as k-anonymity. However, it may be vulnerable to attribute disclosure (Anjum et al. 2018). S-diversity overcomes similarity attacks in the worst-case scenario using a clustering algorithm (Pawar, Ahirrao, and Churi 2018). (α, k)-Anonymity is designed to protect both identification and relationships to sensitive information in data. However, the risk of attribute disclosure due to the possibility of linkage attacks exists (Anjum et al. 2018). (k, e)-anonymity is based on the separation of published sensitive values and quasi-identifiers into tables. The accuracy is improved because quasi-identifiers do not need to be generalized.
A review of Automatic end-to-end De-Identification: Is High Accuracy the Only Metric?
Published in Applied Artificial Intelligence, 2020
Vithya Yogarajan, Bernhard Pfahringer, Michael Mayo
PHI are categorized into explicit identifiers and quasi-identifiers. Explicit identifiers such as name, phone number and social security number are directly linked to a patient. Quasi-identifiers such as age, gender, race and zip code are not directly connected to a patient but can be linked to external data sources and consequently be used to identify a patient, hence posing the same risk to patient privacy as explicit identifiers.
e-DMDAV: A new privacy preserving algorithm for wearable enterprise information systems
Published in Enterprise Information Systems, 2018
Zhenjiang Zhang, Xiaoni Wang, Lorna Uden, Peng Zhang, Yingsi Zhao
K-anonymity is a method used to deal with the problem in a partially optimal way. The release provides k-anonymity protection if at least k records in the release have the same quasi-identifier attribute values (Sweeney 2002).