: George T. Duncan, Mark Elliot, Gonzalez Juan Jose Salazar
: Statistical Confidentiality Principles and Practice
: Springer-Verlag
: 9781441978028
: 1
: CHF 47.50
:
: Methoden der empirischen und qualitativen Sozialforschung
: English
: 200
: Wasserzeichen
: PC/MAC/eReader/Tablet
: PDF

<font face="Arial"><font face="Arial">Because statistical confidentiality embraces the responsibility for both protecting data and ensuring its beneficial use for statistical purposes, those working with personal and proprietary data can benefit from the principles and practices this book presents. Researchers can understand why an agency holding statistical data does not respond well to the demand,“Just give me the data; I’m only going to do good things with it.” Statisticians can incorporate the requirements of statistical confidentiality into their methodologies for data collection and analysis. Data stewards, caught between those eager for data and those who worry about confidentiality, can use the tools of statistical confidentiality toward satisfying both groups.<br />
The eight chapters lay out the dilemma of data stewardship organizations (such as statistical agencies) in resolving the tension between protecting data from snoopers while providing data to legitimate users, explain disclosure risk and explore the types of attack that a data snooper might mount, present the methods of disclosure risk assessment, give techniques for statistical disclosure limitation of both tabular data and microdata, identify measures of the impact of disclosure limitation on data utility, provide restricted access methods as administrative procedures for disclosure control, and finally explore the future of statistical confidentiality.</font> lt;/font>



Preface5
Contents9
1 Why Statistical Confidentiality?13
1.1 What Is Statistical Confidentiality?14
1.2 Stakeholders in the Statistical Process15
1.3 The Data Stewardship Organization's Dilemma15
1.4 The Value of Statistical Data18
1.5 Why Are DSOs Concerned About Statistical Confidentiality?20
1.5.1 A Difficult Context for a DSO20
1.5.1.1 Privacy Worries21
1.5.1.2 Confidentiality Concerns21
1.5.1.3 Changing Legal and Social Context22
1.5.1.4 Sensitivity to Social Impact—''Group Harm''22
1.5.2 Providing Data and Protecting Confidentiality23
1.5.3 Consequences of a Confidentiality Breach24
1.5.4 What Motivates a DSO to Provide Confidentiality?25
1.5.4.1 Legal Requirements and Fair Information Practices25
1.5.4.2 Pragmatic Considerations28
1.5.4.3 Ethical Obligations29
1.6 High-Quality Statistical Data Raise Confidentiality Concerns30
1.6.1 Characteristics of High-Quality Statistical Data30
1.6.2 Disclosure Risk Problems Stemming from Characteristics of High-Quality Statistical Data33
1.7 Disclosure Risk and the Concept of the Data Snooper34
1.8 Strategies of Statistical Disclosure Limitation35
1.8.1 Restricted Access35
1.8.2 Restricted Data36
1.9 Summary36
2 Concepts of Statistical Disclosure Limitation39
2.1 Conceptual Models of Disclosure Risk39
2.1.1 Elements of the Disclosure Risk Problem41
2.1.1.1 Microdata41
2.1.1.2 Deliberate Linkage42
2.1.1.3 Aggregate Data43
2.1.1.4 Attribution and Subtractive Attack43
2.1.1.5 Linking Tables45
2.1.1.6 Hierarchical Tables46
2.1.1.7 Linking Anonymized Data Sets47
2.1.1.8 Spontaneous Recognition47
2.1.2 Perceived and Actual Risk47
2.1.3 Scenarios of Disclosure48
2.1.3.1 Motivation48
2.1.3.2 Means49
2.1.3.3 Opportunity49
2.1.3.4 Types of Attacks50
2.1.3.5 Key Variables51
2.1.3.6 Target Variables51
2.1.3.7 Effect of Data Divergence51
2.1.3.8 Likelihood of Success52
2.1.4 Data Environment Analysis54
2.2 Assessing the Risk54
2.2.1 Uniqueness54
2.2.2 Matching/Reidentification Experiments55
2.2.3 Disclosure Risk Assessment for Aggregate Data55
2.3 Controlling the Risk56
2.3.1 Metadata Level Controls56
2.3.2 Distorting the Data57
2.3.3 Controlling Access57
2.4 Data Utility Impact58
2.5 Summary59
3 Assessment of Disclosure Risk60
3.1 Thresholds and Other Proxies61
3.2 Risk Assessment for Microdata: Types of Matching62
3.2.1 File-Level Risk Metrics62
3.2.1.1 Population Uniqueness62
3.2.1.2 The Proportion of Sample Uniques that are Population Unique63
3.2.1.3 The Skinner and Elliot Method63
3.2.2 Record-Level Risk Metrics65
3.2.2.1 Probability Modeling Approaches65
3.2.2.2 Special Uniqueness66
3.3 Record Linkage Studies67
3.3.1 Using an External Data Set68
3.3.2 Using the Pre-SDL Data Set69
3.3.2.1 Distance-Based Record Linkage69
3.3.2.2 Probabilistic Record Linkage70
3.4 Risk Assessment for Count Data71
3.5 What is at Risk?: Understanding Sensitivity73
3.6 Summary74
4 Protecting Tabular Data76
4.1 Basic Concepts78
4.1.1 Structure of a Tabular Array78
4.1.2 Risky Cells81
4.1.2.1 Dominance Rule or (n, k)-Rule81
4.1.2.2 Prior/Posterior Ambiguity Rule81
4.1.2.3 n-Rule82
4.1.3 The Secondary Problem: The Data Snooper's Knowledge82
4.1.3.1 A Priori Knowledge82
4.1.3.2 The Output Pattern83
4.1.4 Disclosure Limitation86
4.1.5 Loss of Information87
4.1.6 The DSO's Problem87
4.1.7 Disclosure Auditing88
4.2 Four Methods to Protect Tables88
4.2.1 Cell Suppression89
4.2.2 Interval Publication92
4.2.3 Controlled Rounding93
4.2.4 Cell Perturbation96
4.2.5 All-in-One Method97
4.3 Other Methods97
4.3.1 Table Redesign98
4.3.2 Introducing Noise to Microdata98
4.3.3 Data Swapping99
4.3.4 Cyclic Perturbation99
4.3.5 Random Rounding100
4.3.6 Controlled Tabular Adjustment101
4.4 Summary103
5 Providing and Protecting Microdata104
5.1 Why Provide Access?106
5.2 Confidentiality Concerns110
5.3 Why Protect Microdata?114
5.4 Restricted Data116
5.4.1 In Order to Limit Disclosure, What Shall We Mask?119
5.5 Matrix Masking120
5.6 Masking Through Suppression121
5.7 Local S