ebook ebooks e-book e-books downloaden bei MyEbooks.ch downloaden

Statistical Confidentiality Principles and Practice

:	George T. Duncan, Mark Elliot, Gonzalez Juan Jose Salazar
:	Statistical Confidentiality Principles and Practice
:	Springer-Verlag
:	9781441978028
:	1
:	CHF 47.50
:

:	Methoden der empirischen und qualitativen Sozialforschung
:	English

:	200
:	Wasserzeichen
:	PC/MAC/eReader/Tablet
:	PDF

<font face="Arial"><font face="Arial">Because statistical confidentiality embraces the responsibility for both protecting data and ensuring its beneficial use for statistical purposes, those working with personal and proprietary data can benefit from the principles and practices this book presents. Researchers can understand why an agency holding statistical data does not respond well to the demand,“Just give me the data; I’m only going to do good things with it.” Statisticians can incorporate the requirements of statistical confidentiality into their methodologies for data collection and analysis. Data stewards, caught between those eager for data and those who worry about confidentiality, can use the tools of statistical confidentiality toward satisfying both groups.<br />
The eight chapters lay out the dilemma of data stewardship organizations (such as statistical agencies) in resolving the tension between protecting data from snoopers while providing data to legitimate users, explain disclosure risk and explore the types of attack that a data snooper might mount, present the methods of disclosure risk assessment, give techniques for statistical disclosure limitation of both tabular data and microdata, identify measures of the impact of disclosure limitation on data utility, provide restricted access methods as administrative procedures for disclosure control, and finally explore the future of statistical confidentiality.</font> lt;/font>

	Preface	5
	Contents	9
	1 Why Statistical Confidentiality?	13
	1.1 What Is Statistical Confidentiality?	14
	1.2 Stakeholders in the Statistical Process	15
	1.3 The Data Stewardship Organization's Dilemma	15
	1.4 The Value of Statistical Data	18
	1.5 Why Are DSOs Concerned About Statistical Confidentiality?	20
	1.5.1 A Difficult Context for a DSO	20
	1.5.1.1 Privacy Worries	21
	1.5.1.2 Confidentiality Concerns	21
	1.5.1.3 Changing Legal and Social Context	22
	1.5.1.4 Sensitivity to Social Impact—''Group Harm''	22
	1.5.2 Providing Data and Protecting Confidentiality	23
	1.5.3 Consequences of a Confidentiality Breach	24
	1.5.4 What Motivates a DSO to Provide Confidentiality?	25
	1.5.4.1 Legal Requirements and Fair Information Practices	25
	1.5.4.2 Pragmatic Considerations	28
	1.5.4.3 Ethical Obligations	29
	1.6 High-Quality Statistical Data Raise Confidentiality Concerns	30
	1.6.1 Characteristics of High-Quality Statistical Data	30
	1.6.2 Disclosure Risk Problems Stemming from Characteristics of High-Quality Statistical Data	33
	1.7 Disclosure Risk and the Concept of the Data Snooper	34
	1.8 Strategies of Statistical Disclosure Limitation	35
	1.8.1 Restricted Access	35
	1.8.2 Restricted Data	36
	1.9 Summary	36
	2 Concepts of Statistical Disclosure Limitation	39
	2.1 Conceptual Models of Disclosure Risk	39
	2.1.1 Elements of the Disclosure Risk Problem	41
	2.1.1.1 Microdata	41
	2.1.1.2 Deliberate Linkage	42
	2.1.1.3 Aggregate Data	43
	2.1.1.4 Attribution and Subtractive Attack	43
	2.1.1.5 Linking Tables	45
	2.1.1.6 Hierarchical Tables	46
	2.1.1.7 Linking Anonymized Data Sets	47
	2.1.1.8 Spontaneous Recognition	47
	2.1.2 Perceived and Actual Risk	47
	2.1.3 Scenarios of Disclosure	48
	2.1.3.1 Motivation	48
	2.1.3.2 Means	49
	2.1.3.3 Opportunity	49
	2.1.3.4 Types of Attacks	50
	2.1.3.5 Key Variables	51
	2.1.3.6 Target Variables	51
	2.1.3.7 Effect of Data Divergence	51
	2.1.3.8 Likelihood of Success	52
	2.1.4 Data Environment Analysis	54
	2.2 Assessing the Risk	54
	2.2.1 Uniqueness	54
	2.2.2 Matching/Reidentification Experiments	55
	2.2.3 Disclosure Risk Assessment for Aggregate Data	55
	2.3 Controlling the Risk	56
	2.3.1 Metadata Level Controls	56
	2.3.2 Distorting the Data	57
	2.3.3 Controlling Access	57
	2.4 Data Utility Impact	58
	2.5 Summary	59
	3 Assessment of Disclosure Risk	60
	3.1 Thresholds and Other Proxies	61
	3.2 Risk Assessment for Microdata: Types of Matching	62
	3.2.1 File-Level Risk Metrics	62
	3.2.1.1 Population Uniqueness	62
	3.2.1.2 The Proportion of Sample Uniques that are Population Unique	63
	3.2.1.3 The Skinner and Elliot Method	63
	3.2.2 Record-Level Risk Metrics	65
	3.2.2.1 Probability Modeling Approaches	65
	3.2.2.2 Special Uniqueness	66
	3.3 Record Linkage Studies	67
	3.3.1 Using an External Data Set	68
	3.3.2 Using the Pre-SDL Data Set	69
	3.3.2.1 Distance-Based Record Linkage	69
	3.3.2.2 Probabilistic Record Linkage	70
	3.4 Risk Assessment for Count Data	71
	3.5 What is at Risk?: Understanding Sensitivity	73
	3.6 Summary	74
	4 Protecting Tabular Data	76
	4.1 Basic Concepts	78
	4.1.1 Structure of a Tabular Array	78
	4.1.2 Risky Cells	81
	4.1.2.1 Dominance Rule or (n, k)-Rule	81
	4.1.2.2 Prior/Posterior Ambiguity Rule	81
	4.1.2.3 n-Rule	82
	4.1.3 The Secondary Problem: The Data Snooper's Knowledge	82
	4.1.3.1 A Priori Knowledge	82
	4.1.3.2 The Output Pattern	83
	4.1.4 Disclosure Limitation	86
	4.1.5 Loss of Information	87
	4.1.6 The DSO's Problem	87
	4.1.7 Disclosure Auditing	88
	4.2 Four Methods to Protect Tables	88
	4.2.1 Cell Suppression	89
	4.2.2 Interval Publication	92
	4.2.3 Controlled Rounding	93
	4.2.4 Cell Perturbation	96
	4.2.5 All-in-One Method	97
	4.3 Other Methods	97
	4.3.1 Table Redesign	98
	4.3.2 Introducing Noise to Microdata	98
	4.3.3 Data Swapping	99
	4.3.4 Cyclic Perturbation	99
	4.3.5 Random Rounding	100
	4.3.6 Controlled Tabular Adjustment	101
	4.4 Summary	103
	5 Providing and Protecting Microdata	104
	5.1 Why Provide Access?	106
	5.2 Confidentiality Concerns	110
	5.3 Why Protect Microdata?	114
	5.4 Restricted Data	116
	5.4.1 In Order to Limit Disclosure, What Shall We Mask?	119
	5.5 Matrix Masking	120
	5.6 Masking Through Suppression	121
	5.7 Local S