Statistical Disclosure Control
* under development *
This page contains links to various SDC papers (only Felix Ritchie's at present). Here you'll also find the ACRO (Automatic Checking of Research Outputs) Stata code and guides. Please contact us for details while this is in pilot phase.
Felix Ritchie's research papers on SDC
- Disclosure control of analytical outputs
2006, working paper (republished in WISERD Data Resources stream)
Explains model-based assessment of output for disclosure risks. Linear regression as an example, shown to be non-disclosive in normal situations. Re-published as WISERD Working Paper no. 5 in 2011 with spelling mistakes removed and conclusion added
[Final paper]
Full citation: Ritchie F. (2006) Disclosure Control of Analytical Outputs". Mimeo: Office for National Statistics. Edited and reprinted as WISERD Data and Methods Working Paper no. 5 (2011).
- Statistical disclosure control in a research environment
2007, working paper (republished in WISERD Data Resources stream)
Explains the need for SDC designed for research environments, introducing the concept of the 'research zoo'. Outlines the principles-based output SDC model
[Final paper] [Presentation]
Full citation: Ritchie F. (2007) Statistical disclosure control in a research environment, mimeo, Office for National Statistics. Edited and reprinted as WISERD Data and Methods Working Paper no. 6 (2011).
- Disclosure detection in research environments in practice
2008, published
Describes how principles-based output SDC can work, and introduces the 'safe-unsafe' model; also explains the need for researcher engagement and training
[Final paper] [Conference paper]
Full citation: Ritchie F. (2008) “Disclosure detection in research environments in practice”, in Work session on statistical data confidentiality 2007; Eurostat; pp399-406
- Guidelines for the checking of output based on microdata research
2010, published; with NSIs of UK, Germany, Netherlands, Italy
Final report for Eurostat on best-practice guidelines, for both SDC and traning of researchers; largely based on the UK VML model but brings in some additional rules and formalises the idea of 'rules of thumb'
[Final paper]
Full citation: Brandt M., Franconi L., Guerke C., Hundepool A., Lucarelli M., Mol J., Ritchie F., Seri G. and Welpton R. (2010), Guidelines for the checking of output based on microdata research, Final report of ESSnet sub-group on output SDC
- Output-based disclosure control for regressions
2012, working paper
Revised version of Ritchie (2006): corrects drafting errors, revises guidelines, and places concern over malicious analysis in institutioal context. Guidelines REPLACE the ones in Ritchie (2006)
[Final paper]
Full citation: Ritchie F. (2012) "Output-based disclosure control for regressions". Working papers in economics no. 1209. University of the West of England, Bristol.
- Operationalising safe statistics: the case of linear regression
2014, working paper
Outlines a four-stage method for determining the status of a statistic as 'safe' or 'sunsafe'. This leads to a discussion on the subjective nature of 'safety' and the need for realistic assessmetns of risk rather than a focus on theoretical worst-case scenarios.
[Final paper] [Presentation]
Full citation: Ritchie F. (2014) "Operationalising safe statistics: the case of linear regression", Working papers in Economics no. 1410, University of the West of England, Bristol. September
- User-focused threat identification for anonymised microdata
2015, working paper; with Hans-Peter Hafner, Rainer Lenz
This paper argues that anonymisation strategies for research datasets are seriously flawed: they over-emphasise the risks to the producer, undervalue the cost to the researcher, and are based upon theoretical worst-cases rather than a realistic assessment of the evidence. It illustrates the argument by showing how a change in perspective dramatically reduces the perturbation applied to the Community Innovation Survey Scientific Use Files
[Final paper] [Presentation]
Full citation: Hafner H.-P., Ritchie F. and Lenz R. (2015) "User-centred threat identification for anonymized microdata". Working papers in Economics no. 1503, University of the West of England, Bristol. March
- Principles- versus rules-based output statistical disclosure control in remote access environments
2015, working paper; with Mark Elliot
For fifty years SDC has been rules-based: telling researchers what to do. This paper discusses the increasingly popular 'principles-based' approach which provides better security and better data utilty at a lower cost than traditional models. Originally drafted as a note for the Administrative Data Research Network. The UWE working paper is very similar to the published IQ paper. For implementation issues, see the more recent paper by Ritchie and Welpton
[Final paper] [Similar presentations]
Full citation: Ritchie F. and Elliot M. (2015) "Principles- versus rules-based output statistical disclosure control in remote access environments", IASSIST Quarterly v39 pp5-13
- Operationalising principles-based output SDC
2015, draft; with Richard Welpton
Guide to PBOSDC for output checkers - VERY EARLY DRAFT SO COMMENTS VERY WELCOME!
[Note that McAfee scanner keep suggesting there is apotential virus in this file, which I've checked repeatedly using the same McAfee scanner - appears to be McAfee being over enthusiastic about web docs. If you're downloading the version which is 190500bytes and uploaded 12.17 on 10th August 2016, that's the checked file.]
[Further information]
Full citation: Ritchie F. and Welpton R. (2015) "Operationalising principles-based output SDC", mimeo
- Ensuring the confidentiality of statistical outputs from the ADRN. Technical Report
2017, published; with Philip Lowthian
Simple, low jargon introduction to output statistical disclosure control, particularly the principles-based approach
[Final paper]
Full citation: Lowthian P. and Ritchie F. (2017) Ensuring the confidentiality of statistical outputs from the ADRN. Technical report no3. Administrative Data Research Network
- Analyzing the disclosure risk of regression coefficients
2019, published
This paper argues that disclosure risk from publishing regression coefficients in negligible in practical environments. It also points out that looking for deliberate falsification of regression outcomes to disclose data points is irrelevant for most environments, and the one where it matters (remote job systems) is better handled by non-statistical methods
[Final paper] [Further information]
Full citation: Ritchie, F. (2019). Analyzing the disclosure risk of regression coefficients. Transactions on data privacy, 12(2), 145-173
- User-focused threat identification for anonymised microdata
2019, published; with Hans-Peter Hafner, Rainer Lenz
The paper shows how a different perpsective on the threat environment and the public benefit can lead to radically different outcomes for data anonymization
[Final paper]
Full citation: Ritchie, F., Hafner, H., & Lenz, R. (2019). User-focused threat identification for anonymised microdata. Statistical Journal of the IAOS, 35(4), 703-713. https://doi.org/10.3233/SJI-190506.