Quantifying Privacy – Calculating the Risk of Harm from Identity Theft

The Problem

How harmful would it be if some elements of personal information are disclosed online? Can we quantify the risk of harm? How much is too much information to be released?

Project Overview

The proposition underlying this project is that privacy engineering requires a quantitative methodology for making the inevitable compromises between privacy on the one hand and the benefits derived from the disclosure of personal information on the other. This methodology will be based on a mathematical model relating information disclosure to the harm potentially arising from that disclosure, in the context of cyber identity theft. It will take the risk management approach advocated in existing proposals in assessing the risk of harm where risk is the product of the likelihood of harm being caused by the disclosure of specific elements of personal information and the level of that harm.

However the approach adopted in this project differs in two important respects. Firstly, the methodology will treat likelihood of harm as a function of the level of harm so that, for example, it can account for a high likelihood of a low level of harm with a low likelihood of a very high level of harm. A statistical model of harm levels and their likelihoods will be constructed from empirical data. Secondly it explicitly recognises that elements of personal information are rarely disclosed in isolation but are most commonly disclosed in groups of two or more with the composition depending on the context. The methodology will be based on a probabilistic model of disclosure, constructed from empirical data, which will contain the statistical dependencies between the information elements.