A science based on hacked data

Ethics – how should a research scientist deal with hacked and leaked data – is it ethical to use this – and if so, under what circumstances? Two ETH researchers address these questions in the specialized journal Nature Machine Intelligence.

Data sets made available through hacks and leaks can provide a unique and valuable source for scientific work. But this requires a clear ethical justification for using this data, write Marcelo Enca and Evi Faina, who work in bioethics and big data ethics at two ETHs in Zurich and Lausanne.

With their technical article, they want to spark a debate in academia on the topic – particularly in light of the current frequency and extent of data protection violations. They criticize the legal and ethical limits of using such data in research.

Hacked data flows into studies

In 2015, a group of hackers stole the data of millions of clients of the dating portal “Ashley Madison”. The information was leaked to the public – and to research. At least three studies on cheating based on this data have been published in specialized journals, Enka and Faina explain. WikiLeaks data sets also formed the basis for the research. For example, to develop models for predicting conflict.

The researchers also showed that the ethical dilemma of using data of illegal origin goes back a long way. In the early 1990s, for example, controversy erupted over the moral permissibility of using data from medical experiments in the concentration camps of the National Socialist regime.

Legitimacy alone is not enough

The authors of one Ashley Madison study noted that they “discussed the use of the data with several people, including lawyers, who emphasized that the data could be used for research purposes now that it was publicly available.” It can be used for research purposes in the same way as for journalism.”

But even if the use is legal — or at least not penalized — the question of the compatibility of good research practices remains, according to ETH researchers. Because scientists will have a social responsibility and a moral obligation. Failure to do so may result in indirect harm to the individual, lead to a loss of public trust in the research, or jeopardize the reputation of a scientific institution.

Six suggestions

To counter this, ETH researchers offer six suggestions for dealing with such data. This includes, among other things, that scientists must transparently indicate how they obtained the data. You must also prove that the compromised data set is a unique source of information that cannot be collected in any other way. It must also be demonstrated that the proposed research has high social value and that the benefits clearly outweigh its drawbacks. If the data allow data subjects to be identified, their express and informed consent must also be obtained.

Marcelo Enca and Evi Faina stress that research and data ethics will provide the analytical tools to guide the debate on how to deal with data of illegal origin, build trust in science, and ensure research ethics and integrity.


