Removing the pump handle: Stewarding data at times of public health emergency

Reema Patel
6 min readMay 20, 2020


Written by Reema Patel on 20th May 2020

There is a saying, incorrectly attributed to Mark Twain, that states: “History never repeat itself but it rhymes”. Seeking to understand the implications of the current crisis for the effective use of data, I’ve drawn on the nineteenth-century cholera outbreak in London’s Soho to identify some “rhyming patterns” that might inform our approaches to data use and governance at this time of public health crisis.

Where better to begin than with the work of Victorian pioneer John Snow? In 1854, Snow’s use of a dot map to illustrate clusters of cholera cases around public water pumps, and of statistics to establish the connection between the quality of water sources and cholera outbreaks, led to a breakthrough in public health interventions — and, famously, the removal of the handle of a water pump in Broad Street.

Data is vital

We owe a lot to Snow, especially now. His examples teaches us that data has a central role to play in saving lives, and that the effective use of (and access to) data is critical for enabling timely responses to public health emergencies.

Take, for instance, transport app CityMapper’s rapid redeployment of its aggregated transport data. In the early days of the Covid-19 pandemic, this formed part of an analysis of compliance with social distancing restrictions across a range of European cities. There is also the US-based health weather map, which uses anonymised and aggregated data to visualise fever, specifically influenza-like illnesses. This data helped model early indications of where, and how quickly, Covid-19 was spreading.

Another example of a real-time resource is the database established by Johns Hopkins University — a web-based dashboard that has mapped (very rapidly) the growth rate of incidences of Covid-19 deaths and recovery cases across the world. At a national level, there is Singapore’s Against Covid 19 dashboard, which traces how the outbreak is evolving over time, as well as its impacts in relation to age and gender. Many of these tools have been helping to support policymakers and national healthcare systems to predict and plan for demands on their healthcare systems, and to adapt their strategies as the pandemic unfolds.

Ethics and human rights still matter

As the current crisis evolves, many have expressed concern that the pandemic will be used to justify the rapid roll out of surveillance technologies that do not meet ethical and human rights standards, and that this will be done in the name of the “public good”. Examples of these technologies include symptom- and contact-tracing applications. Privacy experts are also increasingly concerned that governments will be trading off more personal data than is necessary or proportionate to respond to the public health crisis.

Many ethical and human rights considerations (including those listed at the bottom of this piece) are at risk of being overlooked at this time of emergency, and governments would be wise not to press ahead regardless, ignoring legitimate concerns about rights and standards. Instead, policymakers should begin to address these concerns by asking how we can prepare (now and in future) to establish clear and trusted boundaries for the use of data (personal and non-personal) in such crises.

Democratic states in Europe and the US have not, in recent memory, prioritised infrastructures and systems for a crisis of this scale — and this has contributed to our current predicament. Contrast this with Singapore, which suffered outbreaks of SARS and H1N1, and channelled this experience into implementing pandemic preparedness measures.

We cannot undo the past, but we can begin planning and preparing constructively for the future, and that means strengthening global coordination and finding mechanisms to share learning internationally. Getting the right data infrastructure in place has a central role to play in addressing ethical and human rights concerns around the use of data.

The importance of trust

Returning to the 1850s, although John Snow had persuaded government officials to remove the handle of the water pump he’d linked to cholera cases in Soho, his own explanation of the cause of cholera outbreaks — that it was a water-borne disease — was rejected for months. The Board of Health issued a report that said, “We see no reason to adopt this belief” — prompting Snow to continue to gather data about cases of cholera, tracing them back to the pump. Scientific orthodoxy at the time preferred the “miasma” theory — that cholera was caused by the inhalation of vapours in the atmosphere — and it took considerable time for Snow’s hypothesis to be taken seriously. In the meantime, people were falling ill and dying.

This highlights another lesson of this famous story. Data taken in isolation is quite literally no panacea. There can be a discrepancy between what the data says we should do, and what governments want to do — other short-term economic and political pressures push against the evidence base, compounding a natural resistance to change. The John Snow Society, at its annual Pumphandle Lectures, commemorates, through a ceremonial removal and reattachment every year of a pump handle, the medical world’s ongoing struggle against such forces.

In a climate where the legitimacy of good data, expertise and credible evidence has taken a hammering, it is important that we make the case for the central role of data in informing decision-making (especially in times of crisis). While data plays a role in responding to specific problems and challenges, in isolation it will have its limitations if our institutions do not find ways of first sourcing then gathering accurate data, understanding deeply the insights that emerge from them, and then swiftly reacting to the trends and patterns identified.

These problems are cultural. Making the best possible use of data means we need systems that are able to adapt, and quickly, as well as a willingness for the assumptions and models underlying our evidence base to be questioned and unpicked (which requires more transparency about the assumptions, not less). We also need trust in institutions at a time when levels of trust have declined, and the ability to pivot rapidly and depart from established traditions. If there is one thing this crisis will do, it is to force us to reimagine (yet again) the relationship between data, people and decision-making.

Where next?

While data can save lives at times of global public health crisis (and, for Covid-19, it is already helping to do so), it can only do this effectively if its use, management and governance is underpinned by clear rules (grounded in law, ethics and human rights) about how best to use data; and if there exists trust in institutions to use data well. Rethinking the data governance ecosystem matters more now than ever.

Considerations for data sharing in public health emergencies

  • Purpose limitation: ensuring that the purposes of data collection technologies such as contact tracing are tightly constrained and specific to legitimate use for emergency scenarios in a public health crisis, and are not deployed for other unintended purposes such as surveillance, commercial, marketing, advertising, or other research purposes.
  • Necessity and proportionality: ensuring data access and sharing mechanisms are necessary and proportionate to achieve their intended goals, by answering challenges such as “Was this necessary to achieve the intended goal?” or “Could something less intrusive/invasive have had the same or similar intended effect?”.
  • Transparency around data collection, and a clear deletion date: there needs to be clear indication of what data is collected, for what purpose, who has access to it and when it will be deleted.
  • Clear boundaries to data sharing: assurances that data access will be limited to what is necessary and will not be shared with commercial or public bodies, such as enforcement and immigration agencies, for purposes outside of emergency use.
  • Clear boundaries to use and storage: assurances establishing that there are clear boundaries for use and storage of personal data, with clear accountability for the parties involved in data processing.
  • Anonymisation and data security: although anonymisation should not be seen as a silver bullet, it’s preferable to use anonymous (or aggregate) data sets to increase protection, and it’s necessary to employ strong information security (regardless of whether the data is anonymised or not). As the European Data Protection Supervisor reminded us in its statement on the European Commission’s plan to collect telecommunications data in the fight against Covid-19, “effective anonymisation requires more than simply removing obvious identifiers”.


A version of this article was originally published on the blog of the Ada Lovelace Institute. Edited and republished with permission.

About the author

Reema Patel is head of public engagement at the Ada Lovelace Institute.

Published at on May 20, 2020.



Reema Patel

Participation/deliberative democracy/futures/emerging tech specialist. Researcher at Ipsos and at ESRC Digital Good Network.