Sign up FAST! Login

Reach into our bodies, grab the genotype, reach into the medical system and we grab our records, and we use it to build something together

Imagine the discoveries that could result from a giant pool of freely available health and genomic data. John Wilbanks is working to build it.

human experiments

Performing a medical or genomic experiment on a human requires informed consent and careful boundaries around privacy. But what if the data that results, once scrubbed of identifying marks, was released into the wild? At, John Wilbanks thinks through the ethical and procedural steps to create an open, massive, mine-able database of data about health and genomics from many sources. One step: the Portable Legal Consent for Common Genomics Research (PLC-CGR), an experimental bioethics protocol that would allow any test subject to say, "Yes, once this experiment is over, you can use my data, anonymously, to answer any other questions you can think of." Compiling piles of test results in one place, Wilbanks suggests, would turn genetic info into big data--giving researchers the potential to spot patterns that simply aren't viewable up close. 

A campaigner for the wide adoption of data sharing in science, Wilbanks is also a Senior Fellow with the Kauffman Foundation, a Research Fellow at Lybba and supported by Sage Bionetworks

In February 2013, the US government responded to a We the People petition spearheaded by Wilbanks and signed by 65,000 people, and announced a plan to open up taxpayer-funded research data and make it available for free.

The goal of Consent to Research is to play a part in the transformation of health from something we experience passively to something we experience actively. Our health is more than the medical visits we make, or the genomes we carry inside us, or our environments. We can now measure an enormous set of data about an individual, and if there were an enormous pool of publicly available health data, it will be far easier to apply powerful modeling techniques and begin to develop new kinds of hypotheses about the connections between our health, our DNA, and our choices.

Our strategy to achieve this goal is to create a massive pool of openly available, user-contributed data about health and disease. Right now data about people’s health is expensive, complex to access, and the sample sizes are tiny.

The problem is that right now, it’s not easy to donate your data to health research.

Thus, our tactics are to create standard, free tools for managing data donation and personal data, promote the use of open standards, and embed our tools into clinical study. These tools help people not only gather data about themselves and their health, but get that data into the hands of  data-driven research scientists. Everything we do is free.

To help potential data donors, we make systems that let data donation happen: informed consent processes, institutional review board protocols, tools to extract data from the systems where it normally lives, and a software system that can receive data and get it to health researchers. 

Stashed in: Big Data!, Big Data, Healthcare, Big Data Ethics and Privacy

To save this post, select a stash from drop-down menu or type in a new one:

I can't tell if 65,000 signatures on a petition is enough to make progress.

With government shutdown, can't see if that has gone up since Feb 2013, however, I don't think that taxpayer funded research is going to lead this. I think the innovation will come from private bodies and individual researchers taking this on.

Here is an example:

The only issue is where things get used that cause me bother, if my geno-type evidences predisposition to a condition, my insurance premiums increase or a mortgage company declines to fund a 30 year term. As commercial organizations, they have every right to do this, but where does it stop? Here's a gender example, but could equally be applied to genotypes:

"The value of using group membership in judging unobserved characteristics is uncontroversial most of the time, and so is hardly noticed. For example, automobile insurance companies consider young unmarried males as a relevant group in determining driver insurance premiums because they tend to have more car accidents than older males or young women. These higher insurance rates also help cut down the number of auto accidents by reducing driving by accident-prone young males. Yet given that group membership is almost always an imperfect predictor of unobserved characteristics, some individuals will be treated much worse (or better) than their true characteristics justifies. In the driving case, young unmarried males who are careful and responsible drivers will pay more for insurance than they would in a world with better information. They might be discouraged from driving because they suffer from the bad driving of other young unmarried males."

This is going to be an interesting debate - very pertinent to the ethics of Big Data (or any data). Given the slow pace of legislative reform and the fast pace of how data is being collected, analyzed and used, I can see a gap where all sorts of things will happen here. 

You May Also Like: