Sharing my genotype

On October 1, I uploaded a file containing my genotype simultaneously to three public websites.

Since I’m not a geneticist, the raw file doesn’t mean very much to me. It’s a 23andme file full of As, Cs, Ts, and Gs. It basically is of no use on my computer. Just takes up space.

I uploaded it both to prove a professional point – I’m asking people to share, and that means I should be sharing my own as well, right? – and out of basic curiosity. I was in the Newark, NJ airport when I downloaded the file and re-uploaded it, first to to the Sage Bionetworks Synapse system and then to the OpenSNP wiki.

Synapse is where I am putting my data that I want syndicated to computational researchers in a rigorous way. Before I could put my file there I had to go through the PLC informed consent process and enroll in the clinically approved SCC-CGR study here. It’s where I will deposit more and more data over time.

It’s also a good example of the difference between a clinically valid environment like synapse and a wiki. To get at my genotype inside Synapse, you have to go through a username and password process, and sign a set of terms in which you agree not to try to re-identify me, or to harm me, and in general to act like a nice person. It’s a low bar to clear, but it’s still a bar. On OpenSNP anyone can get my genotype.

But I was curious about uploading it somewhere with no control, because my experience is often that removing control creates more good things than bad things. Within a few hours of uploading to OpenSNP, admins at another site, SNPedia had (at my request) made a copy to their own wiki. And then some neat things happened. I got a Promethease report annotating my genotype in a different way than 23andme had done – not as complete, but with some startlingly different and new claims about hypertension that make a ton of sense in my family history context.

I also got the most curious email from a genetic genealogist in the UK who analyzes every open snp he can find for genealogy purposes. It contained the soothing sentence “there is no suggestion of consanguinity in your pedigree” – which gave me a laugh, as I’m from Tennessee.

It was quite surprising how much I got back, and how quickly. This all took less than 72 hours from download of my genomic information to receipt of notice that my parents aren’t inbred, and it was just from one simple file moving about just a bit.

I know that the outcomes could have been horrible. I could have discovered I had a rare genetic disease. I could have discovered my parents were cousins. And some no doubt will have this experience.

But my gut is that people will learn this no matter what. Cheap sequencing and automated annotation software essentially guarantee that at some point, someone will sequence you – hospital? jail? doctor? nightclub? The possibilities are endless. The question is, will the individual citizen have the rights and powers to choose where the information goes, and why? Or will that power rest with the collector, in the name of “protecting” the citizen from uncomfortable knowledge?

It’s a big choice we have to make, whether or not to let people own their own data, to let them share it, to allow people to make bad decisions and learn scary things through technology.

It’s not an easy choice, but it is at least a simple choice. Either we’re in charge of our own data, or we’re not.

John Wilbanks is the Chief Commons Officer at Sage Bionetworks and a Senior Fellow in Entrepreneurship at the Ewing Marion Kauffman Foundation. He runs the Consent to Research Project.

3 Comments

  1. Harry McIntosh says:

    I think you’ve hit on something that could be VERY useful. Who knows what discoveries could come out of making more genetic data available to researchers. (That’s the great thing about research–you never know what you’ll run into.)

    As useful as sharing genetic data would be, I have to wonder if combining that with health history data would be even more useful. My genetic data may (or may not) indicate that I have an increased chance of having high cholesterol, but knowing exactly what my cholesterol level IS could make my genetic data much more useful. Take that a step further, and know how much my cholesterol went down on drug X, and you’re really making progress.

    The problem with sharing health history data is that there aren’t (as far as I know) good mechanisms for doing it. Even if I find a website (like WeConsent.us) where I can upload my health data for researchers to use, it won’t help if I don’t have an electronic copy of the data, and it’s in a form it can be used. My impression is that both of these things (the ability for patients to get an electronic copy of their records, and that those records be in a form researchers can use) are generally still missing (at least in the U.S.). I hope that will change.

  2. joyclee says:

    Brilliant article. Thanks for taking the first step. This is an amazing and innovative experiment that could be more powerful than the most generously funded study in traditional research! I love it!

  3. Graham King says:

    The consent form at [1] is broken. It always says ‘Invalid field value for field “today”.’.

    [1] https://plc-cgr.weconsent.us/legalconsent/www/wizard/step9.action