Friday, March 29, 2013

We have raw results from Ancestry - now what?

There has been a lot of chatter around the web and on the social media, including this blog, regarding the release of the raw data from Ancestry for our autosomal DNA (aka the AncestryDNA test). Some of it has not been very flattering regarding the "possible" comments supposedly made by some Ancestry officials at RootsTech 2013 conference.

Roberta on the DNAeXplained – Genetic Genealogy blog had a take that she posted that on first blush had me madder than a hornet. I was especially not happy with one of the comments that has been attributed to Kenny Freestone, the Ancestry product development manager, and I quote, "that their primary focus is to keep things simple for the newer users." This was in regards to providing some advanced tools to aid those of us who tested using their AncestryDNA test to further refine the DNA matches we have received via their testing system.

In chasing this story down, I actually found someone who was present at the event where Mr. Freestone supposedly made the infamous comment mentioned above. Honestly, after reviewing the record, I think Roberta may have jumper the gun just a bit. I can find no evidence from anyone who attended those get togethers that he actually made that comment.

CeCe Moore at the "Your Genetic Genealogist" blog has an extensive post on all this at
http://www.yourgeneticgenealogist.com/2013/03/ancestrydna-raw-data-and-rootstech.html.

So for now I will put this chromosome browser issue and Mr. Freestone to rest. But let it be known by all that I will be keeping at least one eye on you Mr. Freestone. I just don't trust everything coming out of your shop. I have had issues in the past with some of the supposedly great ideas you and your software engineers have generated in the past. I have been an Ancestry paying customer since December 2000 and I honestly just don't trust your software programming staff to do the smart thing all the time (aka your old search vs new search templates/results, etc).

I won't rehash everything that CeCe covered in her post here on this blog. So if you want to get the whole story I encourage you to click on the link above and read all of her post. I think you will find it very interesting. Below I will cover some of the more interesting things she mentioned that have immediate impact on those of us who have spent our cash with Ancestry taking their autosomal DNA test.

CeCe mentioned that she had sent her raw DNA file to several 3rd party providers and received the following comments:

* "After working with it a bit, John Olson announced on the site that he expects that Gedmatch will be accepting AncestryDNA uploads in about two weeks."

You can view John's GEDmatch website at http://gedmatch.com/.

* "David Pike told me that he has updated his tools to work with the AncestryDNA files."

David Pike's DNA Comparison Utilities can be viewed at http://www.math.mun.ca/~dapike/FF23utils/
* "Leon Kull has reportedly updated his HIR search site to work with them as well."

Leon Kull's website is located at http://hiropractic.snpology.com/22/

* "Dr. Ann Turner has created an Excel macro to convert the AncestryDNA files to 23andMe format."

I'm still searching for this tool so if anyone knows where this Excel is at, please email me.

CeCe further wrote on her blog:

"At the "Ask the Expert" Genetic Genealogy panel that I moderated at RootsTech on Saturday:
* "Bennett Greenspan told the audience that Family Tree DNA will be accepting AncestryDNA transfers into Family Finder starting on May 1st. "

* "Dr. Catherine Ball confirmed that the raw data file is not phased and that they are delivering it as they receive it from the chip manufacturer Illumina. She also confirmed what Dr. Ann Turner had already discovered - the data labeled as "Chromosome 25" is from the PAR region. Further, the "Chromosome 23" label refers to the X chromosome data and "Chromosome 24" refers to the Y chromosome."

Additional notes from CeCe:
* "Unlike Family Tree DNA, AncestryDNA is not removing any SNPs from the data - medically relevant or not. "

* "The overlap between AncestryDNA's raw data file and 23andMe's should be around 690,000 SNPs due to the fact that they are both using the same Illumina OmniExpress Plus base chip. The ~10,000 SNP difference can be accounted for due to a different set of poorly preforming probes and test SNPs. Family Tree DNA's should have a similar overlap for the same reasons."

* "There is no mitochondrial DNA included in the raw data file because it is not included on the Illumina chip that they are using. (23andMe adds the mtDNA SNPs)."

CeCe did hear from Ancestry that a search function is in the works but no firm date of availability has been announced. This search function will allow us to filter our list of matches by surname, location and username.

Ancestry is also working an improving the genetic ethnic feature. CeCe mentioned that "a number of AncestryDNA personnel acknowledged to me over the weekend that certain "ethnicities" (i.e. - Scandinavian) are overestimated for many customers. However, they also emphasized that much of the perceived problem with their admixture analysis stems from the question of "where and when". What they mean by this is that it is very difficult (and sometimes impossible) to pinpoint where specific DNA signatures were at an exact time in history."

CeCe also mentioned, "The good news is that AncestryDNA customers don't have to wait for this update to gain more insight into their ancestral origins. Now that AncestryDNA has made the raw data available, customers will be able to upload their raw data file to the various third party sites to try out the admixture calculators and/or send it to Dr. McDonald for his very highly regarded analysis."

I hope to have more details on all of this in the very near future on this blog so please stay tuned.

--Larry aka The Chief