Thursday, April 7, 2016

Mining DNA data with GEDmatch

My readers may have wondered where I have been for the last six months. I have been analyzing my autosomal DNA using GEDmatch.

Dna by Виталий Смолыгин,

I found the best definition of GEDmatch in a pdf document “Using GEDmatch” that Kitty Cooper highlighted in her blog post

“GEDmatch is a FREE, non-profit, “do-it-yourself” genomics website that allows DNA testers to upload raw data from FTDNA, AncestryDNA, and 23andMe to compare with a large database of data that has been voluntarily uploaded by other testers.”

 After learning more about GEDmatch from my cousin Sallie Atkins, I decided to try it out. For anyone who learns best by listening and seeing, I recommend watching Angie Bush’s video “GEDMatch Basics” on before even opening GEDmatch.

By 112.Georgia (Own work) [CC BY-SA 3.0
 via Wikimedia Commons

Depending on the company that you chose to test your autosomal DNA, follow the instructions (shown in the video) to upload your data. Remember to heed the caution given in several websites/blogs about this process: GEDmatch is run by volunteers and can be so inundated by users so that the program seems to freeze at times. Just be patient (and grateful for the wonderful tools this site offers) and try again. One more caveat on uploading: if another person in your household is on the internet at the same time you are uploading, the upload may fail.

After your data is on GEDmatch, you can begin using the tools to identify the people who match you. Aside from the technical aspects of getting your data on GEDmatch, something very important to your success in connecting with your “matches” often gets overlooked. After you run the “one to many” query and see all those potential matches, what do you do?

CC BY-SA 3.0,

 Let’s look at how to communicate with our matches. Rachel Ramey talks about just this subject in her post “A Few Things I’ve Learned as aBeginner at GEDmatch.” Among her tips to include in your message to your matches is your kit number and why you are contacting the person. Also, highlight any surnames that you want your match to consider.

You will be surprised when you see just how many matches GEDmatch delivers in the one-to-many tool. I liked Ms. Ramey’s suggestion that you save GEDmatch tables into Excel or another spreadsheet tool of your choice.

: 2007 Nuno Pinheiro & David Vignoni & David Miller &
 Johann Ollivier Lapeyre & Kenneth
 Wimer & Riccardo Iaconelli / KDE / LGPL 3

You can do so much with the data columns using the “Sort” feature in Excel. Be sure to use the advanced sort where you can choose primary and secondary columns on which to sort. Depending on the sort, you can see different patterns in your data.

Well, now we have our data uploaded to GEDmatch, we have developed a template message to our matches, and we have some spreadsheets where we can organize our data in different views. What do we do when we start receiving e-mail responses from the matches? That’s where Excel again comes into play. And again, Kitty Cooper comes to our rescue. In her post of Jan 17, 2014, Ms. Cooper offers a guest blog post by JimBartlett, “Organizing Your Autosomal Information with a Spreadsheet” (actually with two spreadsheets.)

As you might have concluded from the topics covered in this post, understanding genetic genealogy is not a walk in the park. The field demands a lot of study and concentrated effort if you wish to harvest the rich information from DNA testing, including getting the most out of  the list of matches you receive. But from my experience, nothing in my genealogy research has given me the volume of information that DNA testing has done. It’s just there waiting for me (and you) to analyze and massage it into a usable format.

A great advantage for anyone interested in exploring how to use DNA is the large number of on-line resources. The basic go-to site to for information is The International Society of Genetic Genealogy (ISOGG). In addition to the professionals I mentioned in this post, be sure to check out Roberta Estes’ blog, DNAeXplained, for her amazing ability to explain esoteric subjects. Emily Aulicino is another person who is so adept at decoding technical information in her blog, Genealem's Genetic Genealogy, that non-scientists can understand the concepts.

I wrote this blog post for those who have been hesitant to try their hand at incorporating autosomal DNA into their genealogy research and for those who have tried but got bogged down because they didn’t know the resources out there to help them.

categories: DNA