Coat of Arms1 Coat of Arms2

Blair Genealogy

Home

DNA 101: Y-DNA

DNA 102: Interpreting Results

DNA 103: Grouping Participants

Project Update

Test Results

Blair DNA Fund

Project Application

Pedigree Chart

Info Release Form

Participants

Oldest Ancestors

FTDNATip TMRCA

Blair DNA Mailing List

Blair DNA FAQ

Release of Liability

Blair Society Lines


spacer

Interpreting DNA Test Results

Below is a presentation I gave at the Guild of One-Name Studies DNA Seminar, at Cheltenham UK,  on February 20, 2010.  I hope others will find it useful.

Interpreting DNA Test Results

Almost the first question a Project Administrator gets from a participant is ďIíve just got my test results, what do they mean?Ē Most, who have run a DNA Project for any length of time, will probably agree that interpreting DNA test result is not always easy or straight forward, and usually gets more difficult as the size of a project grows.

I donít pretend to be an expert on the subject but Iíll do my best to explain the method I use in my Blair DNA Project.

  

Overview

Iím going to discuss the limitations of DNA testing, some of the things that DNA testing can do, how DNA test results are used and finally the methods I use to group participants.

  

A  Tool

DNA testing has little value on itís own, but when used with other conventional methods of tracing your ancestry, DNA becomes a valuable tool.

  

DNA Can Not

Itís important for both Project Administrator and participants to understand the limitations of DNA testing. There are a number of things that DNA testing can NOT do.

It can not tell you who your ancestors are. A personís test results on their own have virtually no value. You canít plug your DNA results into some magic formula and find out who your ancestors were. You need to compare your results with the test results of others to determine if you may somehow be connected.

It can not tell two participants with matching (or near matching) test results who their common ancestor is. Even with an exact match the results wonít tell you who your common ancestor is.

DNA testing can not even tell two participants with matching (or near matching) test results exactly how far back their common ancestor existed.

Finally it can not prove a suspected connection between two participants with a paper trail.

  

DNA Can

So what CAN DNA testing do?

If two participants closely match on their test results this is an indication that they share a common ancestor. Just how strong this connection is depends on the strength of the DNA match which Iíll discuss later in the presentation.

It can also give you a rough idea of how far back your common ancestor lived.  This also depends on the strength of the DNA match plus the paper trails. 

While DNA testing can not prove that suspected lines are connected, it can provide evidence to support that premise. How strong this evidence is depends on the paper trails and the strength of the DNA match.

Finally, DNA testing can prove that two individuals or suspected lines are NOT connected. No matter how good the paper trail may be,  if there are too many DNA mutations, they can not be related.

  

Science vs Probability

Genetic Genealogy is somewhat of a dichotomy. On one hand we use the extremely accurate and precise science of DNA testing to get our test results and on the other hand we then take the test results and apply approximated mutation rates and probability to interpret the results.

  

Science vs Probability

One of the primary purposes of DNA testing is to determine if two participants share a common ancestor and hopefully, by using conventional research, determine who that common ancestor is.

The most common method to do this is to calculate the estimated Time to Most Recent Common Ancestor (TMRCA). In other words, estimate how many generations you have to go back to find the common ancestor of two participants.

To do this you have to use statistics and probabilities.

The actual calculations of the TMRCA are far too complex to discuss here, but they depend on knowing the number of mutations and the rate of mutation.

  

Number of Mutations

There are two common ways to count the number of mutations.

The first method, called the Infinite allele model, counts any change in a marker as a single mutation. Each marker is scored as either a match or a non-match. If a marker does not match it is assumed to be a single mutation.

The second method considers the actual difference between the values of markers that do not match. These differences are then added to give the genetic distance.

In the example shown, XXX mismatches YYY on 3 markers, but two of the markers have a difference in their value of 2 giving a genetic distance of 5.

  

Mutation Rates

The second factor in computing the Time to the Most Recent Common Ancestor is the mutation rate of the markers. The mutation rate has a marked effect on the calculation of TMRCA. Doubling the average mutation rate effectively cuts the TMRCA in half. For example a 36 for 37 match with an average mutation rate of 0.002 produces a 50% chance of sharing a common ancestor within 12 generations. The same 36 for 37 match with an average mutation rate of 0.004 produces a 50% chance of sharing a common ancestor within 6 generations.

  

Mutation Rates

One of the most important things to remember is that mutations occur at random. Thereís no way to predict which marker will mutate or when a marker will mutate.

The mutatation rates used to calculate the Time to Most Recent Common Ancestor are estimated based on past observations.

  

Mutation Rates

Despite the critical nature of mutation rates in calculating the TMRCA there is still no real consensus on either individual marker or average mutation rates. Various studies and companies have attempted to determine the mutation rates of individual markers as well as average mutation rates for groups of markers, but the rates seem to vary from study to study and from company to company. Even with all the DNA testing thatís been done, given the random nature of mutations, there is still not enough data to reach agreement on mutation rates.

FamityTree DNA has developed a program, called FTDNATip, that automatically calculates the TMRCA between two participants in a surname project. It uses individual marker mutation rates that FTDNA has developed. FTDNATip has been criticized by many for using mutation rates that are too high, thus producing results that are overly optimistic.

There are links to several sites with differing mutation rates at the bottom of this page.

  

FTDNATip Report

This is a reproduction of an actual FTDNATip report of two participants in the Blair DNA Project. There are two things you should notice:

First, the two participants mismatch on 2 markers but they have a genetic distance of 3, meaning that they mismatch on one of their marker by two.

Second, the probabilities are stated as WITHIN a certain number of generations, i.e. a 68.67% probability of sharing a common ancestor WITHIN 8 generations.

  

TMRCA

TMRCA is a very broad estimate based on uncertain mutation rates. The probabilities will vary depending on the mutation rates used.

Itís imperative that you realize that the probabilities are WITHIN x number of generations. If the results state 89% within 12 generations it means thereís an 89% probability that your common ancestor existed sometime between generation 1 and generation 12. It does not mean there is an 89% probability your common ancestor was 12 generations ago.

The TMRCA results are based solely on the number of mutations and the mutation rates. It doesnít know your surname or the surname of the participant youíre comparing your results to.

The TMRCA results donít not know anything about your genealogies. It does NOT know that you paper trail goes back 3, 4, or 10 generations without a common ancestor.

All it knows is that you and your comparison have X mutations and you are using a mutation rate of Y.

  

TMRCA Alternative

For those that donít like dealing with statistics, FTDNA provides ďGuides for Interpreting Genetic Distance within Surname ProjectsĒ which are descriptive rather than mathematical. There are guides for 12, 25, 37, and 67 marker tests. Links to these guides are provided at the bottom of this page.

  

Grouping Participants

Placing your participants into various groups based on their DNA test results is one of the most important things a Project Administrator can do. I strongly recommend that you start doing this as soon as matches start to appear. As your project grows it becomes much easier to add new matches to an existing group or start a new group.

I group participants primarily based on the strength of their DNA match, but I also consider their paper trail to a lesser degree.

The Strength of the DNA match depends on 1) the number of markers tested, 2) the number of markers that match, and 3) the existence of rare or unusual marker values.

A 36 for 37 marker match with a rare value on one or more of the markers is a stronger match than a 24 for 25 marker match with common values on all the markers.

  

Grouping Participants

I normally only group participants who have tested at least 25 markers. I will include a participant with only 12 markers if he is an exact match and shares a known common ancestor with someone who is already included in a group.

As a rule of thumb I consider two participants to be a match if there is about a 50% chance they share a common ancestor within 12 generations. I use FTDNATip and normally consider 23 of 25, 33 of 37, and 61 of 67 a match close enough to include in a group.

  

Marker Mismatches

As groups grow in size it becomes increasingly more difficult to calculate all the marker mismatches between members of a group. Fortunately there is an online utility that will do all the number crunching for you. McGeeís Y-DNA Comparison Utility allows you to copy data from a spreadsheet or other source, paste it into the program, and produce a chart like the one shown here. This particular chart has been somewhat enhanced but the McGee program gives you all the data in an almost identical format.

By creating this matrix you can see the exact number of mismatches between any two participants.

Note that in addition to the 7 actual participants, Iíve included a hypothetical Anc02.

  

Ancestral Haplotype

One of the things I do for each of my groups is create an Ancestral Haplotype for that group. Heís known as Anc01 or Anc02, etc and is the  hypothetical "common ancestor" of the participants in the Group. Although itís impossible to know his actual DNA results, it is possible to deduce his most likely test results based on the results of his descendants. In its simplest form the ancestral haplotype is simply the most frequent marker values of the participants in the group. This example illustrates the ancestral haplotype of 4 factious participants. Note that although each participant mismatches the other participants on 2 markers, they all match the hypothetical ancestor on 24 of 25 markers.

As you add more participants to a group it is possible that the ancestral haplotype will change. If a group contains a large number of participants who share a known common ancestor with a distinct marker value you may have to make adjustments so you do not skew the haplotype.

  

Unusual Marker Values

Sharing a rare value on one or more of your markers can be a strong indication that participants share a common ancestor, provided the rest of their DNA results support that conclusion. It can be especially valuable in the case of borderline groupings.

In Group 3 of the Blair DNA Project we have 17 participants with a value of 26 on DYS#390 which occurs only about 1 % of the time. 14 of these same participants also have values of 12/14 on DYS#385a/b which occurs less than 4 % of the time.

Several websites have developed frequency distributions for the various marker values. Iíve included the website address of the sites listed here at the bottom of the page. I used the Sorenson Molecular Genealogy Foundation Website.

  

Conventional Research

One of the major reasons for DNA testing is to either support or refute conventional research. So using convention research to place someone in a DNA group may seem illogical. Conventional research should ONLY be used as a tie breaker.

No matter how good the paper trails may be, if the DNA results donít match, I wonít put the participants in the same group.

But what if the DNA results are inconclusive or borderline? Then I look at the conventional research. Do the participants claim to share a common ancestor? If so, how far back is this common ancestor? How complete are their paper trails?  Are there any inconsistencies in their paper trails that would make them suspect? Whether I include them in the same group depends on the answers to all of these questions.

Sometimes instead of asking ďWhat is the probability that these two participants ARE related?Ē itís better to ask the question ďWhat is the probability that these two participants are NOT related?Ē

  

Conclusions

 

DNA References for
Interpreting DNA Test Results

McGeeís Y-DNA Comparison Utility - http://www.mymcgee.com/tools/yutility.html

Sorenson Molecular Genealogy Foundation (SMGF) - Y-Chromosome Database http://www.smgf.org/pages/ydatabase.jspx

Marker Mutation Rates

WorldFamilies.net Marker & Mutation Comparison - http://www.worldfamilies.net/marker

Leo Little - Mutation Rate Effects - http://freepages.genealogy.rootsweb.ancestry.com/~geneticgenealogy/ratestuff.htm

Wikipedia - List of DYS markers - http://en.wikipedia.org/wiki/List_of_DYS_markers

Marker-to-DYS Conversion Chart with Mutation Rates -  http://micbarnette.bravepages.com/dys_conversion_chart.html

TMRCA Calculators

Clan Donald USA TMRCA Calculator - http://dna-project.clan-donald-usa.org/tmrca.htm

TMRCA Calculator - http://www.dnacalculator.org/tmrcaCalculator.php

Moses Walker TMRCA Calculator - http://www.moseswalker.com/mrca/calculator.asp?q=2

FTDNA Interpreting Genetic Distance within Surname Projects

12 Markers - http://www.familytreedna.com/genetic-distance-markers.aspx?testtype=12

25 Markers - http://www.familytreedna.com/genetic-distance-markers.aspx?testtype=25

37 Markers - http://www.familytreedna.com/genetic-distance-markers.aspx?testtype=37

67 Markers - hhttp://www.familytreedna.com/genetic-distance-markers.aspx?testtype=67

Frequency Distribution of Marker Values

Sorenson Molecular Genealogy Foundation (SMGF) - Y-Chromosome Marker Details -  http://www.smgf.org/ychromosome/marker_details.jspx

Y-Base Statistics - http://www.ybase.org/statistics.asp

Leo Little data from FTDNA data and Y-search -  http://freepages.genealogy.rootsweb.ancestry.com/~geneticgenealogy/yfreq.htm

This WebPage was last updated 01/17/2013

spacer
 
Custom Web Search
Search Genealogy, DNA, and Genetic Genelogy websites
 

 
Custom Web Search
Search Genealogy, DNA, and Genetic Genelogy websites

Contact the Blair DNA Project Coordinator
[ Introduction ] [ DNA 101 ] [ DNA 102 ] [ DNA 103 ] [ Project Updates ] [ Test Results ] [ Blair DNA Fund ] [ Application ] [ Pedigree Chart ] [ Info Release Form ] [ Participants ] [ Oldest Ancestors ] [ FTDNATip TMRCA ] [ Blair DNA FAQ ] [ Release of Liability ] [ Blair Society Lines ]


  God Bless
God Bless
America
and its Allies