Wednesday, 17 January 2018

Tracing Lineage II back to the 1650s

I am delighted to introduce Rick Neeley as author of this guest post. Rick recently tested his cousin and established that his "Gleason" line belongs to Lineage II - the North Tipperary Gleeson's. He specifically belongs to Branch F and tests positive for all the SNP markers associated with that branch (BY14189, BY14193, BY14194, BY14195, BY14197 ... via the Z255 SNP Pack).

However, there are several interesting aspects of Rick's story that are worth mentioning:
  • Rick's "Gleeson" ancestors spelt their name a slightly different way, namely CLESSON. This demonstrates how surnames can change over time into quite different forms from the original spelling. It may be that all the Clesson's in the US are related as a result of Rick's colonial ancestor. We need more Gleeson's in the project to assess this.
  • Rick has traced his CLESSON line all the way back to the 1650s, making it the longest pedigree in Lineage II. This pedigree contains 10 generations, about double the size of the average pedigree in Lineage II. This just goes to show that there are extensive pedigrees out there and the DNA project will really benefit from finding more.
  • Rick's Gleeson ancestors mingled with the ancestors of those in Lineage I (the English Gleason's) thus providing a direct link between the two major groups within the project. This simply illustrates what we already know: it is a small world and we are all connected to each other, sometimes in the most amazing ways.

Here is Rick's cousin's direct male line pedigree ...
1. Matthew Clesson b. c1651 Ireland d. 1716 Deerfield MA, married Mary Phelps 1670 & Susannah Hodge 1701
2. Capt. Joseph Clesson b. 1674 Northampton MA, d. 1753 Lake George NY
3. Lt Matthew Clesson b. 1713 Deerfield MA, d. 1756 Deerfield MA
4. Joseph Clesson Sr b. 1756 Deerfield MA, d.1816 Deerfield MA
5. Joseph Clesson Jr b. 1791 Deerfield MA, d. aft.1850 Peoria IL
6. Jarvis S. Clesson b. 1820 Shelbourne MA, d. 1876 Shelbyville IL
7. George Frederick Clesson b. 1863 Beecher City IL, d. 1934 Oklahoma City OK
8. Willard Ray Clesson b. 1893 Matthewson OK
9. Cecil Elbert Clesson b. Pibroch, Alberta Canada, d. 1965 Olympia WA
10. GEC 687631

And below is Rick's fascinating account of his ancestry. Please feel free to leave any comments at the end of the blog post, particularly if you have any insights into how Rick's earliest ancestor might have left Ireland. Enjoy!
Maurice Gleeson
Jan 2018



Matthew Clesson (Gleeson), Irish Colonial Pioneer of Northampton, Massachusetts

INFORMATION GATHERED FROM: "Genealogical Dictionary of the First Settlers of New England;” History of Northampton, Massachusetts;" Irish Pedigrees, Volume 1;" "The Journal of the American Irish Historical Society, Volume 17; "The History of Deerfield, Massachusetts;" Family and Landscape: Deerfield Homelots from 1671;“ “Soldiers in King Philip’s War 1675-1677,” “The Stebbins Genealogy,” 1904; and “Joseph Stebbins: A Pioneer at the Outbreak of the Revolution," 1916.

Our family knew nothing about the origin of Matthew Clesson, born about 1651, other than he was an Irish immigrant. I recently received results from Y-DNA testing of my first cousin that I belong to Lineage II, Branch F of the North Tipperary Gleeson family tree through my mother whose maiden name was Clesson. The closest match within this group is Philip Gleeson who traces his Gleeson ancestors back to North Tipperary, Ireland. My mother had spent several years researching her Clesson ancestry. We had no idea that the family name was originally Gleeson. I don’t know if the change in spelling was intentional or by accident, but every written record we have found here in the USA for my earliest Clesson ancestor, Matthew Clesson, has been fairly consistently Clesson with minor differences. What follows is somewhat unique since there were relatively few Irish immigrants in early colonial America at this time. This is my lineage to Matthew Clesson, with some of the Clesson story that I know to date. There are lots of Josephs and some Matthews to keep track of in this story. The names in bold lettering are my direct ancestor grandfathers after Matthew Clesson.

Matthew Clesson (my seventh great-grandfather) born about 1651, was an Irish immigrant who probably first came to Northampton, Massachusetts, as a servant to one of the early planters. Nearly all the first emigrants in Northampton from Ireland were children or young persons who came over for the express purpose of being servants. [1]

The earliest record I have found in Northampton for Matthew Clesson is in 1664 when he was employed for the year as a “cow keeper or calf keeper” on the commons to “keepe the cowes...to have pay in wheate 3s 3d pr bush.” It was custom for cow keepers to be children and youths as they were required by law to busy themselves in some useful occupation. Sometime before 1665, Matthew was granted three acres of land as the other Irishmen “haue it granted theme not a horn lote.” In the years ahead, no other servant in Northampton accumulated as much land as Matthew. 

Matthew’s dwelling house was burned down during an Indian raid in the King Phillip’s War in 1675. He was one of twelve persons to whom land was granted inside the fortifications in compensation for his losses. He was quite prosperous and accumulated considerable property, owning at one time fifty-nine acres of land lying in twelve different parcels, all of which with the exception of six acres he purchased.



There are several questions that remain unanswered about Matthew Clesson. Who were his parents? How did he get to Northampton, Massachusetts? Was he an orphan? What family or master did he serve? If he was an indentured servant, he had a better financial footing than any other in Northampton. As early as 1667, a few years after being a “cow keeper,” Matthew bought the home lot with a house and barn that was later burned in 1675. 

Matthew also married well. There is a list of prominent Northampton property owners in the town records. On this list is Deacon Nathaniel Phelps, Sr., who arrived as a child with his Puritan family and father, William Phelps, in 1630, on the Mary and John establishing with other Puritan families the Massachusetts Bay Colony. Nathaniel Phelps was one of the first settlers in Northampton and was elected their first Constable in 1656. [2] He was a founding member of their church, and owned a considerable amount of land as one of the original town members receiving land grants. Both Matthew Clesson and Nathaniel Phelps are on a list of Northampton townspeople as contributors to Harvard College for the year 1676. Matthew married Nathaniel Phelps’ daughter Mary in 1670, and in 1702, Matthew Clesson was given a home lot of four acres the same year that his father in-law Nathaniel died. It’s possible that Matthew was a servant for Nathaniel Phelps' family. This area had frequent Indian raids and records may not have been kept or may have been destroyed. If records exist which mention Matthew’s origin, they might be in court minute books, deed records, or an official document that might have recorded his name and previous residence or origin. They are probably archived and would require an onsite search page-by-page.




Matthew Clesson seems to have been something of a man even though the town classed him with the “other Irishmen.” He was twice married and had a family of nine children with his first wife, Mary, several of whom became prominent citizens of Northampton, Deerfield, and other towns in the Connecticut River Valley. Matthew appears as a signer on a petition to the General Court by various inhabitants of Northampton on 4 November 1668: “Respecting the laying of Custome of Trybute vpon Corne or other provissions that are brought into the severall Portes within this Collony.” He signed an Oath of Allegiance on 8 February 1678. Matthew would have been at least 24 years of age to take the oath. He became a Freeman in 1690. 



In the “History of Northampton” book written by Trumbull there is an account of a Matthew Clesson who was an active participant in the military operations and conflicts between his fellow neighbors and the Indians in late April 1709, during Queen Anne’s French and Indian War. I expect Matthew would have been in his mid-fifties at this time, and his son, Matthew, born in 1681 (my seventh great- uncle) would have been about twenty-eight years old. I don’t know which Matthew Clesson the story refers to. If it was his son, Matthew, a couple months later on 27 June 1709, he died of his wounds during the fight in which his brother Joseph was captured. There will be more on this later. The story goes that Matthew was in a scouting expedition of fifteen men led by Captain Wright. They followed the Connecticut River to the White River and over the mountains to the French River. They made canoes and sailed down to Lake Champlain where they killed and scalped a party of Indians on the lake. On their return up the French River, they discovered another party of Indians with a captive from New England. They fired upon the party, killing several Indians. The captive swam for the shore and was seized and burned on the spot by the Indians. Four members of the Captain Wright expedition were killed, and one was wounded. After returning on 28 May 1709, those engaged in the expedition petitioned the court and were awarded twelve pounds to Captain Wright and six pounds to each man as a bounty for their eight Indian scalps. 

The Gleeson Lineage II pedigree is well represented at this time in early colonial Massachusetts by Matthew Clesson and his heirs. But the Gleason Lineage I pedigree is also represented. Thomas Gleason (Gleson), the son of Thomas and Anne (Armesby) Gleson, was born 3 September 1609, in Cockfield, Suffolk, England, and died Cambridge, Massachusetts, about 1687. He married Susanna Page on 31 July 1634, in Cockfield. Susanna was baptized 4 December 1614, in Ingham, Suffolk and probably died in Boston, Massachusetts, 24 January 1691. Thomas and Susanna had several sons which included William, Phillip, and Isaac, that were born in Watertown and Cambridge, Massachusetts and fought or died in the King Philip’s war. His son, Isaac, was in the Connecticut River Valley living in Enfield, Connecticut, and fought at the Battle of Turner’s Falls 19 May 1676, about ten miles northeast of the Deerfield, Massachusetts, settlement. One can wonder if Isaac Gleason knew any of the Clessons. A possible connection is through Matthew’s youngest son, Samuel, who married Abigail Bushrod, whose father, Peter Bushrod, was also in the battle. Both Samuel Clesson and Isaac ‘s son, Isaac Gleason, were listed as descendants of soldiers in the “Falls Fight” and thereby claimants of land granted to the soldiers by an act of the Court in August 1741.



Matthew Clesson, born about 1651, made out a will in 1713 and is believed to have died 17 November 1716, in Deerfield, Massachusetts. He married 22 December 1670, Northampton, Massachusetts, Mary, daughter of Deacon Nathaniel Phelps and Elizabeth Copley. Mary died 15 April 1687, in Northampton, Massachusetts; some years later, Matthew married his second wife, Susanna Hodge/Hedge, on 21 November 1701. 

Matthew and Mary had nine children: 
  1. Mary, b. 13 Aug. 1672; d. 11 Dec. 1672 
  2. Thankful, b. 19 Sept. 1673; d. about 1761; m. (1) 1690 Joseph Mason, (2) 28 October 1695, Samuel Davis 
  3. JOSEPH, (my sixth great-grandfather) b. 23 April 1675; d. 4 June 1753; m. about 1704, Hannah Arms 
  4. Elizabeth, b. August 1677; d. 16 July 1709; m. 30 November 1698, John Hannum, Jr. 
  5. Mary, b. 20 November 1679; d. after 1750; m. 6 April 1701, Benjamin Bartlett
  6. William, b. 3 Jan. 1680; d. before 1709 
  7. Matthew, b. 31 December 1681; mortally wounded at Deerfield, Massachusetts, by Indians, 23 June 1709, and d. 27 June; he was engaged to be married to Sarah Mattoon, who she shared his estate with his brothers and sisters by direction of the Probate Judge. 
  8. John, b. 1 April 1685; d. before 1709 
  9. Samuel, b. April 1687; d. 8 September 1767; m. 24 May 1716, Abigail Bushrod, daughter of Peter Bushrod and Elizabeth Hannum 

The Clessons came of sturdy stock and the sons of the Irish servant Matthew are mentioned very frequently in accounts of the border warfare with the French and Indians. 



Matthew’s son Joseph (my sixth great-grandfather), was a soldier in King William's War (1689-1697). At the age of fifteen, he was one of the American parties engaged in the "Pomeroy Pursuit" from the Deerfield garrison in 1688. He was a resident of Deerfield from 1705 to 1709 and of Northampton from 1712 to 1724. In official accounts of the Queen Anne’s War (1704-1718) and of the Indian massacres on the border, Joseph Clesson, while on a scouting patrol on 23 June 1709, was captured by a party of French and Indians commanded by de Rouville. He was taken to Canada but either escaped or was released. He was an active participant in "Father Rasle's War" 1721 to 1725. Father Rasle was a Jesuit Priest and missionary to the Abernaki Indians. The English believed that Rasle was the mastermind who planned many Indian raids on their homes and settlement. Joseph is mentioned as a captain of the military forces at Deerfield in 1713. In 1730, he bought a home lot and house in Deerfield, Massachusetts. A rebuilt house on that lot is owned by Historic Deerfield and houses their Silver collection and is known today as the Clesson House. 



Joseph’s younger brother, Matthew, (my seventh great-uncle) was also a known Indian fighter during Queen Anne’s War and took a prominent part in the battle in which his older brother was captured while on a scouting patrol 23 June 1709, which I discussed earlier. However, on 24 June 1709, the day after that battle, Matthew received a mortal wound while fighting a party of French and Indians in defence of the settlement. He died four days later. Matthew’s son, also named Matthew, was listed as a member of the settlement's military force and was a captain at this time. 

During the French and Indian War, Capt. Joseph Clesson commanded a company of Massachusetts soldiers and died from the rigors of military service on 4 June 1753; he was buried in the camp burial ground near Fort William Henry in New York. He was married to Hannah Arms and had 10 children by her. 



Capt. Joseph’s son, Matthew Clesson (my fifth great-grandfather), born in 1713, was also prominent in military affairs of the settlement. He was in the frontier service under Captain Kellogg at the age of nineteen. By 1747, Matthew was a lieutenant. On 4 August 1747, he led a scouting party from Fort Dummer towards Lake Champlain and Canada. He was sent there by Governor Shirley to watch the movement of the French and Indians who were reported to be forming an army for a raid. He again led another scouting party in 1755 to Lake George. Worn out by the hardships of this expedition, he died on 4 or 24 October 1756. It is been recorded that his motto was "Kill them all! Nits will become lice".



Two of Lt. Matthew Clesson’s sons, Joseph, Sr., born 1756, (my fourth great-grandfather) and Matthew, born 1748, were patriots in the Revolutionary War. Joseph, Sr. fought at the Siege of Boston in 1776. The Deerfield Clesson House was left to Lt. Matthew Clesson, who left it to his three sons, of which Joseph, Sr. was the last one to have it. By 1798, the house had deteriorated greatly. In 1814, the house was torn down, and what was intended to be a rear part of a new house was built on the lot about 1814. Between 1830 and 1837, it was moved to its current location on the lot. His son, Joseph, Jr., born 1791, (my third great-grandfather) inherited the new house when his father died in 1816. He sold it to Eliphalet Dickinson in 1818, and from there it passed out of the family. 



The house in the picture above was lived in by the Clessons for only two or three years, but appears to have been built by them, or under their direction. 



There are close ties with other Revolutionary War Patriots to the Clesson family. Joseph Clesson, Jr. (my third great-grandfather) married into the Stebbins family, a prominent English family in Deerfield, Massachusetts. He married Mehitable Stebbins, 10 November 1814, in Deerfield, MA. Mehitable’s father Joseph Stebbins, Jr. and Joseph Clesson, Jr.’s father, the patriot Joseph Sr. knew each other in Deerfield and fought the battles during the Siege of Boston. Little did they know that their son and daughter would marry several years later in 1814 in Deerfield, making Mehitable a third great-grandmother and her father, Joseph Stebbins, Jr., a fourth great-grandfather of mine. 

Joseph Stebbins, Jr. served the entire Revolutionary War with a rebel and military career beginning in 1773, as a leader of the “Sons of Liberty” in Deerfield, and the first to respond as the lieutenant of the Minuteman Company that answered the Lexington Alarm on 20 April 1775, when a rider galloping through town called “to arms...Gage has fired on the people! Minute men to the rescue! Now is the time! Cambridge is the place!” 

In Cambridge, Joseph Stebbins was promoted to captain in the Continental Army by General Ward and placed under the command of Colonel Prescott. His officer’s commission was later signed by John Hancock and hangs in Deerfield Memorial Hall. His commission was issued in the same room and by the same body of men which had commissioned George Washington Commander-in-Chief eleven days earlier on 19 June 1775. Captain Stebbins was at the Battle of Bunker Hill 17 June 1775. He was one of the company commanders tasked the night before the battle to build the Redoubt and was “in the thick of the fight” defending the Redoubt the following day. He was later at the Battle of Saratoga in 1777, fighting at Stillwater and Bemis Heights and witnessed General Burgoyne’s surrender to General Horatio Gates. He was promoted to Lt. Colonel in 1781, and Colonel of the Militia in May 1788 to assist Governor John Hancock in Shays’s Rebellion. All three of his officer commissions were signed by John Hancock. 

Richard Alan Neeley
January 2018



Some comments from Maurice:

[1] Matthew Clesson would have been a child around the time of Oliver Cromwell's conquest of Ireland. I wonder if he was one of the many children and youths rounded up and sent to the New World as (un)indentured servants. If so, there may be Court Records which show what age he was assigned by the Court upon his arrival. This "age assignment" was necessary because many children did not know how old they were and the Court would decide. The age assigned determined how many more years of indentured labour the child / youth had to serve before being set free from his indentureship. [Reference: Without Indentures, Richard Hayes Phillips, 2013]

[2] Nathaniel's father William was from Crewkerne, Somerset and was born about 1593. You can read about him here. Coincidentally, there were a family of Phelps in North Tipperary in the mid-1600s which has been discussed in a previous post ... The Phelps Connection. They originated from Tewkesbury, Gloucestershire in 1620s.




Friday, 17 November 2017

FTDNA Holiday Sale until Dec 31 2017

FamilyTreeDNA have launched their Annual Holiday Sale. This runs from the last day of the Annual FTDNA Conference (Nov 12th 2017) until the end of the year. So now is the time to buy FTDNA tests and take advantage of some of their lowest prices ever. They also make perfect Birthday, Thanksgiving & Christmas gifts for friends and family.

2017 Holiday Sale Discounts

There are discounts on many of their products including upgrades on mtDNA and Y-DNA. The discounts represent approximately a 10-30% reduction from the usual price.


There is a special offer regarding the Big Y test. The usual price is $575 but there is a $100 discount in the sale. Further discounts are possible with the vouchers described below. But everyone who buys a Big Y test will automatically get a FREE upgrade to the Y-DNA-111 test. So if you have only tested your Y-DNA to the 37 marker level, buying the Big Y will get you a free upgrade to 111 markers (which would normally cost you $188).

Even if you haven't done a Y-DNA-37 test yet, you can order it at the Sale Price, and use a voucher for a further discount, and then once it has registered on the system, you can order the Big Y test and get the $100 Sale Price discount, and any additional voucher discount, and a free upgrade to 111 markers. This is a very good deal indeed!
So if you were very lucky, you could get the Y-DNA-37 for $109 (using a $20 voucher) plus the Big Y for $375 (using a $100 voucher) and the free upgrade to 111 markers. This wold normally cost $169 + $575 + $188 = $942 but you would be getting it for $484. This is only 51% of the price you would normally pay.



As mentioned above, you can use Holiday Reward vouchers to lower the sale prices even further. These will be issued every Monday until the end of the Sale but each voucher only lasts for 7 days so you have to use them quickly. In effect, this may reduce the cost of the Family Finder atDNA test to $49 and Y-DNA-37 to $109.

A $20 voucher for the Y-DNA-67 test

To access your voucher, simply log on to your FTDNA account and click on the Holiday Reward icon on your home page. If you make a purchase during the Sale, you frequently get a Bonus Reward as well. This gives further discounts on other tests.



And if you want to use the voucher for yourself, simply click on the Enjoy Rewards button and the product will be added to your Cart and the discount applied. Alternatively you can give the voucher to friends or family by clicking on the Share Rewards button. Each voucher can only be used once, and must be used before the weekly deadline.


A lot of people donate any vouchers they are not using so check the ISOGG Facebook group and Genetic Genealogy Ireland Facebook group for any unused vouchers that you might be able to take advantage of. Be warned, they go fast so you might have to try several before you find one that works.

Enjoy the Sale!

Maurice Gleeson
Nov 2017







Tuesday, 15 August 2017

Version 3 of the Mutation History Tree for Lineage II

Below is the updated version of the Mutation History Tree for Gleeson Lineage II (the North Tipperary Gleeson's). Previous versions were published in December 2015 (version 1) and December 2016 (version 2). A pdf version of the tree can be downloaded from Dropbox via this link ... L2 MHT v3a

To see where you sit in the tree, find your G-number from the table at the bottom of this post (taken from our WFN Results page).


So what does it tell us?
  • The Gleeson Lineage II family tree currently has 11 major branches. And there are likely to be a lot more.
  • It looks like the Gleeson surname has been around for quite some time. The first branch to branch off was Branch F (far right). This is a pretty ancient branch and dates (very roughly) from about 1050 AD, not to far away from the presumed date of origin of the Gleeson surname.
  • There are probably some branches that have simply died out over the passage of time ... and what we are seeing here is simply a modern day snapshot of the remains of the "clan" that once was. In times past, some of the branches might have been much more prominent, and others much less prominent.
  • Age estimates of the branching points are very crude because the dating methodology has severe limitations. It is hoped that these can be improved with time.
  • Some branches are associated with a particular area or townland in North Tipperary (e.g. Branch C1 - Garryard; Branch E - Curraghneddy). It is hoped that as more people join the project and supply their MDKA information (particularly birth location) that more and more branches will be associated with specific locations. This in turn will help members with their individual genealogical research.


The Tree
The Pedigrees (and Key)
The (previously) Unique SNP markers
Click to enlarge ... or download the high-quality pdf version

The tree consists of several parts:
  • the tree itself, illustrating the branching pattern based on SNP & STR marker data
  • the pedigrees associated with each member in the tree (plus details of their MDKA / EKA)
  • a key to the tree, and numbered footnotes
  • the unique SNPs identified for those members who undertook the Big Y test

The tree has expanded considerably since the last version. The results of the tenth Big Y test are now included (from our Clan Gathering Chairman, Michael G. Gleeson). These came back from the lab in late December 2016 and underwent additional analysis by Alex Williamson for inclusion in the Gleeson portion of his Big Tree. These results confirmed the existence of Branch F (which had previously been merely predicted to exist on the basis of STR marker data). They also split up the "A5629 SNP Block" which up to that point consisted of 4 SNP markers. Thereafter it was split into an upstream branch characterised by the SNP A5631, and a downstream SNP block characterised by the 3 SNPs A5627, A5629, & A5630.

These results made A5631 the apparent over-arching Gleeson-specific SNP for Lineage II (i.e. only Gleeson's have been discovered to share this particular SNP marker). Thus, A5631 could be the DNA marker that defines membership of the larger Gleeson "Clan".

Lineage II Gleeson's on the Big Tree illustrating the old "A5629 SNP Block"
(from Nov 2016)
The current version of the Lineage II Gleeson portion of the Big Tree
showing how the previous A5629 SNP Block is now split in two (Aug 2017)

In addition to the 10th set of Big Y results, fifteen people expressed an interest in doing the new Z255 SNP Pack and the results of 13 of these people have now come back from the lab. This revised SNP Pack contains almost 50 SNP markers that are either shared only by Lineage II members or are unique to Lineage II members, and represents over 95% of all shared and unique Lineage II-specific SNPs (see this previous blog post). So the Pack is very specific for Lineage II. These SNP markers were identified via the 10 Big Y tests previously undertaken by our project members and were incorporated into the revised SNP Pack by the team at FTDNA.

A review of some preliminary results of these SNP Pack tests was discussed in a previous blog post. The updated results are included in a table at the bottom of this post.

The data from these 13 sets of new results have been added to the tree and as a result, the branching pattern has expanded considerably. The previous version of the tree consisted of 6 branches (known or predicted) but the new version contains 11 branches:
  • Branch A has been split in two (A1 & A2) and two new members added (see red G-numbers: G95 & G113).
  • Branch B has remained intact and has gained a new member (G107).
  • Branch E was previously thought to be more closely related to Branch B but the new SNP Pack results indicate that it is in fact more closely related to Branch C. Thus Branch E's attachment to the tree has been changed.
  • Branch C has been split into two (C1 & C2) - the latter has gained a new member (G89) thanks to the new SNP Pack results.
  • Branch D has split into two also (D1 & D2). This is not a big surprise as the anticipated common ancestor of the original 2 members of this branch was some 14 generations ago. This branch has also gained a new member (G106) due to the SNP Pack results.
  • Branch F has also remained intact and has gained a new member (G104), again due to the new SNP Pack results. This is an unusual branch and appears to be the oldest branch in the project so far. Its connection to the rest of the group is some 30-32 generations ago, approximately 1050 AD, taking it very far back in time, almost to the predicted origin of the Gleeson surname.
  • Branch G is a new branch within the tree. It consists of just two people and they are not particularly closely related to each other. Both tested with the new Z255 SNP Pack and only tested positive for the more upstream Lineage II SNP markers (A5631 & the A5627/29/30 SNP Block). This too is a relatively old branch and its connection to the rest of the tree is some 25 generations ago (about 1200 AD). 
  • Branch H is also a new branch and may be a similar age to Branch G (i.e. about 25 generations ago). However, the members of this branch have tested positive for marker BY5706 (which is one step further downstream than Branch G). None of the 4 members in this branch are particularly closely related, so I would expect this branch to split up into further sub-branches in due course.

Version 1 of this Mutation History Tree contained 16 of the project members of Lineage II, version 2 placed 20 of the 31 members (65%) on the tree, and version 3 is the most comprehensive to date and contains 32 of the 36 members currently in Lineage II (89%).  The remaining 4 members cannot be placed with reasonable accuracy and will require further testing to enable placement.

Altogether, of the 36 members in Lineage II, 23 (64%) have downstream SNP data available - 10 via the Big Y test, and 13 via the new Z255 SNP Pack. The SNP Pack proved to be a great success and an 89% placement rate is quite impressive. The placement rate increased from 65% to 89% as a result of the SNP Pack testing.

Interestingly, some members were sufficiently closely related to other members of the group that SNP testing was not necessary. In some cases a definite relationship was already known, and in other cases the STR-based Genetic Distance was sufficiently close that placement was possible ... with reasonable confidence. The caveat here is that there may be a degree of Convergence obscuring the true relationship between certain members. And as a result, some people who have not undergone SNP-testing may need to be moved onto a different branch in the future. 

There were several questions that I had hoped the revised Z255 SNP Pack testing would answer:
  1. Are 10 Big Y tests enough to identify all/most of the downstream SNPs associated with Lineage II?
  2. How many future members are likely to be placed on the tree by just using the revised Z255 SNP Pack?
  3. Will there be a need for future Big Y testing within the group? or has the testing undertaken by group members so far helped reduce the cost for future members?

Now that the results of the SNP Pack testing are in, we can look at these questions one by one and see to what extent we have an answer for each.

The 10 Big Y tests certainly did identify a lot of the downstream branches of the tree, but not all of them. If we take "downstream" to mean (crudely) less than 18 generations ago (i.e. less than 600 years), then between the Big Y testing and the Z255 SNP Pack testing, six (6) downstream branches were identified (Branches A1, B, E, C1, C2, F). The remaining 5 branches did not have a "sufficiently downstream" SNP identified (Branches A2, D1, D2, G, H).

Also, the exercise identified new branches that were not predicted from the original Big Y testing. It is therefore likely that additional new branches will continue to be identified over time as more people join the project and undertake SNP testing.

So, although the SNP Pack testing did provide a lot of additional useful information, and has improved the structure of the Mutation History Tree, its coverage of "downstream SNPs" (using the arbitrary threshold of approximately 18 generations) is only about 50%. This fact alone indicates that there will be a need for Big Y testing in the future, but perhaps much more selectively (thus saving money for project members).

Now that the structure of the tree is quite developed, and as it continues to "mature", it will become easier and easier to place future members on a particular branch of the tree and in many instances will obviate the need for SNP testing. At this stage it is difficult to know how often this will happen.

For future members who are not easy to place, the options will be a 67 or 111 STR upgrade, the Z255 SNP Pack, or the Big Y test. In most instances, the SNP Pack might be the test of choice if the new member appears to be a possible match to one of the "downstream branches". But if it is not possible to place the new member anywhere on the existing tree, then the Big Y test might be preferred.

Accurately dating when each branch arose remains a problem and there are several reasons for this:
  1. In order for the dating to be accurate, the branching structure must be accurate. And for some people there is insufficient data to place them confidently on the tree. In such cases, it may be necessary to upgrade to 67 or 111 STR markers, or do the Z255 SNP Pack, or do the Big Y test.
  2. There is an inherent problem with any dating methodology used. Statistically, it may produce very accurate results. But from a genealogical perspective, the results are very inexact. Even at the 111 STR marker level, one can often expect to find a range of +/- 300 years on either side of the midpoint estimate. The same is true for dating using SNPs. 
  3. Dating using STRs appears to work best for people who are relatively closely related (say within the last 500 years) and dating using SNP markers may be more "exact" for people who are related 500-1000 years ago. Only further research will help clarify this.
  4. FTDNA's TiP tool uses proprietary information and its methodology is not public knowledge. As a result there is no way of checking the science behind it. It may be that it's estimations are incorrect. Last year (2016) the algorithm's were adjusted and new TMRCA estimates were generated for the same results. But there is no way of knowing if this was an improvement or not. I suspect that the TiP tool may underestimate the age of more distant (upstream) branching points because it does not accurately take into account the extent of parallel and back mutations. 
  5. Dave Vance's SAPP programme uses Ken Nordtvedt's Interclade Ageing methodology. I don't know much about this method but it may be a better way of using STR data to estimate TMRCA. And as the SAPP Programme is automated, it takes a lot of the hard work out of the calculations. Potentially.
  6. Ultimately, dating the branching points will involve a mixture of the above techniques and the best that can be achieved may simply be a "best guess".

So the take home message is that all time points in the tree should only be taken as a very rough guide.

As the tree grows and expands, more and more people will be able to use it to help their own genealogical research. Already we are making connections and breaking down Brick Walls for members in Branches B and C1. 

More will follow in time.

Maurice Gleeson
Aug 2017



The members of Gleeson Lineage II (from the WFN Results page)
... find your G-number above and then locate yourself on the tree

Below is the revised spreadsheet of the results of the recent Z255 SNP Pack testing. The previous blog post only included 12 sets of results - the 13th set of results effectively split Branch D into two separate branches.

click to enlarge ...
or download a high-quality pdf version
via this Dropbox link here









.


Note that some SNP markers have more than one name (e.g. A5631 is also called Y17108). This confusing situation arises because different institutions give the same SNP different names. The best place to see which SNPs have alternative names is to go to the Gleeson portion of the tree on YFULL. Just search for A5631 (use Cmd+F on a Mac or Ctrl+F on a PC). Note that the YFULL tree does not have as many datapoints as the Big Tree or FTDNA's haplotree.







Thursday, 25 May 2017

Convergence - quantifying Parallel & Back Mutations (Part 1)

In a recent post I explored the concept of Convergence and made the point that the mechanism by which Convergence arises is via a combination of Parallel Mutations and Back Mutations in the STR marker values. These mutations are changes that occurred at some time in the past but because they remain hidden to us in the present, we cannot tell when they occurred or how frequently they occurred just by looking at two sets of STR results from people living today.

However, there is a way around this problem. Or at least a partial solution.

By using a combination of STR data and SNP data we have been able to build a Mutation History Tree for the North Tipperary Gleeson's (Lineage II of the Gleason DNA Project). This tree is a "best fit" tree, by which I mean a tree constructed in such a way as to explain the STR & SNP data in the most parsimonious way i.e. with the fewest number of branches that will accommodate or "fit" the data. This approach is also called the "maximum parsimony" approach and is often used when building cladograms or phylogenetic trees. The Mutation History Tree (MHT) is simply another type of cladogram.

But a key point here is that this "best fit" tree is likely to change as more data becomes available. And to illustrate this point, I'm going to compare the current version of the tree (Dec 2016) with the next version that is being prepared following the recent availability of new data from 12 sets of Z255 SNP Pack results.

Below is the current version of the MHT for Lineage II. By comparing each mutation in the tree with every other one, we can identify which mutations are Back Mutations (occurring on a single line of descent) and which are Parallel Mutations (occurring on two or more lines of descent). I have highlighted the Back Mutations in yellow and the Parallel Mutations in green.


Back Mutations in yellow, Parallel Mutations in green
from Gleeson Lineage II MHT (version Dec 2016)

Parallel Mutations occur in the following lines of descent:
  • CDYb 40-39 ... A, E, D, F (4 times)
  • CDYa 39-38 ... A, B, C, F (4 times)
  • 464c 17-16 ... A x2, D (3 times)
  • 461 12-11 ... A, B (2 times)
  • 576 18-19 ... A, D (2 times)
  • 390 23-24 ... A, B, C (3 times)
  • 390 24-23 ... B, C (2 times)
  • 456 16-15 ... B, D (2 times)
  • and so on ...
Back Mutations are more difficult to count, and to conceptualise. Whether you consider the value as mutating forward or back is entirely dependant on your reference point. If our anchor is the upstream Z255 branch, then the original value of marker 390 (for example) is 24, mutating (forward) to 23 on the Z16438 branch, and then back to 24 (in parallel) on Branches A, B & C, and then back to 23 (again in parallel) on Branches B & C. So there are several points to make here:
  • this is in fact a Back Mutation that occurs in parallel in 3 separate lines of descent. It is thus both a Back Mutation (relative to its earlier value of 24 on the Z255 branch) and a Parallel Mutation, occurring at (presumably) different time points in Branches A, B & C. It is thus coloured yellow and green.
  • It can also be considered a Triple Mutation relative to the Z255 branch - in the sense that it mutates forward to 23 then back to 24, then back to 23 again. But what happens if it flips forward and back 5 times? What would we call that? And what do we call it if it goes two steps forward and one step back? This is where terminology fails us. I'm not sure if there is a standardised way of describing these different kinds of mutation (if there is, please leave a comment below).
  • the mutation 390 24-23 occurs in Branches B & C ... relative to its value of 24 in the Z255 branch, this could be considered a Parallel Forward Back Forward Mutation ... for Pete's Sake!!

But if we just focus on the Back Mutations that occur downstream of the branch characterised by the STR mutation (710 36-37), just above the A5627 SNP Block. This "710 branch" incorporates all the Gleeson's of Lineage II, from Branch A to F.* On this overarching branch for Lineage II, the value of the STR marker 390 is 23 and Back Mutations are as follows:
  • 390 24-23 ... B, C ... this is the only Back Mutation below the "710 branch"
  • And it is also a Parallel Mutation
  • All the other yellow Back Mutations are relative to the upstream Z255 branch, and not our downstream "710 branch", and so are not counted in this particular exercise.

So, let's generate some statistics from these numbers:
  • The total number of mutations below the "710 branch" (irrespective of whether they are forward or back) is 71.
  • There are 69 Forward Mutations (i.e. away from the original value of the relevant marker on the "710 branch")
  • There are 2 Back Mutations 
  • There are 26 Parallel Mutations
  • Forward Mutations outnumber Back Mutations by a ratio of 35.5 : 1
  • Parallel Mutations outnumber Back Mutations by a ratio of 13 : 1
  • There are 16 people in this tree, and if we make the big assumption that the "710 branch" starts 1000 years ago (i.e. roughly at the time of the introduction of the Gleeson surname), then over the course of 1000 years, the rate of each type of mutation is (crudely) as follows:
    • Forward Mutations = 69/16 = 4.3125 mutations per "line of descent" per 1000 years
    • Back Mutations = 2/16 = 0.125 mutations per "line of descent" per 1000 years
    • Parallel Mutations = 26/16 = 1.625 mutations per "line of descent" per 1000 years
These are crude estimates but they give some idea of the relative importance of Parallel Mutations compared to Back Mutations. And applying this information to the phenomenon of Convergence, it would seem that Back Mutations play a very minor role compared to Parallel Mutations.

In a subsequent post we will see how these calculations stand up when we add in additional data from 12 SNP Pack results and reconfigure the tree into the next version of the "best fit" model. And we will also attempt to quantify the total number of Back & Parallel Mutations below the upstream marker Z255.

Maurice Gleeson
May 2017

* the Big Y results of a 10th member of the group indicate that this branch is characterised by the SNP A5631 although this result is not reflected in this version of the MHT







Wednesday, 24 May 2017

Z255 SNP Pack Results - a first look

Last month (April), 14 members of the Gleason DNA Project underwent testing with the newly revised Z255 SNP Pack. FamilyTreeDNA were very quick to process the requests and the results of 12 of these members have already been returned from the lab.

Below are the top-line results of these first 12 tests. Only the most relevant SNP marker data has been extracted below and individual sections can be enlarged by clicking on the image. A pdf version of the complete data can be downloaded using this Dropbox link here.

To orientate you to the table, the SNP marker results of each of the 12 members are arranged in columns B to M. Each column has the initials, G-number, and kit number for each member. SNP markers highlighted in pink tested positive in that particular individual. The various branches of the Mutation History Tree for the North Tipperary Gleeson's (Lineage II) are indicated at the top and at the sides and are coloured in the same colours as in the current diagram of the tree (see below).





So what do the results tell us?

First off, working down the tree from marker Z255, all 12 members tested positive for the following markers: Z255, Z16437, Z16438, and BY2852. All but one person tested positive for BY2853 and BY2854.

Then we arrive at the Gleeson-specific markers, starting off with A5631 - everyone tested positive for this marker, as previously predicted from the results of the 10th Big Y test (from our Clan Gathering Chairman, Michael G Gleeson, 371202). These results came in after the tree diagram below was drawn and effectively confirmed that Branch F (predicted solely from STR data) did in fact exist. 

Below A5631, the Gleeson Tree splits into 2 branches. The first of these branches is Branch F and the first impressive results of the SNP Pack testing have revealed that 2 of the 12 members (G-104, G-97) belong to this branch. Their SNP data has helped define a new SNP block below A5631 consisting of the following SNPs (which up until now were "Private SNPs", present only in our Chairman, MGG-371202): 
  • BY14189
  • BY14193
  • BY14194
  • BY14195
  • BY14197

The other branch below A5631 is characterised by a SNP block consisting of 3 SNPs: A5627, A5629 & A5630. These 3 SNP markers are shared by Branches A through E in the Lineage II group. Ten of the 12 members tested positive for these 3 SNPs ... except for member G-75. His result came back negative for SNP A5627. This could be a back mutation in this SNP, and so FTDNA are retesting this single SNP marker to be sure.

Current version of the MHT for Gleeson Lineage II

Below the A5627 Block, there are two branches:
  • A5628, which in turn splits into Branch A and Branch B ... and possibly Branch E
  • BY5706, which in turn splits into Branch C and Branch D ... and possibly Branch E

From the above, you will see that the placement of Branch E on the current Gleeson Tree is in some doubt. This is because we have had no SNP data available for Branch E up until now and its placement in the tree has been based on STR data alone. Its ambivalent placement became apparent when we ran Dave Vance's SAPP programme using the STR data and this showed that it could equally be predicted to be nearest to Branch B as Branch C - this was discussed in a previous post (Dec 2016). However, a second major benefit of the SNP Pack testing is the revelation of the correct placement of Branch E. 
  • Member G-75 (MG, 371160) from Branch E tests positive for the SNPs BY5706, BY5707, BY5708 & BY5709. 
  • The first (BY5706) is common to both Branches C & D, and the latter three are SNPs that characterise Branch C. 
  • Thus Branch E is now identified as a sub-branch of Branch C.
And on the topic of Branch C, two of the 12 members (G-89 & G-22) tested positive for Branch C SNP markers (BY5707, BY5708 & BY5709). Not only that, but their results effectively split Branch C into 2 different sub-branches: C1, C2
  • C1 - characterised by the SNP A13116 and shared by members G-89 and G-66
  • C2 - characterised by the SNP Block A13110, A13112 & A13113, and shared by members G22 & G71


This new categorisation leaves member G-71 with the Private SNPs A13111 and FGC19590; and leaves member G-66 with the Private SNPs A13114 & A13115.

None of the 12 members tested positive for the SNPs characterising Branch A, Branch B or Branch D.  However ... 
  • 1 member tested positive for A5628 and therefore sits on a new branch adjacent to Branch A and Branch B
  • 4 members (G-81, G-108, G-77 & G-18) tested positive for BY5706 but no downstream markers - they therefore sit on a new branch (or branches) adjacent to Branch C and Branch D
  • 2 members (G-94 & G-110) only tested positive for the A5627 SNP Block and nothing further downstream. They therefore sit on a new, relatively upstream branch (or branches) on the Tree.

In the next post we will look at the revised Mutation History Tree for Gleeson Lineage II, incorporating these new SNP Pack results.

We'll also look at what this SNP Pack testing has told us about the nature of the evolution of the Gleeson surname and how it has helped individual members with their genealogical research.

Maurice Gleeson
May 2017





Saturday, 20 May 2017

Convergence - what is it?

There are several phenomena encountered in the the analysis of Y-DNA STR data that can throw a genetic spanner in the works, and Convergence is one of them!

In genetic genealogy, Convergence occurs when two men have DNA signatures that are exactly or nearly identical, but have evolved that way purely by chance. As a result, the two men will show up in each others' list of matches and will give the false impression that they may be closely related (e.g. within the last several hundred years) when in fact they are much more distantly related (e.g. within the last several thousand years). The problem is we cannot tell that Convergence has occurred simply by looking at the two men's STR results. It is hidden from our view. We cannot see it just by looking at the present-day STR data. And the danger is that if the two men think they are closely related, they may start chasing their common connection, thinking that they will find the answer via further documentary research, when in fact there is little hope of that at all. Their "close match" is a red herring. And their pursuit of the Common Ancestor is a wild goose chase.

So what can we do about it? How can we recognise it? How can we avoid it wasting our precious research time?

Confusion

The concept is occasionally discussed in Facebook groups or on various blogs, but there tends to be quite a lot of confusion around what it actually means. And there are a variety of quite understandable reasons for this. 

Firstly, there isn't a standard definition for Convergence, so how it is used varies from person to person.  Some people apply it only to exact matches, others apply it to exact and close matches. Moreover, the concept of Convergence is closely tied up with the concept of lack of Divergence. Both are different phenomena, but their effects and consequences are very similar. Another contributing factor is the fact that it is difficult to see it or detect it in practice. We know that it exists, but we have no way of identifying it just by comparing two sets of STR results. In other words, it's largely a hidden phenomenon (like Black Holes). It is only when we do SNP testing that the extent of Convergence becomes apparent. And the problem is that not enough people have done SNP testing. 

The good news is that more and more people are doing SNP testing and as they do, the extent of Convergence becomes more apparent. The Lineage II members in the Gleason DNA Project are trailblazers in this regard and we will explore the results of the recent Z255 SNP Pack testing in subsequent blog posts.

But in this post, we will look at an example of Convergence from the Gleason DNA Project in order to illustrate some of the key characteristics and consequences of Convergence. In later posts, we will look at clues that may indicate that Convergence is present, attempt to quantify the number of Back Mutations & Parallel Mutations that occur over time (using the Mutation History Tree that we have previously constructed for Lineage II - the North Tipperary Gleeson's), and finally we will attempt to quantify Convergence itself.

But first of all, let's look at some of the aspects of the definition of the term.

Definition

A general definition for the term convergence from the Conicse Oxford English Dictionary illustrates some general characteristics of convergence that are worth exploring because they are of relevance to how the term is applied in genetic genealogy and to the analysis of Y-DNA STR data in particular:
converge  1. come together from different directions so as eventually to meet
convergent  2. Biology (of unrelated animals and plants) showing a tendency to evolve superficially similar characteristics ...
There are several important aspects to these definitions that we can apply to the analysis of STR data (e.g. your 37 marker data). First of all, the sense that things were initially apart, but then they come together. Secondly, the idea that two things can look the same or similar on the surface, but in fact they have come from very different directions. And thirdly, the idea that two things can evolve from something different into something the same.

Let's look at how this more general concept can be applied to the analysis of Y-STR data.

And a good starting point is the description of Convergence on the ISOGG Wiki:
Convergence (also known as evolutionary convergence) is a term used in genetic genealogy to describe the process whereby two different genetic signatures (usually Y-STR-based haplotypes) have mutated over time to become identical or near identical resulting in an accidental or coincidental match.
One can think of convergence as producing misleading matches – two men appear to be more closely related than they actually are. The same situation may result (very occasionally) if there is an exceptional lack of divergence. In other words, so few mutations occurred in the descendants of a common ancestor over the course of time that the common ancestor may appear to have lived only a few hundred years ago when in fact he lived much further back than that, perhaps several thousand years ago.

So let's pick apart some of the key elements of this definition. You might like to refamiliarise yourself with some basic concepts, such as the different types of DNA markers (STRs and SNPs), and what you are actually seeing when you look at the DNA Results page.

Basic Concepts

Firstly, the above description of Convergence refers to the genetic signature - the Y-STR haplotype. This is the string of numbers you see associated with your results on the DNA Results page of the project. I like to think of it as if all the Y-chromosomes of the men in the group were all stacked up on top of each other, in such a way that each of the individual markers along the chromosome were all aligned with one column for each marker. Thus in the diagram below, each of the men have a value of 13 for the first marker. The values for the second marker are a mixture of 23 and 24. And so on.

The Y-STR results for the men of Lineage II
(click to enlarge)

Another key point in the above description is the concept that some markers mutate over time e.g. the number changes from 14 to 15. These mutations are identified by comparing the value in each square to the modal value for the entire group (i.e. the most frequent value among the men in that group). The most frequent values for each of the markers are used to generate the "modal haplotype" which is a virtual signature constructed from these most frequent values (and is represented by the row marked "MODE", the 3rd row from the top in the diagram above).

Mutations are indicated by coloured squares. If the value for any marker is the same as the modal value for that marker (i.e. the most common value among the men in that group), then the square that the value is in will not have a colour. If however, the value is higher than the norm, it will be coloured pink; if it is lower than the norm, it will be coloured purple.

If you and someone else have exactly the same string of numbers, you will have the same coloured squares and the same "no-colour" squares. If you are not exactly identical, you will have some coloured squares that the other person does not have ... and vice versa. In other words, the sequence of numbers, and hence colours, will be different. Each coloured square represents a mutation - a small minor increase or decrease in the number (compared to the norm) for that particular marker, in that particular individual.

Convergence in theory

Let's imagine that some distant ancestor living 10,000 years ago gave rise to four distinct lines of descent surviving today (represented by the men A, B, C, and D in the diagram below). Let's look at what happened to their first 37 STR markers over time, and let's assume that mutations only occurred in 5 of these STR markers, as shown in the diagram below. How did the values  change over the passage of time, from 10,000 years ago to the present day? And how many of the descendants of this ancestor "match" each other today?

In descendant A, only one of these 5 STR markers mutated. It underwent a single mutation (from 13 to 14) about 6000 years ago, and that was the only mutation over the span of 10,000 years. This is an rather extreme example of "lack of Divergence".

Descendant B had several mutations in his line of descent, but only affecting the first and the fifth markers. These show progressive "forward mutations" away from their original values. With the first marker, the mutations go forward in an upward direction (14,15,16,17) whilst with the fifth marker they go forward in a downward direction (15,14,13,12). This latter may seem counterintuitive but it serves to emphasise that "forward" means "away from" the original value, no matter if it is up numerically or down numerically.

Descendant C also has experienced mutations in only the first and fifth marker. But here we see two examples of a Back Mutation. The first marker shows a forward mutation 6000 years ago (13 becomes 12) but this has gone back to 13 by 4000 years ago. It then undergoes another forward mutation by the time of the present day (13 to 14). Similarly, the fifth marker undergoes a forward mutation (16 to 17) by 4000 years ago but a Back Mutation by 2000 years ago.

Descendant D undergoes mutations on all 5 of his STR markers. A Back Mutation occurs with the second marker between 2000 years ago and the present day (15 to 14); and likewise with the third marker (12 to 13); and likewise with the fifth marker (17 to 16). Two Back Mutations occur with the fourth marker (29 to 30 by 4000 years ago; and 31 to 30 by the present day).


Mutations over time in 4 distinct lines of descendants

Remember, these are four distinct lines of descent, with the MRCA (Most Recent Common Ancestor) represented by the first row of 5 STR markers in the diagram above. So now let's look to see if any of the mutations that occurred in these four individual lines of descent occurred in parallel i.e. the same mutational change occurred in two completely separate lines of descent.

Have a look at the first marker in A, B and C. All three men developed the same mutation on this marker - a change from a value of 13 to 14. In Lines A and B this change occurred in parallel around 6000 years ago. In Line C, the change occurred in parallel around about the present day.

There is a similar parallel mutation between Line C and D. Look at the fifth marker - it increases in value from 16 to 17 around about 6000 years ago in Line D and 4000 years ago in Line C.

And there is a parallel back mutation present in Lines C and D also - the fifth marker switches from 17 to 16 about 2000 years ago in Line C and around about the present day in Line D.

With Back Mutations you are only looking at a single line of descent. With Parallel Mutations we are comparing two or more lines of descent. And we will see that in practice Parallel Mutations are much more common than Back Mutations and have a much greater role to play in the development of Convergence.

The STR results of living people today tells us nothing about their evolutionary history

Which brings us to Convergence itself. Let's look at the Genetic Distance between each of these lines of descent. This helps to make the point that the DNA results from living people are only a snapshot in time. They do not tell us anything about how those STR values have evolved over the past 10,000 years:
  • A and B have a Genetic Distance (GD) of 7. This is made up of a 3-step difference on the first marker (14 vs 17) and a 4-step difference on the fifth marker (16 vs 12). And as these were the only changes on their first 37 markers, the GD would be written as 7/37. This exceeds FTDNA's threshold for declaring a match (i.e. 4 steps or less over the first 37 markers; written as 0-4/37) and so A and B would not appear in each other's list of matches.
  • A and C have a GD of zero. They are an exact match. Their GD for the first 37 markers is thus 0/37. They appear in each other's match list and the match looks really close. They think they have a common ancestor in the last few hundred years. They start comparing family trees, looking for the elusive ancestor. They will never find him. This is a wild goose chase. This is the consequence of Convergence.
  • A and D have a GD of 2 (or 2/37). This GD falls within the threshold for declaring a match. They both appear in the other's match list. They email each other, looking for the common ancestor - another wild goose chase. Another example of Convergence and its consequences.
  • B and C have a GD of 7/37. No match.
  • B and D have a GD of 9/37. No match.
  • C and D have a GD of 2/37. It's a match. It's Convergence. They don't know that. They spend months researching their connection. It's a wild goose chase.

The STR results of people living today tell us virtually nothing about how those STR marker values have evolved over time. They may have come from a relatively recent common source, or they may have come from widely differing directions.


Below is another way of conceptualising how the numerical value of a single STR marker might evolve over time. This marker started out with a value of 8 for the common ancestor of 4 distinct lines of descent. But by the time of the present day, two lines had a value of 9, one had a value of 13 and one had a value of 5. But the evolutionary history of these 4 lines of descent is peppered with Back Mutations and Parallel Mutations:

  • Back Mutations
    • Line 2 (red) - 14 becomes 13 some time between 1000 years ago and the present day (0)
    • Line 4 (purple) - 4 to 5 between 1000 and 0 years ago
    • Line 3 (green) - 5 to 6, 6 to 7, and 7 to 8 between 7000 (7K) and 4000 (4K0 years ago
  • Parallel Mutations
    • 8 to 9 in Line 2 (10K to 9K), Line 1 (7K to 6K), and Line 3 (2K to 1K)
    • 8 to 7 in Line 3 (10K to 9K) and Line 4 (9K to 8K)
    • 7 to 6 in Line 3 (9K to 8K) and Line 4 (7K to 6K)
    • 6 to 5 in Line 3 (8K to 7K) and Line 4 (4K to 3K)


The evolution of values in a single STR marker over time in 4 descendant lines
of a common ancestor who lived some 10,000 years ago


The consequence of all these Parallel & Back Mutations is that the present day descendants of two of the lines (green Line 3 & blue Line 1) have exactly the same numerical value for this STR marker despite the fact that their evolutionary histories are so different.

This is an example of the evolutionary history for a single STR marker. And if this is representative of all STR markers, then the chances that the values for a particular marker will converge over time is really quite high. But our DNA results usually consist of 37 markers (the standard test most people start with) so what are the chances of the first 37 markers evolving in such a way as to result in convergence of a sufficient number of STR values to cause a coincidental match? ... well, the probability of that happening would be a lot lower. And the probability would be lower still with 67 markers, and lower still with 111 markers. But because so many people have tested (over 600,000 currently), we do see the phenomenon occurring even at higher marker levels (67 and 111).

And in a subsequent post we will look at clues to the presence of Convergence, so that you can look at your own or anyone's list of matches and adjust your suspicion level accordingly.

Convergence in practice

And to illustrate these points, I have temporarily moved one of the ungrouped project members into Lineage II, namely member Jim Treacy (B38804)*. He is third from the end in the diagram below. Don't worry about not being able to read the text (you can click to enlarge the diagram if you like) - just focus on the coloured squares. 

The Y-STR results for the men of Lineage II (with a Treacy third from the end)
(click to enlarge)

And Jim has no coloured squares for the first half of the markers. It is only when we reach the 19th marker in the row that he has a pink square with the value 16 inside it - everyone else in that column has a value of 15 for that marker, except for one person who has a value of 14. And as we continue along Jim's row, there are 4 other coloured squares, bringing the total to 5. This can be expressed as a Genetic Distance of 5/37 from the modal haplotype (i.e. the 3rd row from the top, which - to remind you - is a virtual signature constructed from the most frequent values for each of the markers).

Now a GD of 5/37 between two men would mean that they do not appear in each others' list of matches (because FTDNA have set the threshold for "declaring" a match to be 4/37 or less). But among Jim's list of matches at the 37 marker level, there are two members of Lineage II (with a GD of 4/37). And at the 67 marker level, Jim has 6 members of Lineage II among his matches (with a GD of 6 to 7/67). So this looks (on the surface) that Jim is relatively closely related to our Lineage II group. And this suggests (on the surface) that there may be a common ancestor some time in the past several hundred years, maybe somewhere between 1700-1850 (on the basis of TMRCA calculations based on the TiP Report). 

So what do we do next? Do we start looking for documentary evidence? Do we go back to the church records and land records and old newspapers to see if there is mention of a Gleeson-Treacy connection? 

We could do. But it would be a wild goose chase. Because the Treacy-Gleeson connection is a red herring. And we know this because we have done SNP testing.

Jim has done the Big Y test, as have 10 of the members of Lineage II. Both Jim and Lineage II members belong to Haplogroup R, and both share some SNP markers in common. Each marker characterises a branching point in the Tree of Mankind and a SNP Progression is a list of these SNP markers down to the finer "more downstream" branches of the Tree. Here are the SNP Progressions for Jim and for the Lineage II Gleeson's:
  • R-P312> Z290 > L21> DF13 > ZZ10 > Z255 > Z16437 > A557 > Z29008 > A10891
  • R-P312> Z290 > L21> DF13 > ZZ10 > Z255 > Z16437 > Z16438 > BY2852 > A5631

You can see that the branching points are exactly the same ... until marker Z16437. Thereafter, Jim goes down one branch and the Gleeson's go down another one. Now, let's be clear: the Gleason's and Jim do share a common ancestor. And if he was around today he would test positive for the SNP marker Z16437. But his children would have evolved along different paths - one path taking us down to our present-day Jim Treacy, the other taking us down to our present-day Gleeson's. You can see where Jim and the Gleeson's are placed on the Tree Mankind in the diagram below.


Gleeson's to the left, Treacy's to the right, & about 1500 years in between

And when did this common ancestor live? YFULL date the formation of Z16437 as 1650 years ago. The two markers downstream of this, A557 (Jim Treacy) and A5631 (Gleeson), both have formation dates of 1400 years ago. So from this we can say that the common ancestor of Treacy & the Gleeson's is somewhere between 1400 to 1650 years ago. Or to give it an actual date (by subtracting from 1950, the approximate birth year for members of Lineage II), sometime between 300 and 450 AD.

This is clearly a lot further back in time than the 1700-1850 AD estimate suggested by the STR data.

So this is a great example of Convergence. By chance, Jim's STR signature has evolved over time to approximate that of the Gleeson's of Lineage II and as a result, he looks a lot more closely related to the group than he actually is.

Maurice Gleeson
May 2017

* a big thank you to Jim for allowing me to use his name and his results in this example