Wednesday 14 December 2016

L2 MHT - "difficult to place" people

When we put together the revised Mutation History Tree for Lineage II (L2 MHT), there were 9 people who could not be confidently allocated to sub-branches of the tree. We are going to use two techniques to attempt to place them on the MHT and the first of these is Dave Vance's SAPP programme. We will explore Robert Casey's methodology in a subsequent post.

The SAPP programme is like a turbo-charged version of Fluxus especially designed for genetic genealogists. Fluxus is the software programme I used to help generate the first version of the MHT last year. It is a programme that uses STR data to generate a phylogenetic tree (a.k.a. cladogram or phylogram or Mutation History Tree). The SAPP programme also generates a phylogenetic tree based on STR data, but it has some additional features that make it way superior and far more elegant and user-friendly: 
  • the output is more like a family tree, and less like assembly instructions for Swedish furniture (it's an oldie but a goodie)
  • unlike Fluxus, it incorporates SNP data so that the upper branches of the tree can be anchored effectively 
  • it recognises similar STR signatures and takes these into account when grouping people 
  • it recognises people with known genealogical relationships and groups them together
These features make the SAPP programme a great time-saver and an excellent way of double-checking your work if you have created your MHT manually, as I have. It takes a lot of trial and error (40 minutes in my case) to get the data input "just right" but once you have done it correctly, the output is impressive.

Below is the SAPP Tree output from the SAPP programme for L2 (with some of my own graphic additions) and below it (for comparison) the output from my manually created L2 MHT. You can download higher quality pdf versions of these files from Dropbox by clicking on the captions below each individual graph. And that's the first point - the detail in the images is not easy to see. There is a lot of information concentrated in a small space and that makes reading it very challenging. We have the same problem when trying to navigate through our family tree - it will never all fit on the same page. Best to click on the Dropbox link (the caption below each graph) so you can view it in a separate bowser window, or download the file and open it in a separate programme on your computer.

This SAPP Tree includes all 29 members of Lineage II. The 9 previously ungrouped individuals are indicated by a dashed red border around the relevant boxes. Most of them are sitting away from the SNP-confirmed branches of the tree - the exceptions are G77 under Branch A and G99 & G81 under Branch D.

What you can just about make out without enlarging the image is the colour-coded branches in each version of the tree. There is good concordance between the two trees with regards to Branch A & Branch B (both SNP-confirmed branches) - both have the same membership, the same (or complementary) STR mutations listed, and similar placement on the larger tree in relation to each other.

However, although Branch C (also SNP-confirmed) & Branch E have the same membership in both trees, & the same (or comparable) STR mutations listed, their placement in the SAPP Tree is different - SAPP says Branch E could be genetically closer to Branch C than Branch B. Further SNP testing will be needed to determine this.

Branch F is split in two in the SAPP Tree, with members G97 & G98 sitting quite distantly away from G65. Determining which tree has the correct placement will only be decided by additional SNP testing. The split in Branch F  has also caused a split in Branch D - this is not too surprising as the two members on this branch (G05 & G68) are very distantly related and I am sure this particular branch will split into several smaller branches in due course as more SNP testing is undertaken. In addition, Branch E has also split Branch D and has been placed very differently compared to the other tree.

The SAPP Tree for Gleeson L2
(click image to enlarge, click caption to download pdf)
My L2 MHT for Gleeson Lineage II
(click image to enlarge, click caption to download pdf)

The SAPP Tree raises some important considerations for the Gleeson L2 MHT:
  • It revealed a few data omissions on my part (so it was a good way of verifying my data)
  • It generated Genetic Distance tables which I found very useful (see below). The maximum Genetic Distance (GD) between any two members in L2 was 12/37, 12/67, and 18/111 … and Adjusted GDs (taking into account Back & Parallel Mutations) were a staggering 30/37, 29/67, & 29/111
  • In Branch A, the SAPP programme identified a "more parsimonious" configuration of the branch, as a result of which I have slightly modified my version of Branch A (in other words, it identified a better configuration that made better sense of the data - see diagram below). The revised L2 MHT is available from Dropbox here.

The Old & New Versions of Branch A
  • In addition, all 9 people who could not be placed on the tree previously have now been allocated (provisionally) to specific sub-branches. This allows us to see to whom they are (potentially) most closely related. However, the confidence with which these 9 members have been placed on the SAPP Tree is relatively low (compared to the other members, who have either been SNP-tested or have relatively unique Y-STR Signatures and/or supportive Genetic Distance data). Thus their positions on the SAPP Tree have to be taken with a grain of salt. To confirm whether or not these 9 members have been accurately placed on the tree will require additional SNP testing. 

I have been in discussions with Thomas Krahn from YSEQ and have negotiated a specially-priced SNP Testing Strategy for the Gleeson's of Lineage II. And that will be outlined in detail in the next post.

Maurice Gleeson
Dec 2016


Outputs of the SAPP Programme for Gleeson Lineage II





3 comments:

  1. Maurice -
    Thank you for all you are doing!

    Merry Christmas
    Mary and Jim Petty

    ReplyDelete
  2. I enjoy reading your studies but confess that I am occasionally lost. Okay, more tan occasionally. I followed the link to the SAPP program. I couldn't tell how to create a file for upload to the site. Can you give guidance?

    ReplyDelete
    Replies
    1. Hi Jane, it is not an easy process. The best way is to use the sample file that Dave Vance supplies here (https://drive.google.com/file/d/0B1oWf7A5py4ASVNVbFpCUF9Ibm8/view) and adapt it for use with your own data. If you are having trouble, drop Dave an email (davevance01@gmail.com) - he is very helpful.
      Best, Maurice

      Delete