Wednesday, 7 December 2016

Building the Mutation History Tree - Placement

In the previous post we looked at how we can group people together within Lineage II to form sub-branches. Once we have our sub-branches, the next step is to place each of them on the larger Tree of Mankind.

The starting point is the Modal Haplotype (MH) of the Z255 subclade. I obtained this from two sources - the R-L21 Haplogroup Project and Nigel McCarthy's Group E. Both are in complete agreement apart from marker CDYb which has a value of 39 in Nigel's version and a value of 40 in the Z255 Project's version. I arbitrarily chose the value of 40 for my version of the tree.

The Z255 Modal Haplotype (red text) & Branch Modal Haplotype (mutations in green highlight)

I then defined the modal haplotype for each individual branch (which I called the Branch Modal Haplotype or BMH in the Results Spreadsheet) and highlighted the differences between it and the Z255 MH (in green highlight). This identified those mutations which were common to all or some of the sub-branches, and which therefore potentially occurred quite early on (i.e. relatively far upstream) on the Tree, after the Z255 mutation/marker. These included the following:

  • a value of 19 on marker 20 (dys448) for all sub-branches
  • a value of 13 on marker 9 (dys439) for all sub-branches except Branch D
  • a value of 16 & 8 on markers 13 & 14 (dys458 & dys459a) for all sub-branches except Branch F
  • and so on ...

I then visually inspected each sub-branch in turn, marker by marker, and identified which marker values differed from the Z255 MH and if these marker values were unique to that branch. Thus marker 4 (dys391) has a unique value of 10 in Branch F, a value that is unique to this group and therefore helps define this branch. Similarly, marker 2 (dys390) has a value of 23 for each member of Branch F and thus is also potentially branch-defining (it is also present in Branch A and thus could potentially occur further up the tree as a common mutation to both).

Branch-defining mutations on Branch F (bold outline)

I also identified mutations that were specific to individuals and were therefore not branch-defining. Examples include ...

  • the marker value of 14 for marker 1 (dys393) in the results of member G-68 (row 28)
  • a value of 15 for marker 3 (dys19) in member G-79 (row 15)
  • a value of 14 for marker 9 (dys439) in member G-79 (row 15)
  • and so on ... 


The end result is a draft "tree" for each sub-branch. This exercise is best done with a paper and pen initially because there will be a lot of crossing out and moving markers around. You can see in the diagram below that the marker values that are shared by all members of Branch F are written in the upper part of the tree, and the values that are unique to specific individuals result in a branching pattern in the lower part of the tree. Marker values that also occur in other branches and might therefore be better placed further up the tree are indicated with arrows pointing upwards.

Identifying Branch-specific & Individual mutations for Branch F

Once the mutations for each sub-branch had been defined, the next step was to try to hook the various sub-branches together. This was a game of chicken and egg, trying to figure out if some mutations could have occurred earlier in the tree than others. If placing them earlier in the tree resulted in a simpler version of the tree, then the particular mutation was moved up accordingly (this is analogous to the "maximum parsimony" approach used in the Fluxus software programme). Doing so often required additional upstream branches to be created in order to "fit them in".

And lastly, once the tree had been accurately defined on paper, it could be easily transferred into a digital format using Excel to draw the tree.

Simples!
Maurice Gleeson
Dec 2016



The Mutation History Tree for Lineage II (L2 MHT)
(click to enlarge)

A more detailed account of the Grouping & Placement process can be found in this YouTube video.







No comments:

Post a Comment