Phylogenetics
Tasks for the lab report
Analyse two of the alignments given above, your choice, (one should be
DNA and one aa-sequences) and use at least two of the three main
methods of phylogenetic recounstruction, distance, parsimony, maximum
likelihood. The threes should include some kind of statistical testing.
Hemoglobin B from various species
(unaligned): Amino acid sequence


Figure 1. (above) A constructed phylogeny that is a Neighbor joining
tree with a boot-strap test of the hemoglobin B aa sequence between 12
taxa. The substitution model: Poisson Correction ( = most probable
distributation of substitutions of amino acids on the sequence, better
than PAM); all done in MEGA 3. The figure below represent the neighbor
joining of the same sequences displayed in NJ-plot.

Figure 2. The Phylogeny tree construct with the maximum parsinomy
method. Bootstrap did not work on Maximum Parsimony using MEGA 3.
Substition model was not applicable (why?: It is completely different
from nj. It does not use substitution models. Maximum parsimony is
based on principal of minimal evolution.)

Figure 3. Pairwise Distances matrix for the aa sequences using poisson
Correction as substitution model. This distance matrix values are the
basis for the neighbor joining phylogeny tree seen in figure 1.
A 896 bp segment of mtDNA d-loop
for five primates from Brown et al. (1982);

Figure 4. The neighbor joining phylogeny tree uses kimura 2 parameter
as a substitution model. Bootstrap values are indicated in the figure.
The figure indicates the gorilla to be the most recent common ancestor
for the chimpanzee and human. This correlates well with an overall
putative model of hominoid evolution.

Figure 5. The phylogeny tree with the maximum parsimony method. In
contrast to the neighbor joining tree (above in figure 4) this tree
claims the human to be the recent common ancestor of the mtDNA d-loop
sequence to the gorilla and the chimpanzee.

Figure 6. The distance matrix basis for the neighbor joining phylogeny
tree (figure 4).
Conclusion
The distance matrix value; the closer the value is to zero the more
similar the pairwise alignments is. In figure 6, this would mean that
chimpanzee and human (0,095) is more similar than human and gibbon
(0,212). This is displayed in the neighbor joining phylogeny tree
(figure 4.)
Neighbor joining is an algorithm for inferring a branching tree diagram
from the distance matrix. It works by successively clustering pairs of
taxa together. NJ can facilitate contemporary tips of uneven length.
The nj is effective to use for datasets comprising sequences with
largely varying rates of evolution (not proceeded as a clock).
"Bootstrap is a method that is
analagous to cutting the data matrix into individual columns of data
and throwing the characters into a hat. A character is then drawn at
random from this hat and it becomes the first character of the new
datamatrix. The character is then replaced in the hat, the hat is
shaken and again another character is drawn from the hat. This process
is repeated until our new pseudoreplicate is the same size as the
original. Some characters will be sampled more than once and some will
not be sampled at all. This process is repeated many times (say,
100-1,000) and phylogenetic trees are reconstructed each time. After
the bootstrap procedure is finished, a majority-rule consensus tree is
constructed from the optimal tree from each bootstrap sample."
The bootstrap support for any internal branch is the number of times it
was recovered during the bootstrapping procedure. Bootstrap values over
>50 are considered valid. In short, bootstrap is a method estimating
probability in a relation study and proceeds by resampling the original
data matrix with replacement of the characters.
Maximum parsimony (character-based method) means that phylogenetic
trees that can explain a given data set (aligned sequences) by fewer
evolutionary events is preferred over a tree that requires more
evolutionary events. It follows the principle of simpler solutions
being preferred over more complex ones.
Another note is that the nj-plot function "swap nodes" is not to fiddle
with data; it is just a view.
(info found in lectures, google and
http://www.bioinf.org/molsys/glossary.html)
Hints in Mega3
1. Align (export to Mega file)
2. Phylogeny
3. Distances: compute pairwise