Introduction to Bioinformatics lab5: Multiple alignment & PhylogenyNov 04, 2009. Firstly, you should download and install clustalw, clustalx, treeview and phylip to your disc. Section I: Multiple alignment exercise1. CarboxypeptidasesCarboxypeptidases are enzymes that cut the carboxyl terminus of peptides. They are secreted by the pancreas to aid digestion. The file carboxypep.fasta contains 18 sequences for carboxypepsidases from humans, cows, rats, pigs, etc. Align the sequences using ClustalX.
2. Beta-lactamasesBeta-lactamases are enzymes (produced by bacteria) that cleave antibiotics of the so-called beta-lactam family, thus making the bacteria resistant against those drugs (including penicillin). The sequences of 12 beta-lactamases can be found in file bcII-all.fasta. Align them in ClustalX. Choose Alignment | Output Format Options | Output Order | Input to prevent ClustalX from rearranging the sequences.
Section II: phylip exerciseIntroductionThe aim of this exercise is to get a basic knowledge about phylogenetic trees. In systematic biology and evolution biology this is of course essential. But phylogenetic trees are also important tools in ecology and medicine (which we will see here). Inference MethodsThe way of constructing a tree in today¡¯s lab is described below.
1. In-class exercise
Note: You should view all of the output or outtree file with notebook. Section III: Take-home exerciseEvolution of HIV and SIVThere are four different subspecies of the African green monkey, Cercopithecus aethiops. They inhabit different, but partially overlapping, areas south of Sahara. From all these subspecies, a lentivirus , which is a retrovirus, have been isolated. The lentivirus is called SIV, simian immunodeficiency virus, which is a misleading name due to the fact that it has not been shown to cause immunodeficiency in its natural hosts. The lentiviruses are subspecies specific which has led to the conclusion that the virus is of an old age. SIV resembles HIV in many aspects and the two probably have a common ancestor. Their RNA genomes are about 10 kb in length. A large fraction of the RNA is taken up by the gag, pol and env- genes and long terminal repeats. In addition, there are five or six shorter genes. Some of these are unique to lentiviruses. HIV and SIV are extraordinary diverse from a genetic point of view. In an infected individual the viruses may differ on the RNA-level. The term for these different viruses is quasi-species. HIV is divided into two groups, HIV-1 and HIV-2. These groups are further subdivided into subtypes. Globally circulating strains of HIV are known to be highly recombinogenic, but in humans recombination can only occur between viruses that are replicating within the same cell. It has been shown that coinfection can occur, but it is very rare. In this lab you will examine the evolution of HIV- and SIV-viruses.
ExerciseThe three files env.txt, gag.txt and pol.txt are containing the protein sequences from the isolates in the list below.
Do phylogenic analysis of the proteins. Use neighbor joining with 100 bootstrap replicates as shown above. Also run the same analysis, but skip seqboot. This will give you the real neighbor joining tree (the tree based on the real data, as opposed to the bootstrap trees which are based on random sampling from the real data). From the consensus tree, you can see the support for the nodes on the real tree. All the trees are un-rooted. That means that they do not say anything about the direction of evolution. Take that into account when you compare the trees. To simplify, use the same out-group in all the trees. To know which number to type in the out-group box, count the sequences from the top of the alignment. (The numbers depend on the alignment, so the same taxa could have a different number in the alignments for the different proteins.) Use your biological knowledge to select an appropriate outgroup. If you want, you can also run protpars to search for the best tree under maximum parsimony criteria and compare the results. Please answer the following question:
updated on Aug. 12, 2007 by Wu |