Introduction to Bioinformatics - COIN81


COIN 81 Week Six

In week 6 we'll start our journey into phylogenetics and motif analysis. You'll use Biology Workbench, and will investigate HIV mutation, compare the envelope proteins of retroviruses, and participate in a scenario-based lesson on the emergence of SARS. You'll start out by viewing the HIV mutation PowerPoint which introduces the problem space, and the use of phylogenetics as a tool to do comparative sequence alignment, and determine diversity and divergence, two key measure of viral evolution. After a comparison of DNA, we'll use ClustalW multiple sequence alignments of retroviruses to determine evolutionary linkage. The same exercise can be performed on prions.

After you gain confidence with those tasks, we'll try our hand at comparing the SARS virus with other coronaviruses. This was an actual activity performed in 2003 as SARS emerged in China. Follow the steps in the phylogenetics exercise carefully, and work through creating dendrograms of each of the SARS proteins. There are six proteins that I have prepared in text format - it is a time consuming chore.

The main page to start the assignments can be found from this link.

Biology Workbench - Biology Workbench is a web based tool, which runs on a SDSC server. There is a PDF that you can read to help you get started.

swami - Swami, The Next Generation Biology Workbench, which builds on Biology Workbench, and adds more tools.

Phylogenetics - HIV mutation can be studied using phylogenetics. start out by reading this paper, which described the actual experiment you will be running. You'll also need to download the files as a zipped archive.

Phylogenetics - Retrovirus ENV protein comparison. You'll run a multiple sequence alignment of the retroviral ENV protein from HIV-1, SIV-1, HIV-2, SIV-2, HTLV-1, STLV-1, HTLV-2, and STLV-2. You'll download the sequences (and submit them) from this text file.

Phylogenetics - Emergence of SARS. This is also a scenario-based exercise based on the following sequence and modeling research, published in early spring 2003, which looked at sequencing and multiple sequence alignments of SARS, and compared it with other coronaviruses to attempt to understand where it had emerged from. You'll download the sequences (and submit them) from this text file.

Phylogenetics - Comparing hemoglobin and myoglobin using these sequences. You should try to locate protein sequences using GenBank, UniProt, Swiss-Prot (TrEMBL) to make sure you understand how to search for, download, and format sequences for performing multiple sequence alignments.

Assignment four will take a full week, and is the foundation you'll build on for motif analysis and study of avian influenza virus (AIV). I find phylogenetics and multiple sequence alignments to be one of the most interesting bioinformatics tools and methods for understanding both protein and viral evolution.