Understanding the Evolution of the Eukaryotic Cell: The Endosymbiotic Theory


In this section you will learn how to search databases for a protein sequence. First, we are going to choose a mitochondrial gene to search for. We need one that is likely to be highly conserved, in other words, one that hasn¡¦t changed much over millions of years. Genes that are required for cellular respiration, many of which reside in the mitochondrial genome, fall into this category. That said, we are going to use the protein sequence of the succinate dehydrogenase (succinate deH) gene of the eukaryote Reclinomonas americana. R. americana is a very simple eukaryote, therefore it has not evolved from its original form as much as most other eukaryotes. This is important to us since we are looking for mitochondrial sequences that have changed as little as possible. The succinate deH enzyme is involved in the Citric Acid Cycle of cellular respiration and is crucial to cell survival. Therefore we would expect the succinate deH gene to be minimally changed during the billions of years of divergent evolution because any mutation that drastically altered this gene would result in cell death. In other words, mutations in the succinate deH gene would be naturally selected against.

Using the Ndjinn Tool: to Search Sequence Databases

1.Select the "Ndjinn ¡V Multiple Database Search"tool by clicking on it and hit "Run".



2. On the next screen, you need to define the parameters of your search. In the search box at the top of the page type: "Reclinomonas americana succinate dehydrogenase¡¨. This tells the search engine what to look for. Also, notice the "Hits per page"drop-down menu immediately below the input box. This drop-down menu allows you to decide how many sequences you want to display. For this exercise, you want to see all of the sequences that are found so select "All¡¨.



Scroll down the page. Below the input box, you will see a list of many different databases, all containing a variety of sequence information. The databases are separated into two distinct groups: The first group contains sequences from many different organisms (for example, "GBBCT"contains a large number of sequences from many different bacteria), whereas the second group contains the entire genome sequences of specific organisms (for example, "Mthe"contains the entire genome sequence of the bacterium Methanobacterium thermoautotrophicum).

3. Since you are looking for sequences from a unicellular eukaryotic organism, click on the box that is next to the GBINV database. This is the GenBank Invertebrate Sequences database. This database contains DNA sequences that are specific to invertebrates only, and thus to unicellular eukaryotes, such as R. americana.



4. Scroll back up to the top of the Screen and click on "Search¡¨. You will then be sent to a page that contains the results of your search. At the time this tutorial was written, this particular search yielded one match. If you get a different number of matches, do not worry. Inconsistencies in the number of search results can occur because new sequences are being added to the databases on a daily basis.

5. Select "GBINV:2258325 Reclinomonas americana mitochondrial DNA, complete genome"by clicking on the box next to it.



In most cases, we would now click on the "Import Sequence(s)"button at the bottom of the page. This would import the selected sequence into your personal database labeled "Endosymbiosis"under Protein Tools to use whenever you want. HOWEVER we are faced with a special case.

The "Add New Protein Sequence"Tool
Unfortunately, the R. americana mitochondrial gene sequences are not available individually in the protein databases: they are only present as part of the entire R. americana mitochondrial genome which is very large and hence difficult to work with. In order to retrieve a specific gene sequence from the R. americana mitochondrial genome, we need to do a little extra work. Specifically, we need to find the succinate deH gene sequence within the mitochondrial genome, copy it, and then add it back to the Workbench as an individual sequence using the "Add New Protein Sequence"tool.

1. Make sure that "GBINV:2258325 Reclinomonas americana mitochondrial DNA, complete genome"is selected



2. In order to view the different sequences within the mitochondrial genome, click on Show Records. If you are using Netscape the Records will appear in a separate window; if you are using Internet Explorer, the Records will overwrite the existing window. Scroll down past "References"until you see the area called "Features¡¨. Scroll down a little further and you will see white boxes with checkmarks in them next to a "CDS"label. These are the different genes and their sequences. The R. americana mitochondrial genome contains 67 protein-coding genes.

3. To quickly find the succinate deH gene you can use the "Edit"menu of your web-browser and select "Find (on This Page)¡¨
NOTE: Older versions of web browsers for Macintosh computers may not have the "Find"tool. If that¡¦s the case, simply scroll down the list until you find the right match



4. Type "succinate"in the box that appears and click on "Find¡¨. You should find three separate succinate deH sequences (each with its own CDS checkbox). They are the sequences for "subunit 3¡¨, "subunit 4"and "subunit 2¡¨.



5. For the purpose of this exercise, we will use "subunit 2¡¨, the last one of the three.

6. The actual protein sequence of this gene is directly below it labeled "translation"and is represented by different letters of the alphabet, where the letters represent different amino acids. Notice that all protein sequences start with the letter M, for "Methionine¡¨, which is the amino acid encoded for by the start codon. Highlight the "succinate:ubiquinone oxidoreductase subunit 2"sequence (be sure not to include quotation marks in the sequence) and hit Control+C or select "Copy"from the Edit menu (see picture below).



7. Now, you need to get back to the "Protein Tools"homepage. If you are using Explorer hit the "Return"button at the bottom of the Records window; if you are using Netscape, simply close the Records window.



8. Select "Add New Protein Sequence"in the tool menu and click "Run¡¨. Label your selected sequence (eg, "Reclinomonas americana succinate dehydrogenase, subunit 2¡¨) and then paste the sequence in the box indicated by hitting Control+V or selecting "Paste"from the Edit Menu.



9. When you are done hit the "Update"button at the bottom of the page and the screen will rearrange then simply hit the "Save"button. This will return you to the Protein Tools homepage where you will see the sequence that you just entered.


Note that all of the steps taken in the previous section (using the "Add New Protein Sequence"tool) were only necessary because we needed only a small part of the R. americana mitochondrial sequence.

The above procedure can be used to isolate any gene sequence from an entire genome or from a fragment containing multiple genes. In fact, we will use this same procedure later on in this tutorial to isolate a different R. americana mitochondrial gene sequence.

<< Previous ^Top^ Next >>