In this section we are going to use the BLAST tool to search for sequences that are similar (or homologous) to the eukaryotic R. americana mitochondrial succinate deH subunit 2 gene.
Using the BLAST Tool to Find Homologous (Similar) Sequences
1. Check the box next to the succinate deH subunit 2 sequence.
2. Select "BLASTP ¡V Compare a PS to a PS DB"(BLASTP - Compare a Protein Sequence to a Protein Sequence Database) in the tool menu, click "Run"

3. On the next page, we must select the database(s) to be searched. We want to compare the succinate deH sequence to both eukaryotic and prokaryotic sequences. The best database to use in this case is "SwissProt", a very extensive collection of both eukaryotic and prokaryotic sequences. If you wish, you can select any of the other databases as well. To select multiple databases, hold down the CTRL key on your keyboard, while selecting. You can use up to 16 databases at one time, however each new database added can cause a significant increase in processing time, and it may make the results difficult to interpret.
4. All the remaining options on the page allow us to fine tune our search, but that won't be necessary so simply click on "Submit".

When this tutorial was written, the results were as follows:

To find out about each sequence, select it and click on "Show Record(s)". Right away we see that three of the top seven matches (the ones check-marked above) are bacteria and therefore PROKARYOTES! In other words, you took a eukaryotic mitochondrial gene sequence, searched a database containing both eukaryotic and prokaryotic sequences, and obtained among the closest matches several prokaryotic sequences.
When you view the information for each sequence you should notice two things:
a. The very first match is the Reclinomonas americana mitochondrial DNA sequence. This makes sense: the genetic sequence we used to search for similar sequences is itself present in the SwissProt database and is obviously the closest match to itself.
b. The prokaryotes whose sequences are most homologous to that of the R. americana succinate deH gene are: Rickettsia prowazekii, Rickettsia conorii and Paracoccus denitrificans. More important than their names is the fact that they are all ALPHA-PROTEOBACTERIA. This suggests that the prokaryotic cell that was engulfed millions of years ago, eventually giving rise to mitochondria, most likely was related to present-day alpha-Proteobacteria.
E Value
The E value or "Expect" value is the most intuitive, or instinctive, way to rank the results of a search. The E value estimates the statistical significance of a search result by specifying the number of matches with a given score that could be expected to occur purely by chance in a search of a database of a particular size. For example, an Expect value of 2 would indicate that two matches with that particular score would be expected to occur purely by chance. The expected value changes with the size of the database (in a larger database more chance matches with a given score are expected). Search results with E values much higher than 0.1 are unlikely to reflect true sequence relatives, but sometimes it is useful to examine hits with lower significance (E values between 0.1 and 10) for short regions of sequence similarity. In the absence of longer similarities, these short regions may allow the tentative assignment of biochemical activities to the sequence in question. The significance of any such regions must be assessed on a case-by-case basis. Essentially, the smaller the E-value, the more homologous or similar the sequence is to the original sequence BLASTED. An E-value of zero indicates that no matches would be expected by chance - this would represent a perfect or near perfect match.
Using the CLUSTALW Tool to Align Two Sequences for Comparison
We are now going to determine how similar the R. americana succinate deH subunit 2 sequence is to an alpha-proteobacterial sequence using a tool called CLUSTALW. CLUSTALW is used to align two sequences one on top of the other so that it is possible to see where and how they differ. The alignment process takes place by comparing the two sequences and finding common regions within them. The Biology Workbench then uses an algorithm to compute the most likely position in which the two sequences line up.
1. Import the bacterial sequence that is the closest match to the R. americana gene (it is the third one from the top) by check-marking it and clicking on the "Import Sequence(s)"button.

2. You should now be back at the Protein Tools homepage, and you should see two sequences listed.
3. Select both sequences by clicking on the boxes next to them and highlight "CLUSTALW ¡V Multiple Sequence Alignment"in the tool menu. Click on "Run".
4. On the next screen click on "Submit"(once again, it is not necessary to mess with the default settings on this page).

The two sequences are now aligned one on top of the other and a color-coding system is used to differentiate highly conserved regions (matching amino acids) from non-conserved regions. Identical amino acids are highlighted in blue and non-identical amino acids are in black. Certain pairs of amino acids are in green. The green color signifies that, although the amino acids are different, they have similar chemical properties.
You can also see that the bacterial sequence (labeled DHSB_RICCN) has a string of amino acids at the beginning, but there are only dashes (----) for the R. americana sequence. This is because the bacterial sequence is longer than the R. americana sequence: the alignment tool had to slide the R. americana sequence down to align the two. All amino acids that are colored blue are identical in both sequences and as you can see, there is a very large region of homology, which explains the high score (E value) the bacterial sequence received in the BLASTP search (5 x 10^-93).
A high level of homology between bacterial and eukaryotic mitochondrial sequences, especially considering that two billion years have passed since eukaryotic and prokaryotic organisms went down separate evolutionary paths, strongly supports the theory that mitochondria are the evolutionary products of bacteria.
5. When you are finished viewing the alignment, "Return"to the Protein Tools homepage.