In this section you will learn how to search protein databases and analyze sequences that you import into the Workbench.
Using the Ndjinn Tool: to Search Sequence Databases
1. To get to the Protein Tools homepage, click on the "Protein Tools" button at the top of the window. The Protein Tools homepage is "Empty", meaning that no protein sequences have been saved here yet. Importing sequences from any number of protein databases will change this -- the word "Empty" will disappear and in its place will be the list of imported protein sequences.
2. In the scrollable menu at the top of the page, select the "Ndjinn ”V Multiple Database Search" tool by clicking on it and hit "Run".

3. On the next screen, you need to define the parameters of your search. In the search box at the top of the page type "rpoK". This tells the search engine what protein sequences to search for. The rpoK subunit is one of the many subunits that make up the RNA polymerase enzyme. We are asking the Workbench to search for the sequence of an individual subunit of RNA polymerase, rather than for the sequences of all of the RNA polymerase subunits, to avoid being overwhelmed by results. Also, notice the "Hits per page" drop-down menu immediately below the input box. This drop-down menu allows you to decide how many sequences you want to display. For this exercise, you want to see all of the sequences that are found so select "All".

2. On the next screen, you need to define the parameters of your search. In the search box at the top of the page type: "Reclinomonas americana succinate dehydrogenase". This tells the search engine what to look for. Also, notice the "Hits per page"drop-down menu immediately below the input box. This drop-down menu allows you to decide how many sequences you want to display. For this exercise, you want to see all of the sequences that are found so select "All".

Scroll down the page. Below the input box, you will see a list of many different databases, all containing a variety of sequence information. The databases are separated into two distinct groups: The first group contains sequences from many different organisms (for example, "GBBCT"contains a large number of sequences from many different bacteria), whereas the second group contains the entire genome sequences of specific organisms (for example, "Mthe"contains the entire genome sequence of the bacterium Methanobacterium thermoautotrophicum).
3. Since you are looking for sequences from a unicellular eukaryotic organism, click on the box that is next to the GBINV database. This is the GenBank Invertebrate Sequences database. This database contains DNA sequences that are specific to invertebrates only, and thus to unicellular eukaryotes, such as R. americana.

Scroll down the page. Below the input box, you will see a list of many different databases, all containing a variety of sequence information. The databases are separated into two distinct groups: The first contains sequences from many different organisms (e.g. GBBCT - - contains a large number of sequences from many different bacteria), whereas the second group contains the entire genome sequences of specific organisms (e.g. Mthe contains the entire genome of Methanobacterium thermoautotrophicum).
4. We are going to use the database called "SWISSPROT" because it contains protein sequences from many different organisms. Scroll down the page and check the box next to this database.

5. Scroll back up to the top of the Screen and click on "Search". The next screen to appear will contain the results of the search. At the time this tutorial was written, the search engine found 14 sequences using "rpoK" as the search string. If you get a different number of matches, do not worry. Inconsistencies in the number of search results can occur because new sequences are being added to the databases on a daily basis. What do the descriptions that appear in the results window (e.g. sulac, metth, metja”K) mean? With this strange nomenclature, or naming system, it is almost impossible to know which organisms”¦ sequences you are looking at.
6. To get more information about these sequences, select all of them by clicking on them with your cursor, then click on the "Show Record(s)" button below the results window.

A new screen will appear. This screen contains information about the sequences listed in the results window, for example, the name of the organism that the sequence belongs to and that organism”¦s classification, that is, whether it is a eukaryote, a bacterium or an archaebacterium.
7. Scroll down the Records window or use the "Find" function in your browser (in the "Edit" menu) to locate the "rpoK" sequence from the archaebacterium Methanobacterium thermoautotrophicum. Once you have identified the M. thermoautotrophicum rpoK sequence, select it by clicking on the box next to it.
8. Scroll to the bottom of the records files and click on the "Import Sequence(s)" button. This will send the M. thermoautotrophicum rpoK sequence to the Protein Tools homepage.

9. Click "Return" in the new window that appears after you click on the "Import Sequence(s)" button. You should now be back in the Protein Tools homepage.