Biology Literature Research (BioLitRes) 101 -
Part 3 - Writing Queries.


Overview: In this section you will learn how to write good queries for searching any type of  scientific  literature library, database, portal or repository.

As for many library resources, technology has made searching for research literature vastly easier than it used to be.  Once upon a time, literature indexes were only in print form.  Indexers would carefully go through periodicals and put citation details about periodical content into a print form to publish.  The indexes would have details available in a number of ways – via a list of article titles, via a list of authors, and via a list of subjects.  Some, but not all, had abstracts (article summaries) that were included.  Major indexes would publish new volumes quarterly, or more or less frequent depending on the amount of material they indexed.  Some had a thesaurus that contained a selected controlled vocabulary that was used for subject terms used for the index.     Looking for materials could well involve looking at each and every volume for several months and years back to find articles of interest.  Needless to say, scholarly work was a laborious process.

Technology has vastly improved access to literature.  Major print indexes of yore have given way to electronic bibliographic databases.  Indexers still go over periodicals (and for a number of indexes other types of materials) and create records each article.  These records contain what are called fields which contain details about the article.  There is a field for the article title, a field for the article author(s), s field for each component of the citation.  There may also be a field for subject headings (aka descriptors) that assist the searcher to see what the main topics of the article are.  Some utilize a controlled vocabulary (either from a thesaurus or other entity) and others don’t.  When available, most subject terms are assigned by the indexer although there are some systems that have an algorithm that does so automatically from the article details, references, or text.  The number and type of fields depends on the database.  The more types of details that are indexed, the more types of fields are created to place this information in the record can be searched by category.  Search engine have various features such as limits (date, type of material, language, category subsets, etc.), history, display options, file output options, etc.  Technology has made it so that all records, no matter how many years involved, can be searched at the same time for a query.  There is great flexibility in defining what is searched for.  Needless to say that it is much easier to identify literature than in the past.  Knowing this you will have a better understanding, and appreciation, for the amount of behind-the-scenes work that makes your search happen.
 
As good as search engines are today; they are not yet at the level where you can type our topic statement and they would give you back the answer you seek. Why? Because understanding natural language is not a trivial problem for search engines. The next best thing then, is to re-write your topic statement as a query in a language that the search engine can understand. To a search engine, a query is an expression made up of descriptors, operators, filters, and its uses a set of grammar  rules to interpret the meaning of the query and translate it into a search operation.

To write good queries you would need: a good set of descriptors, operators for combining them, mechanism for filtering or setting limits or for expanding the scope (truncation of terms).

GUIDELINES

STEP 1: Rewrite concepts as descriptors that will be understood by the specific search engine.
Descriptors : link to keywords and controlled vocabulary
We already developed a list of descriptors in parts I and II. We need to make sure that they are 'understood' by the search engine we are going to be using.
Even though each search engine has its own set of keywords, you are likely going to do just fine if you refined your topic statement as per Parts I and II.


Example1:

CONCEPT

Alternative
Descriptor

Type of  source(s) of information:
R=reference,
S=secondary,
P=primary,
W=web

Priority
1=highest

Location

Human genes and chromosome

same

Reference, secondary

1

Biology library

Central dogma

same

Reference, secondary

2

Biology library

Gene Mutation

same

Reference, secondary

3

Biology library

Hemoglobin and red blood cells

same

Reference, secondary

4

Biology library

Human genetic disease

same

Reference, secondary

5

Biology library

Sickle cell anemia aka sickle cell disease aka SCD

same

Reference, secondary

6

Biology library

Name of gene for SCD

TBD

Primary, web-based

7

NCBI PubMed

Gene database record

TBD

web-based

8

NCBI Entrez

Gene sequence

TBD

Primary, web-based

9

NCBI GenBank


Add descriptor to the table of concepts in your worksheet
Click here to see a demo of steps 2 and 3. In order to watch the video, you should have flashplayer plugin installed. You can download the plugin from here.

STEP 2:  Write your queries by combining terms with operators
Operators: link to Combining terms
The operators we use to combine terms in a query are boolean operators: AND, OR, NOT
Some search engines also accept grouping operators [ ] ( ) { } to combine terms

Example1:

Operator1

Concept1

Operator2

Concept2

Operator3

Concept3

 

hemoglobin

AND

genetic disease

 

 

 

Sickle cell anemia

OR

Sickle cell disease

 

 

NOT

Protein sequence

AND

Gene sequence

AND

Sickle cell anemia


Use the table in your worksheet to write your queries; write one query per line

STEP 3: Refine your queries further by using limits, filters, truncation
Truncation: link to Truncation
Limits:  to restrict the set of values that a certain index can take for us to consider it 'of possible interest; for example, year = after 1995
Filters: similar to limits, for example: language it is written in = English

Example1:

Query
Meaning

Query Expression

Operator1

Concept1

Operator2

Concept2

Operator3

Concept3

 

All variants of human hemoglobin

 

 

Hemoglob*

 

AND

 

(Human

 

OR

 

Homo sapiens)

Refine your queries (if necessary) in your worksheet:

<< Previous ^Top^ Next >>