MCB 3421 Computer Lab Assignment 4 (Databank searches, part A: ENTREZ)

Your name:
Your email address:

1. (less than 20minutes)
Use Pubmed in NCBI's Entrez to find an article written by Carl R. Woese (famous scientist, co-discoverer of the Archaea), published in the journal Proceedings of the National Academy of Sciences of the United States of America with the words primary kingdoms in the title of the paper. Try to use Boolean operators and field tags; if you cannot recall the tags, use the Preview/Index tool under advanced search (link on top right).

What query did find the 1977 article?

How many related citations (link in right hand bar) are linked to this article?
(in the window that gives you the title, authors and abstract, click on the link labeled "related citations" in the forth table in the right hand site; number is in the header Results 1 to 20 of xxx)
When was the most recent published (Hint: In the Display settings pull down menu set the "Sort by" option to Pub Date)?   
(Aside: If you wonder why prokaryotes is the wrong choice when asked in an exam about the names of the three domains of life, check the 2009 opinion piece from Norm Pace. A more readable and complete summary of the history is in Jan Sapp's article at )

2. (ca. 7 minutes) (Note: NCBI's pull-down menus do not always work well in Safari -- use Firefox)
In NCBI's pubmed find the earliest paper co-authored by Senejani and Gogarten. What is the topic of the paper?

To learn about inteins, select books as the target database to search (pulldown menu on the left, below the black bar) and search for intein homing, then click on the <see top results> link for the first book- scroll down to the images on homing and splicing (section 11.3.4. -- somewhat informative. For more information check Wikipedia on inteins

3. (ca. 5 minutes)
Dr. Gogarten seems obsessed by an important protein called ATP synthase. Is he interested in anything else? How many articles has he published that are NOT related to the ATP synthase OR ATPase?
What query did you assemble?
How many articles did you find?

4. (10 minutes)
For a scientist of your choice (e.g., your advisor, or someone who publishes in your field of interest), using ISI's Web of Science database (click on the Web of Science tab at the top) search for articles that cite this author (i.e., you need to use a cited reference search (Link below the web of Science tab).
Which was the most cited article (there does not seem to be an elegant solution to this, you actually need to scroll through the pages)?
How often was it cited?
When was the most recent citation?
Did you find any interesting article?             
Was this article available online?      
Repeat the exercise using Google Scholar. By default Google lists articles in order of how often they were cited (the rich get richer principle). Do you identify the same article. Was the number of citations similar/identical?

5. (5 minutes)
Using Pubmed, search for articles co-authored by Taiz and Gogarten.

a) How many articles did you retrieve?

b) Using the "find related data" pull down menu in the right bar, display all Nucleotide Links and all Protein Links.
How many did you find?

c) Do all the different protein sequences really refer to different sequences?

What might explain your finding?

Go to the nucleotide entry for gi|167559. How many related sequences (link in right hand bar) does this entry have?

Go to the protein encoded by gi|167559 (167560). How many related sequences (link in right hand bar) does this entry have?

How do you explain the difference?

6. (10 minutes)
Using Entrez, search Protein (use drop-down box to select the Protein database) for 19888400 (this is a gi number, see historical note)

Click on the BLink link in the lower part of the right-hand column (under related information). Read more about BLink here (especially note the last paragraph)

Do you notice anything interesting about the alignments?

The top of the BLINK page contains colored squares that give how often the links are found in different organisms. How many plant and metazoan sequences are linked? If you display them (select "only plant" or "only metazoa") , as what are the homologs annotated?

7. (8 minutes)
To what domain (super kingdom), phylum (kingdom), and family does Thermoplasma belong? (Use the Taxonomy Search. Click on the "Thermoplasma" link that is returned as the result of the search. In the line labeled lineage, if you hover the mouse pointer over the names, it tells you which taxonomic category you are pointing at. )

How many protein and genome sequences are available for Thermoplasma acidophilum, how many are available for the genus Thermoplasma? (In the taxonomy browser go to Thermoplasma and check protein and genome in the header, then click on <Display>)


Check the appropriate radio button below before pressing the submit button:

Send email to your instructor (and yourself) upon submit
Send email to yourself only upon submit (as a backup)
Show summary upon submit but do not send email to anyone