Comments on Assignments #1:

 

Question 1-5 and 10:  Use of the new query interface is highly recommended.  In particular, use the INDEX option allows formulating more complex searches.  

 

Reminder: “All Fields” does not exactly do the same as adding the results from searching all fields.  The organism field allows to select sequences from organisms based on the NCBI’s taxonomic system. 

 

Ad 6:  The protein and nucleotide databanks searched by entrez are highly redundant.

 

Ad 7 to 9.  Amino acid sequences allow finding more divergent sequences compared to using the encoding nucleotide sequence. Therefore, you would be better off using the protein rather than the nucleotide sequence !

 

Ad 10:  A good search strategy would be to use the index to put something like the following into the search field:

("ribulose bisphosphate carboxylase"[All Fields] OR "ribulose bisphosphate carboxylase large chain"[All Fields] OR "ribulose bisphosphate carboxylase large subunit"[All Fields] OR "rbcl"[All Fields]) AND "archaea"[Organism]

 

If you look at the numbers in the index, keep in mind that if you change the selected field the main part of the table only updates after you hit the view button.  The same is true for the display button in clipboard.  You first need to display or view what you want to do, only then does the following command (save, OR, …) act on the selected items!

 

ad 11 and 12. Entrez accesses a highly redundant data bank, thus under related sequences you get many more entries than hit in a blast search of a non-redundant data bank.  The ATPase like protein involved in flagellar assembly is related of F- (=bacterial), V- (=vacuolar or eukaryotic) and A (=archaeal)-ATPase catalytic and regulatory subunits.  It also shows weak but reproducible similariy to a rho transcription terminator subunit.  In addition to these subunits that reveal the similarity already at the level of the primary sequence, there are many other nucleotide binding domains which appear to be homologous to the ATP binding domain of the ATPases (including myosin, HSP70, hexokinase and many others).