Comments on Assignments #3:

 

What are Inteins?

Inteins (short for INTervening proTEIN sequence) are self splicing intervening sequences similar to introns, but not only are they transcribed into mRNA, they also are translated into proteins.  It is at the protein level that the splicing occurs: the intein is released; the two exteins are joined together.  Inteins as well as introns often contain / encode an endonuclease(domain).  This endonuclease disrupts copies of the gene that do not contain the intein/intron.  The disrupted copy then is repaired using the intein containing copy as template which results in copying the intein/ intron into the intron/intein free gene.  Therefore these endonucleases are also called homing endonucleases.  At least at first sight they appear an excellent example for a selfish gene.  Inteins are usually found in the most conserved parts of proteins.

More on inteins is here and here.

 

 

When you get 2 genomes which are not choices given by Genome and Bioinformatics site, how can you compare these 2

genomes?

 

As far as I know, all of the completed and released genomes are offered as options by these two sites. 

If you have a genome sequence that is not at NCBI (which means it has not yet been released) you could use a program like magpie to find and annotate ORFs.  Most of the NCBI programs can also be downloaded and run locally.  Eugine Koonin also provides a software package called SEALS which might be useful to work with unreleased data. 

 

 

 

If A is shown to be homologous to B, and B is shown to be homologous to C, is then A homologous to C? 

In general homology follows a chain rule, and the answer to the question is yes.  However molecular sequences are complicated and can arise through domain shuffling.  To apply the chain rule, one needs to consider the homology of the individual sequence positions. 

- If the demonstration of homology is based on the same part of sequence B, then the answer is yes. 

- If B contains two domains, and domain 1 is homologous to A, and domain 2 is homologous to C, then the answer is NO.

The main problem in understanding the chain rule is that there is a difference between homology and detecting homology.  There are many more homologues in the data banks than can be detected using blast or randomization approaches. 
!!! The failure to demonstrate homology between two sequences (E>10-4) does not constitute demonstration of non-homology !!! It just means you were not able to prove the homology

 
 

Is there a difference between kingdoms and domains?

 
Yes.  Archaea, Bacteria and Eucarya (formerly known as Archaebacteria, Eubacteria and Eukaryotes) are the current names for the three domains (formerly known as Ur-kingdoms) of life.  Each of these domains consists of several kingdoms.  The currently accepted kingdoms within the Archaea are the Cren- and the Euryarcheota.  (The former are also called Eocytes by some.)
 
 

“Aeropyrum and Archaeoglobus, which was thought to be related, but there is no ORF in Aeropyrum shared by Archaeoglobus.”

 
This cannot be.  Using the biogadgets site’s comparative inventories and an E-value cutoff<10-60 there are 191 matches.