SALVE FOUNDATION HOME  
BIOHOME - COURSES - LIBRARY - SCIENCE NEWS - GALLERY - LINKS - MAIL  
Phylogenetic trees and evolutionary comparison  
Page: Data banks - NCBI - main page    
 
  .  
     
   
THIS PAGE discusses the main features of biological databases. We visit NCBI (National Center for Biotechnology Institution) as an example of data banks. There are detailed help pages, even tutorials at NCBI, therefore here we just help you to get started. We wish you to understand the philosophy of searches, some possibilities at NCBI restricted to the needs of tree building, and draw attention to a few technical tips to make life easier.  
You know all this? Go to "Alignment".
     

How can DNA sequences be obtained?

For some questions sequencing a certain gene cannot be avoided. The good news is, that once a gene is sequenced, it normally would be uploaded to a public data base. In fact, most journals require authors to deposit their sequences prior excepting their manuscripts for publication. This way anyone reading the article can check the original data, moreover, such sequences can be used for other type of analyses.

Data banks:

The three major nucleotide databases are available online:

  • GenBank (maintained by National Center for Biotechnology Institution),
  • EMBL (European Molecular Biology Laboratory) and
  • DDBJ (DNA Data Bank of Japan)

Each database contain the same data because they regularly exchange them while you sleep. Data banks obtain sequences directly from research laboratories, companies, scientific literature and from patents; and they are continuously updated. You can not only download sequences free, but also exploit many other useful tools and information in the database. In this course we shell visit GenBank, as an example of databases.

 

A short visit to NCBI

Most databanks actually consist of several interconnected databases. Creating such interconnected databases as well as building efficient search software is quite complicated. Using them, however, is not too difficult.

The starting page of NCBI (National Center for Biotechnology Institution) gives you several options. The options are not arranged in a logical, but rather in a practical way. In the deep blue navigation bar, links to search options and data bases are mixed up, to confuse the enemy. However, as any search you do will lead to a specific homepage with enlarged navigation bar, at this point we would discuss only the search options briefly.

Basically there are two search option, BLAST and Entrez.

The BLAST retrieval system can be used to find sequences according to similarity. If you do have a nucleotide or amino acid sequence you can use BLAST to search and retrieve all sequences similar to yours. Similarity, of course, can be defined in many ways, which we will discuss later. The only thing you should remember of BLAST at this point is, that if you need a similarity search, hit BLAST in the navigation bar.

All the other data sets are searched with words, by the so called ENTREZ system. Word search is simple.

For example, if you are interested to find out whether Neandertal man were among your ancestors, you simply type in "Neandertal" into the SEARCH bar and hit GO. The Default search option is GenBank, so you get back references of sequences which in some field contain the word "Neandertal".

At the present time, 4 sequences are related to Neandertal, that is to the species Homo sapiens neanderthalensis. Before we go on searching for sequences, we shell spend some time with some details of Entrez search.

PRACTICE: Now, please, open NCBI homepage, and make a search in GenBank (default window) for NEANDERTAL. Results will be shown in the "Entrez Nucleotide" window. Now, type neandertal with different spelling "neandertHal". At this point I got but a single sequence. Unfortunately, spelling might be critical when searching with words. When in doubt, it is useful to truncate the word by an asterisk, such as neandert*. I got 5 hits.

While word search is easy, results can be quite frustrating Fortunately, there are some ways to focus your search. Possibilities are different in the various databases, so we show them separately.

   
Next comes a brief description of NCBI databases and Entrez search in general.   Next page: Entrez search
     

 

   
     
     
     
     
     

 

 

     
Page written by: Anikó Schrott and Peter Kabai  
Edited by: Peter Kabai  
modif.: 2001-05-04
     
written: 2001-05-04, modified.: 2001-05-04