Announcement

BIRLA INSTITUTE OF SCIENTIFIC RESEARCH, JAIPUR

Summer Training Exercises



Day3

Go Back

Section - 1

Sequence and Genome Analysis

Various Genome Databases:

NCBI Genome Browser
http://www.ncbi.nlm.nih.gov/genome

Ensembl Genome Browser
http://www.ensembl.org/index.html

UCSC Genome Browser
http://hgdownload.cse.ucsc.edu/downloads.html

ORF Finder

The ORF Finder (Open Reading Frame Finder) is a graphical analysis tool which finds all open reading frames of a selectable minimum size in a user's sequence or in a sequence already in the database.

This tool identifies all open reading frames using the standard or alternative genetic codes.

The deduced amino acid sequence can be saved in various formats and searched against the sequence database using the WWW BLAST server. The ORF Finder should be helpful in preparing complete and accurate sequence submissions. It is also packaged with the Sequin sequence submission software. GENSCAN is a program to identify complete gene structures in genomic DNA. It is a GHMM-based program that can be used to predict the location of genes and their exon-intron boundaries in genomic sequences from a variety of organisms.

It also predicts peptide sequence of predicted genes of given genomic DNA.

Steps:

  1. Open ORF Finder webpage: http://www.ncbi.nlm.nih.gov/gorf/gorf.html

  2. Paste sequence in FASTA format or provide accesion no of sequence.

  3. provide the nucleotide range if needed

  4. Specify genetic code.

  5. Click on 'Run Orffind' button for the prediction results.

GENSCAN

GENSCAN is a program to identify complete gene structures in genomic DNA. It is a GHMM-based program that can be used to predict the location of genes and their exon-intron boundaries in genomic sequences from a variety of organisms. It also predicts peptide sequence of predicted genes of given genomic DNA.

Steps:

  1. Open GenScan webpage: http://argonaute.mit.edu/GENSCAN.html

  2. Select 'Organism' - can select DB organism according to organism of input DNA sequence.

  3. Set 'Suboptimal exon cutoff' value - use 0.5 for better prediction

  4. Choose 'Print Option' for only peptide or both peptide & cds prediction.

  5. Browse or paste genomic DNA sequence (less than 1mb data)

  6. Click on 'Run GenScan' button for the prediction results.

Terms used in Results of GenScan:



Gn.Ex : gene number, exon number (for reference)

Type : Init = Initial exon (ATG to 5' splice site)

Intr = Internal exon (3' splice site to 5' splice site)

Term = Terminal exon (3' splice site to stop codon)

Sngl = Single-exon gene (ATG to stop)

Prom = Promoter (TATA box / initation site)

PlyA = poly-A signal (consensus: AATAAA)

S : DNA strand (+ = input strand; - = opposite strand)

Begin : beginning of exon or signal (numbered on input strand)

End : end point of exon or signal (numbered on input strand)

Len : length of exon or signal (bp)

Fr : reading frame (a forward strand codon ending at x has frame x mod 3)

Ph : net phase of exon (exon length modulo 3)

I/Ac : initiation signal or 3' splice site score (tenth bit units)

Do/T : 5' splice site or termination signal score (tenth bit units)

CodRg : coding region score (tenth bit units)

P : probability of exon (sum over all parses containing exon)

Tscr : exon score (depends on length, I/Ac, Do/T and CodRg scores)

BISR Primer Machine

Steps:

  1. Open BISR Primer webpage: http://bioinfo.bisr.res.in/cgi-bin/project/primer/index.cgi

  2. Paste sequence in FASTA format or provide accesion no of sequence.

  3. Provide the nucleotide range if required

  4. You can also take the test file if you are checking it for use.

  5. Click on 'Proceed' button for the prediction results.

  6. Result will provide the positions of nucleotide in given sequence on the basis of ranks.

Exercise on Emboss

https://www.ebi.ac.uk/Tools/emboss/

Go Back