Summer Training Exercises
Go Back
Section - 1
Databases and tools in CADD
Various Databases:
Protein Data Bank
Repository for 3D‐structural data of biological macromolecules.The structures can be obtained by different techniques, e.g. crystallography, electron microscopy, NMR.
Data is freely accessible on the internet, e.g. www.rcsb.org .
Key resource for structural biology – most journals require experimentally determined structures to be submitted to the PDB before publication.
Protein Data Bank (PDB)
http://www.rcsb.org/pdb/home/home.do
PubChem
PubChem, released in 2004, provides information on the biological activities of small molecules.
PubChem is organized as three linked databases within the NCBI's Entrez information retrieval system.
These are PubChem Substance, PubChem Compound, and PubChem BioAssay.
PubChem also provides a fast chemical structure similarity search tool.
PubChem
https://pubchem.ncbi.nlm.nih.gov
DrugBank Version 5.0
PubChem, released in 2004, provides information on the biological activities of small molecules.
PubChem is organized as three linked databases within the NCBI's Entrez information retrieval system.
These are PubChem Substance, PubChem Compound, and PubChem BioAssay.
PubChem also provides a fast chemical structure similarity search tool.
3.DrugBank Version 5.0
https://www.drugbank.ca/
Section - 2: Chemical Sketching - Molinspiration
Introduction
Biologically active compound act as a drug.these software is helpful for sketching molecule. Smiles file
format is The Simplified Molecular Input Line Entry Specification (SMILES) a line notation for
molecules. SMILES strings include connectivity but do not include 2D or 3D coordinates.
Hydrogen atoms are not represented. Other atoms are represented by their element symbols
B, C, N, O, F, P, S, Cl, Br, and I. The symbol "=" represents double bonds and "#" represents triple
bonds. Branching is indicated by (). Rings are indicated by pairs of digits.
Name Formula SMILES String
Methane CH4 C
Ethanol C2H6O CCO
Benzene C6H6 C1=CC=CC=C1 or c1ccccc1
Ethylene C2H4 C=C
STEPS:
- Open the site
http://www.molinspiration.com - click on free on-line cheminformatics services
- Use the icon | for single carbon || for double bond
- Red cross is eraser
- Sketch molecule aspirin.
- For it Open site
http://www.ncbi.nlm.nih.gov/sites/entrez?db=pccompound - search for aspirin
- For smiles format take it from pubchem canonical smiles
- paste smiles format CC(=O)OC1=CC=CC=C1C(=O)O in paste smiles here.
- click on calculate properties and predict bioactivity
- You will get output predict bioactivity
- You will get output for calculate properties.
GPCR ligand | -0.66 |
Ion channel modulator | -0.91 |
Kinase inhibitor | -0.49 |
Nuclear receptor ligand | -1.23 |
miLogP | 1.434 |
TPSA | 63.604 |
natoms | 13 |
MW | 180.159 |
nON | 4 |
nOHNH | 1 |
nviolations | 0 |
nrotb | 3 |
volume | 155.574 |
Section - 3: Protein Structure Prediction: Homology Modeling
Homology Modeling using Swissmodel
Steps:
- Take a protein sequence in FASTA format whose structure is to be modelled. This is our 'Target Sequence'.
- Now go to the webpage http://swissmodel.expasy.org/
- Paste target sequence into swiss model workspace in FASTA format
(you can also upload target sequence )
you can provide a Project title and email-id-
You have 2 options now you can either search for templates or build models directly
- you can see result after a while
- We can also download the detailed results
Section - 4: Protein Structure Validation
The Structure Analysis and Verification using SAVES Server
Steps:
- Open SAVES SERVER webpage: http://services.mbi.ucla.edu/SAVES/
- Upload the PDB file.
- RUN all programs
Go Back
Section - 5: Homology Modeling using Chimera
- Open target sequence into chimera
(File -> Open -> (select target file from file browser)
or can download target sequence online by-
File -> Fetch by ID (provide ID related to given databases)
This step opens target protein sequence into 'sequence viewer'. - To view secondary structures in this target sequence
go to (in 'sequence viewer')
Structure -> Secondary Structure -> Show Actual/ Show Predicted - show/hide these structures in 'sequence viewer'
go to (in 'sequence viewer')
Info -> Region Browser - Template Sequence Search
To model structure of this target protein firstly a known structure protein sequence is searched which is much similar to this target sequence. For this Blast is performed and target sequence is aligned with PDB database's protein structures. We take a much similar sequence (with having low resolution parameter) known as 'Template Sequence'.
Following process is followed for the same-
go to (in 'sequence viewer')- a) Info -> Blast Protein
- b) select target protein and click OK
- c) Select program 'blast' and database 'pdb' ( can enter desired E-value and Matrix used for BLAST ) and click OK
- View alignment of target sequence to blast results
For this, select a protein from results and click on 'Show in MAV'.
(MAV stands for MultAlign Viewer) - Loading Template Structure into 'Chimera'
Select a best hit template protein from above results and click on 'Load Structure'. - Modelling Structure of template sequence
go to (in 'MultAlign Viewer')
- a) Structure -> Modeller(homology)
- b) Choose target sequence
- c) Choose template sequence
- d) Enter valid key of MODELLER (to run modeller via web service[MODELIRANJE]) or Run Modeller locally
- e) Can set 'Advance Option'
- f) Click OK
It will take some time to model structure of target protein on template. - 'Modeller Result' box is opened when models are build. We can check our model by selecting each models.
- To show/hide Template and models from viewer
go to (in 'Chimera Viewer')
Favorites -> Model Panel
(On/ Off checkbox to given Name to show/hide structures)
Go Back
Press the button 'build model'.
Swissmodel provide 3 best results according to their score
It provide- Sequence identity, Alignment and Structure