The
determination that the human genome comprises only approximately
35,000 genes - not 60,000 to 100,000 as previously thought - has directed
even more attention to the role of
proteins and, therefore, to the field
of structural genomics. One goal of this field is to reveal the structures
of all the key “functional” sites of any human protein, information that
should make it much easier to develop highly specific drugs, thus leading
to more effective, and safer, pharmaceuticals.
Applications Map
Finding guide to terms in these glossaries Site
Map
Applications Functional
Genomics Proteomics
Informatics Algorithms In
silico & Molecular Modeling
Technologies Mass spectrometry
NMR & X-Ray Crystallography
Biology
Protein Structures
Proteins
ab initio:
From the beginning (Latin)
ab initio modeling: In
silico & molecular
Modeling glossary
ab initio
protein modeling:
Predict
3D structure from sequence without using a homologous model/ template; this
technology is not at the stage of being broadly applicable to drug discovery.
[CHI Structural
proteomics report]
Ab initio
methods use the physiochemical properties of the amino acid sequence of
a protein to literally calculate a 3D structure (lowest energy model) based
on protein folding. As opposed to determining the structure of an entire
protein,
ab initio methods are typically used to predict and model
protein folds (domains). This method is gaining considerably, in part due
to the development of novel mathematical approaches, a boost in available
computational resources (for example, tera- and pentaFLOPS supercomputers),
and considerable interest from researchers investigating protein- ligand
(or drug) interactions. [Christopher Smith "Bioinformatics,
Genomics, and Proteomics" Scientist 14[23]:26, Nov. 27, 2000] http://the-scientist.com/yr2000/nov/profile_001127.html
Related
terms
protein structure prediction
ab initio
structure prediction:
Prediction of
a protein’s structure based on amino acid sequence alone — that is, without
mapping the structure to structures of known sequences.
Broader term: protein structure prediction
(compared
with ab initio). Narrower term (compared with structure prediction)
atomic resolution data: NMR & X-ray
crystallography
biological function: Functional
genomics glossary
CASP
Critical Assessment of Techniques for Protein Structure
Alignment [Protein Structure Prediction Center, Lawrence Livermore National
Lab, US] http://predictioncenter.llnl.gov/
Links to CASP meetings results and information on "Ten most wanted"
proteins solicitation.
comparative modeling: See homology modeling.
evolutionary homology: Functional genomics glossary
fold alignment:
A critical step in homology modeling,
because it provides the key structures for the model. If suitably
matched folds cannot be identified, a type of fold assignment known as
protein threading can be used.
fold recognition:
Methods of protein fold recognition attempt
to detect similarities between protein 3D structure that are not accompanied
by any significant sequence similarity. There are many approaches, but
the unifying theme is to try and find folds that are compatible with a
particular sequence. Unlike sequence- only comparison, these methods take
advantage of the extra information made available by 3D structure information. In effect,
the turn the protein folding problem on it's head: rather than predicting
how a sequence will fold, they predict how well a fold will fit a sequence.
Robert B. Russell, Guide to Structure Prediction "Fold recognition
methods and links" Sept. 1999 http://www.bmm.icnet.uk/people/rob/CCP11BBS/foldrec.html
Related terms threading; Protein
structure glossary. protein folding, protein folds
foldedness:
Methods for analyzing "foldedness" of expressed
proteins include NMR and circular dichroism spectroscopies.
granularity: Computers & computing glossary
Hidden Markov Models HMM: In
silico & Molecular
modeling glossary homology model: A model of a protein, whose three-dimensional
structure is unknown, built from, e.g., the X-ray coordinate data of similar
proteins or using alignment techniques and homology arguments.
[IUPAC Computational] Related terms:
Functional
genomics glossary homology; Sequencing
glossary alignment
homology modeling:
This procedure, also termed comparative modeling or
knowledge-based modeling, develops a three-dimensional model from a protein
sequence based on the structures of homologous proteins. ... Care must be used
in applying the term, "homology modeling." In fact, as noted above
some authors prefer alternative names for the procedure. One must recognize that
homology does not necessarily imply similarity. Homology has a precise
definition: having a common evolutionary origin [6,7]. Thus, homology is
a qualitative description of the nature of the relationship between two or more
things, and it cannot be partial. Either there is an evolutionary relationship
or there is not. An assertion of homology usually must remain an hypothesis.
Supporting data for a homologous relationship may include sequence or
three-dimensional similarities, the relationships between which can be described
in quantitative terms. David R. Bevan, Molecular Modeling of Proteins and
Nucleic Acids, Dept. of Biochemistry, Virginia Tech,
1997-2003 http://www.biochem.vt.edu/modeling/homology.html
A computational method for determining the
structure of a protein based on its similarity to known structures. The accuracy
of structures determined by homology modeling depends largely on the amount of
homology between the unknown and the known protein sequence.
The most successful tool for prediction of
protein structure from sequence, but with significant room for improvement.
CMBI Homology Modelling Course
http://www.cmbi.kun.nl/gvteach/hommod/index.shtml
Center for Molecular and Biomolecular Informatics, Univ. of Nijmegen,
Netherlands, 2001. Dictionary http://www.cmbi.kun.nl/gvteach/dictionary.shtml
45 definitions.
Related terms: structural homology; Sequencing
glossary sequence homology; Proteins glossary hypothetical
protein; In silico & Molecular
Modeling Compare with similarity
NIGMS National Institute of General Medical Sciences:
Part of
NIH, supports biomedical research not targeted to specific diseases or
disorders. Divisions of Cell Biology and Biophysics; Genetics and Developmental
Biology; and Pharmacology, Physiology, and Biological Chemistry support
research http://www.nigms.nih.gov/
NIGMS Structural Genomics Initiatives
http://www.nigms.nih.gov/funding/psi.html
pharmacophore:
Pharmaceutical
biology glossary
protein folding problem: See protein structure
prediction
protein informatics: Proteomics
glossary
protein production:
A major bottleneck and challenge in structural
genomics.
protein sequence space: [J.] Maynard-Smith's (1970. Natural Selection and the concept of a protein space. Nature 225: 563- 564) concept of a "protein
sequence space" in which each site in an alignment is represented on its own axis and the number
of axes required to represent all conceivable variants for a protein is equal to the number of sites
in its sequence. Each sequence occupies a unique point in this space; variants differing at one site
are adjacent (Hamming) neighbours. The collection of all viable sequence variants for a
particular protein forms a localized interconnected `neighbourhood' of points within the space.
This representation has proved conceptually intuitive and analytically powerful
...
In protein sequence space, constraints are reflected in the multidimensional shape of the
cluster of points that make up the "neighbourhood" of variants viable for a specific protein. The
boundary defining the edge of this neighbourhood is characteristic of the protein's function and
can be thought of as its functional "signature". [Gavin JP Naylor,
"Measuring Shifts In Function and Evolutionary Opportunity Using
Variability Profiles: A Case Study of the Globins" also Journal of
Molecular Evolution 51 (3): 223-233 Sept. 2000] http://bioinfo.mbb.yale.edu/e-print/protspace-jme/text.pdf
Protein Structure Initiative:
Aims at determination of the 3D
structure of all proteins. This aim can be achieved in four steps: Organize
known protein sequences into families; Select family representatives
as targets; Solve the 3D structure of targets by X-ray crystallography
or NMR spectroscopy; Build models for other proteins by homology to solved
3D structures. http://www.structuralgenomics.org/
protein structure prediction:
Methods for
protein structure prediction have matured to the point where models produced by
prediction algorithms can be used to understand and test hypotheses about
biological function. The goal of this community wide effort is to provide
structural and functional insights into biologically important proteins,
particularly those that are intractable to experimental structural
determination. Ten Most Wanted, Critical Assessment of Techniques for Protein
Structure Prediction, CASP, Lawrence Livermore National Lab, US http://predictioncenter.llnl.gov/
Involves primary sequence alignment,
secondary and tertiary structure prediction and homology modelling.
Protein 3D structures are encoded
by a linear sequence of amino acid residues. To predict 3D structure from
sequence is a task challenging enough to have occupied a generation of
researchers. Have we finally succeeded? The bad news is: we still cannot
predict structure for any sequence. The good news is: we have come closer,
and growing databases facilitate the task. A solution of the structure
prediction problem would supposedly change experimental molecular biology
more than any other theoretical method. We may witness such a break- through
in the near future. However, the lessons from the Asilomar prediction contests
were that we may need a common frame- work to co- ordinate the efforts of
the researchers in the field. "Neural networks for protein structure prediction:
hype or hit? Burkhard Rost, Dec. 1999 http://www.embl-heidelberg.de/~rost/Papers/pre1999_tics/paper.html
Narrower term: ab initio
protein structure prediction Related terms: In
silico & Molecular
Modeling glossary
protein structure, primary, secondary, tertiary
and quaternary:
Protein Structure glossary.
protein threading:
See threading.
RNA structural genomics:
The systematic determination of all
macromolecular structures represented in a genome, is focused at present
exclusively on proteins. It is clear, however, that RNA molecules play a variety
of significant roles in cells, including protein synthesis and targeting,
many forms of RNA processing and splicing, RNA editing and modification,
and chromosome end maintenance. To comprehensively understand the biology of a cell, it will ultimately be necessary to know the identity of all encoded RNAs,
the molecules with which they interact and the molecular structures of these
complexes. This report focuses on the feasibility of structural genomics of RNA,
approaches to determining RNA structures and the potential usefulness of an RNA
structural database for both predicting folds and deciphering biological
functions of RNA molecules. [Jennifer A. Doudna "Structural Genomics of
RNA" Nature Structural Biology 7 (11) supp: 954-956 (Nov. 2000] http://www.euchromatin.org/Doudna1.htm
signal transduction: Metabolic
engineering glossary
similarity:
Quantity that indicates for example the percentage
identical amino acids between two sequences. Similarity is an observed quantity,
that might be for example be expressed in percent of residues that are similar
between two aligned sequences. Similarity is a bad measure, because it is
subjective. The author of the software decides whether Gln and Asp are similar
or not. The percentage identity is a much better measure. There
is an important difference between similarity and homology. Similarity is a
value between 0.0 and 1.0, or between 0 and 100%. On the other hand, there are
no degrees of homology. The sequences are either homologous or not. Center for Molecular and Biomolecular Informatics,
Dictionary, Univ. of Nijmegen,
Netherlands, 2001 http://www.cmbi.kun.nl/gvteach/dictionary.shtml
structural bioinformatics:
Involves the process of determining
a protein's three- dimensional structure using comparative primary sequence
alignment,
secondary and tertiary structure prediction methods, homology modeling,
and crystallographic diffraction pattern analyses. Currently, there is
no reliable de novo predictive method for protein 3D- structure determination.
Over the past half- century, protein structure has been determined by purifying
a protein, crystallizing it, then bombarding it with X-rays. The X-ray
diffraction pattern from the bombardment is recorded electronically and
analyzed using software that creates a rough draft of the 3D structure.
Biological scientists and crystallographers then tweak and manipulate the
rough draft considerably. The resulting spatial coordinate
file can be examined using modeling- structure software to study the gross
and subtle features of the protein's structure. Christopher Smith "Bioinformatics,
Genomics, and Proteomics" Scientist 14[23]:26, Nov. 27, 2000 http://the-scientist.com/yr2000/nov/profile_001127.html
Related terms Algorithms,
In silico & Molecular
Modeling.
Structural Biology Industrial Platform:
Fifteen companies, including
representatives of some of Europe's largest pharmaceutical industries,
have formed the Structural Biology Industrial Platform to work with each
other, the European Commission and Research Centres in Europe to promote
structural biology research, training and development. http://www.sbip.org/
structural genomics:
Focuses on the physical aspects of the genome through the
construction and comparison of gene maps and sequences, as well as gene
discovery, localization, and characterization. Brush up on your 'omics, Chemical
& Engineering News, 81(49): 20, Dec. 2003 http://pubs.acs.org/cen/coverstory/8149/8149genomics1.html
The fast-developing
fields of structural and functional genomics -- studies of proteins encoded by
the entire genome -- are being brought to bear on the problem of understanding
the root of many cancers. A protein's structure can tell researchers much about
its function, information that ultimately is needed to understand a protein's
link to cancer. By determining the detailed, three- dimensional structure of
proteins, researchers are better able to understand how each protein functions
normally and how faulty protein structures can cause disease. David Brand,
MacCHESS moves into cancer research through structural genomics, Cornell,
2001 http://www.news.cornell.edu/Chronicle/01/2.22.01/MacCHESS.html
Involves quickly determining the 3D structures of large numbers of
proteins
(or other complex biological molecules, such as nucleic acids), ultimately
accounting for an organism’s entire proteome. Footnote: As traditionally
defined, the term structural genomics referred to the use of sequencing
and mapping technologies, with bioinformatic support, to develop complete
genome maps (genetic, physical, and transcript maps) and to elucidate genomic
sequences for different organisms, particularly humans. Now, however, the
term is increasingly used to refer to high- throughput methods for determining
protein structures
Many of the criticisms leveled at the Human Genome Project in
the mid- 1980’s have been redirected toward structural genomics. Unlike high-
throughput genome sequencing, it is not a simple matter to decide
when a structural genomics effort has reached completion. SK Burley et
al “Structural genomics: beyond the Human Genome Project” Nature Genetics
23: 151 Oct. 1999
Related term:
structural proteomics
A good
explanation of structural genomics Joint
Center for Structural Genomics http://www.jcsg.org/help/robohelp/Definitions/Structural_Genomics.htm
Structural genomics project links
Human Proteome/Structural Genomics Pilot Project, Brookhaven National
Laboratory, US http://www.proteome.bnl.gov/
A pilot project to examine the feasibility of high-throughput
determination of 3-dimensional structures of proteins by x-ray crystallography,
starting from genome sequences.
Human Proteomics Initiative, Swiss Institute of Bioinformatics, European
Bioinformatics Institute http://us.expasy.org/sprot/hpi/
A major project to annotate all known human sequences
according to the quality standards of Swiss- Prot. This means providing, for
each known protein, a wealth of information that include the description of its
function, its domain structure, subcellular location, post- translational
modifications, variants, similarities to other proteins, etc.
Effort to annotate, describe a distribute to the life science community
a large amount of highly curated information concerning human protein sequences
Structural Genomics Initiative,
NIGMS, US http://www.nigms.nih.gov/funding/psi.html
Structural genomics databases Databases & software
directory.
structural genomics technologies:
NMR
& X-Ray Crystallography
structural homology: Identify
3D structures of proteins or domains in the same family as a sequence of
interest.
Related terms:
homology Functional
genomics glossary homology modeling Molecular
modeling glossary
structural homology
protein:
The degree of 3-dimensional shape similarity between proteins. It
can be an indication of distant AMINO
ACID SEQUENCE HOMOLOGY and used for rational DRUG
DESIGN. [MeSH 2003]
structural proteomics:
Often referred to as structural genomics, this
discipline involves determining the 3D structures of large numbers of proteins,
ultimately accounting for an organism's entire proteome. It adds critical
information in at least two points in the drug discovery pathway: (1) target
identification, or selecting a pathway in which a drug might function, and (2)
medicinal chemistry, or the actual design of compounds to modulate this pathway.
A high-throughput, system wide means of determining gene
function. It
typically involves using high- throughput X-ray diffraction methods to determine
the structure of proteins encoded by at least one member of each gene
family in the genome. This approach is coupled with the use of bioinformatics
as a tool in structural proteomics and computational modeling
to determine structures of other proteins in the same family. Conversely, an
important goal of structural proteomics is the creation of databases of
structures. [CHI Target
Validation report]
When asked to identify bottlenecks in the [structural proteomics] field,
several academic and industry scientists pointed to the need for faster and more
reliable protein production and purification strategies, rather than stronger
beams at the X-ray crystallization step.
structure from sequence:
See protein structure prediction,
structural homology
structure prediction problem: The protein secondary structure
prediction problem has become a classic, challenging problem for the artificial-
intelligence and machine learning community. Virtually every conceivable
computational technique in these fields (e.g., information theory [6, 12, 13], artificial
neural networks [15, 20, 22], cascaded networks [18, 19, 27], hybrid systems
[28], nearest neighbor methods [21], hidden markov chains [4], machine
learning [17, 25], mutual information [26]) has been applied in the context of
protein structure prediction. The reason for this attention is well- founded and
clear: If protein structure, even secondary structure, can be accurately
predicted from the now abundantly available gene
and protein sequences, such
sequences become immensely more valuable for the understanding of drug-
design, the genetic basis of disease, the role of protein
structure in its enzymatic, structural, and signal transduction
functions, and basic physiology from molecular to cellular, to fully systemic
levels. In short, the solution of the protein structure prediction problem (and
the related protein folding problem) will bring on the second phase of
the revolution. [Peter Munson et. al "Protein Secondary Structure
Prediction, NIH, 1994] http://abs.cit.nih.gov/reprints/text3.html
target identification: Targets
glossary
threading:
In this approach, a target sequence is “threaded”
through a library of 3D folds to try to find a match. This method
is used when no sequence is clearly related to the target sequence.
toxicoproteomics: Proteomics glossary
Bibliography
CHI, Structural Proteomics: High-Throughput
Approaches Fuel Drug Discovery and Development, Cambridge Healthtech Institute, Malorye
Branca, Allan Haberman, Deidre Lockwood 2001
Joint Center for
Structural Genomics Technologies http://www.jcsg.org/scripts/prod/technologies1.html
Nature Structural Biology, Structural genomics supplement, Nov.
2000
Alpha
glossary index
How
to look for other unfamiliar terms
IUPAC definitions are
reprinted with the permission of the International Union of Pure and Applied
Chemistry.
|