|
Chemistry
term index Drug
discovery term
index Informatics term index Technologies
term index Biology term index
Site
Map
Related glossaries include Drug
Discovery & Development, Functional
genomics
Informatics: Algorithms & data
analysis, Chemoinformatics,
Clinical
informatics Drug
discovery informatics Genomic informatics
IT
infrastructure Protein
informatics
Technologies Microarrays Sequencing
See Genomic Informatics for data analysis
using these technologies, as well as gene specific informatics. Systems
biology and cellular and physiological biology informatics are in this
Bioinformatics glossary.
Biology: DNA, Expression,
Proteins, Sequences,
DNA & beyond

Bioinformatics
June 7-8, 2012 • Singapore Program | Register
| Download Brochure
annotation:
The annotation process identifies sequence features on the contigs such as
variation, sequence tagged sites, FISH-mapped clone regions, transcript
alignments, known and predicted genes, and gene models. This stage provides
contig, RNA, and protein records with added feature annotation. In addition,
organism specific features, such as Gene Trap clones for mouse will also be
annotated.. NCBI Annotation Information 2008 http://www.ncbi.nlm.nih.gov/genome/guide/build.shtml
The value of a genome
is only as good as its annotation. At the Sanger Institute, we are providing
high quality manual curation in addition to automated prediction provided by
Ensembl. Finished genomic sequence is analysed on a clone by clone basis using a
combination of similarity searches against DNA and protein databases as well as
a series of ab initio gene predictions. Manual Curation of the Human
Genome, Wellcome Trust, Sanger Institute, 2003 http://www.sanger.ac.uk/HGP/havana/
Each fragment of DNA contains unique features. A DNA fragment may encode a
portion of a gene or a gene control sequence, or the fragment may be a portion
of a
genome that has no apparent function. Bioinformaticists perform detailed
analysis of DNA fragments, comparing new DNA sequence, previously annotated DNA
sequences and identifying common characteristics, and assigning known or
putative potential functions to the DNA sequence. Cross species DNA sequence
comparison is quite common and can reveal common genes shared between organisms.
A bioinformatic study may also require peptide to peptide comparisons allowing
common structural features of proteins to define the function a DNA fragment
encoding a specific protein or enzyme. Explanatory notes, comments, analysis and commentaries added to a database.
May refer to sequence data or protein structures and includes predictions, characterizations,
summaries, and other detailed information, including gene function. Annotation can be manual (as in SWISS- PROT) or automated (as in TrEMBL).
Since annotation is highly skilled and labor intensive, efforts are being
made to automate the process, at least for preliminary data. Related terms: annotated databases, curated databases, comparative genome annotation,
distributed annotation system, genome annotation; SNPs
& genetic variations Genetic Annotation Initiative Narrower
terms: baseline annotation, computational annotation, distributed sequence
annotation; Proteomics: annotation - proteins;
BioConductor:
An open source and open development software project to provide tools for the
analysis and comprehension of genomic data (bioinformatics). http://www.bioconductor.org/
bioinformatics: Bioinformatics
June 7-8, 2012 • Singapore Program | Register
| Download Brochure
Bioinformatics
April 25-26, 2012 • Boston, MA Program
| Register | Download
Brochure
There is a greater need than
ever for bioinformatics experts and experimental biologists throughout Europe to
work together towards common goals that will expedite biological research. This
conference will bring together experimentalists, computational biologists as
well as bioinformaticians and showcase the next generation of informatics
resources for life science researches. Bioinformatics
October 12-13, 2011 • Hannover Germany Program
| Register
| Download Brochure
In recent years, biologists and medical researchers have increasingly
relied on computational methods to perform investigations. Bioinformatics is not
only an integral part of basic life science research but also plays an important
role in converting basic science results to application and/or commercial tools.
The interdisciplinary fields of
Bioinformatics and Computational Biology are locked in a high stakes race with
analytical instrument developers and innovators. The pace and scope of change in
many fields of biomedical research rivals what we once associated only with
semiconductor devices. This report explores the interlocking challenges facing
instrumentation advances, computational demands and our evolving systems biology
knowledge. Key challenges presented in this report include: Instrumentation capable of generating
terabytes of raw data daily Storage requirements for human gene sequences Need for cross platform data analysis
standards Appropriateness of analysis & modeling
applications Database data quality and annotation protocols
Insight Pharma Reports, Bioinformatics
& Computational Biology, 2009
The Bioinformatics and
Computational Biology program, which supports the National Centers for
Biomedical Computing, aims to develop novel, cutting-edge software and data
management tools to effectively mine the vast wealth of biomedical data
generated from sophisticated modern laboratory techniques and facilitate data
sharing between researchers. NIH Common Fund http://commonfund.nih.gov/bioinformatics/index.aspx
Roughly, bioinformatics describes any use of computers to handle biological information. In practice the definition used by most people is narrower; bioinformatics to them is a synonym for "computational molecular
biology" - the use of computers to characterise the molecular components of living things.
Damian Counsell, bioinformatics.org FAQ] http://bioinformatics.org/faq/#whatIsBioinformatics See
above bioinformatics.org FAQ for tight and loose definitions of bioinformatics,
and information on how long the term has been used. The
definition of bioinformatics is not universally agreed upon. Generally speaking,
we define it as the creation and development of advanced information and
computational technologies for problems in biology, most commonly molecular
biology (but increasingly in other areas of biology). As such, it deals with
methods for storing, retrieving and analyzing biological data, such as nucleic
acid (DNA/RNA) and protein sequences, structures, functions, pathways and
genetic interactions. Some people
construe bioinformatics more narrowly, and include only those issues dealing
with the management of genome project sequencing data. Others construe
bioinformatics more broadly and include all areas of computational biology,
including population modeling and numerical simulations. Biomedical
informatics is a slightly broader umbrella that includes not only
bioinformatics, but other areas of informatics in biology, medicine and
health-care. They are closely
related. Russ Altman "Guide to
informatics at Stanford University, 2006 http://www-helix.stanford.edu/people/altman/bioinformatics.html
We have coined the term Bioinformatics for the study of informatic processes
in biotic systems. Our Bioinformatic approach typically involves spatial, multi-
leveled models with many interacting entities whose behavior is determined
by local information. [Theoretical Biology Group, Univ. of Utrecht, Netherlands,
Paulien Hogeweg Director http://www-binf.bio.uu.nl/
Original definition was “the study of informatic processes in biotic
systems” Paulien Hogeweg MIRROR beyond MIRROR, puddles of LIFE, in Artificial
Life, ed. C.G. Langton, Addison Wesley, 297-316, 1988 [Nick Saville's
homepage, Theoretical Biology and Bioinformatics, Utrecht Univ., Netherlands,
1997
Despite
the apparent fatigue of the linguistic use of the term itself, bioinformatics
has grown perhaps to a point beyond recognition. We explore both historical
aspects and future trends and argue that as the field expands, key questions
remain unanswered and acquire new meaning while at the same time the range of
applications is widening to cover an ever increasing number of biological
disciplines. Rise
and Demise of Bioinformatics? Promise and Progress, Christos
A. Ouzounis, PLOS Computational
Biology April 2012 http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002487
The earliest Medline reference I've found to bioinformatics is William Bain's
"Bioinformatics in Europe - the federation strikes back" in
Trends in Biotechnology 11(6): 217- 218 June 1993.
Narrower terms: bacterial bioinformatics, comparative bioinformatics, functional bioinformatics,
glycobioinformatics, medical bioinformatics, molecular bioinformatics, pharmaceutical bioinformatics,
protein bioinformatics; Protein informatics structural bioinformatics; Related terms: European
Bioinformatics Institute EBI, Open Bioinformatics Foundation; Algorithms data mining
Carole Goble, Seven
Deadly Sins of Bioinformatics, 2007 http://www.slideshare.net/dullhunk/the-seven-deadly-sins-of-bioinformatics

Bioinformatics & Cancerinformatics: Turning the Data Deluge into
Meaningful Biological Knowledge
February 21-23, 2012 – San Francisco, CA
Download Brochure
Register
Program

Integrated R&D Informatics & Knowledge
Management: Leveraging Data from Disparate Sources to Create Value
February 21-23, 2012 – San Francisco, CA
Download Brochure
Register
Program bioinformatics
visualization: Special issue of
Informatics Visualisation, vol. 4 no. 3, Sept. 2005 guest editors Chris North
& Theresa-Marie Rhyne http://people.cs.vt.edu/~north/BioVisCFP.html
biojava.org: An open-source project dedicated to providing Java
tools for processing biological data. This will include objects for manipulating
sequences, file parsers, CORBA interoperability, access to ACeDB,
dynamic programming, and simple statistical routines. The BioJava library
is useful for automating those daily and mundane bioinformatics tasks.
http://www.biojava.org/
biological computing:
IT infrastructure
biological
databases:
Biological databases have inherent complications stemming from
the nature of the information they contain and the dependence of computational
methods on these data. Most biological data are not digital, making machine-
readability of the data (for automated data- mining) impossible. In addition,
the lack of standardized nomenclature and ontology, the use of protein aliases
(leading to ambiguity), the lack of interoperability across databases, and the
presence of errors in database annotations have hindered and complicated the use
of computational methods. Defining the Mandate of Proteomics in the Post-
Genomics Era, Board on International Scientific Organizations, National Academy
of Sciences, 2002 http://www.nap.edu/books/NI000479/html/R1.html
bioMOBY:
An international group of biological data hosts, biological
data service providers, and coders whose aim is to set standards for biological
data representation, distribution, and discovery. http://biomoby.org/
BIONLP.org:
Natural language processing of biology text. Bob
Futrelle, Computer Science, Northeastern Univ., US, 2002]http://www.ccs.neu.edu/home/futrelle/bionlp/
BioPax: Biological Pathways Exchange. A collaborative effort to
create a data exchange format for biological pathway data. http://www.biopax.org/
Related terms: metabolic
pathways
bioperl.org:
An international association of developers of open
source Perl tools for bioinformatics, genomics and life science research.
We work closely with our friends and colleagues at biojava.org, biopython.org
and
bioxml.org. The Bioperl server provides an online resource for modules,
scripts, and web links for developers of Perl- based software for
life science research. http://bio.perl.org/
biopython.org:
An international association of developers of
freely available Python tools for computational molecular biology. biopython.org
provides an online resource for modules, scripts, and web links for developers
of Python- based software for life science research.
http://www.biopython.org/
biosemiotics:
http://en.wikipedia.org/wiki/Biosemiotics
BioWidget Consortium Home Page, Computation Biology & Informatics
Lab, Univ. of Pennsylvania, US. The bioWidgets toolkit is a collection
of Java Beans (used for development of graphics applications and/or applets
in the genomics domain). http://www.cbil.upenn.edu/bioWidgets/
BISTI Consortium:
Established in May 2000 to serve as the focus
of biomedical computing issues at the NIH and to facilitate implementation
of the BISTI recommendations. The Consortium is composed of senior-level
representatives from the NIH centers and institutes and representatives
of other Federal agencies concerned with bioinformatics and computational
applications. The mission of the BISTI Consortium is to make optimal use
of computer science and technology to address problems in biology and medicine
by fostering new basic understandings, collaborations, and transdisciplinary
initiatives between the computational and biomedical sciences. http://www.bisti.nih.gov/bistic2.cfm
cancer informatics
& bioinformatics: Explosive
growth of biological information has resulted from many advances in the field of
molecular biology and sophisticated techniques and equipment used to carry out
rapid genome sequencing. These advances have enabled improvement in biological
research and clinical medicine. Bioinformatics & Cancer
informatics February 21-23, 2012 • San Francisco, CA Program | Register
| Download Brochure
cellular bioinformatics:
The lesser developed branch of bioinformatics that focuses on the understanding
of the functioning living cell. As such it has to integrate DNA, mRNA, protein
and metabolic data. Because of the complexity of the problem, it also needs to
invoke mathematical modeling. ... The branch of cellular bioinformatics that
focuses on understanding on the basis of all the know experimental data is also
called computational biochemistry. Hans Westerhoft, Vrije Universiteit Netherlands
http://www.bio.vu.nl/hwconf/papers/cellbioinf.html
cloud based
bioinformatics: Leveraging cloud computing technology, bioinformatics tools
can be made available to anyone anywhere when they need them. This conference
will feature successful cases of large scale on demand computing in the cloud,
and translational bioinformatics analysis conducted in the cloud, as well as the
software that let users create and share standardized research pipelines and
workflow with fast turnaround time and lower cost. Cloud-Based Bioinformatics October 10-11, 2012 • Vienna
Austria Program | Register | Download Brochure
comparative bioinformatics:
The main
focus of the group is the development of novel algorithms for the comparison of
multiple biological sequences. Multiple comparisons have the advantage of
precisely revealing evolutionary traces, thus allowing the identification of
functional constraints imposed on the evolution of biological entities. Most
comparisons are currently carried out on the basis of sequence similarity. Our
goal is to extend this scope by allowing comparisons based on any relevant
biological signal such as sequence homology, structural similarity, genomic
structure, functional similarity and more generally any signal that may be
identified within biological sequences. Using such heterogeneous signals serves
two complementary purposes: (i) producing better models that take advantage of
the signal evolutionary resilience, (ii) improving our understanding of the
evolutionary processes that lead to the diversification of biological functions
Centre for Genomic Regulation, Barcelona Spain .http://pasteur.crg.es/portal/page/portal/Internet/02_Research/01_Programmes/01_Bioinformatics_Genomics/03_Comparative_Bioinformatics
comparative
systems biology:
My research projects in comparative
systems biology have four main thrusts: whole-genome functional
annotation, multi-clustering of molecular profiles, cross-condition analysis
of functional genomics data, and computationally-driven design of biological
experiments. The research I am conducting with my life science colleagues in
comparative systems biology has the goal of providing precise functional
annotations to hypothetical genes in model organisms and in newly-sequenced
genomes; delineating similarities and differences in cellular networks
activated in different diseases; identifying core cellular pathways common to
response networks for multiple stresses in various model organisms; and
refining our understanding of the molecular basis of disease resistance in
plant-pathogen interactions. Research interests, TM Murali, Computer Sciences,
Virginia Tech, http://people.cs.vt.edu/~murali/research.html
Broader
term: systems biology Google
= about 126 May 7, 2007, about 203 Oct. 15, 2007
computational annotation:
The workshop began with a series of presentations on computational annotation
and experimental approaches to biological confirmation of functional elements in
the genomes of both model organisms and the human. Subsequent to those
discussions, NHGRI outlined its proposal for a pilot project to exhaustively
determine all functional elements in a small fraction (~1 percent) of the human
genome, Initial Inventory of Functional Elements to Identify: The participants
recommended that both protein- coding genes and non- protein- coding genes need
to be identified. For each of these, the complete (full- length) coding sequence
and all variants, as well as the transcriptional regulatory elements (e.g.,
promoters and enhancers) and post- transcriptional regulatory elements (e.g.
cis- acting RNA elements) should be described. All pseudogenes should be
identified. A number of global sequence features, such as sites of methylation,
sequence variation, evolutionary history of sequence blocks and repetitive
elements were suggested for inclusion, as were a number of chromosomal elements,
such as origins of replication, nuclease hypersensitive sites, matrix attachment
sites and histone modifications. Workshop on the Comprehensive Extraction of
Biological Information from Genomic Sequence, Bethesda, Md. July 23-24, 2002, http://www.genome.gov/10005568
computational annotation technologies:
Several
‘wet bench’ technologies and resources were discussed. These included DNA
array studies, RT-PCR/ cDNAs, in situ hybridization, chromatin
immunoprecipitation, RNAi, knockout mice, and antibody analysis of protein
function. A broad range of computational approaches were also considered to be
critical for inclusion. These included both comparative sequence analysis of
multiple genomic sequences to identify conserved elements and automated
prediction of functional elements, including coding sequences, promoters,
alternative splice variants and other highly conserved regions. The importance
of ensuring close collaboration between experimental and computational
approaches was stressed. Workshop on the Comprehensive Extraction of Biological
Information from Genomic Sequence, Bethesda, Md. July 23-24, 2002, http://www.genome.gov/10005568
computational biology: The development and application of data -
analytical and theoretical methods, mathematical modelling and computational
simulation techniques to the study of biological, behavioral, and social
systems. Biomedical Information Science and Technology
Initiative BISTI Bioinformatics at the NIH, 2000 http://www.bisti.nih.gov/
I find that people use "computational biology" when discussing that subset of bioinformatics (in the broadest sense) closest to the field of classical general biology.
Computational biologists interest themselves more with evolutionary, population and theoretical biology rather than cell and molecular biomedicine. It is inevitable that molecular biology is profoundly important in computational biology, but it is certainly not what computational biology is all about (see next paragraph). In these areas of computational biology it seems that computational biologist's have tended to prefer statistical models for biological phenomena over
physico- chemical ones. This is often wise...
One computational biologist (Paul J Schulte) did object to the above and makes the entirely valid point that this definition derives from a popular use of the term, rather than a correct one. Paul works on water flow in plant cells and points out that biological fluid dynamics is a field of computational biology in
itself - and this, like any application of computing to biology, can be described as computational biology...
Where we disagree, perhaps, is in his conclusion from
this - which I reproduce in full: "Computational biology is not a "field", but an "approach" involving the use of computers to study biological processes and hence it is an area as diverse as biology itself." Richard Durbin, Head of Informatics at the Wellcome Trust Sanger Institute, expressed an interesting opinion on this distinction in an interview on this distinction:
"I do not think all biological computing is bioinformatics, e.g. mathematical modelling is not bioinformatics, even when connected with
biology- related problems. In my opinion, bioinformatics has to do with management and the subsequent use of biological information, particular genetic information."
[Damian Counsell, bioinformatics.org FAQ,
2001] https://bioinformatics.org/faq/#definitionOfCompbiol
A field of biology concerned with
the development of techniques for the collection and manipulation of
biological data, and the use of such data to make biological discoveries
or predictions. This field encompasses all computational methods and theories
applicable to molecular biology and areas of computer- based techniques
for solving biological problems including manipulation of models and datasets.
MeSH, 1997 Google = about 90,900 Aug. 20, 2002;
about 331,000 July 26, 2004;a bout 1,430,000 May 7, 2007 Related terms:
protein informatics
Computational biology FAQ, Robert D. Phair, US, 2000 http://www.bioinformaticsservices.com/bis/resources/faq/faq.html
conceptual
biology:
As we see it, is not a distinct type of science, but rather
it has a different source: the information in databases... By logical, critical
analysis of existing facts and models, one can generate a hypothesis in which
predictions are formulated in testable terms, and then search for relevant
information among published reports of experiments that may have had a different
purpose altogether. MG Blagosklonny and AB Pardee, Unearthing the gems:
Conceptual Biology, Nature 416 (6879): 373, 28 March 2002
The iterative process
of analysing existing facts and models available in published literature to
generate new hypotheses. Julie C. Barnes, Conceptual
biology: a semantic issue and more, Nature 417(6889): 587-588, 6 June 2002
Related terms: Research
meta-analyses, meta- analysis
curated databases:
Often less complete than primary databases, but
they have less redundancy and the added value of scientific annotation;
therefore, a biologically significant sequence should be easier to find in such
a database and of greater value. Naturally, the degree of redundancy and
annotation in such a database depends on the experience, skills, aims, and
devotion of its curators. ... The only proper way to curate databases is the way groups like those that
developed OMIM [Online Mendelian Inheritance
in Man], SWISS- PROT and most commercial databases have done it — that
is, through making scientific judgments as data are cleaned up and merged.
Under the supervision of a curator. Other curated databases include LocusLink,
RefSeq, & SGD (Saccharomyces cerevisae Genome Database) databases:
Collections of data in machine- readable form, which
can be manipulated by software to appear in varying arrangements and subsets.
Genetic information is stored in different ways in
different databases, which makes it hard to compare their holdings. So
while computational biologists are trying to improve the quality of the
databases, they are also working to build bridges between them. So
far, they have had only limited success … each database has its own Web
site with unique navigation tools and data storage formats that make such
searching difficult … programs can’t easily recognize data that are not
stored in a uniform way. Elizabeth Pennisi “Seeking Common language in a Tower
of Babel” Science: 449 Oct. 15 1999
Databases
& software describes and provides links
to around 200 databases and about 30 software tools. Narrower terms: annotated
databases, curated databases, federated databases, integrated databases,
interoperability, non- redundant databases, proprietary databases, redundant
databases, relational databases, flat files, indexed flat files.
distance functions or similarity scores:
The key issue in comparing expression
profiles is deciding what it means for two profiles to be
"similar." Mathematically, we need a function that takes two
expression profiles and calculates a similarity score. It is sometimes easier to
work with the opposite concept of distance, and people often speak of distance
functions instead of similarity scores. Many similarity or distance functions
are used in microarray work, and there is no consensus as to which one is best. Narrower terms: Euclidean distance, Pearson correlation
distributed annotation system: A client- server system in
which a single client integrates information from multiple servers. It allows a
single machine to gather up genome annotation information from multiple distant
web sites, collate the information, and display it to the user in a single view.
Little coordination is needed among the various information providers.
Biodas.org http://biodas.org/
dynamic modeling:
Mathematical
approaches to studying biological variation have changed little in several
decades. There is a need to develop new dynamic models to illuminate how systems
interact and evolve. Just as important, it is critical to study the nature of
biological and mathematical assumptions of models and statistics. Tools for
analyzing and interpreting data on the architecture of complex phenotypes should
be developed in the context of real biological information. Genetic
Architecture, Biological Variation and Complex Phenotypes, PA-02-110, May 29,
2002- June 5, 2005 http://grants1.nih.gov/grants/guide/pa-files/PA-02-110.html
Euclidean distance: Commonly used distance function, which works by
treating each expression profile as defining a point in a multidimensional
space.
European Bioinformatics
Institute EBI,
Hinxton, Cambridge, UK.
An EMBL outstation. http://www.ebi.ac.uk/
functional
bioinformatics: The
emerging field of functional bioinformatics focuses on the development of ontologies
or concept classifications fed into algorithms used to perform computations
of the functions of biomolecules . "About
bioinformatics" George Washington Univ. Medical Center, 2002 http://www.gwumc.edu/bioinformatics/about/bioinfo.htm
An emerging subfield of
bioinformatics that is concerned with ontologies and algorithms for
computing with biological function. Functional bioinformatics is the
computational counterpart of functional genomics ... is concerned
with managing and analyzing functional genomics data, such as gene
expression experiments and large- scale knock- out experiments. ..
emphasizes large- scale computational problems, such as problems involving
complete metabolic networks and genetic networks. Peter D. Karp
"An ontology for biological function based on molecular
interactions" Bioinformatics Ontology 16 (3): 269- 285, 2000
Related terms: Functional genomics,
Metabolic Engineering Ontologies & taxonomies
genochemistry
genomic chemistry: The volume of data from biological and chemical
studies has been increasing exponentially in recent years. In particular,
there are now 150 billion sequences within GenBank, 60k protein structures
in PDB, and 50 million chemicals with unique structures (as of Sept.
7, 2009, CAS). As a result, one of the most important challenges has
been the annotation of genetic sequences to their functions, and enzymes
(encoded by their sequences) to their substrate profiles. A
systematic study of chemistry that links the enzyme's sequence information
(including SNP) and substrate structural diversity is needed. It
differs from traditional disciplines in many ways and requires a
restructuring of established methods, the standardization of the data
collection process, and new bioinformatics and modeling tools. It can take
the form of extended biocatalysis complemented by bioinformatics and
molecular modeling. We tentatively refer to this discipline as
Genochemistry. IUPAC, Genochemistry -- chemistry
designed for life sciences: Towards a guideline and a framework of
genochemistry, 2010 IUPAC
Project Number 2009-021-3-300. A
glossary of specialized terms will be included.
glycobioinformatics,
glycoinformatics: Glycosciences
I2B2 Informatics for Integrating Biology
& the Bedside: An NIH- funded National
Center for Biomedical Computing based at Partners HealthCare System.
[Boston] http://www.i2b2.org/
information overload:
Biomedicine is in the middle of revolutionary advances. Genome projects, microassay methods like DNA chips, advanced radiation sources for crystallography and other instrumentation, as well as new imaging methods, have exceeded all expectations, and in the process have generated a dramatic information overload that requires new resources for handling, analyzing and interpreting data. Delays in the exploitation of the discoveries will be costly in terms of health benefits for individuals and will adversely affect the economic edge of the country.
Opportunities in Molecular Biomedicine in the Era of Teraflop Computing: March 3 & 4, 1999, Rockville, MD, NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-
Champaign
Many
of today's problems stem from information overload and there is a desperate need
for innovative software that can wade through the morass of information and
present visually what we know. The development of such tools will depend
critically on further interactions between the computer scientists and the
biologists so that the tools address the right questions, but are designed in a
flexible and computationally efficient manner. It is my hope that we will
see these solutions published in the biological or computational literature.
Richard J. Roberts, The early days of bioinformatics publishing, Bioinformatics
16 (1): 2-4, 2000
"Information overload" is not an overstatement these days. One of the biggest challenges is to deal with the tidal wave of data, filter out extraneous
noise and poor quality data, and assimilate and integrate information on a
previously unimagined scale Google = about 118,000
July 19, 2002; about 249,000 Oct. 22, 2004; about 1,480,000 Nov 18, 2009
Where's
my stuff? Ways to help with information overload, Mary Chitty, SLA
presentation June 10, 2002, Los Angeles CA
Wikipedia http://en.wikipedia.org/wiki/Information_overload
integrated databases:
Integration [of databases] typically is
accomplished by creating small, object- oriented software elements, or “wrappers”
that let a single overlaying, often browser like, desktop application interact
with all the pieces. The original separate systems are intact and
functional, and new ones can be added, while the underlying complexity
is transparent to users. There are still many challenges … but computing
environments are becoming more unified, flexible and expandable. A. Thayer
“Bioinformatics for the Masses” Chemical & Engineering News 78(6):
19-32 Feb. 7, 2000
Information in OMIM [Online Mendelian Inheritance in Man] and the published working draft of the International
Human Genome Sequencing Consortium (Nature 15 Feb. 2001) has been facilitated
by ties to NCBI's RefSeq and LocusLink databases. Are there other good
examples of integrated databases? Related terms: Bio-Ontology Standards
Group, Data Model Standards Group; Functional
genomics Gene Ontology
integration:
Integration of the various types of large- scale data is currently receiving
much attention. There appears, however, to be little agreement on what exactly
is meant by "integration", not to mention how to achieve it. The word
"integration" is being attached to almost any analysis that involves
the combined use of two or more large datasets. Lars J.
Jensen, Peer Bork, Quality analysis and integration of large- scale molecular
data sets. Drug Discovery Today: Targets, 3(2): 51-56
integration (of databases):
Allows researchers to increase the value
they get from the data, because it increases the base of information they can
access and allows for more robust searching. Related
terms: IT infrastructure
middleware, Object Oriented modeling OOM, object
protocol model OPM; Maps genomic & genetic memory mapped data structures
Interoperable Informatics Infrastructure Consortium I3C:
http://www.i3c.org/
LSID Life Sciences
Identifiers: Cover pages http://xml.coverpages.org/lsid.html
life sciences informatics:
Informatics are
essential at every step of genomics- based drug discovery and development. The
commercial landscape of life sciences information technology has changed
dramatically in recent years. Bioinformatics,
in particular, has gone through a dramatic boom/bust. While IT companies are
looking to the drug discovery and development arena as a new market opportunity,
pharmaceutical companies are faced with rising pressure to reduce (or at
least control) costs, and have a growing need for new informatics tools to help
manage the influx of data from genomics, and turn that data into tomorrow's
drugs. Key IT tools, such as high- performance computing, Web services, and
grids, are being used to improve the speed and efficiency of drug discovery and
development. True breakthroughs are still lacking, particularly in key areas
such as gene prediction, data mining, protein structure modeling and prediction,
and modeling of complex biological systems. However, most experts agree that IT
and bioinformatics are essential to reaching the improved productivity the
pharmaceutical industry craves.
molecular bioinformatics:
Conceptualizing biology in terms of
molecules (in the sense of physical- chemistry) and then applying
"informatics" techniques (derived from disciplines such as applied
math, CS [computer science] and statistics to understand and organize the
information associated with these molecules on a large- scale. Mark Gerstein
"What is Bioinformatics?" MB&B 474b3, 2001 http://bioinfo.mbb.yale.edu/what-is-it.html
molecular information theory: In our laboratory we
use Claude
Shannon's information
theory, computers (Unix, Pascal and PostScript
graphics on Sun workstations) and genetic engineering (protein and DNA gels,
cloning, sequencing and magnetic bead technology) to study genetic control
patterns on DNA and RNA. "Molecular Information Theory" Tom
Schneider, National Cancer Institute, US, 2002 http://www.lecb.ncifcrf.gov/~toms/introduction.html
molecular pattern recognition:
Developing computational methodologies
for the analysis and interpretation of large-scale expression datasets
generated by DNA microarray experiments. Analysis of genome-wide
expression patterns and their correlations with phenotypes of interest
may provide unique insights into the structure of genetic networks and
into biological processes not yet understood at the molecular level.
Whitehead/ MIT [US] Genome Center's Molecular Pattern Recognition
web site. http://www.genome.wi.mit.edu/MPR/index.html Broader term: pattern recognition. Related terms Expression
molecular systems biology:
An integrative discipline that seeks to explain the properties and behaviour of
complex biological systems in terms of their molecular components and their
interactions. Nature Publishing, Molecular Systems Biology aims &
scope http://www.nature.com/msb/authors/index.html#Aims-and-scope
Broader term: systems biology
NCBI National Center for Biotechnology Information:
Established
in 1988 as a national resource for molecular biology information,
NCBI creates public databases, conducts research in computational
biology, develops software tools for analyzing genome data, and disseminates
biomedical information - all for the better understanding of molecular
processes affecting human health and disease. Part of NIH. http://www.ncbi.nlm.nih.gov
non-redundant databases:
Researchers at the National Center for
Biotechnology Information (NCBI) coined the term "nr" database
(nonredundant database) to refer to a database in which the obviously
redundant entries have been merged. These entries are typically those that are
100%, character- by- character identical, and algorithms exist that can remove
such redundancy. Although such a database has less redundancy than a primary
database, a substantial amount of redundancy remains, and it can be removed only
by a curator using scientific judgment.
Many databases try to be “non-redundant”.
Unfortunately, biological data is too complex to fit a simple definition
of redundancy … Each “non- redundant” database has its own definition of
redundancy. George Church Lab, Harvard Medical School, US http://arep.med.harvard.edu/seqanal/db.html
Examples of non- redundant databases include UniGene
and SWISS- PROT,
while
DDBJ/ EMBL/ GenBank are redundant databases.
ontologies
proteomics: Protein informatics
Open Bioinformatics Foundation
OPEN-BIO:
The purpose of the foundation is to act as an umbrella
organization for the various bio*.org projects that grew out of the original BioPerl
project. The goal of the foundation is to provide financial, administrative and
technical assistance for our various open source life science projects. http://open-bio.org/
Narrower terms: biojava.org, bioperl.org, biopython.org, bioxml.org
Related
term: biocorba.org
prediction: Narrower
terms: exon prediction, gene prediction,
ORF prediction, protein sequence prediction; Protein
informatics protein structure prediction; Related terms:
recognition
proprietary databases:
Fee- based, copyrighted databases
(in contrast to public databases such as those at DDBJ/ EMBL/ GenBank). Some databases charge subscription fees to commercial
organizations, with other arrangements available to non- profits.. Also
referred to as private databases. Compare: public databases
protein
bioinformatics protein informatics: Protein
informatics
public databases: Freely
accessible databases such as GenBank/ EMBL/ DDBJ, ArrayExpress or BLOCKS.
There has been much debate about public vs. proprietary databases.
recognition:
Narrower terms: computational gene recognition, gene recognition, molecular recognition.
recognition site: Pharmaceutical
biology
research informatics:
The explosion of genomic information, from
sequences
and gene expression to SNPs and protein structures,
is of limited value for pharmaceutical researchers without powerful software
capable of interpretation and comparisons. Data mining, multiple location data sharing, and computational enhancements
of biological and chemistry projects, as well as integration of these efforts,
and legacy information systems, the very different language
and perspectives of chemists and biologists, and the organizational issues
of compartmentalization remain key topics.
self- organizing map:
A type of mathematical cluster analysis
that is particularly well suited for recognizing and classifying features
in complex, multidimensional data. The method has been implemented in a
publicly available computer package, GENECLUSTER, that performs the analytical
calculations and provides easy data visualization … Expression patterns
of some 6,000 human genes were assayed, and an online database was created.
GENECLUSTER was used to organize the genes into biologically relevant clusters
that suggest novel hypotheses about hematopoietic differentiation. P. Tamayo
et al “Interpreting patterns of gene expression with self- organizing maps:
methods and application to hematopoietic differentiation” PNAS 96(6): 2907-
2912
Mar 16, 1999
Similar to
k-means, but the algorithm organizes
the clusters in a two- dimensional grid, such that clusters that are close
together in the grid are more similar than those further apart. This is a very
useful feature when working with large numbers of clusters. Related term: neural networks
semantic
systems biology: Semantic technologies are playing an increasingly
important role in capturing and modeling biological knowledge. Semantic
systems biology can complement the bottom-up approach with data-driven
generation of hypotheses. Therefore, Semantic Systems Biology (SSB)
is a systems biology approach that uses semantic description of knowledge
about biological systems to facilitate integrated data analysis. About
Semantic Systems Biology http://www.semantic-systems-biology.org/about
spatio
temporal dynamics: Local interactions in space can give rise to large
scale spatio temporal patterns (e.g. (spiral) waves, spatio- temporal
chaos (turbulence), stationary (Turing- type) patterns and transitions
between these modes). Their occurrence and properties are largely
independent of the precise interaction structure. They are indeed seen to
occur at many organizational levels of biotic systems. Space can be either
'real' space or a state space, e.g. 'phenotype space' in models of
speciation or 'shape space' in immunological models of shape- based
receptor interactions. We show that such spatio- temporal patterns have
important consequences for fundamental bioinformatic processes. Paulien
Hogeweg, Overview of Research 1993- 1998, Utrecht University, Netherlands,
1999 http://www-binf.bio.uu.nl/overview/node3.html
standards: Related terms: Bio-ontology Standards Group, CORBA,
Data
Model Standards Group, object protocol model OPM . EBI [European Bioinformatics
Institute] is also
working on standards. Microarrays
MAML, MGED,
MIAMI
synthetic
biology: A) the design and construction of new
biological parts, devices, and systems, and B) the re-design of existing,
natural biological systems for useful purposes. http://syntheticbiology.org/
Life Reinvented,
Wired on synthetic biology, Jan 2005 http://www.wired.com/wired/archive/13.01/mit.html?pg=1tems,
systems
bioinformatics:
With the completion of the Human
Genome Project, the scientific community is now faced with the even greater
challenge of analyzing the resulting data from this and other large-scale genome
projects to better understand the networks underlying biological function. Second
International Computational Systems Bioinformatics Conference To be Held August
11-14, 2003 at Stanford University, IEEE CS Bioinformatics Technical Chair via
BizWire http://quickstart.clari.net/qs_se/webnews/wed/bx/Bca-ieee-cs_csb2003.RMsB_DuP.html
Google
= about 1,230 Sept. 2, 2003; about 8,240 May 25, 2005
systems biology: Systems
and Multiscale Biology
April 25-26, 2012 • Boston, MA Program | Register | Download
Brochure
A
discipline at the intersection of biology, mathematics, engineering and the
physical sciences that integrates experimental and computational approaches to
study and understand biological processes in cells, tissues and organisms.
Studies at the systems level are distinguished not only by their quantitative
nature in data collection and mathematical modeling, but also by their focus on
interactions among individual elements such as genes, proteins and metabolites.
These studies often integrate data from multiple levels of the biological
information hierarchy in an environmental and evolutionary context and pay
particular attention to dynamic processes that vary in time and space.
Successive iterations of experiment and theory development are characteristic of
systems biology. When applied to human health, systems biology models are
intended to predict physiological behavior in response to natural and artificial
perturbations and thereby contribute to the understanding and treatment of human
diseases. National Institute of General Medical Sciences, NIH National Centers
for Systems BIology http://www.nigms.nih.gov/Initiatives/SysBio/
This
report focuses on the current and future applications of Systems Biology in drug
discovery, specifically in pinpointing optimal individual targets, and
combinations of targets, to overcome metabolic pathway redundancies, leading to
efficacious and safe products. Insight Pharma
Reports, Systems
biology: A disruptive technology, 2008
The label
“systems biology” is pretty awful, except, of course, for the many
even worse labels that have been tried. More important is what SB seeks to
do: transform biology and health care into a rigorous, predictive science
offering a richer understanding of biology and a vastly improved approach
to drug development and medicine. SB would build on the molecular biology
revolution and elucidate the wiring diagrams (and their rules) buried in
the data. John Russell, BioIT World, Sept 2007 http://www.bio-itworld.com/issues/2007/sept/cover-story/
Systems
biology is frequently defined as the study of all of the elements in a
biological system and their relationship to one another in response to
perturbation. Advances in science and technology are enabling the
development of this emerging and cross-disciplinary field by allowing
researchers to explore how biological components function as a network in
cells, tissues and organisms. Recently, pharmaceutical companies have
begun to embrace systems approaches in an effort to better understand
physiology, pathogenic processes and pharmacological responses. This
review focuses on recent advances within three core areas of systems
biology: data collection, data analysis, and the integration and sharing
of data. Susie
Stevens and J. Rung, Advances in systems biology: measurement, modeling
and representation, Current Opinion in Drug Discovery and Development,
2006 Mar; 9(2): 240- 250.
Systems biology is the
study of an organism, viewed as an integrated and interacting network
of genes, proteins and biochemical reactions which give rise to life. Instead of
analyzing individual components or aspects of the organism, such as sugar
metabolism or a cell nucleus, systems biologists focus on all the components and
the interactions among them, all as part of one system. These interactions are
ultimately responsible for an organism´s form and functions. Systems Biology,
the 21st century science, Institute for Systems Biology, Seattle, 2010 http://www.systemsbiology.org/Intro_to_ISB_and_Systems_Biology/Systems_Biology_--_the_21st_Century_Science
There are two opinions on what systems biology
is supposed to be. One group sees systems biology as another level of
combining data from different levels (like DNA, RNA and
protein level) (see [Leroy] HOOD). Another group wants to combine classical molecular
and cell biology with systems theory and focus on the new forms of behavior that
emerge when systems of genes and proteins are studied in a wholistic way. For
this they need data from all those different levels as well, of course. That is
why they see systems biology as a cooperative effort, with systems theory
providing a theoretical framework and a new view on things for biologists, along
with lots of experience with complex systems, and biology providing in-depth
knowledge of the field of application as well as practical handling experience.
This data is the basis for developing the kind of detailed models
that are necessary for such studies of systemic properties and behavior. For
both groups, the goal is to reach a new level of understanding of biological
systems often referred to as 'systems level' understanding. A glossary for
Systems Biology, Systems Biology Group, Stuttgart http://www.sysbio.de/projects/glossary/Systems_Biology.shtml
The very nature of systems biology requires integrating data from a
variety of sources generated and interpreted by people skilled in different
areas -- engineering, computer science, biology, physics, mathematics, and
statistics. Key considerations in this process include the generation of
quantitative data, barriers in communication across departments, and
organizational challenges.
Glossary for systems biology, Institutes for
System Dynamics and Control and for Systems Theory in Engineering of the
University of Stuttgart 100 + definitions, 2002 http://www.sysbio.de/projects/glossary/index.shtml
What is systems
biology? Institute for Systems Biology,
Seattle WA http://www.systemsbiology.org/Default.aspx?pagename=whatissystemsbiology
Wikipedia http://en.wikipedia.org/wiki/Systems_biology
Google = about 865,000
May 25, 2005; about 1,530,000 Nov 10, 2006; about 11,600,000 Feb 14 2011 Narrower terms: comparative
systems biology, molecular
systems biology; hepatocyte
systems biology, semantic systems biology ; In
silico & molecular modeling applied systems biology, in silico biology ;
Metabolic engineering
signal transduction Pharmaceutical
biology integrative biology-
thresholding:
The researcher defines minimum and maximum values that
are considered reliable; measurements that are too low or too high are dropped
from the dataset or marked as unreliable. It also makes sense to subtract the
minimum value from all other measurements, because this reflects baseline noise.
This approach implicitly assumes that microarrays normally operate in the linear
part of the dynamic range, and that the transitions between the linear and flat
regimes occur abruptly. Broader term: normalization
Bibliography
Bioinformatics and Genomics Gateway, BioMedCentral http://www.biomedcentral.com/gateways/bioinformaticsgenomics/
Systems
Biology Gateway, BioMedCentral http://www.biomedcentral.com/gateways/systemsbiology/
Alpha glossary index
How
to look for other unfamiliar terms
|
|