You are here Biopharmaceutical Glossary homepage/Search > Informatics > Pharmaceutical Cheminformatics


Cheminformatics/ Chemoinformatics Glossary & taxonomy
Evolving terminology for emerging technologies

Comments? Questions? Revisions?  Mary Chitty mchitty@healthtech.com
Last revised March 28, 2008
View a Printer-Friendly Version of this Web Page!


New Page 1

Please register for CHI's Genomics Glossaries & Taxonomies website. This sign-in box with then disappear from each page, if you accept cookies. Use of this site will continue to be free, but better demographic data on who is accessing this material helps us to justify the expense of maintaining this resource. Registration policy has details.

Registered users of the Genomics Glossaries & Taxonomies will automatically be signed up for CHI's complimentary email monthly newsletter, GenomeLink, unless you choose to opt out of receiving it.

Mr.     Ms.     Mrs.     Dr.     Prof.

First:

         

Last:

Title:

Dept.:

Company:

Address:

City:

State:

Zip:

Country:

Email:

Opt-out of Email

YES    NO

Telephone:

Would you like to receive CHI event updates via fax? 
Yes       No 

Fax:


Chemoinformatics is the application of informatics tools to solve discovery chemistry problems. From library shaping to ADME-Tox prediction via virtual screening, computational chemistry is an integral component of hit and lead generation. Coverage this year will include case studies of several approaches and tools that helped to identify compounds with a balanced ADME-Tox profile together with high potency and selectivity. Creation of large in silico virtual libraries of compounds vastly increase the efficiency in mining the chemical space and considerably reduces time and costs in drug discovery.

Informatics  Map: Finding guide to terms in these glossaries  Site Map
Related glossaries include Applications  Drug Discovery & Development, Pharmacogenomics
Informatics
Algorithms & data analysis, Bioinformatics, Computers & computing, Databases & Software DirectoryIn silico & Molecular Modeling Information management & interpretation 
Technology  Chemistry & biology

affinity based data mining: Algorithms & data analysis glossary

CML Chemical Markup Language:  Wikipedia  http://en.wikipedia.org/wiki/Chemical_Markup_Language 

chemical informatics:  Chemical informatics is the application of computer technology to chemistry in all of its manifestations. Much of the current use of cheminformatics techniques is in the drug industry, but chemical informatics is now being applied to problems across the full range of chemistry. Chemical informaticians often work with massive amounts of data. They construct information systems that help chemists make sense of the data, attempting to predict the properties of chemical substances from a sample of data, much as Mendeleev did many years ago when he accurately predicted the existence and properties of unknown elements in the periodic table. Thus, through the application of information technology, chemical informatics helps chemists organize and analyze known scientific data and extract new information from that data to assist in the development of novel compounds, materials, and processes. Chemical Informatics at Indiana University, 2007 http://www.chembiogrid.org/related/resources/ciatiu.html 

Chemical informatics and cyberinfrastructure collaboratory at Indiana University http://www.chembiogrid.org/index.html 

chemical information: Many people view chemoinformatics as an extension of chemical information, which is a well established concept covering many areas that employ chemical structures, data storage and computational methods, such as compound registration databases, on- line chemical literature, SAR analysis and molecule- property calculation. Timothy Ritchie "Chemoinformatics; manipulating chemical information to facilitate decision- making in drug discovery" Drug Discovery Today 6(16): 813-814, Aug. 2001

chemical information system: Must include registration, computed and measured properties, chemical descriptors and inventory. The primary purpose is to be able to identify a chemical substance, find compounds similar to the target compound and determine the location of the compound. To effectively build it, an object definition of the chemical sample is paramount…The hub [central database] of the chemical information system is the inventory system. Frank Brown "Chemoinformatics: What is it and How does it Impact Drug Discovery" Annual Reports in Medicinal Chemistry 33: 375-384, 1998

cheminformatics: The application of informatics tools to solve discovery chemistry problems. From library shaping to ADME-Tox prediction via virtual screening, computational chemistry is an integral component of hit and lead generation. Cheminformatics, World Pharmaceutical Congress,  May 23-25, 2006 • Philadelphia, PA

The practice of  finding the "best- fitting" compounds to address particular targets. The field encompasses diversity analysis and library design, virtual screening, rational drug design, and tools and approaches for predicting activity and other properties from structure.

Going by the numbers in  Google.com cheminformatics seems to be the currently most used form of this word, overtaking chemoinformatics. See the Glossary FAQ question #3 for details and methodology.

Related terms: Drug discovery & development, In silico & Molecular Modeling

chemi-informatics: See chemoinformatics

chemodescriptors: Hawkins DM, Basak SC, Kraker J, Geiss KT, Witzmann FA, Combining Chemodescriptors and Biodescriptors in Quantitative Structure-Activity Relationship Modeling, J Chem Inf Model. 46(1): 9-16, Jan 23, 2006

chemoinformatics: Chemoinformatics is a scientific discipline that has evolved in the last 40 years at the interface between chemistry and computer science. It has been realized that in many areas of chemistry, the huge amount of data and information produced by chemical research can only be processed and analyzed by computer methods. Furthermore, many of the problems faced in chemistry are so complex that novel approaches utilising solutions that are based on informatics methods are needed. Thus, methods were developed for building databases on chemical compounds and reactions, for the prediction of physical, chemical and biological properties of compounds and materials, for drug design, for structure elucidation, for the prediction of chemical reactions and for the design of organic syntheses. Obernai Declaration, Chemoinformatics in Europe: Research and Teaching, May 29-31, 2006 Obernai, France  http://infochim.u-strasbg.fr/chemoinformatics/Obernai_declaration.php 

The focus [of chemoinformatics] is placed on four traditional research areas: chemical database systems, computer-assisted structure elucidation systems, computer-assisted synthesis design systems, and 3D structure builders. WL Chen, Chemoinformatics: past, present, and future. Journal of Chemical Information Model, 46(6): 2230-2255, Nov 2006

Chemoinformatics is an integral part of the discipline of knowledge management. Nicholas J. Hrib, Norton P. Peet "Chemoinformatics: are we exploiting these new science?" Drug Discovery Today 5 (11): 483- 485, Nov. 2001

Increasingly incorporates "compound registration into databases, including library enumeration; access to primary and secondary scientific literature; QSAR Quantitative Structure Activity Relationships) and similar tools for relating activity to structure; physical and chemical property calculations; chemical structure and property databases, chemical library design and analysis; structure- based design and statistical methods. Because these techniques have traditionally been considered the realms of scientists from different disciplines, differences in computer systems and terminology provide a barrier to effective communication. This is probably the single most challenging problem that chemoinformatics must solve. M Hann and R Green "Chemoinformatics – a new name for an old problem?" Current Opinion in Chemical Biology 3:379- 383, 1999

An emerging area, which annotates small molecules and also libraries with structure – function, synthesis, and all other relevant data used to design and develop better drugs. "Combinatorial Chemistry" Nature Biotechnology 18:  Supplement Oct. 2000, from Nature Biotechnology 16, 691– 693, 1998 

Mixing of information technology and management to transform data into information and information into knowledge for the intended purpose of making better decisions faster in the arena of drug lead identification and optimization. ..In Chemoinformatics there are really only two [primary] questions: 1.) what to test next and 2.) what to make next. The main processes within drug discovery are lead identification, where a lead is something that has activity in the low micromolar range, and lead optimization, which is the process of transforming a lead into a drug candidate. Frank Brown  "Chemoinformatics: What is it and How does it Impact Drug Discovery" Annual Reports in Medicinal Chemistry 33: 375-384, 1998

Related terms: cheminformatics, chemi-informatics, chemometrics, computational chemistry.  

chemometrics: The application of statistics to the analysis of chemical data (from organic, analytical or medicinal chemistry) and design of chemical experiments and simulations. IUPAC Computational

The science of relating measurements made on a chemical system or process to the state of the system via application of mathematical or statistical methods. International Society of Chemometrics "ISC symbol and definition of chemometrics" 1997  Wikipedia chemometrics  http://en.wikipedia.org/wiki/Chemometrics

Related terms: In silico & molecular modeling glossary  3D-QSAR, comparative molecular field analysis (CoMFA, QSAR )

ClogP values: In silico & molecular modeling glossary

computational biology: Bioinformatics glossary

computational chemistry: Chemistry & biology glossary Related terms: In silico & molecular modeling Computer Aided Molecular Design CAMD, molecular graphics

data mining: Nontrivial extraction of implicit, previously unknown and potentially useful information from data, or the search for relationships and global patterns that exist in databases. Bob Klevecz "The Whole EST Catalog" Scientist 12 (2): 22 Jan 18 1999 more...  Algorithms & data analysis glossary

data warehouse: Algorithms & data analysis glossary

drug design: Includes not only ligand design, but also pharmacokinetics (Pharmacogenomics) toxicity, which are mostly beyond the possibilities of structure- and/ or computer- aided design. Nevertheless, appropriate chemometric (Chemoinformatics) tools, including experimental design and multivariate statistics, can be of value in the planning and evaluation of pharmacokinetic and toxicological experiments and results. Drug design is most often used instead of the correct term  "ligand design”.  IUPAC Computational 

The molecular designing of drugs for specific purposes (such as DNA- binding, enzyme inhibition, anti- cancer efficacy, etc.) based on knowledge of molecular properties such as activity of functional groups, molecular geometry, and electronic structure, and also on information cataloged on analogous molecules. Drug design is generally computer- assisted molecular modeling and does not include pharmacokinetics, dosage analysis, or drug administration analysis.  MeSH, 1989

An iterative process involving drug discovery, lead optimization and chemical synthesis with the aim of maximizing functional activity and minimizing adverse effects. 

Narrower terms: rational drug design, structure- based drug design, molecular design; In silico & molecular modeling 3D-QSAR, QSAR, Computer Aided Molecular Design, Computer Assisted Drug Design CADD, Computer Assisted Molecular Modeling CAMD, de novo design 

GUI Graphical User Interface: Computers & computing glossary

genetic algorithm GA:  Algorithms & data analysis glossary

hydrophilicity is the tendency of a molecule to be solvated by water. IUPAC Medicinal Chem

hydrophobicity is the association of non-polar groups or molecules in an aqueous environment which arises from the tendency of water to exclude non polar molecules. (See also Lipophilicity). IUPAC Medicinal Chem

Immersive Virtual Reality IVR: New futuristic technique [which] enables the user to literally become a part of his or her data and to use additional senses. Although IVR has not yet enjoyed widespread use in scientific disciplines, it has been cost- effective in architectural design. Nicholas J. Hrib, Norton P. Peet "Chemoinformatics: are we exploiting these new science?" Drug Discovery Today 5 (11): 483-485, Nov. 2000]

Related term: In silico & molecular modeling glossary VRML

information silos: The cultural aspects impeding communication between different groups can be immense, are often not recognized or articulated, and greatly impede interdisciplinary research.   Wikipedia http://en.wikipedia.org/wiki/Information_silo 

Google = about 2,220 July 7, 2003; about 38,400 Feb. 20, 2006;a bout 74,000 Nov 10, 2006

in silico: In silico & molecular modeling glossary

 InChI: IUPAC International Chemical Identifier  http://www.iupac.org/publications/ci/2006/2806/4_tools.html 

Lipinski’s rules of five: See rules of five
Christopher Lipinski, Pfizer on Reducing the Investment Made in Likely Drug Development Failures, CHI's GenomeLink 15.1 http://www.healthtech.com/newsarticles/issue15_1.asp

lipophilicity: Represents the affinity of a molecule or a moiety for a lipophilic environment. It is commonly measured by its distribution behaviour in a biphasic system, either liquid- liquid (e.g., partition coefficient in octan-1-ol/water) or solid/liquid (retention on reversed- phase high performance liquid chromatography (RP-HPLC) or thin- layer chromatography (TLC) system).  (See also Hydrophobicity). IUPAC Medicinal Chem

"plug and play" systems: Required for effective chemoinformatics systems. Must be designed backward from the answer to the data to be captured and systems should be in components where each component has one simple task…modular systems that can "plug and play" into other systems. Frank Brown "Chemoinformatics: What is it and How does it Impact Drug Discovery" Annual Reports in Medicinal Chemistry 33: 375- 384, 1998

predictive data mining: Algorithms & data analysis glossary Used in structure- function correlations.

Principal Components Analysis PCA:  Algorithms & data analysis glossary

rules of five: Lipinski’s rules. Set of criteria for predicting the oral bioavailability of a compound on the basis of simple molecular features  (molecular weight CLogP, numbers of  hydrogen- bond donors and acceptors). Often used to profile a library or virtual library with respect to the proportion of drug- like members  which it contains. IUPAC Combinatorial

An algorithm, developed  by Christopher A. Lipinski (of Pfizer) and colleagues, in which many of the cutoff numbers are five or multiples of five. There are actually four rules, and Pfizer has developed a additional number of criteria for adoption of lead candidates. Advanced Drug Delivery Research 23: 3- 25, 1997.

Reducing the investment made in likely drug development failure. CHI's Genome Link 15.1 http://www.healthtech.com/newsarticles/issue15_1.asp   Christopher Lipinski on the rules of five (see section 8.4)  There are actually 50+ rules now.

Structure Activity Relationship SAR: The relationship between chemical structure and pharmacological activity for a series of compounds  [IUPAC Medicinal Chemistry]

Compounds are often classed together because they have structural characteristics in common including shape, size, stereochemical arrangement, and distribution of functional groups. Other factors contributing to structure- activity relationship include chemical reactivity, electronic effects, resonance, and inductive effects. [MeSH, 1972]

Narrower terms: In silico & molecular Modeling  3D-QSAR, QSAR; Related terms: NMR SAR by NMR ; Algorithms & data management cluster analysis, Principal Components Analysis PCA, recursive partitioning 

structure based design: A design strategy for new chemical entities based on the three- dimensional (3D) structure of the target obtained by X-ray or nuclear magnetic resonance (NMR) studies, or from protein homology models. [IUPAC Computational]

structure based drug design:  Structure-based design (SBD) has been in use within the pharmaceutical industry for over twenty-five years. Given the multi-disciplinary nature of drug discovery and development, SBD can hardly be the unique success factor. However, SBD is playing an increasingly important role. SBD of compound properties are still developing and growing in acceptance. In this program, we wish to highlight some recent breakthroughs and successes using SBD, including the accompanying interest in ligand-based. Structure Based Drug Design: Sophisticated Approaches to Drug Discovery, June 25-27 2008, Boston MA

For years researchers have sought a more rational approach to designing drugs rather than screening for hits and leads. The rapidly growing body of structural information emerging as a result of genomic- derived targets and industrialization of protein structure determination is dramatically altering the data drug designers have to work with. Newer approaches, such as co-crystallization of ligands with a given target, allows structural techniques to be used for screening, which then further facilitates the design process. 

Since the early 1980s, industry has been interested in structural biology as part of the discipline of direct structure- based drug design, which combines structural biology with computational and medicinal chemistry in order to design drugs - rather than merely selecting drugs - that modulate a protein target of interest.  

Broader term: drug design. Related terms: rational drug design; In silico & Molecular modeling; Drug targets target structure

"silo systems": Legacy method for many information systems, a system built to collect, store and report one laboratory’s data. Each "silo system" holds the data differently and may be in a different technology … the results of the systems cannot easily be interchanged … This is as much a corporate structure and resource problem as it is a technical problem. Contrast with "plug and play". Frank Brown "Chemoinformatics: What is it and How does it Impact Drug Discovery" Annual Reports in Medicinal Chemistry 33: 375- 384,1998

Related term: information silos

stereochemical formula (stereoformula): A three- dimensional view of a molecule either as such or in a projection. IUPAC Compendium

stereochemistry: See stereochemical formula (stereoformula):

Structure Activity Relationship (SAR): The relationship between chemical structure and pharmacological activity for a series of compounds. IUPAC Medicinal Chemistry See also Drug Discovery & Development

systems biology: Genetic manipulation & disruption glossary

virtual database assembly: A crucial activity as it enables access to the large number of drug- like molecules that could theoretically be made... can serve several purposes: for example, to generate a maximally diverse virtual library for lead generation, a biased library aimed at a specific target or target family, or a lead optimization library. Nicholas J. Hrib, Norton P. Peet "Chemoinformatics: are we exploiting these new science?" Drug Discovery Today 5 (11): 483- 485, Nov. 2000

virtual library: A library which has no physical existence, being constructed solely in electronic form or on paper. The building blocks required for such a library may not exist, and the chemical steps for such a library may not have been tested. These libraries are used in the design and evaluation of possible libraries. IUPAC Combinatorial Chemistry 

Related terms:  Combinatorial libraries & synthesis; In silico & molecular modeling in silico

virtual molecules: It has also become clear that even the most efficient combinatorial chemistry approaches can generate only a minute fraction of the 1 x 1040 virtual drug molecules that could be prepared. Timothy Ritchie "Chemoinformatics; manipulating chemical information to facilitate decision- making in drug discovery" Drug Discovery Today 6(16): 813- 814, 16 Aug. 2001

virtual screening: In silico & molecular modeling glossary.

XML:  Computers & computing glossary 

Bibliography
ChemBioGrid related sites, Chemical Informatics and Cyberinfrastructure Collaboratory, Indiana Univ., US, 2007  http://www.chembiogrid.org/related/index.html 
Chemical Informatics Letters glossary, Jonathan Goodman, 2005,  100 + definitions    http://www-jmg.ch.cam.ac.uk/CIL/gloss.html 
COSMAS: Cheminformatics, Ontologies & Statistical Mining Aggregation Site http://www.methylsalicylate.com/cosmas/ Cheminformatics blog.
IUPAC International Union of Pure and Applied Chemistry, Glossary of Medicinal Chemistry, 1998. 100+ terms. http://www.chem.qmw.ac.uk/iupac/medchem/

IUPAC  International Union of Pure and Applied Chemistry, Glossary of Terms used in Computational Drug Design, H. van de Waterbeemd, R.E. Carter, G. Grassy, H. Kubinyi, Y. C.. Martin, M.S. Tute, P. Willett, 1997. 125+ definitions. http://www.iupac.org/reports/1997/6905vandewaterbeemd/glossary.html

IUPAC International Union of Pure and Applied Chemistry, Standard XML data dictionaries for chemistry, current project http://www.iupac.org/projects/2002/2002-022-1-024.html 

Virtual Computational Chemistry Laboratory
http://www.vcclab.org

Alpha glossary index

How to look for other unfamiliar  terms

IUPAC definitions are reprinted with the permission of the International Union of Pure and Applied Chemistry. 

Contact | Privacy Statement | Alphabetical Glossary List | Tips & glossary FAQs | Site Map