You are here Biopharmaceutical/ Genomic Glossary homepage/Search  > Technologies > Biopharmaceutical Sequencing

Sequencing DNA & proteins Glossary & taxonomy
Evolving Terminologies for Emerging Technologies

Comments? Suggestions Revisions? Mary Chitty
Last revised October 20, 2014.


The "race" to sequence the Human Genome was not a 100 yard dash, but a marathon.  Although the Human Genome Project finished well ahead of schedule, and a number of genes have been identified, we have just begun to get a glimpse of what specific genes do and how we might be able to better use this knowledge for therapeutic interventions.  Teasing apart the interactions of  genes and proteins, delineating changes throughout the cell cycle, and correlating changes with health and disease will take even more time.  But with complete sequences, and the cross- species comparisons we can expect new insights and speeding up over time. Sequencing DNA is only a first step towards finding what functions are connected with specific sequences. Sequencing proteins (and determining the structures  – and functions of proteins) is ongoing.  

Chemistry term index   Drug discovery term index   Informatics term index   Technologies term index    Biology term index  Site Map Related glossaries include  Applications  Biomarkers   Molecular Diagnostics   Molecular Medicine  Informatics Bioinformatics  Drug discovery informatics  Sequencing informatics terms  in Genomic informatics
Technologies Chromatography & electrophoresis  Microarrays  Genomic technologies  Biology Functional genomics
   Genomics  Pharmacogenomics  ProteinsProtein StructuresProteomics  SNPs & genetic variations, Sequences - DNA & beyond    

$1,00 genome : Molecular Diagnostics

BioIT World Weekly next gen data: 

clinical sequencing: Clinical SequencingClinical Sequencing  February 10-12, 2014 • San Francisco, CA Program | Register | Download Brochure

coverage [sequencing]: Coverage is the average number of reads representing a given nucleotide in the reconstructed sequence ... Sometimes a distinction is made between sequence coverage and physical coverage. Sequence coverage is the average number of times a base is read (as described above). Physical coverage is the average number of times a base is read or spanned by mate paired reads[8].  Wikipedia shotgun sequencing accessed Jan 10, 2011 

de novo sequencing: Determination of sequences (of genes or amino acids) whose sequence is not yet known. Can be done with LC/MS/MS or nanoelectrospray MS/MS.

From the Latin "de novo" from the beginning. See also Mass spectrometry

deep sequencing: Techniques of nucleotide sequence analysis that increase the range, complexity, sensitivity, and accuracy of results by greatly increasing the scale of operations and thus the number of nucleotides, and the number of copies of each nucleotide sequenced. The sequencing may be done by analysis of the synthesis or ligation products, hybridization to preexisting sequences, etc.  MeSH 2011    
DNA sequencing:

exome sequencing: Targeted sequencing of all protein-coding regions in the human genome -- now offers an unprecedented opportunity for systematic, genome-wide discovery of somatic mutations in tumor tissue. Cancer. New epigenetic drivers of cancers, Elsässer SJ, Allis CD, Lewis PW. Science. 2011 Mar 4;331(6021):1145-6. 

Genome and Transcriptome AnalysisGenome and Transcriptome Analysis  February 10-12, 2014 • Molecular Medicine TriConference San Francisco, CA Program | Register |

Genomics & Sequencing Data Integration, Analysis and Visualization February 13-14, 2014 • Molecular Medicine TriConference San Francisco, CA Program | Register | 

genotype: The genetic constitution of an organism as revealed by genetic or molecular analysis, i.e. the complete set of genes, both dominant and recessive, possessed by a particular cell or organism. IUPAC Biotech

The observed alleles at a genetic locus for an individual. NHLBI    The genetic constitution of the individual; the characterization of the genes.  MeSH 1968

genotyping:  The determination of relevant nucleotide- base sequences in each of the two parental chromosomes. May refer to identifying one or more, up to the entire gene sequence of an organism. Compare phenotype. Used for diagnosis, drug efficacy, and toxicity. Utilizes genomic DNA that, after digestion, reacts with a SNP array to obtain an individual SNP pattern. These variations can for instance provide information about the diagnosis of a certain disease, or the effectiveness or side effect of a certain drug.

The genetic scientific community is exploding with new robust tools which explore the connections between genotypes and phenotypes. The falling prices from developing to mature genotyping platforms, result in abundant data to interrogate and analyze. In addition, as more detailed clinical classification of patients is performed, stronger genetic associations of complex diseases are discovered. Genotyping Tools June 2009, San Francisco CA  order CD

Genotyping implies (though I haven't found this in print) determining known variants, as opposed to discovery of new ones. Related terms SNPS & other genetic variations; Broader term sequencing; Narrower terms: haplotyping, genome wide association studies

What is the difference between genotyping and sequencing? 23andme 

GWAS Genome Wide Association Sequencing: Genomic informatics

haplogroups: Groups of similar haplotypes.  Wikipedia 

haplotype: The genetic constitution of individuals with respect to one member of a pair of allelic genes, or sets of genes that are closely linked and tend to be inherited together such as those of the MAJOR HISTOCOMPATIBILITY COMPLEX. MeSH, 1987

A haplotype is the set of SNP alleles along a region of a chromosome. Theoretically there could be many haplotypes in a chromosome region, but recent studies are typically finding only a few common haplotypes. Developing a Haplotype map of the human genome, 2001 

A particular pattern of sequential SNPs found on a single chromosome. These SNPs tend to be inherited together over time and can serve as disease-gene markers. The examination of single chromosome sets (haploid sets), as opposed to the usual chromosome pairings (diploid sets), is important because mutations in one copy of a chromosome pair can be masked by normal sequences present on the other copy. 

From “haploid genotype.”  The key idea is that alleles often travel together. Related terms: haplotyping, haplotyping technologies Cell biology diploid, haploid, ploidy; Maps & mapping: haplotype map HapMap; Narrower term: SNPs & genetic variations haploinsufficiency, haplotype block, SNP haplotype
Wikipedia  Gives at least three different meanings for haplotype accessed Jan 10, 2011

haplotyping:  Haplotyping involves grouping subjects by haplotypes, or particular patterns of sequential SNPs, found on a single chromosome. These SNPs tend to be inherited together over time and can serve as disease-gene markers. Haplotyping—A Key Approach to Studying Genetic Variation  CHI's GenomeLInk 14.2  

Somatic cells, as opposed to germ cells, have two copies of each chromosome. A given single- base position may be homozygous for the wild- type base (each chromosome has the normal allele), homozygous for a SNP base (each chromosome has the altered allele), or heterozygous for two different bases (one chromosome has the normal allele and the other has the abnormal allele). Haplotyping involves grouping subjects by haplotypes, or particular patterns of sequential SNPs, found on a single chromosome. These SNPs tend to be inherited together over time and can serve as disease- gene markers. The examination of single chromosome sets (haploid sets), as opposed to the usual chromosome pairings (diploid sets), is important because mutations in one copy of a chromosome pair can be masked by normal sequences present on the other copy.  Genes tend to travel in packs. This is good news for pharmacogenomics. Broader terms genotyping, sequencing

haplotyping technologies: Include microarrays, mass spectrometry, sequencing

horizontal sequencing: The obvious alternative [to vertical sequencing] is to perform all four reactions in one vial and determine the sequence by comparing determined oligonucleotide mass differences with expected data (horizontal sequencing). Eckhard Nordhoff,a Christine Luebbert, Gabriela Thiele, Volker Heiser, and Hans Lehrach, Rapid determination of short DNA sequences, Nucleic Acids Res > v.28(20); Oct 15, 2000 Related term: vertical sequencing

Maxam-Gilbert sequencing & Sanger sequencing: The two basic sequencing approaches, Maxam- Gilbert and Sanger, differ primarily in the way the nested DNA fragments are produced. Both methods work because gel electrophoresis produces very high resolution separations of DNA molecules; even fragments that differ in size by only a single nucleotide can be resolved. Almost all steps in these sequencing methods are now automated. Maxam- Gilbert sequencing (also called the chemical degradation method) uses chemicals to cleave DNA at specific bases, resulting in fragments of different lengths. A refinement to the Maxam- Gilbert method known as multiplex sequencing enables investigators to analyze about 40 clones on a single DNA sequencing gel.  Sanger sequencing (also called the chain termination or dideoxy method) involves using an enzymatic procedure to synthesize DNA chains of varying length in four different reactions, stopping the DNA replication at positions occupied by one of the four bases, and then determining the resulting fragment lengths. Primer on Molecular Genetics,  Oak Ridge National Lab, US

microsequencing: Sequencing of proteins or peptides in very small amounts (sub microgram), sometimes for use as probes.

minisequencing: A solid- phase method for the detection of any known point mutation or allelic variation of DNA. In the method amplified, biotinylated DNA sequences containing the mutation site are immobilized onto streptavidin coated microplate and primer extension reactions are carried out using labeled nucleotides. Incorporation of the labeled nucleotide is dependent on the genotype and is analyzed using ELISA technique. Assay method allows automation. Photometry applications, Labsystems Oy, Finland, no longer on website

Single base sequencing. 

multilocus sequence typing: Direct nucleotide sequencing of gene fragments from multiple housekeeping genes for the purpose of phylogenetic analysis, organism identification, and typing of species, strain, serovar, or other distinguishable phylogenetic level. MeSH 2011

next generation sequencing:  This report focuses on current and innovative NGS technologies, services and markets. Next Generation Sequencing Generates Momentum July 2011 Table of Contents

Next-generation sequencing now makes it possible to determine the sequence of a genome at accessible prices and in a short period of time DNA-Seq

As the cost of genome sequencing falls dramatically and new technologies emerge, researchers face ever-increasing amounts of data.  What are the best ways to analyze and interpret this data? How do you select the best sequencing technology for your needs? How will this technology impact diagnostics and medical care?  

Sequencing a genome is only the beginning. Several layers of analysis are necessary to convert raw sequence data into understanding of functional biology. First, error sources in the original raw data from multiple platforms and diverse applications must be accounted for. Then, as computational methods for assembly, alignment, and variation detection continue to advance, a broad range of genetic analysis applications including comparative genomics, high-throughput polymorphism detection, analysis of coding and non-coding RNAs, and identifying mutant genes in disease pathways can be addressed.  

Next-Gen Sequencing Informatics April 29 - May 1, 2014 • Boston, MA Program | Register | 

Next generation Sequencing NGS Leaders A community created to advance the use and value of next-generation sequencing through community-based knowledge sharing. 

optical mapping: Stretching DNA molecules in nanochannels allows structural and copy-number variations to be visualized like beads on a string. Channeling DNA for optical mapping Yael Michaeli  & Yuval Ebenstein  Nature Biotechnology 30,  762–763 (2012)  doi:10.1038/nbt.2324 published online August

Optical Mapping is the process which allows the creation of a genome or chromosome sized restriction enzyme map of an organism, from very small quantities of high molecular weight DNA. The DNA is run through nanochannels, fixed in place, stained, digested and visualised using an optical microscope. The individual fragments within the molecules of DNA are then measured and the molecules are assembled together according to matching patterns of cleavage, thus creating a de novo restriction enzyme map. This entire process can be carried out within a week, a similar speed to the generation of NGS sequence. Scaffold contigs of NGS data can be digested in silico and aligned to the optical map, allowing ordering, orientation and gap sizing information to be inferred. Optical Mapping as a Complementary Technology to NGS

Next generation sequencing (NGS) is revolutionizing all fields of biological research but it fails to extract the full range of information associated with genetic material. Optical mapping of DNA grants access to genetic and epigenetic information on individual DNA molecules up to 1 Mbp in length.  Beyond sequencing: optical mapping of DNA in the age of nanotechnology and nanoscopy Michal Levy-Sakin,  Yuval Ebenstein Current Opinion in Biotechnology Volume 24, Issue 4, August 2013, Pages 690–698

pathogen sequencing: In the future, more pathogens will have their genomes completely sequenced to determine not only how the pathogen causes disease, but what, if any, treatments will be most effective. The DNA sequences of viruses like HIV, human papilloma virus (HPV), and hepatitis C (HCV) are already being characterized and therapies prescribed based on this genetic information. To perform these types of diagnoses, DNA sequencing will have to become faster, more cost effective, simpler to perform, and more accessible to clinical laboratories. 

published working drafts - human genome: International Human Genome Sequencing Consortium special issue: Nature 409 (6822) 15 Feb 2001

Human Genome [Celera Genomics sequence] special issue: Science 291 (5507) Feb. 16, 2001

resequencing: Eric Lander, director of the Whitehead Institute's Center for Genome Research, and professor of biology at MIT notes " The human genome will need to be sequenced only once, but it will be resequenced thousands of times, in order, for example to unravel the polygenic factors underlying human susceptibilities and predispositions … Re-sequencing will also provide the ultimate tool for genotyping studies" E. Lander "The New Genomics" Science 274: 536, 25 Oct. 1996

Previously sequenced site is resequenced for SNP discovery or other purposes.  DNA resequencing involves sequencing a DNA region where a reference sequence for the region is already available. These studies provide important insight into the function of genes and the evolution of genes and populations. Applications abound including: comparative genomics, high-throughput SNP detection, identifying mutant genes in disease pathways, profiling transcriptomes for organisms where little information is available, researching lowly expressed genes, to identifying newly emerging or genetically engineered bacterial and viral strains. 

RNA sequencing: RNA-Seq and Transcriptome Analysis December 5-6, 2013 • Lisboa Portugal Program |
  March 7-8, 2012 • San Diego, CA Program | 

SNP Single Nucleotide Polymorphism: SNPs & Genetic Variations

SNP scoring: Involves methods to determine the genotypes of many individuals for particular SNPs that have already been discovered. ... tools are just beginning to emerge and many more robust technologies are needed.  NIH, Methods for Discovering and Scoring Single Nucleotide Polymorphisms, Request for Applications Jan. 9, 1998

SNP, Perkin Elmer 

Sanger sequencing: See under Maxam-Gilbert sequencing.

scanning, scoring: SNPs & other genetic variations

sequence inversion: The deletion and reinsertion of a segment of a nucleic acid sequence in the same place, but flipped in an opposite orientation. MeSH 2010

sequencing: Proteins, nucleic acids -- Analytical procedures for the determination of the order of amino acids in a polypeptide chain or of nucleotides in a DNA or RNA molecule. IUPAC Compendium  

Largely automated now. Full DNA sequencing is the "gold standard" for genotyping.    

Narrower terms; next generation sequencing, shotgun sequence, de novo sequencing, microsequencing, minisequencing, multiplex sequencing, Sanger sequencing, sequencing by synthesis.  Related terms: genotyping, GWAS Genome Wide Association Sequencing, haplotyping, sequencing data analysis & storage, sequencing data management

sequencing by synthesis: Promising new sequencing technologies, based on sequencing by synthesis (SBS), are starting to deliver large amounts of DNA sequence at very low cost. Polymorphism detection is a key application. Quality scores and SNP detection in sequencing-by-synthesis systems., Brockman W, Alvarez P, Young S, Garber M, Giannoukos G, Lee WL, Russ C, Lander ES, Nusbaum C, Jaffe DB, Genome Research 2008 Jan 22 [Epub ahead of print ]

sequencing - cost of: Cheap and easy genome sequencing has been both a blessing and a curse. We are able to find an incredible wealth of variation, but for the most part we have no easy way to tell whether a difference might contribute to a disease or not. The poster child for this problem is autism. Lots of genome wide association studies (GWAS) have been done and lots of rare variants in lots of different genes have been found – unfortunately, way too many to pick out the ones that really matter.  Luckily our friend yeast can help. Yeast winnows down GWAS hits in autism, SGD Database 2013

The cost of sequencing a single DNA base [when the Human Genome Project was initiated] was about $10 then; today, sequencing costs have fallen about 100-fold to $.10 to $.20 a base and still are dropping rapidly. Human Genome News 11 (1-2) Nov. 2000  Related term: Molecular Diagnostics $1,000 genome

Sequencing Data Storage and Management  March 7-8, 2012 • San Diego, CA Program | Register | Download Brochure

sequencing - high- throughput: Uses robotics, automated DNA- sequencing machines and computers.

sequencing informatics

shotgun sequencing: Sequencing method which involves randomly sequencing tiny cloned pieces of the genome, with no foreknowledge of where on a chromosome the piece originally came from. This can be contrasted with "directed" [sequencing] strategies, in which pieces of DNA from adjacent stretches of a chromosome are sequenced. Directed strategies eliminate the need for complex reassembly techniques. Because there are advantages to both strategies, researchers expect to use both random (or shotgun) and directed strategies in combination to sequence the human genome. DOE Glossary  Shotgun sequencing comes of age, Tabitha Powledge, Scientist Dec. 31, 2002  Hybrid of whole genome shotgun and clone- by- clone approach is probably best.

Single-Cell SequencingSingle-Cell Sequencing August 20-21, 2014 • Washington, DC Program | Register | Download Brochure

third-generation sequencing (TGS): Sequencing single DNA molecules without the need to halt between read steps (whether enzymatic or otherwise).  A window into third-generation sequencing, Glossary Eric E. Schadt*, Steve Turner Andrew Kasarskis, Human Molecular Genetics 19, IssueR2 Pp. R227-R240.

vertical sequencing: Among the many proposed concepts for sequencing DNA using mass spectrometry, the most successful has been to combine Sanger cycle sequencing with MALDI-TOF-MS (1, 2, 412). Four nucleobase- specific oligonucleotide ladders are generated in separate reaction vials, which are then separated and detected inside the mass spectrometer. The sequence is determined by comparing the recorded spectra (vertical sequencing). Eckhard Nordhoff, Christine Luebbert, Gabriela Thiele, Volker Heiser, and Hans Lehrach, Rapid determination of short DNA sequences, Nucleic Acids Res > v.28(20); Oct 15, 2000  Related term: horizontal sequencing

viral genotyping: Genomic data is enabling researchers to predict a patient's response to therapy based on the viral genotype for viral infections. HIV genotyping is an early example of how treatment decisions are made based on the genotype of the virus.

whole genome shotgun sequencing: Whole Genome Shotgun (WGS) sequencing projects are incomplete genomes or incomplete chromosomes that are being sequenced by a whole genome shotgun strategy. WGS projects may be annotated, but annotation is not required. The pieces of a WGS project are the contigs (overlapping reads), and they do not include any gaps. NCBI Whole Genome Shotgun Submissions  Broader term shotgun sequencing methodRelated term: GWAS Genome Wide Association Sequencing

454 Glossary, Roche Diagnostics 1996=2011 
DOE, Human Genome Project Information, Oak Ridge National Laboratory, Dictionary of Genetic Terms. 2007, 100+ definitions.  
Ensembl Glossary 
IUPAC  International Union of Pure and Applied Chemistry, Glossary for Chemists of terms used in biotechnology. Recommendations, Pure & Applied Chemistry 64 (1): 143-168, 1992. 200 + definitions.
IUPAC International Union of Pure and Applied Chemistry, Glossary of Terms used in Bioinorganic Chemistry, Recommendations, 1997. 450+ definitions.
MeSH Medical Subject Headings, (PubMed Browser) National Library of Medicine, Revised annually.  250,000 entry terms, 19,000 main headings. 
NCBI (US) BLAST Glossary, 2000. 40+ definitions
NHGRI (National Human Genome Research Institute), Talking Glossary of Genetic Terms, 100+ definitions.   Includes extended audio definitions.

Technologies Conferences
Next Generation Sequencing NGX

Technologies CDs, DVDs
Technologies Short courses

Technologies Insight Pharma Reports  

Insight Pharma Reports, Next-Generation Sequencing Technologies: Applications and Markets 2010  
Insight Pharma Reports, Next generation Sequencing: solving the genome, 2009

Alpha glossary index

How to look for other unfamiliar  terms

IUPAC definitions are reprinted with the permission of the International Union of Pure and Applied Chemistry.

Contact | Privacy Statement | Alphabetical Glossary List | Tips & glossary FAQs | Site Map