You are here Biopharmaceutical/ Genomic glossary homepage/Search > Biology > Sequences DNA & beyond

Pharmaceutical Sequences – DNA & beyond
Evolving terminology for emerging technologies

Comments? Suggestions? Revisions? Mary Chitty mchitty@healthtech.com
Last revised March 23, 2012 
View a Printer-Friendly Version of this Web Page!



Biology & Chemistry Map: Finding guide to terms in these glossaries  Site Map
Gene definitions, DNA Proteins, Protein Structures and RNA are sub-categories inextricably linked to this glossary. 
Other related glossaries include Applications: Genomics, Proteomics
Informatics Algorithms
, In silico & Molecular Modeling
Technologies Microarrays & protein chipsSequencing
Biology: Biomolecules, Expression, Glycosciences Not until the technologies for working with nucleic acids and proteins are better integrated will these fields be more visibly interdependent.

3' [three prime] bias: 

3' [three prime] flanking region: The region of DNA which borders the 3' end of a transcription unit and where a variety of regulatory sequences are located. MeSH, 2002

3' UTR (three prime): The sequence at the 3' end of messenger RNA that does not code for product. This region contains transcription and translation regulating sequences. MeSH, 1999

Region at the 3' end of a mature transcript (following the stop codon)  that is not translated into a protein. [DDBJ/ EMBL/ GenBank Feature Table]  http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html

A term that identifies one end of a single- stranded nucleic acid molecule. The 3' end is that end of the molecule which terminates in a 3' hydroxyl group. The 3' direction is the direction toward the 3' end. Nucleic acid sequences are written with the 5' end to the left and the 3' end to the right, in reference to the direction of DNA synthesis during replication (from 5' to 3'), RNA synthesis during transcription (from 5' to 3'), and the reading of mRNA sequence (from 5' to 3') during translation Broader term: UTR Related terms:  5' (5-prime)  Gene amplification & PCR primer extension

5' (5-prime):  The sequence at the 5' end of the messenger RNA that does not code for product. This sequence contains the ribosome binding site and other transcription and translation regulating sequences. MeSH, 1999

A term that identifies one end of a single- stranded nucleic acid molecule. The 5' end is that end of the molecule which terminates in a 5' phosphate group. The 5' direction is the direction toward the 5' end. Nucleic acid sequences are written with the 5' end to the left and the 3' end to the right, in reference to the direction of DNA synthesis during replication (from 5' to 3'), RNA synthesis during transcription (from 5' to 3'), and the reading of mRNA sequence (from 5' to 3') during translation.  [Mouse Genome Informatics] Related term: 3' (3-prime)

5' Flanking Region:  The region of DNA which borders the 5' end of a transcription unit and where a variety of regulatory sequences are located.  MeSH 2002

5' UTR (five prime): Region at the 5' end of a mature transcript (preceding the initiation codon) that is not translated into a protein. [DDBJ/ EMBL/ GenBank Feature Table]   http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html

5' Untranslated Region:. That portion of an mRNA from the 5' end to the position of the first codon used in translation. Related terms:  3'UTR, 3' prime; Gene amplification glossary primer extension Broader term UTR

ATCG: See adenine, base, base pair, thymine, cytosine, guanine

adenine (A): A nitrogenous base, one member of the base pair AT (adenine/ thymine). [DOE]

alternative exons: Gene definitions
alternative promoters: See under promoter
alternative splicing: Gene definitions Broader term: splicing; Related terms:
  pre- mRNA splicing, protein splicing, RNA splicing, trans- splicing
alternative transcripts: Expression, genes & beyond

amino acid sequence:  The order of amino acids as they occur in a polypeptide chain. This is referred to as the primary structure of proteins. It is of fundamental importance in determining protein conformation. MeSH, 1966

attenuator: In prokaryotes. 1) region of DNA at which regulation of termination of  transcription occurs, which controls the expression of some bacterial operons;  2) sequence segment located between the promoter and the first structural gene that causes partial termination of transcription. [DDBJ/ EMBL/ GenBank Feature Table]  http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html

base: Adenine, cytosine, guanine, thymine, and (only in RNA) uracil. Related terms: base pair, nucleotide [DOE]  

Called bases because they are alkaline (basic) in the acidic DNA structure. Base and base pair used "fairly indiscriminately" by molecular biologists [Bains]

base pair bp): Two bases which form a "rung of the DNA ladder." A DNA nucleotide is made of a molecule of sugar, a molecule of phosphoric acid, and a molecule called a base. The bases are the "letters" that spell out the genetic code. In DNA, the code letters are A, T, G, and C, which stand for the chemicals adenine, thymine, guanine, and cytosine, respectively. In base pairing, adenine always pairs with thymine, and guanine always pairs with cytosine. [NHGRI]  Narrower terms: adenine, cytosine, guanine, thymine, uracil

biological macromolecules: Biomolecules glossary
cDNA complementary DNA: Gene definitions 
carbohydrate sequence: Glycosciences glossary

central dogma: Horace Judson Freeland quotes Francis Crick talking about the central dogma "Nobody tried to go from protein sequence back to nucleic acid, because that just wasn't on. You see. But I don't think it was ever discussed. ... Jim, [Watson] you might say, had it first. DNA makes RNA makes protein. That became then the general idea. ... what are all the possible information flows?" [Freeland asked why he had called it the central dogma?] "It was because, I think, of my curious religious upbringing. Because Jacques [Monod] has since told me that a dogma is something which a true believer cannot doubt!" Crick laughed. ... "But that wasn't what was in my mind. My mind was, that a dogma was an idea for which there was no reasonable evidence. You see?!" And Crick gave a roar of delight. "I just didn't know what dogma meant. And I could just as well have called it the "Central Hypothesis" - you know. Which is what I meant to say. Dogma was just a catch phrase.  ... And it's a negative hypothesis, so it's very very difficult to prove.... The central dogma is much more powerful [than Crick's sequence hypothesis], and therefore in principle you might have to say it could never be proved. But it's utility - there was no doubt about that. Because if you didn't believe that, you could invent theories, unlimited theories, whereas if you just put in that one assumption, ... then, essentially you were on the right track you see." ... "In looking back I am struck not only by the brashness which allowed us to venture powerful statements of a very general nature, but also by the rather delicate discrimination used in selecting what statements to make. Time has shown that not everybody appreciated our restraint" [HF Judson, Eighth Day of Creation Cold Spring Harbor Laboratory Press 1996 pp. 333-334]  

Francis Crick "Central dogma of molecular biology" Nature227 (258): 561-563 Aug. 8, 1970 [historical article clarifying original explanation]

The Oxford English Dictionary makes clear the duality of dogma, particularly in the context of dogmatic, defined as "accepted as true instead of being based upon experience, particularly if done in an imperious, arrogant manner".  Dogma is defined as "systematised beliefs" (sometimes deprecating). Dogmatic physicians are cited as "an ancient sect" which "endeavoured to discover by reasoning the essence and occult causes" of disease.  Related terms: transcription, translation  

central dogma exceptions: Reverse transcription, prions, retroviruses?   
1. Reverse transcriptase and RNA genomes. DNA is not the only molecule of heredity in nature and, as David Baltimore and Howard Temin showed, the flow of information from DNA to RNA is not the only pathway possible. 2. Catalytic RNAs (ribozymes). Proteins are not the only structures capable of catalyzing a reaction. Tom Cech demonstrated the catalytic nature of certain classes of introns (intervening sequences) that are able to "self-splice." In addition Harry Noller has shown that the synthesis of the peptide bond during protein synthesis is catalyzed by the 23S rRNA of the ribosome. 3. Heritable proteins. Stanley Prusiner has given us the novel name "prion" (proteinaceous infections particle) to describe the agent responsible for a number of slow, neurological infectious disease, including scrapie, bovine spongiform encepalopathy (mad cow disease) and Creutzfeld- Jakob disease. [Martinez Hewlett, Molecular Biology 411, Univ. of Arizona, Tucson US] http://www.blc.arizona.edu/marty/411/Modules/mod4.html

cis-acting sequences: The sequences just 5' of the start site of transcription are the most important for the initiation of transcription. This is where the transcription complex is built. In general, this region is called the promoter. For eukaryotes, several sequences same to be conserved among many genes. One such sequences is the TATA box. The sequence is located about 30 bases upstream (-30) from the transcription start site and is the one sequence required for any significant transcription to occur. Other sequences add in transcription but are not always part of promoter. The two most found are the CCAAT box (called the CAT box) and the GC box. Because mutants of these three sequences only express mRNAs at low levels, these are considered the most important sequences of the basic transcription complex. [Phillip McClean, "Control of gene expression in eukaryotes, North Dakota State Univ. 1997]  http://www.ndsu.nodak.edu/instruct/mcclean/plsc431/geneexpress/eukaryex3.htm

Does not usually code for proteins. Compare trans-acting. Expression glossary

cis-splicing: RNA glossary
cis-trans:
Gene definitions
clone, cloning: Cell biology glossary
coding region(s): Gene definitions
codon: RNA glossary
 

cytosine (C): A nitrogenous base, one member of the base pair GC (guanine and cytosine). [DOE]

DNA: DNA glossary Narrower terms: exons, genes, introns, LINES, SINES
DNA-directed RNA polymerase: RNA glossary
DNA - RNA - protein: See central dogma  Related term: transposons
How are these two terms different?

ds: Double-stranded (DNA or RNA).

downstream:  Identifies sequences proceeding farther in the direction of expression; for example, the coding region is downstream from the initiation codon, toward the 3' end of an mRNA molecule. Sometimes used to refer to a position within a protein sequence, in which case downstream is toward the carboxyl end which is synthesized after the amino end during translation. [Lemon]

draft genome sequence [human]: Sequencing glossary
EST Expressed Sequence Tag: DNA glossary

enhancer: A cis- acting sequence that increases the utilization of (some)  eukaryotic promoters, and can function in either orientation and in any location (upstream or downstream) relative to the promoter. Eukaryotes and eukaryotic viruses. [DDBJ/ EMBL/ GenBank Feature Table] http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html 

At the 5' and 3' end of the gene, enhancers are located, which respond to the signals mediated by the proteins regulating the function of the gene. Enhancers can also be located within the introns. The regulative effect of the enhancers is either positive or negative. In the latter case they are often called silencers [for reviews concerning enhancers and silencers, see for example 141, 142]. ... In the cis-trans test, the E- g-/E+ g+ cis-heterozygote is phenotypically wild, whereas the E- g+/E+ g- trans-heterozygote is phenotypically mutant. Thus the cis-trans test gives a positive result. This means that we cannot on the basis of a genetic test alone distinguish between an enhancer and the transcription unit regulated by it; biochemical evidence is needed. Thus, by definition, the regulatory elements of a transcription unit, such as enhancers, have to be included in the gene itself.  Petter Portin in "The Origin, Development and Present Status of the Concept of the Gene: A Short Historical Account of the Discoveries" Current Genomics, 2000  http://www.bentham.org/cg/sample/cg1-1/Portin.pdf

enhancer elements (genetics): Cis- acting DNA sequences which can increase transcription of genes. Enhancers can usually function in either orientation and at various distances from a promoter.  [MeSH, 1988]  Related term: promoter 

exons: Gene definitions
exteins:  See under inteins
Related terms: inteins, protein splicing
.
gene expression: Expression glossary  See also Microarrays
gene identification, gene prediction: In silico & Molecular modeling glossary

genetic code:  The sequence of nucleotides, coded in triplets (codons) along the mRNA, that determines the sequence of amino acids in protein synthesis. The DNA sequence of a gene can be used to predict the mRNA sequence, and the genetic code can in turn be used to predict the amino acid sequence. [DOE] 

The notion of a “code” as the key to information transfer was not articulated publicly until late 1954, when [George] Gamow, Martynas Ycas, and Alexander Rich published an article that defined the code idiom for the first time since Watson and Crick casually mentioned it in a 1953 article. Yet the concept of coding applied to genetic specificity was somewhat misleading, as translation between the 4 nucleic acid bases and the 20 amino acids would obey the rules of a cipher instead of a code. As Crick acknowledged years later, in linguistic analysis, ciphers generally operate on units of regular length (as in the triplet DNA scheme), whereas codes operate on units of variable length (e.g., words, phrases). But the code metaphor worked well, even though it was literally inaccurate, and in Crick’s words, “‘Genetic code’ sounds a lot more intriguing than ‘genetic cipher’.” Codes and the information transfer metaphor were extraordinarily powerful, and heredity was often described as a biological form of electronic communication. [Richard A. Pizzi "Genetic ciphering" Modern Drug Discovery  4 (3): 65- 66 Mar. 2001] http://pubs.acs.org/subscribe/journals/mdd/v04/i03/html/03timeline.html
Who wrote the book of life: A history of the genetic code
. Lily E. Kay, Stanford University Press, 2000.    Related term: central dogma

genomic DNA: DNA glossary

genomic sequence: In April 2003, the sequence of the human genome will be essentially complete. For the scientific community now to make the best use of that fundamental information resource, the identity and precise location of all sequence-based functional elements in the genome must be determined. While many of the protein-coding genes are already known, many others remain to be identified. Beyond open reading frames, non- protein- coding genes, transcriptional regulatory elements and determinants of chromosome structure and function remain largely unknown. A comprehensive encyclopedia of all of these features is needed to utilize fully the sequence of the human genome to understand human biology better, to predict potential disease risks, and to stimulate the development of new therapies and other interventions to prevent and treat disease.

The sequence- based functional elements that will be targeted include, but are not limited to: Transcribed sequences, including both protein- coding and non- protein- coding. A description of the gene structure with transcriptional start sites, polyadenylation sites, along with all alternative transcripts, is an example. Conserved non- coding sequences that may represent functional elements. Cis- acting elements that regulate transcription and/ or chromatin structure. These elements include promoters, enhancers, and insulators. Sequence features that affect/ control chromosome biology. Examples include origins of replication and hot spots for recombination. Epigenetic changes, such as DNA methylation and chromatin modifications. Workshop on the Comprehensive Extraction of Biological Information from Genomic Sequence, Bethesda, Md. July 23-24, 2002, http://www.genome.gov/10005568http://grants1.nih.gov/grants/guide/rfa-files/RFA-HG-03-003.html

global regulators: Expression glossary

guanine (G): A nitrogenous base, one member of the base pair GC (guanine and cytosine). DOE]

human sequence: See Sequencing glossary  draft sequence, finished sequence, published sequence, working draft 
initiation codon: RNA glossary

inteins:  Wikipedia http://en.wikipedia.org/wiki/Intein   Internal protein sequences.   Related terms: exteins, protein splicing.

intergenic DNA: DNA glossary

interspersed repetitive sequences: Copies of transposable elements interspersed throughout the genome, some of which are still active and often referred to as "jumping genes". There are two classes of interspersed repetitive elements. Class I elements (or RETROELEMENTS - such as retrotransposons, retroviruses, LONG INTERSPERSED NUCLEOTIDE ELEMENTS and SHORT INTERSPERSED NUCLEOTIDE ELEMENTS) transpose via reverse transcription of an RNA intermediate. Class II elements (or DNA TRANSPOSABLE ELEMENTS - such as transposons, Tn elements, insertion sequence elements and mobile gene cassettes of bacterial integrons) transpose directly from one site in the DNA to another. [MeSH, 1999]  Narrower terms: LINES, SINES

intron: DNA glossary
junk DNA: DNA glossary

LCR Locus Control Region:   

LINEs Long Interspersed Nuclear Elements or Long INterspersed Elements: Families of long (average length = 6 500 bp), moderately repetitive (about 10,000 copies). LINEs are cDNA copies of functional genes present in the same genome; also known as processed pseudo- genes. [FAO Glossary] 

Highly repeated sequences, 6K- 8K base pairs in length, which contain RNA polymerase II promoters. They also have an open reading frame that is related to the reverse transcriptase of retroviruses but they do not contain LTRs (long terminal repeats). Copies of the LINE 1 (L1) family form about 15% of the human genome. The jockey elements of Drosophila are LINEs. [MeSH, 1999]
Related terms: non-coding, retrotransposons. 

LTR Long Terminal Repeat: A sequence directly repeated at both ends of a defined sequence, of the sort typically found in retroviruses. [DDBJ/ EMBL/ GenBank Feature Table] http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html  Broader term: terminal repeat sequences

locus control region: A regulatory region first identified in the human beta- globin locus but subsequently found in other loci. The region is believed to regulate transcription by opening and remodeling chromatin structure. It may also have enhancer activity. [MeSH, 1998]

mRNA messenger RNA: RNA glossary 
methylation: Proteins glossary
 
messenger RNA: See mRNA  
mobile genetic elements:          Narrower terms: transposons, LINES, retrotransposons, SINES
non-coding DNA: DNA glossary
non-coding first exons: Gene definitions 
nucleic acids: DNA
or RNA 
ORESTES open reading frame expressed sequence tags: DNA glossary

ORF Open Reading Frame: Corresponds to a stretch of DNA that could potentially be translated into a polypeptide;  i.e., it begins with an ATG "start" codon and terminates with one of the 3 "stop" codons. For an ORF to be considered as a good candidate for coding a bona fide cellular protein, a minimum size requirement is often set, e.g., many of the systematic sequencing groups define an ORF as a stretch of DNA that would code for a protein of 100 amino acids or more. An ORF is not usually considered equivalent to a gene or locus until there has been shown to be a phenotype associated with a mutation in the ORF, and/ or an mRNA transcript or a gene product generated from the ORF's DNA has been detected.  [SGD glossary, Stanford Univ. US] http://genome-www.stanford.edu/Saccharomyces/help/glossary.html#fasta

Reading frames where successive nucleotide triplets can be read as codons specifying amino acids and where the sequence of these triplets is not interrupted by stop codons. [MeSH, 1991] 

Without stop codons, are continuously readable by RNA polymerase  Broader term: reading frame, Narrower term: URF Related term: Omes & omics glossary ORFeome

open reading frame: See ORF

operator regions (genetics): Regulatory elements of an operon to which activators or repressors bind to effect the transcription of genes in the operon. [MeSH, 1986]

precursor RNA: RNA glossary

primary (initial, unprocessed) transcript: Includes 5' clipped region (5' clip), 5' untranslated region (5' UTR), coding sequences (CDS, exon), intervening sequences (intron), 3' untranslated region (3' UTR), and 3' clipped region (3' clip). [DDBJ/ EMBL/ GenBank Feature Table] http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html

promoter: Region on a DNA molecule involved in RNA polymerase binding to initiate transcription.  [DDBJ/ EMBL/ GenBank Feature Table] http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html  

Promoters are DNA sequences on the 5' side of the gene on which the RNA polymerase fastens when transcription begins. In all groups of organisms alternative promoters have been shown for many genes. These alternative promoters have been classified into six classes by Ueli Schibler and Filipe Sierra [121] (Fig. 3). Certain types of alternative promoters make it possible for transcription to start from different points of the gene in different cases, and for the transcripts to have initiation codons at different positions of the chromosome. Thus it is possible for a single gene in this case too to produce more than one type of messenger RNA molecules, encoding more than one polypeptide. This is again against the basic conceptual framework of the neoclassical view of the gene. ... According to whether the unit of transcription is controlled by one or several promoters, simple and complex transcription units are distinguished.   [Petter Portin in "The Origin, Development and Present Status of the Concept of the Gene: A Short Historical Account of the Discoveries" Univ. of Turku, Finland, 2000]   http://www.bentham.org/cg1-1/portin/P.Protin.htm  
Related terms: cis- acting, enhancer, promoter regions; Omes & omics : promoterome

promoter regions: The DNA region, usually upstream to the coding sequence of a gene or operon, which binds and directs RNA polymerase to the correct transcriptional start site and thus permits the initiation of transcription. [IUPAC Biotech]

DNA sequences which are recognized (directly or indirectly) and bound by a DNA- dependent RNA polymerase during the initiation of transcription. Highly conserved sequences within the promoter include the Pribnow box in bacteria and the TATA BOX in eukaryotes. [MeSH, 1985] Related term: enhancer.

protein: Proteins glossary 
protein coding,  protein coding regions: See coding regions.
protein editing: See under protein splicing. 
protein expression: Expression glossary
protein rearrangements: See under protein splicing.

protein splicing: Excision of in- frame internal protein sequences (inteins) of a precursor protein, coupled with ligation of the flanking sequences (exteins). Protein splicing is an autocatalytic reaction and results in the production of two proteins from a single primary translation product: the intein and the mature protein. MeSH, 1997

Protein splicing is defined as the excision of an intervening protein sequence (the INTEIN) from a protein precursor and the concomitant ligation of the flanking protein fragments (the EXTEINS) to form a mature extein host protein and the free intein (Perler 1994). Protein splicing results in a native peptide bond between the ligated exteins (Cooper 1993). Extein ligation differentiates protein splicing from other forms of autoproteolysis. Conserved intein motifs differentiate inteins from other types of in- frame sequences present in one homolog and absent in another homolog or from other types of protein rearrangements.

Please Note: The term 'Protein Splicing' has been associated with inteins since 1994 (Perler 1994). Recent papers have described protein rearrangements that are not intein-mediated. The mechanism of these rearrangements is currently unknown, but preliminary evidence suggests that they are mediated by various cellular enzymes. For clarity, we suggest calling these non-intein mediated events either protein rearrangements or Protein Editing.  New England BioLabs, Inc., InBase, The New England Biolabs Intein Database   http://www.neb.com/inteins/int_id.html 
Related terms: exteins, inteins

protein synthesis: See translation, transcription
RNA RiboNucleic Acid: RNA glossary
RNAi, RNA interference: Genetic Manipulation & Disruption
RNA polymerase RNAP, RNA precursors: RNA glossary
RNA secondary structure prediction: In silico & Molecular modeling glossary
RNA silencing: Genetic Manipulation & Disruption See also RNAi 
RNA splice sites, RNA splicing: RNA glossary

reading frames: The sequence of codons by which translation may occur. A segment of mRNA 5' AUCCGA3' could be translated in three reading frames, 5' AUC.. or 5' UCC.. or 5' CCG.., depending on the location of the start codon. [MeSH, 1991]  Narrower term: ORF Open Reading Frames

reference sequences: The Reference Sequence (RefSeq) collection aims to provide a comprehensive, integrated, non-redundant set of sequences, including genomic DNA, transcript (RNA), and protein products, for major research organisms.

RefSeq standards serve as the basis for medical, functional, and diversity studies; they provide a stable reference for gene identification and characterization, mutation analysis, expression studies, polymorphism discovery, and comparative analyses. RefSeqs are used as a reagent for the functional annotation of some genome sequencing projects, including those of human and mouse. NCBI Reference Sequences database  http://www.ncbi.nlm.nih.gov/RefSeq/ 

regulatory sequences. repetitive sequences: DNA glossary

response elements: Nucleotide sequences, usually upstream, which are recognized by specific regulatory transcription factors, thereby causing gene response to various regulatory agents. These elements may be found in both promoter and enhancer regions. [MeSH, 1998]

retroelements: Elements that are transcribed into RNA, reverse- transcribed into DNA and then inserted into a new site in the genome. Long terminal repeats (LTRs) similar to those from retroviruses are contained in retrotransposons and retrovirus- like elements. Retroposons, such as LONG INTERSPERSED NUCLEOTIDE ELEMENTS and SHORT INTERSPERSED NUCLEOTIDE ELEMENTS do not contain LTRs. [MeSH, 1999]

retrotransposon: DNA fragments copied from viral RNA with reverse transcriptase that insert in the host chromosomes .[Life Sciences]

reverse transcriptases: Gene amplification & PCR  Related terms: non- coding, retrotransposons.

reverse transcription: Reverse transcription is used naturally by retroviruses to insert themselves into an organism's genome. Artificially induced reverse transcription is a useful technique for translating unstable mRNA molecules into stable cDNA. [J Buhler, Washington Univ.] http://www.cs.washington.edu/homes/jbuhler/research/array/glossary.html  Related terms reverse transcriptases; Gene definitions cDNA

ribonucleic acid: See RNA
ribosomal frameshifting: RNA glossary

SINEs Short Interspersed Nuclear Elements or Short INterspersed Elements: Short interspersed nuclear elements. Families of short (150 to 300 bp), moderately repetitive elements of eukaryotes, occurring about 100,000 times in a genome. SINES appear to be DNA copies of certain tRNA molecules, created presumably by the unintended action of reverse transcriptase during retroviral infection. [FAO Glossary] 

Highly repeated sequences, 100- 300 bases long, which contain RNA polymerase III promoters. The primate Alu (ALU ELEMENTS) and the rodent B1 SINEs are derived from 7SL RNA, the RNA component of the signal recognition particle. Most other SINEs are derived from tRNAs including the MIRs (mammalian- wide interspersed repeats). [MeSH, 1999]

sequence: The order of neighbouring amino acids in a protein or the purine and pyrimidine bases [A,C,T,G, uracil] in RNA and DNA. [IUPAC Bioinorganic] Narrower terms: sequence data-  molecular;  Proteins amino acid sequence Related terms: Sequencing draft sequence - human, published sequence - human, working draft sequence - human Glycosciences glossary carbohydrate sequence

sequence data- molecular:  Descriptions of specific amino acid, carbohydrate or nucleotide sequences which have appeared in the published literature an/or are deposited in and maintained by databanks such as GenBank, EMBL, NBRF or other sequence repositories [databases] [MeSH, 1988]

silencer elements transcriptional: Nucleic acid sequences that are involved in the negative regulation of TRANSCRIPTION by CHROMATIN SILENCING. MeSH 2003

splice sites: Boundaries between exons and intron, there are two varieties: the border going from exon to intron is called a donor site or a 5' site, the border separating intron from exon is called an acceptor site or a 3' site. [TP Speed, S. Cawley, "Locating splice sites"  Statistics 260 Statistics in Genetics, Univ. of California- Berkeley, 1998]  http://www.stat.berkeley.edu/users/terry/Classes/s260.1998/Week12/week12/node14.html

Location in the DNA sequence where RNA removes the noncoding areas to form a continuous gene transcript for translation into a protein. [DOE]

splice junctions:  While it is well accepted that the consensus sequences of exon- intron boundaries in mRNA precursors are important for specifying splice sites, the signals that govern the excision of introns are not well understood yet, because actual splice site sequences are more or less different from the consensus sequences. So far several statistical methods for predicting actual splice site sequences (splice junctions) in pre- mRNAs of mammalian genes have been proposed. ... However, while the statistical methods proposed so far have some ability for predicting the splice site sequences in statistical tests, it seems to be far from sufficient when applied to actual problems. "Comparison of Statistical Algorithms for Predicting Splice Junctions in mRNA Precursors of Mammalian Genes" Yukiyasu Ogawa, Tomomasa Nagashima, Sirajuddin Khawaja,  Genome Informatics Workshop,  GenomeNet, Yokohama,  Japan, Dec. 11- 12, 1995]  http://www.genome.ad.jp/manuscripts/GIW95/Poster/GIW95P08.html

Junctions between exons and introns. 

splice variants: The HGNC [Human Genome Nomenclature Committee] has no authority over protein nomenclature; however, we are frequently asked how to designate splice variants so we suggest the following: Proteins should be designated using the same symbol as the gene, printed in non- italicized letters. When referring to splice variants, the symbol can be followed by an underscore and the lower case letter "v" then a consecutive number to denote which variant is which. Human Genome Nomenclature Committee "Guidelines for Human Gene Nomenclature"  Genomics 79(4):464-470 (2002)  http://www.genenames.org/guidelines.html    

splicing: 1. Of RNA: the procedure by which introns are removed from eukaryotic precursor mRNA molecules and adjacent exon sequences are joined together (spliced). 2. Of DNA: manipulation for joining together double stranded DNA fragments with protruding single stranded "sticky ends" by means of ligases. [IUPAC Biotech, IUPAC Compendium] Narrower terms: cis- splicing, protein splicing, pre- mRNA splicing, RNA splicing, trans- splicing; Gene Definitions  alternative splicing, cDNA; Related terms Cell biology glossary spliceosomes

start codon, stop codon: RNA glossary

template: Gene amplification & PCR Template appears in many biological and biochemical contexts.  Do meanings vary?

terminal repeat sequences: Nucleotide sequences repeated on both the 5' and 3' ends of a sequence under consideration. For example, the hallmarks of a transposon are that it is flanked by inverted repeats on each end and the inverted repeats are flanked by direct repeats. The Delta element of Ty retrotransposons and LTRs (long terminal repeats) are examples of this concept. [MeSH, 1999]

terminator: A sequence of DNA lying beyond the 3’ end of the coding segment of a gene which is recognized by RNA polymerase as a signal to stop synthesizing mRNA. [IUPAC Biotech]

Sequence of DNA located either at the end of the transcript  that causes RNA polymerase to terminate transcription  [DDBJ/ EMBL/ GenBank Feature Table]   http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html

terminator codon: RNA glossary

terminator regions (genetics): DNA sequences which signal the termination of transcription. [MeSH, 1991]

thymine (T): DNA glossary
Tn: See under transposons
trans-acting factors: Expression glossary
trans-acting proteins: Protein categories
trans-acting RNA: RNA glossary
 
transcript: Expression glossary  Related terms 3' UTR, 5' UTR, primary transcript, terminator

transcription: The process by which the genetic information encoded in a linear sequence of nucleotides in one strand of DNA is copied into an exactly complementary sequence of RNA. [IUPAC Biotech]

The synthesis of an RNA copy from a sequence of DNA (a gene); the first step in gene expression. Compare translation (the process in which the genetic code carried by mRNA directs the synthesis of proteins from amino acids. [DOE]

transcription, genetic: The transfer of genetic information from DNA to messenger RNA by DNA- directed RNA polymerase. It includes reverse transcription and transcription of early and late genes expressed early in an organism's life cycle or during later development.  MeSH, 1973  Related terms: translation,  attenuator, reverse transcriptases, transcription machinery; Narrower terms:  Gene amplification & PCR reverse transcription; Microarrays Northern blotting

transcription factors: Expression glossary  

transcription machinery: Consists of the RNA polymerase II holoenzyme plus two additional “general transcriptions factors,” which are protein complexes and a histone acetyltransferase that theoretically exerts its transcriptional activity by modifying chromatin. ... [Several] studies provide evidence for the function of components of the general transcription machinery, in terms of their role in regulation of the transcription of specific sets of genes … Apparently, the specific transcription regulatory activities of components of the general transcription machinery provide a layer of regulation in addition to that provided by the gene- specific regulators … Knowledge gained concerning the coordinate regulation of genes, how gene- specific transcription factors (which are the targets of many existing drugs (e.g., steroids, selective estrogen response modifiers, thiazolidinediones) interact with general transcription factors, and how signal transduction pathways regulate gene transcription is expected to be important for genomics based identification of targets that are components of transcriptional regulation and signal transduction networks.  [CHI Functional Genomics report]  Related terms: terminator

transcriptional silencer elements: See silencer elements transcriptional

translation: The unidirectional process that takes place on the ribosomes whereby the genetic information present in an mRNA is converted into a corresponding sequence of amino acids in a protein. [IUPAC Bioinorganic]

The conversion of the genetic instructions for a protein from nucleotides of messenger RNA with amino acids. [NIGMS]

translation, genetic: Formation of peptides on ribosomes, directed by messenger RNA. [MeSH, 1973]

transposons:  A mobile genetic element that can replicate itself and insert itself into the genome, including interrupting genes and disrupting their function, an insertional mutagen.  [CHI Functional Genomics report] 

One of a class of genes that are capable of moving spontaneously from one chromosome to another, or from one position to another in the same chromosome; also known as jumping genes or transposable elements. [Glick]

DNA elements carrying genes for transposition and other genetic functions.  In many cases the latter genes enable bacteria to live in extreme environments. Transposons are much longer than IS (Insertion) elements. Abbreviated Tn. [Schlindwein]

First recognized in the 1940’s by Dr. Barbara McClintock in studies of peculiar inheritance patterns found in the colors of Indian corn. Also known as  “jumping DNA”, referring to the fact that some stretches of DNA are unstable and “transposable” i.e. they can move around – on and between chromosomes.  This theory was confirmed in the 1980’s when scientists observed jumping DNA in other genomes. [HGMIS Oak Ridge National Lab, US]  http://www.ornl.gov/hgmis/faq/faqs1.html  Related term: DNA transposable elements How are these two terms different?

URF: Unidentified Reading Frame

UTR: The parts of the messenger RNA sequence that do not code for product, i.e. the 5' UNTRANSLATED REGIONS and 3' UNTRANSLATED REGIONS. [MeSH, 1999]

UnTranslated Region: Critical for many aspects of gene regulation and expressionNarrower terms  3' UTR, 5' UTR. 

upstream: Identifies sequences located in a direction opposite to that of expression; for example, the bacterial promoter is upstream of the initiation codon. In an mRNA molecule, upstream means toward the 5' end of the molecule. Occasionally used to refer to a region of a polypeptide chain which is located toward the amino terminus of the molecule. [Lemon] 

Bibliography
DDBJ/ EMBL/ GenBank Feature Table, 2001, 100 + definitions. http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html
Mouse Genome Informatics Glossary, Jackson Lab, US, 2006  http://www.informatics.jax.org/mgihome/other/glossary.shtml
SGD Glossary Terms, Stanford Univ., US http://genome-www.stanford.edu/Saccharomyces/help/glossary.html

Alpha glossary index

How to look for other unfamiliar  terms

IUPAC definitions are reprinted with the permission of the International Union of Pure and Applied Chemistry.

Contact | Privacy Statement | Alphabetical Glossary List | Tips & glossary FAQs | Site Map