You are here Biopharmaceutical/ Genomic Glossary homepage > Biology > SNPs & other genetic variations

SNPs & genomic & proteomic variations glossary & taxonomy
Evolving terminologies for emerging technologies

Comments? Questions? Revisions?
Mary Chitty MSLS
mchitty@healthtech.com
Last revised January 07, 2020 



Our knowledge of genetic variations has been so profoundly influenced by Mendelian  genetics that it is difficult to speculate about the ways in which our thinking will need to change with further insights into genomics. We have far to go in teasing apart the multiple variables of complex traits and diseases, the relationships between hereditary, somatic and environmental factors, and in making the transition from focusing on monogenic diseases with high penetrance to polygenic conditions with greatly varying degrees of penetrance. In addition some fairly common words (allele, polymorphism, wild- type) may carry an explicit (or more frequently implicit) connotation of "normal" and/ or functional, dating from the early days of genetics when only mutant phenotypes revealed the presence of genetic variations.  

Currently identified disease related genetic variations are relatively rare.  Gene expression studies are giving some insight into clusters of alleles which may be linkable to diseases and phenotypes. Recent work on SNPs seems promising, but powerful new methods for integrating data and detecting variants undetectable by current technologies are still needed.

Biology & Chemistry term index   Related Glossaries include Functional Genomics   PharmacogenomicsBiomarkers   Informatics Algorithms Bioinformatics  Technologies Chromatography & electrophoresis, Gene amplification & PCR, Microarrays Sequencing   Biology Chemistry & biology Expression, Gene definitions, Maps- genomic & genetic, Sequences DNA & beyond 

allele-specific hybridization (ASH): A method of SNP detection. ASH technologies use oligonucleotides that differ at a single base position corresponding to the SNP to be detected. In some instances, two oligos are provided, one for a "wild- type" or normal allele and the other for the SNP. In other instances, four oligos, corresponding to each of the four possible bases at the SNP position, are provided. ASH technology shows up in several microarray products. Related terms: Gene amplification & PCR

alleles: One of several alternate forms of a gene which occur at the same locus on homologous chromosomes and which become separated during meiosis and can be recombined following fusion of gametes. [IUPAC Biotech, IUPAC Compendium]

Mutually exclusive forms of the same gene, occupying the same locus on homologous chromosomes, and governing the same biochemical and developmental process. MeSH, 1968

A related individual or strain contains stable, alternative forms of the same gene which differs from the presented sequence at this location (and perhaps others). superceded by 'Variation'. Allele will become illegal from April 15th, 2000 DDBJ/ EMBL/ GenBank Feature Table  http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html     
From alleomorph, which is from the Greek meaning one another form, used by Bateson & Saunders 1902 [OED]  The word gene didn't come along until 1911, coined by W. Johanssen. Related terms: allelic variants, copy number variations, mutation, polymorphisms, SNP, bi-allelic, di- allelic, multiple- allelism, tri- allelic. Narrower term: slightly deleterious allele. Broader term: variants   

allelic imbalance: A situation where one member (allele) of a gene pair is lost (LOSS OF HETEROZYGOSITY) or amplified. MeSH, 1991

alternative splicing: Gene definitions

anonymous SNPs: SNPs that have no known effect on gene function.  Thought to be the most common type of  SNPs and possibly valuable as markers for linkage disequilibrium studies, when they are relatively close to the gene being sought.   Related term: intron SNPs

association studies: Looking at particular genes or variations in two groups (e.g., affected patients and controls or responders and nonresponders) to establish an association with a phenotype by finding significant variations in the two groups.

In human genetic linkage studies frequently involve the comparison of allele frequencies for a marker locus between a disease population and in a control population. When statistically significant differences in the frequency of an allele(s) are found between a disease and control population, the disease and allele(s) are said to be in association. [NHLBI]  
Narrower term:
  random genome-wide association studies. Related term: linkage

bi-allelic: In principle, SNPs could be bi-, tri-, or tetra-allelic polymorphisms.  However, in humans, tri- allelic and tetra- allelic SNPs are rare almost to the point of non- existence, and so SNPs are sometimes simply referred to as bi- allelic markers (or di- allelic to be etymologically correct).  This is somewhat misleading because SNPs are only a subset of all possible bi- allelic polymorphisms (e.g., multiple base variations). Anthony Brooks "The essence of SNPs" Gene 234: 177-186, 1999. Variant of di-allelic.  Related terms: allele, SNPs

biological variation: Studies of genetic architecture have historically focused on associations of genotype and phenotype (e.g., between DNA markers and a disease). However, an organism is a unique consequence of both genes and environment and is created by complex interactions of multiple events and forces. How genes are expressed depends on their cellular, developmental, physiological, and environmental context. Genetic Architecture, Biological Variation and Complex Phenotypes, PA-02-110, May 29, 2002- June 5, 2005 http://grants1.nih.gov/grants/guide/pa-files/PA-02-110.html

cSNPs: When SNPs are present in the actual gene-coding region of a chromosome, they are called cSNPs and have a higher probability of influencing propensity to disease or drug response than SNPs found outside gene regions. There are an estimated 200,000 cSNPs present in the human genome. CHA Cambridge Healthtech Advisors, Clinical Genomics: The Impact of Genomics on Clinical Trials and Medical Practice report, 2004 Related/ equivalent? term: exon SNPs Broader term: SNP Narrower terms: non- synonymous SNPs, synonymous SNPs

candidate gene studies:  These studies, in contrast to genomewide scans, focus on particular SNPs thought to have a functional effect or to be involved in specific conditions. They are generally considered more practical than genomewide scans  Related term:  Gene categories candidate gene

candidate SNP:  Particular SNPs thought to have a functional effect.

CEPH Centre d’Etude du Polymorphism Humain: Paris FRANCE. Collects pedigrees appropriate for reference genetic mapping.  These are characterized by the availability of a large number of offspring (average 8.5) and both sets of paternal and maternal grandparents.  The structure of these pedigrees renders them “linkage phase known”. NHLBI

chimerism:
The occurrence in an individual of two or more cell populations of different chromosomal constitutions, derived from different individuals. This contrasts with MOSAICISM in which the different cell populations are derived from a single individual.  MeSH Year introduced: 2005

coding SNPs cSNPs: When SNPs are present in the actual gene-coding region of a chromosome, they are called cSNPs and have a higher probability of influencing propensity to disease or drug response than SNPs found outside gene regions. There are an estimated 200,000 cSNPs present in the human genome. CHA Cambridge Healthtech Advisors, Clinical Genomics: The Impact of Genomics on Clinical Trials and Medical Practice report, 2004 Related/ equivalent? term: exon SNPs Broader term: SNP Narrower terms: non- synonymous SNPs, synonymous SNPs

common variants: When a given SNP at a particular location on a chromosome occurs in at least 1% of the population, it is considered to be a common variant. CHA Cambridge Healthtech Advisors, Clinical Genomics: The Impact of Genomics on Clinical Trials and Medical Practice report, 2004

complex trait: Complex traits are defined as those where inheritance does not follow a Mendelian (that is, simple) pattern. The best definition of a complex trait is: where “one or more genes acting alone or in concert increase or reduce the risk of that trait”.3 This definition allows for all of the above possibilities (oligogenic, polygenic, and multifactorial), as well as non-Mendelian single gene disease, and has three important details. Firstly, genetic variations (mutations or polymorphisms (see box 2)) result in differences in the risk of disease; the disease causing mutations (DCMs) or alleles do not by themselves confer disease, they simply increase or reduce the likelihood that a trait will be expressed in a given individual. The second detail concerns the term “trait” as opposed to “disease”. Most diseases are heterogeneous syndromes and the clinical phenotypes are often extremely variable. Using the term “trait” allows for this variation in the disease phenotype and the prospect that DCMs not only determine which diseases we are more likely to develop but also may determine the severity of the clinical syndrome that follows.  Previously complex traits were called “polygenic” (involving more than one gene), multifactorial (depending on the interaction of the host genome and one or more environmental factors), or oligogenic (whereby individual mutations in several different genes in one or more common pathways lead to the same clinical syndrome but each patient with the disease may possess a single disease causing mutation only)  P T Donaldson,  Genetics of liver disease: immunogenetics and disease pathogenesis, Gut. 2004 April; 53(4): 599–608  doi: http://gut.bmj.com/content/53/4/599 

Copy Number Polymorphisms CNPs: Will become a valuable tool for defining relative phenotypes in a population.  

Copy-number polymorphisms (CNPs) represent a greatly underestimated aspect of human genetic variation. Recently, two landmark studies reported genome-wide analyses of CNPs in normal individuals and represent the beginning of an understanding of this type of large-scale variation.   Patrick G. Buckley*, Kiran K. Mantripragada*, Arkadiusz Piotrowski, Teresita Diaz de Ståhl and Jan P. Dumanski Copy-number polymorphisms: mining the tip of an iceberg, Trends in Genetics 21 (6): 315- 317, June 2005  https://www.ncbi.nlm.nih.gov/pubmed/15922827

Another term for CNV 

copy number variations CNVs: large-scale structural changes in DNA that vary from individual to individual. These include insertions, deletions, duplications and complex multi-site variants that range from kilobases to megabases in size. CNV can influence gene expression, phenotypic variation and alter gene dosage, and in certain instances may be associated with developmental disorders, cause disease or confer susceptibility to complex disease traits. Commonly used Genome Terms, NCBI https://www.ncbi.nlm.nih.gov/projects/genome/glossary.shtml 

We defined a CNV as a DNA segment that is 1kb or larger and present at variable copy number in comparison with a reference genome. A CNV can be simple in structure, such as tandem duplication, or may involve complex gains or losses of homologus sequences at multiple sites in the genome. Richard Redon et. al, Global Variaiton in copy number in the human genome, Nature 2006 Nov 23;444 (7118): 444- 454     Related terms: copy number polymorphisms,  SNP 

correlation studies:  For SNPs will likely be attempts to correlate phenotype or drug response with candidate SNPs.  May or may not be carried out with population validation studies.

crossing over: See under genetic recombination.

DNA fingerprinting: A procedure in which multilocus band patterns of a DNA sample are generated by digestion of the DNA with restriction enzymes followed by electrophoresis and visualization by hybridization with  probes specific for repetitive sequences. In forensic medicine the probes used are "core" sequences specific for simple tandem repetitive sequences (MINISATELLITE REPEATS or VNTRs). The multilocus band patterns, known as DNA fingerprints, are evaluated for similarities with DNA from an individual. MeSH, 1991 Related term: Genomics forensic applications

DNA footprinting: A method for determining the sequence specificity of DNA- binding proteins. DNA footprinting utilizes a DNA damaging agent which cleaves DNA at every base pair; DNA cleavage is inhibited where the ligand binds to DNA. MeSH, 1996 (Rieger et al., Glossary of Genetics: Classical and Molecular, 5th ed)

DNA ligase enzymes: Can link two adjacent oligonucleotide probes that are hybridized to a template.  A SNP detection technology.  

deletions:  A genetic rearrangement through loss of segments of DNA or RNA, bringing sequences which are normally separated into close proximity.  This deletion may be detected using cytogenetic techniques and can also be inferred from the phenotype, indicating a deletion at one specific locus. MeSH ‘gene deletion’, 1993 

A type of mutation caused by loss of one or more nucleotides from a DNA segment. Deletions can be very large, encompassing many genes and megabases of DNA, to the point of producing a visible cytological abnormality in a chromosome. Small deletions within a gene can alter the reading frame, and thus the amino acid sequence of the encoded protein  Mouse Genome Informatics, Jackson Lab  Related terms: indels; Functional genomics Cre-lox 

di-allelic: See under bi-allelic.

direct approach: See candidate gene approach. Alternative to shotgun sequencingSequencing

duplication: A particular kind of mutation: production of one or more copies of any piece of DNA,  including a gene or even an entire chromosome. [NHGRI]

An additional copy of a DNA segment present in the genome. Gene duplication is the source of paralogous genes.  [Mouse Genome Informatics]  Narrower term: whole genome duplication

EST Expressed Sequence Tags Gene definitions

endonucleases: A high throughput fragment analysis SNP scanning technology M Phillips CHI Nucleic Acid Technologies conference, June 2000

epiallele:  Wiktionary http://en.wiktionary.org/wiki/epiallele  See related epigenetics

epigenetic: Descriptive term for processes that change the phenotype without altering the genotype. IUPAC Biotech

epigenetics: Epigenetics refers to the study of heritable changes in gene expression that occur without a change in DNA sequence. Epigenetic mechanisms are multifaceted and complex, and they provide an additional layer of transcriptional control to regulate how genes are expressed. Covers a broad range of effects, and several are discussed in this special issue. But how did epigenetic regulation arise? For RNA- mediated silencing and DNA methylation there is evidence that they have evolved as part of a host defense mechanism against viruses and parasitic DNA. The substrate - double- stranded RNA (dsRNA) - for both posttranscriptional gene silencing (PTGS) [or RNA interference (RNAi)] and transcriptional gene silencing (TGS) seen in plants is a common intermediate in the life cycle of many viruses and transposons. Guy Riddihough and Elizabeth Pennisi, "The Evolution of Epigenetics" Science 293 (5532): Aug. 20, 2001

Given that there are several existing definitions of epigenetics, it might be felt that another is the last thing we need. Conversely, there might be a place for a view of epigenetics that keeps the sense of the prevailing usages but avoids the constraints imposed by stringently requiring heritability. The following could be a unifying definition of epigenetic events: the structural adaptation of chromosomal regions so as to register, signal or perpetuate altered activity states. This definition is inclusive of chromosomal marks, because transient modifications associated with both DNA repair or cell-cycle phases and stable changes maintained across multiple cell generations qualify. It focuses on chromosomes and genes, implicitly excluding potential three-dimensional architectural templating of membrane systems and prions, except when these impinge on chromosome function. Also included is the exciting possibility that epigenetic processes are buffers of genetic variation, pending an epigenetic (or mutational) change of state that leads an identical combination of genes to produce a different developmental outcome17.  An implicit feature of this proposed definition is that it portrays epigenetic marks as responsive, not proactive. In other words, epigenetic systems of this kind would not, under normal circumstances, initiate a change of state at a particular locus but would register a change already imposed by other events.  Perceptions of epigenetics - Code Biology www.codebiology.org/database/Epigenetic%20Code/Bir07.pdf

Epigenetics: a web supplement Science Aug. 10, 2001 http://www.sciencemag.org/content/vol293/issue5532/#specialintro
Epigenetics 101, Guardian http://www.theguardian.com/science/occams-corner/2014/apr/25/epigenetics-beginners-guide-to-everything 

International Human Epigenetic Consortium http://ihec-epigenomes.org/ 

epigenome: A set of  what may be hundreds of genes whose function is determined by [genetic] imprinting. [Post Gazette News Bar Harbor, Maine genetics seminar,  July 26, 2000]  http://www.post-gazette.com/healthscience/20000726heredity1.asp

epigenomics: The Common Fund's Epigenomics Program includes a series of complementary initiatives aimed at generating new research tools, technologies, datasets, and infrastructure to accelerate our understanding of the role of epigenetics - the study of how chemical "marks" on DNA regulate gene activity and expression without altering the DNA sequence itself - in human health and disease. Initiatives include  Reference Epigenome Mapping Centers , Epigenomics Data Analysis and Coordination Center , Technology Development in Epigenetics , Discovery of Novel Epigenetic Marks in Mammalian Cells, Epigenomics of Human Health and Disease. Epigenomics,  NIH Common Fund  http://commonfund.nih.gov/epigenomics/

A whole genome approach to epigenesis and epigenetics. “An approach that views these [imprinting, metabolic networks, genetic hierarchies in embryonic development, and epigenetic mechanisms of gene activation in cancer] and other complex phenotypes from the genomic level down, rather than from the genetic level up, can provide powerful insights into the functional interrelationships of genes in health and disease. S Beck, A Olek and J Walter “From genomics to epigenomics” Nature Biotechnology 17 (12):1144 Dec 1999

Takes a whole-genome approach to studying environmental or developmental epigenetic effects, primarily DNA methylation, on gene function. Thus, epigenomics focuses on those genes whose function is determined by external factors. Brush up on your 'omics, Chemical & Engineering News, 81(49): 20, Dec. 2003  http://pubs.acs.org/cen/coverstory/8149/8149genomics1.html     Related terms: Expression; Gene definitions epigenetics

epigenotype: Patients with disorders involving imprinted genes such as Angelman syndrome (AS) and Prader- Willi syndrome (PWS) can have a mutation in the imprinting mechanism. Previously, we identified an imprinting center (IC) within chromosome 15q11-ql3 and proposed that IC mutations block resetting of the imprint, fixing on that chromosome the parental imprint (epigenotype) on which the mutation arose. S Saitoh, "Minimal definition of the imprinting center and fixation of chromosome 15q11-q13 epigenotype by imprinting mutations" Proceedings of the  National Academy of  Sciences U S A PNAS  93 (15) :7...  CH Waddington    

epistasis: : Two or more genes interacting with one another in a multiplicative (the effects of alleles at loci which together contribute to a phenotype when their combination is not equal to the sum of the individual contribute of each allele by itself) fashion. [NHLBI]

eSNPs expression SNPs: We present the first empiric study to systematically characterize the set of single nucleotide polymorphisms associated with expression (eSNPs) in liver, subcutaneous fat, and omental fat tissues, demonstrating these eSNPs are significantly more enriched for SNPs that associate with type 2 diabetes (T2D) in three large-scale GWAS than a matched set of randomly selected SNPs. Zhong H, Beaulaurier J, Lum PY, Molony C, Yang X, et al. (2010) Liver and Adipose Expression Associated SNPs Are Enriched for Association to Type 2 Diabetes. PLoS Genet 6(5): e1000932. doi:10.1371/journal.pgen.1000932 http://www.plosgenetics.org/article/info%3Adoi%2F10.1371%2Fjournal.pgen.1000932 

exon skipping: As many as half of the disease- associated single- nucleotide mutations in the coding regions of genes do not alter specific amino acids in the protein, but rather affect RNA splicing; for example, by inactivating or creating a splice site. And more often than not, these mutations result in the exclusion of exons from mRNA, a process known as exon skipping. So finding a way to correct these mutations and reinstate these exons into the transcript could be an effective route to treating the underlying cause of a wide range of diseases. Simon Frantz The Essence of Correction, Nature Reviews Drug Discovery 2: 170 April 2003

One of the major forms of alternative splicing, which generates multiple mRNA isoforms differing in the precise combinations of their exon sequences, is exon skipping. While in constitutive splicing all exons are included, in the skipped pattern(s) one or more exons are skipped. The regulation of this process is still not well understood; so far, cis- regulatory elements (such as exonic splicing enhancers) were identified in individual cases. We therefore set to investigate the possibility that exon skipping is controlled by sequences in the adjacent introns. Conserved sequence elements associated with exon skipping E. Miriami, H. Margalit, R. Sperling Nucleic Acids Research 31 (7) : 1974- 1983, Apr 1, 2003

Differential exon use is a hallmark of alternative splicing, a prevalent mechanism for generating protein isoform diversity. Many disease- associated mutations also affect pre- mRNA splicing, usually causing inappropriate exon skipping. Correction of disease- associated exon skipping by synthetic exon- specific activators L. Cartegni, AR Krainer Nat Structural Biology Feb. 2003 10 (2) :120 -125

exon SNPs, exonic SNPs: Are these the same as coding SNPs cSNPs?

founder populations: Those in which an identifiable current population derives from a relatively small group of founders. The small size of the founder group suggests that one will find fewer genes and fewer alleles involved in susceptibility to a given disease within the population. 

frame-shift mutation: A type of mutation in which a number of nucleotides not divisible by three is deleted from or inserted into a coding sequence, thereby causing an alteration in the reading frame of the entire sequence downstream of the mutation. These mutations may be induced by certain types of mutagens or may occur spontaneously. MeSH, 1991. Related term: reading frames.  Sequences, DNA & beyond.

functional polymorphisms:  To understand the mechanistic basis by which a polymorphism is associated with a particular phenotype or behavioural outcome, it is necessary to know whether that polymorphism is functional (i.e., whether it alters the function of a gene or set of genes). In most cases, the function of an associated polymorphism is not defined and must be surmised or extrapolated as an effect on the gene that contains this polymorphism. In rare cases, a polymorphism may be a nonsynonymous coding region variation that alters the gene product protein structure. What is a functional genetic polymorphism? Defining classes of functionality. J Psychiatry Neurosci. 2011;36(6):363-5. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3201989/

Assessing the functionality of a given polymorphism is not a straightforward process. Most genes have several polymorphic sites, many of which may impact on gene expression; SNPs are inherited as haplotypes and the potential for post-translational modification of expressed genes means that simply assessing gene transcription is not evidence of a biological effect. Even if an SNP is associated with a “functional” effect, there is no guarantee that the associated function will have a biological consequence. Most biological systems have a degree of built-in redundancy to cope with relative differences in key components.  P T Donaldson,  Genetics of liver disease: immunogenetics and disease pathogenesis, Gut. 2004 April; 53(4): 599–608  doi: 10.1136/gut.2003.031732.

functional variants: Genetic variation in the general population (not just patients and controls) that can potentially affect gene functional directly.  ... We would not need to know the disease status of the individuals sampled.  This rich catalogue of genetic changes ... would include SNPs that alter amino acids in proteins, and possibly gene-splicing or expression levels. Most would be rarer than those pursued by the HapMap.    Richard Gibbs, "Deeper into the genome" Nature 7063:1233- 1234, 27 Oct. 2005 

gene-based SNPs: Not just coding SNPs but also intron SNPs and promoter SNPs

gene-disease associations: The discovery of gene- disease associations requires scanning these same hundreds of thousands of SNPs in as many as 1,000 individuals, possibly using microarray technology.. Validation of gene- disease association studies requires measuring very few SNPs in thousands of individuals. Related term: association studies

gene duplication: See duplication

gene mapping: describes the methods used to identify the locus of a gene and the distances between genes.[2] .  Wikipedia accessed 2018 Nov 8  https://en.wikipedia.org/wiki/Gene_mapping

genetic drift: Random fluctuations in gene frequencies, particularly in small populations. 

genetic mapping: also called linkage mapping - can offer firm evidence that a disease transmitted from parent to child is linked to one or more genes. Mapping also provides clues about which chromosome contains the gene and precisely where the gene lies on that chromosome.  Genetic maps have been used successfully to find the gene responsible for relatively rare, single-gene inherited disorders such as cystic fibrosis and Duchenne muscular dystrophy. Genetic maps are also useful in guiding scientists to the many genes that are believed to play a role in the development of more common disorders such as asthma, heart disease, diabetes, cancer, and psychiatric conditions.  NHGRI Genetic Mapping https://www.genome.gov/10000715/

genetic variants: Narrower terms: alleles, mutations, polymorphisms, SNPs and even narrower terms:    deletions, duplications, enhancements, frame- shift mutations, idiomorphisms, indels, insertions, leaky mutations, loss of function mutations, null mutations, point mutations, reassortments, VNTRs. Variants seems to be replacing these terms in sequencing databases.

genetic variation - detection technologies: Includes microarrays, dHPLC, DNA fingerprinting, DNA footprinting, DNA ligase enzymes, endonucleases, ribotyping, SSCP, SSEP. Related terms: protein variations; Sequencing  genotype, haplotype, scanning, scoring.

genome mapping: The essence of all genome mapping is to place a collection of molecular markers onto their respective positions on the genome. Molecular markers come in all forms. Genes can be viewed as one special type of genetic markers in the construction of genome maps, and mapped the same way as any other markers.  Wikipedia accessed 2018 Nov 8  https://en.wikipedia.org/wiki/Gene_mapping  

There are two distinctive types of "Maps" used in the field of genome mapping: genetic maps and physical maps. While both maps are a collection of genetic markers and gene loci, genetic maps' distances are based on the genetic linkage information, while physical maps use actual physical distances usually measured in number of base pairs. While the physical map could be a more "accurate" representation of the genome, genetic maps often offer insights into the nature of different regions of the chromosome, e.g. the genetic distance to physical distance ratio varies greatly at different genomic regions which reflects different recombination rates, and such rate is often indicative of euchromatic (usually gene-rich) vs heterochromatic (usually gene poor) regions of the genome. …Genome sequencing is sometimes mistakenly referred to as "genome mapping" by non-biologists. The process of "shotgun sequencing" resembles the process of physical mapping: it shatters the genome into small fragments, characterizes each fragment, then puts them back together (more recent sequencing technologies are drastically different). While the scope, purpose and process are totally different, a genome assembly can be viewed as the "ultimate" form of physical map, in that it provides in a much better way all the information that a traditional physical map can offer. Wikipedia accessed 2018 Nov 8  https://en.wikipedia.org/wiki/Gene_mapping#Genetic_mapping_vs_physical_mapping

genotype: Sequencing 

GWAS Central provides a centralized compilation of summary level findings from genetic association studies, both large and small. We actively gather datasets from public domain projects. https://www.gwascentral.org/

haploinsufficiency: The situation in which an individual who is heterozygous for a certain gene mutation or hemizygous at a particular locus, often due to a deletion of the corresponding allele, is clinically affected because a single copy of the normal gene is incapable of providing sufficient protein production as to assure normal function.  Genetics Home Reference, National Library of Medicine, NIH   http://ghr.nlm.nih.gov/ghr/glossary/haploinsufficiency

Wikipedia https://en.wikipedia.org/wiki/Haploinsufficiency

haplotype, haplotyping, haplotyping technologies: Sequencing

Hardy-Weinberg law: A principle of population genetics that predicts genetic equilibrium in large populations, assuming standard variables.  GH Hardy was a British mathematician and Wilhelm Weinberg a German physician. 

heritability: Before we attempt to identify disease genes, we need to have some estimate of the heritable component of the disease so that we can determine the best and most pragmatic means of investigating it. Simple measures of heritability include: concordance in monozygotic and dizygotic twins, the degree of familial aggregation, and calculations such as sibling relative risk.  P T Donaldson,  Genetics of liver disease: immunogenetics and disease pathogenesis, Gut. 2004 April; 53(4): 599–608  doi: 10.1136/gut.2003.031732

HGVbase now GWAS Central

Human Genome Variation Society http://www.hgvs.org/

idiomorphism:  A polymorphism or any type of variation in DNA sequence occurring with less than 1% frequency.  Nicholas Schork to Malorye Branca, personal communication Sept. 1999  Compare with SNP.

indel: HGVS: confusing term, do not use  Sometimes: a sequence change where, compared to a reference sequence, one or more nucleotides are replaced by one or more other nucleotides
Sometimes: a variant which is a deletion or an insertion.  MESH

a length difference between two alleles where it is unknowable if the difference was originally caused by a sequence insertion or a sequence deletion. Human Genome Variation Society, Sequence Variant Glossary http://varnomen.hgvs.org/bg-material/glossary/  See also insertion

INDEL Mutation: A mutation named with the blend of insertion and deletion. It refers to a length difference between two ALLELES where it is unknowable if the difference was originally caused by a SEQUENCE INSERTION or by a SEQUENCE DELETION. If the number of nucleotides in the insertion/deletion is not divisible by three, and it occurs in a protein coding region, it is also a FRAMESHIFT MUTATION. MeSH 2008

indirect approach: See random genome-wide association studies.

informative polymorphisms: Within candidate genes, the number of common polymorphisms is finite, but direct assay of all existing common polymorphism is inefficient, because genotypes at many of these sites are strongly correlated. Thus, it is not necessary to assay all common variants if the patterns of allelic association between common variants can be described. We have developed an algorithm to select the maximally informative set of common single-nucleotide polymorphisms (tagSNPs) to assay in candidate-gene association studies, such that all known common polymorphisms either are directly assayed or exceed a threshold level of association with a tagSNP. Am J Hum Genet. 2004 Jan;74(1):106-20. Epub 2003 Dec 15. Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Carlson CS, Eberle MA, Rieder MJ, Yi Q, Kruglyak L, Nickerson DA. http://www.ncbi.nlm.nih.gov/pubmed/14681826

insertion: Several nucleotides can be added to a sequence, resulting in an insertion.  Effects of an insertion are variable.  Insertions and deletions can be hard to tell apart and are sometimes referred to collectively as “indels”.

intra-genic sequence variants: HGBASE is not designed to include gene `mutations', but instead is a catalog of intra-genic (promoter to transcription end point) sequence variants found in `normal' individuals. Although the distinction between `mutation' and `variation' can be somewhat blurred, the general idea is that the content of HGBASE concerns frequently occurring `normal polymorphisms', whether or not they are suspected to increase the risk of developing a particular phenotype. This is in contrast to `mutant sequences' which are known to cause genetic disease. Despite its name, HGBASE contains all types of intra-genic variation and is not limited to bi-allelic polymorphisms (though these do represent most of the database content). Both functional polymorphisms (e.g. promoter and non-silent codon changes) and non-functional polymorphisms (e.g. intron sequence differences) are included. This is for two reasons. Firstly, it is often difficult to be certain about the functional consequence of a variation. Secondly, regardless of functional relevance, any intra-genic polymorphism can usually be employed as an effective surrogate marker for an unknown functional variant in an association study, due to close proximity and linkage disequilibrium.  Human Genetic Bi-allelic Sequences, HGBASE, a Database of Intra-genic Polymorphisms  1998  http://www.bioline.org.br/request?oc98131     http://www.scielo.br/pdf/mioc/v93n5/12m.pdf  Now HGVBase http://www.hgvbaseg2p.org/index     

intron SNPs, intronic SNPs: Are these the same as anonymous SNPs?  See also under gene based SNPs

leaky mutation: Some function remains, but not at the level of the wild type allele. [Philip McClean , Intermediate Genetics PLSC 431, North Dakota State University https://www.ndsu.edu/pubweb/~mcclean/plsc431/mutation/mutation4.htm

linkage: Two loci are physically connected to one another on the same chromosome at a distance that is measured at less than 50% recombination.  Two traits are linked when they fail to be transmitted to offspring independently from one another.  The more closely linked two loci are to one another, the greater the chance that both loci will be transmitted to offspring together. The tendency for genes that are located close to each other on the same chromosome to be inherited together. [NHLBI]

The proximity of two or more markers (e.g. genes, RFLP markers) on a chromosome; the closer together the markers are, the lower the probability that they will be separated during DNA repair or replication processes (binary fission in prokaryotes, mitosis or meiosis in eukaryotes), and hence the greater the probability that they will be inherited together. DOE Related terms: linkage analysis, linkage disequilibrium.

linkage analysis:  Early gene discovery methods centered on linkage analysis in families, wherein clusters of patients with a particular disease were identified and studies were expanded in large pedigrees. The association of microsatellite markers with disease pointed the way to genome loci likely to contain a disease gene. While this family linkage approach is adequate for high- penetrance variants, it is not very useful for polygenic diseases, which would require impractically large family clusters to yield meaningful data. Related terms linkage, linkage disequilibrium.

linkage disequilibrium: When alleles at two distinctive loci occur in gametes more frequently than expected given the known allele frequencies and recombination fraction between the two loci, the alleles are said to be in linkage disequilibrium.  Evidence for linkage disequilibrium can be helpful in mapping disease genes since it suggests that the two may be very close to one another. [NHLBI] 

The tendency of closely spaced alleles to be inherited together. Linkage disequilibrium reduces the number of polymorphic markers that must be studied when using random markers to detect association between a gene and a trait. In the absence of linkage disequilibrium, only a causative polymorphism shows any appreciable difference between the case and the control group. However, in the presence of linkage disequilibrium, polymorphisms that are physically near a causal polymorphism will also show a difference in frequency between cases and controls, and an enhanced association with the trait in questions. Related terms: haplotype, haplotyping technologies, linkage, linkage analysis.

localize, locus, loci (plural): Gene definitions

loss of function mutation: Wild type alleles typically encode a product necessary for a specific biological function. If a mutation occurs in that allele, the function for which it encodes is also lost. The general term for these mutations is loss- of- function mutations. Philip McClean , Intermediate Genetics PLSC 431, North Dakota State University, 2000 https://www.ndsu.edu/pubweb/~mcclean/plsc431/mutation/mutation4.htm

markers: 1. (DNA) A fragment of known size used as reference for analytical purposes. 2. (genetic) A gene with known phenotype and mapped position. 3. (chromatography) A reference substance co- chromatographed with the sample to assist in identifying the components. IUPAC Compendium

A segment of DNA with an identifiable physical location on a chromosome whose inheritance can be followed. A marker can be a gene, or it can be some section of DNA with no known function. Because DNA segments that lie near each other on a chromosome tend to be inherited together, markers are often used as indirect ways of tracking the inheritance patterns of genes that have not yet been identified, but whose approximate locations are known. [NHGRI]

Types of genetic maps are differentiated largely by the type of marker used.  Narrower terms: polymorphisms, DNA fingerprints, microsatellite markers, microsatellite repeats, microsatellites,  minisatellite repeats, RFLPs, restriction enzymes, SSRs, STRs, STSs, satellites Biomarkers biological markers, biomarkers, DNA markers, genetic markers, surrogate markers; DNA: ESTs; Related terms:  Maps- genomic & genetic

microsatellites: Consist of tandem repeats, which contain repetitive runs of the same short base sequence (e.g., GTA, GTA, GTA…). Among individuals, these sections of DNA may vary in the number of repeats they contain and can serve as markers and signs of genetic variation. 

minisatellite repeats: Tandem arrays of moderately repetitive (5- 50 repeats) short (10- 60 bases) DNA sequences found dispersed throughout the genome and clustered near telomeres. Their degree of repetition is two to several hundred at each locus. Loci number in the thousands but each locus shows a distinctive repeat unit. Minisatellite repeats are often called variable number of tandem repeats [VNTRs].  MeSH, 1997   Also known as SSRs Related term/broader term?: tandem repeats. Broader term polymorphisms

missense mutation:  Wikipedia http://en.wikipedia.org/wiki/Missense_mutations

molecular variants: Narrower terms: alleles, mutations, polymorphisms, SNPs. Broader term: variants.

mosaicism: The occurrence in an individual of two or more cell populations of different chromosomal constitutions, derived from a single ZYGOTE, as opposed to CHIMERISM in which the different cell populations are derived from more than one zygote. MeSH Year introduced: 1967(1964)

multifactorial: As polygenic but requires environmental input for disease genesis. P T Donaldson,  Genetics of liver disease: immunogenetics and disease pathogenesis, Gut. 2004 April; 53(4): 599–608  doi: 10.1136/gut.2003.031732

multifactorial diseases: Disease that result from a complex interplay of multiple genes and the environment. CHA Cambridge Healthtech Advisors, Clinical Genomics: The Impact of Genomics on Clinical Trials and Medical Practice report, 2004 Compare multigenic, polygenic, oligogenic.

multigenic: Traits controlled by more than one allele. Compare multifactorial, oligogenic, polygenic

multiple alleles: The state of having three or more alleles, or distinctive forms of a gene.  Wiktionary accessed Jan 12 2011  http://en.wiktionary.org/wiki/multiple_allelism

ABO blood groups are a good example of multiple allelism.  Individuals have only two alleles, but there are a number of different combinations and phenotypes.

multiplicative: See under epistasis.

mutation:  Genetic variation present in less than 1% of the population. P T Donaldson,  Genetics of liver disease: immunogenetics and disease pathogenesis, Gut. 2004 April; 53(4): 599–608  doi: 10.1136/gut.2003.031732

A heritable change in the nucleotide sequence of genomic DNA (or RNA in RNA viruses), or in the number of genes or chromosomes in a cell, which may occur spontaneously or be brought about by chemical mutagens or by radiation (induced mutation). [IUPAC Bioinorganic]

Any detectable and heritable change in the genetic material not caused by genetic segregation or recombination, which is transmitted to daughter cells and to succeeding generations, providing it is not a dominant lethal factor. MeSH, 1964

A change, deletion, or rearrangement in the DNA sequence that may lead to the synthesis of an altered inactive protein the loss of the ability to produced the protein. If a mutation occurs in a germ cell, then it is a heritable change in that it can be transmitted from generation to generation. Mutations may also be in somatic cells and are not heritable in the traditional sense of the word, but are transmitted to all daughter cells. [NHLBI]  

Any type of change (including insertions, deletions, point mutations, and rearrangements in the nucleotide sequence of DNA) which leads to variations in the population. Genetic changes that have been associated with disease risk or were caused by damage inflicted by external agents (such as radiation) are particularly described as mutations. Related terms: alleles, polymorphisms, SNPs. Narrower terms: deletions,  duplications, frame shift mutations, leaky mutations, loss of function mutations, null mutations, neutral mutations, point mutations, suppressor mutations; Functional genomics  targeted mutation.    Mutation databases see Databases & software directory under 'variations'.

mutation detection: See genetic variation detection technologies

mutation rate: The frequency with which a mutation occurs within an organism or gene. In general, rates of spontaneous mutation vary between one in 104 and one in 108 per gene per generation, and can be considerably increased by mutagens. [IUPAC Biotech]

negative selection: Many mutations are deleterious to the fitness of an organism. These will be selected against and eventually lost from the population.  S. Sunyaev “SNP frequencies in human genes” Trends in Genetics 16:8): 335-337 August 2000

neutral mutation: Substitutions that have no selective advantage or disadvantage. S. Sunyaev  “SNP frequencies in human genes” Trends in Genetics 16:8): 335-337 August 2000

nonsense mutation: (also called STOP mutation) Any change in DNA that causes a (termination) codon to replace a codon representing an amino acid. W Fangman, Definitions of course terms, Genetics 372, Winter 2000   http://depts.washington.edu/genetics/courses/genet372/w2000Terms.html 

non- synonymous SNPs, nsSNPs: When the altered code doesn't correspond to the same amino acid as the "wild- type" sequence. Substitutions in coding regions that result in a different amino acid. Broader term: cSNP Related term: synonymous SNP

nucleotide diversity: The number of base differences between two genomes, divided by the number of base pairs compared.  A sensitive indicator of biological and historical factors that have affected the human genome. A. Chakravarti "...to a future of genetic medicine" Nature 409: 822-823, 15 Feb. 2001 Related terms: population genetics, population genomics

null mutation: A mutation where function is totally lost.

oligogenic: Single mutations in each case or family but several different genes make up the disease. P T Donaldson,  Genetics of liver disease: immunogenetics and disease pathogenesis, Gut. 2004 April; 53(4): 599–608  doi: 10.1136/gut.2003.031732.

A trait is considered to be oligogenic when two or more genes work together to produce the phenotype. Implies that ‘few’ genes are involved and should be contrasted with a polygenic trait, which implies that many genes are involved in phenotype expression. [NHLBI] Compare multifactorial, multigenic, polygenic

penetrance: Genomics
phene: See Databases & software directory under OMIA Online Mendelian Inheritance in Animals
phenotype: Genomics  Related term: genetic architecture

phenocopy: A phenotype that is not genetically controlled but looks like a genetically controlled phenotype. An  environmentally induced phenotype that resembles the phenotype produced by a mutation. [Edinburgh]

pleiotropism or pleiotropy: Single genes produce multiple, seemingly unrelated phenotypic effects.

point mutations:  A mutation caused by the substitution of one nucleotide for another. This results in the DNA molecule having a change in a single base pair. MeSH, 1993

In gene mutation, one allele of a gene changes into a different allele. Because such a change takes place within a single gene and maps to one chromosomal locus (“point”), a gene mutation is sometimes called a point mutation. This terminology originated before the advent of DNA sequencing and therefore before it was routinely possible to discover the molecular basis for a mutational event. Nowadays, point mutations typically refer to alterations of single base pairs of DNA or of a small number of adjacent base pairs  NCBI Resources, Molecular Basis of Mutation https://www.ncbi.nlm.nih.gov/books/NBK21322/

Point mutations affect only one or a few nucleotides within a gene   University of Evansville http://faculty.evansville.edu/de3/b10004/PDFs/13b_Sources_variation.pdf 

SNPs and point mutations are structurally identical, differing only in their frequency. Variations that occur in 1% or less of a population are considered point mutations, and those occurring in more than 1% are SNPs.  

polygenic: involves multiple genes that interact to create a permissive gene pool for disease genesis. P T Donaldson,  Genetics of liver disease: immunogenetics and disease pathogenesis, Gut. 2004 April; 53(4): 599–608  doi: 10.1136/gut.2003.031732Related terms:  multifactorial, multigenic, oligogenic, epistasis,

positional candidates: Functional genomics Not to be confused with the candidate gene strategy.

polymorphisms: A term formulated by population geneticists to describe loci at which there are two or more alleles that are each present at a frequency of at least 1% in a population of animals. The term has been co- opted for use in transmission genetics to describe any locus at which at least two alleles are available for use in breeding studies, irrespective of their actual frequencies in natural populations. [NHLBI]

A gene that exists in more than one version (allele), and where the rare allele can be found in more than 2% of the population. [NHGRI]   

Genetic variations, broadly encompassing any of the many types of variations in DNA sequence that are found within a given population. Specific subtypes of polymorphisms include mutations, point mutations, and SNPs.   Polymorphisms have  the sense of being more neutral than mutations.  Narrower terms: functional polymorphisms, idiomorphisms, non- functional polymorphisms; SNP- human, microsatellite repeat polymorphisms, RFLPs, SSRs simple sequence repeats, STRs, tandem repeats. Related terms: CEPH, International SNP Map Working Group, linkage analysis, SSCP, SSEP, alleles, association, linkage, mutations, population genetics, sequence variants, SNPs. Broader term: variants

population genetics: The study of the genetic composition of populations and of the effects of factors such as selection, population size, mutation, migration, and genetic drift on the frequencies of various genotypes and phenotypes. MeSH, 1966 Until recently, a small rather esoteric specialty.

population genomics: The study of the forces that determine patterns of neutral and adaptive variation in genomes. Michel Vieulle, Expression of interest for a network in population genomics, European Framework Programme 6  http://gdrevol.snv.jussieu.fr/pgmics.pdf 
Guidelines for referring populations in publications and presentations, https://www.coriell.org/0/Sections/Support/NHGRI/NHGRI_Pop_Ref.aspx?PgId=688

promoter SNPs pSNPs:  If a cSNP or an rSNP leads to an altered amino acid, which in turn leads to altered protein function or expression and an observable change in the organism’s phenotype, the  SNP may be labeled a pSNP. 

protein polymorphisms: Polymorphisms in exons (protein coding DNA). cSNPs are a subset of protein polymorphisms. 

protein variants:  Variations in proteins have very large number of diverse effects affecting sequence, structure, stability, interactions, activity, abundance and other properties. Although protein-coding exons cover just over 1 % of the human genome they harbor a disproportionately large portion of disease-causing variants… Protein variants can be of genetic origin or emerge at protein level. Types and effects of protein variants Vihinen, M. Hum Genet (2015) 134: 405. https://doi.org/10.1007/s00439-015-1529-6 https://link.springer.com/article/10.1007/s00439-015-1529-6  Broader terms: genetic variants, variants

random genome-wide association studies: Looks at many random SNPs in the hope that some will be in linkage disequilibrium with a particular gene or genes. Broader term: association studies.

reassortment: The rearrangement of genes from two distinct strains to produce a novel viral strain. 

recombinant: The result of a crossover in a doubly heterozygous parent such that alleles at two loci that were present on opposite homologs are brought together on the same homolog.  The term is used to describe the chromosome as well as the animal in which it is present. [NHLBI]  Related  terms: genetic recombination; recombinant DNA technology, recombinant antibodies, recombinant DNA, recombinant proteins, recombination 

regional scanning:  Developed by Genset.  Used linkage studies to identify sequence regions from the public domain and their own studies to narrow the parts of the genome they would subsequently scan.  Then using high-density mapping, making a first pass using a map with SNPs every 30- 40 kb in the regions of interest and then increasing the density to every 5-10 kb as they closed in on the genes of interest.  Can only be used for studies of disease genes for which family linkage data is available. Related terms: Maps- genomic & genetic

regulatory SNPs rSNPs:  These SNPs affect regulatory regions that govern gene expression.  Thought to be relatively uncommon and potentially as valuable as cSNPs.  
rSNP guide
http://wwwmgs.bionet.nsc.ru/mgs/systems/rsnp/help.html

repeats: Narrower terms:  include microsatellite repeats, minisatellite repeats, short tandem repeats, satellites, tandem repeats, VNTRs

restriction analysis: Uses naturally occurring enzymes that cut DNA at precise sites. The enzymes cut the DNA from different individuals in different places when individuals differ in the sequence of a site where the enzyme cuts. The cut fragments will therefore be different sizes, and the pattern of sizes will form a DNA pattern, or "DNA fingerprint" for the individual. CHA Cambridge Healthtech Advisors, Clinical Genomics: The Impact of Genomics on Clinical Trials and Medical Practice report, 2004

restriction enzymes: Endonucleases which recognize specific base sequences within a DNA helix, creating a double- strand break of DNA. Type I restriction enzymes bind to these recognition sites but subsequently cut the DNA at different sites. Type II restriction enzymes both bind and cut within their recognition or target sites. [IUPAC Biotech]  

A group of enzymes isolated from bacteria that cut DNA molecules at specific sites characterized by certain nucleotide sequences. [NHLBI] Broader term: markers.

Restriction Fragment Length Polymorphism RFLP:  Variation occurring within a species in the presence or length of DNA fragment generated by a specific endonuclease at a specific site in the genome. Such variations are generated by mutations that create or abolish recognition sites for these enzymes or change the length of the fragment. MeSH, 1995

Restriction fragment length polymorphism, can be used in SNP genotyping. This method relies on enzymatic cleavage of DNA followed by electrophoresis Broader term: marker 

ribotyping: RESTRICTION FRAGMENT LENGTH POLYMORPHISM analysis of rRNA genes that is used for differentiating between species or strains. MeSH, 2001

satellite: Many tandem repeats (identical or related) of a short basic repeating unit;  many have a base composition or other property different from the genome average  that allows them to be separated from the bulk (main band) genomic DNA. DDBJ/ EMBL/ GenBank Feature Table http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html  Narrower terms: microsatellite, minisatellite

scanning: Technology used to discover new or unknown SNPs. M Phillips CHI Nucleic Acid Detection Technologies conference, June 7- 9, 2000 Related term: genome scan (broader? narrower?).

scoring: Determining, by comparison the base pairs (genotypes) at the locus for many individuals for particular SNPs that have already been discovered. Related terms: Sequencing genotyping, scoring methods

segregation:  The principle that the two partners of a chromosome pair are separated during meiosis and distributed randomly to the germ cells.  Each germ cell has an equal chance of receiving either chromosome.  NHLBI.  Related terms: aggregate, Mendelian.

sequence variant: following recommendations of the Human Genome Variation Society, “sequence variant” is a more inclusive term than “polymorphism. What is a functional genetic polymorphism? Defining classes of functionality. J Psychiatry Neurosci. 2011;36(6):363-5. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3201989/

Single Amino acid Polymorphisms SAAPs: Structurally expressed SNPs and mutations. Andrew CR Martin, Single Amino Acid Polymorphism Database http://www.bioinf.org.uk/saap/ 

Polymorphisms, which differ by a single amino acid. 

‘slightly’ deleterious alleles: An allelic variant subject to negative selection, but the selection coefficient is relatively low. The frequency of a slightly deleterious allele in a population is subject both to the stochastic fluctuations of genetic drift, depending on population size, and to a very weak negative selection.  Unlike strongly deleterious alleles, which are quickly eliminated by selection, slightly deleterious alleles can be kept in a population for a long time owing to drift. Due to selective pressure, they are predominantly observed at low frequencies (in comparison to purely neutral alleles. S. Sunyaev  “SNP frequencies in human genes” Trends in Genetics 16:8): 335-337 August 2000

SNP Single nucleotide polymorphism: The most common form of DNA variation, alterations to a single base. If the SNP is in a gene, it can disrupt the gene's function. Most \SNPs do not occur in genes, but can be associated with other types of DNA variation and so are used effectively as markers. CHA Cambridge Healthtech Advisors, Clinical Genomics: The Impact of Genomics on Clinical Trials and Medical Practice report, 2004  

SNPs are single base pair positions in genomic DNA at which different sequence alternatives (alleles) exist in normal individuals in some population(s), wherein the least frequent allele has an abundance of 1% or greater.  Thus single base insertion/ deletion variants (indels) would not formally be considered to be SNPs. ... In practice, the term SNP is typically used more loosely than required by the above definition. ... Complications with the above definition also exist. Specifically, some people might not want to consider disease predisposing single base variants to be SNPs - but the above definition would encompass such things as recessively acting, low penetrance, dominant, quantitative trait loci, or risk associated alleles, since all of these will occur in some normal (non- diseased) individual.  Also the 'some population' component of the definition is limited by practical challenges of attaining and surveying representative global population samples. Consequently, claims of non- polymorphic sequences should always be accompanies by statements of the actual populations and the numbers of chromosomes tested. Overall, it is therefore apparent that the term 'SNP' is being widely and imprecisely used as a catch- all label for many different types of subtle sequence variation. Anthony Brooks "The essence of SNPs" Gene 234: 177-186, 1999

This involves exchange of one nucleotide base (ACT or G) for another. P T Donaldson,  Genetics of liver disease: immunogenetics and disease pathogenesis, Gut. 2004 April; 53(4): 599–608  doi: 10.1136/gut.2003.031732.

A SNP is a position in the genome where some individuals have one DNA base (e.g., A), and others have a different base (e.g., C). SNPs and point mutations are structurally identical, differing only in their frequency. Variations that occur in 1% or less of a population are considered point mutations, and those occurring in more than 1% are SNPs. This distinction is pragmatic and reflects the fact that low- frequency mutations cannot be used effectively in genetic studies as genetic markers, while more common ones can.  

SNPs can occur in coding regions of the genome (cSNPs), in regulatory regions (rSNPs), or, most commonly, in "junk DNA" regions, in which case they are referred to as anonymous SNPsNarrower terms: SNPs- human, single amino acid polymorphisms SAPS;  anonymous SNPs, cSNPs, candidate SNP, exonic SNPs, intron SNPs, pSNP, promoter SNPs, rSNP, SNP haplotypes, synonymous SNPRelated terms: idiomorphism, protein polymorphisms SNP Consortium, SNP discovery, SNP scans
SNP Consortium: Became International HapMap Project https://en.wikipedia.org/wiki/International_HapMap_Project 

SNP haplotypes: Multilocus analysis of single-nucleotidendashpolymorphism (SNP) haplotypes may provide evidence of association with disease, even when the individual loci themselves do not. Haplotype-based methods are expected to outperform single-SNP analyses because (i) common genetic variation can be structured into haplotypes within blocks of strong linkage disequilibrium and (ii) the functional properties of a protein are determined by the linear sequence of amino acids corresponding to DNA variation on a haplotype. Andrew P Morris, A Flexible Bayesian Framework for Modeling Haplotype Association with Disease, Allowing for Dominance Effects of the Underlying Causative Variants, American Journal of Human Genetics,79 : 679– 694, 2006 DOI: 10.1086/508264 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1592560/

SNP - human: A single site in a nucleotide sequence that contains two to four allelic variations within a population at relatively high frequencies (1.0% by convention). S. Sunyaev "SNP frequencies in human genes" Trends in Genetics 16:8): 335-337 August 2000

About 30 million SNPs are thought to exist, making them much better markers than alternative markers, such as micro- satellite repeats or short tandem repeats. But it has been the discovery that some SNPs are linked to particular diseases that has fueled the rising interest in this field. ... to determine whether and how particular SNPs correlate to specific conditions, one may need to study hundreds of thousands of SNPs in thousands of patients. At this point in time, this remains a costly prospect. Also, the software tools are only now emerging to deal with the analysis challenges. 

SSCP  Single Strand Conformational Polymorphism: High throughput fragment analysis, a technique for screening for SNPs. 

SSEP Single Strand Electrophoretic Polymorphism: A technique for screening for SNPs. 

STRs Short Tandem Repeats:  Small regions of repeated bases throughout the human genome, which vary widely in length within the population.  These regions vary at much higher rates than SNPs, and in fact change too rapidly to be useful in identifying disease genes or propensity to disease.  However, they have proved very useful in identifying individuals by constructing a DNA fingerprint of the length of a set of STRs in the person's genome. CHA Cambridge Healthtech Advisors, Clinical Genomics: The Impact of Genomics on Clinical Trials and Medical Practice report, 2004

structural variants: No two genomes are alike; instead, each displays structural variability in the form of single-nucleotide polymorphisms (SNPs), deletions or insertions of various sizes, which are collectively called copy number variants (CNVs) and inversions, which are copy number neutral structural variants.  Nicole Rusk, Finding copy-number variants, Nature Methods 5(11):917 Nov 2008  

STSs Sequence-Tagged Sites:  Unique short sequence of DNA, typically less than 400 bases long.  Detected by PCR.  Allows identification of where in the genome the studied sequence is localized.  ESTs are STSs derived from cDNA.  Useful for orienting the physical mapping and sequence data reported from different laboratories.  [DOE]

A short DNA segment that occurs only once in the human genome and whose exact location and order of bases are known. Because each is unique, STSs are helpful for chromosome placement of mapping and sequencing data from many different laboratories. STSs serve as landmarks on the physical map of the human genome. [NHGRI]  Related terms: cloning, vectors Cell biology  positional cloning Functional genomics.
STS databases see Databases & software directory

suppression: Second mutation at a site distinct from the first mutation reverses, at least partially, the phenotypic expression of the first mutation. 

synonymous SNP: Substitutions in coding regions that result in the same amino acid. S. Sunyaev  “SNP frequencies in human genes” Trends in Genetics 16:8): 335-337 August 2000

When the altered code still corresponds to the same amino acid as the "wild- type" sequence  Narrower term: SAP single amino acid polymorphism Broader term: cSNP  Related term: non- synonymous SNP

tag SNPs: The HapMap project has identified a further subset of 'tag' SNPs that are most useful in genetic association studies, because they link common haplotypes. Richard Gibbs, "Deeper into the genome" Nature 7063:1233- 1234, 27 Oct. 2005 

tandem repeats:  Multiple copies of the same base targeted mutation: Functional genomics

tri-allelic:  Blood groups (A, B, O) are an example of tri-allelism. Related terms: allele, bi-allelic.

VNTRs Variable Number of Tandem Repeats: See minisatellite repeats.

whole genome duplication: Gene duplication is an important source of evolutionary novelty. Most duplications are of just a single gene, but Ohno proposed that whole- genome duplication (polyploidy) is an important evolutionary mechanism. KH Wolfe, DC Shields "Molecular evidence for an ancient duplication of the entire yeast genome" Nature 387(6634): 708- 713, June 12, 1997

wild-type: The most frequently encountered genotype in natural breeding populations. IUPAC Biotech

The term "wild-type" was fixed in the lexicon in the early days of fruit-fly genetics, when one could go out and catch one; now it means the original line of normally functioning individuals. HF Judson, Eighth Day of Creation Cold Spring Harbor Laboratory Press 1996 p. 276

To what extent is wild-type a theoretical concept?  Is it as slippery and elusive as the concept of  "normal" in many cases in clinical medicine? Related terms:  Gene categories suppressor genes  Protein categories: wild- type proteins

Genetic variations resources
Background on Comparative Genomic Analysis, NHGRI, 2012 http://www.genome.gov/10005835
DDBJ/ EMBL/ GenBank Feature Table, Version 6.7, 2016 http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html
Den Dunnen et al. (2016) HGVS recommendations for the description of sequence variants: 2016 update. Hum.Mutat. 25: 37: 564-569 although the examples on these pages mainly give examples for human (Homo sapiens), the recommendations can be applied to all species.
Genetic Epidemiology Glossary, M Tevfik Dorak 2017 http://www.dorak.info/epi/glosge.html 
Human Genome Variation Society, Sequence Variant Glossary http://varnomen.hgvs.org/bg-material/glossary/
Human Genome Variation Society, Sequence Variant Nomenclature http://varnomen.hgvs.org/
Philip McLean, Genes and Mutations, North Dakota State University, 1999  https://www.ndsu.edu/pubweb/~mcclean/plsc431/mutation/mutation1.htm 
NCBI Commonly used Genome Terms 
https://www.ncbi.nlm.nih.gov/projects/genome/glossary.shtml 
NCBI Variations https://www.ncbi.nlm.nih.gov/guide/variation/
NHGRI (National Human Genome Research Institute), Talking Glossary of Genetic Terms, 100+ definitions.  https://www.genome.gov/genetics-glossary Includes extended audio definitions.

NIH, Understanding Human Genetic Variation; Biological Sciences Curriculum Study. Bethesda (MD): National Institutes of Health (US); 2007. https://www.ncbi.nlm.nih.gov/books/NBK20363/

IUPAC definitions are reprinted with the permission of the International Union of Pure and Applied Chemistry.

How to look for other unfamiliar  terms  

Contact | Privacy Statement | Alphabetical Glossary List | Tips & glossary FAQs | Site Map