|
The dividing line between this glossary and Algorithms
& data analysis is very fuzzy. In general this one focuses on unstructured data
(or a combination of structured and unstructured), while Algorithms
centers on structured data Finding guide to terms in these glossaries Informatics
Map Site
Map
Informatics includes Bioinformatics
Computers & computing In silico & Molecular Modeling
Ontologies
Technologies Microarrays & protein chips
Sequencing
Advances
in biology and new high-throughput technologies are generating massive amounts
of data that overwhelm the current information technology infrastructure. The
challenge is to build a common capability that enables a more efficient
translation of data into knowledge that leads to new and effective treatments.
caBigTM and Molecular Medicine, NCI, NIH http://cabig.cancer.gov/molecular/overview.asp
Google = "data analysis" about
1,420,000 as of July 23, 2002; about 4,480,000 as of Sept. 23,
2004; "data interpretation" about 58, 200 July 23, 2002;
about 147,000 as of Sept. 23, 2004
3D technologies:
Visual
communications are pervasive in information technology and are a key enabler of
most new emerging media. In this context, the NRC Institute for Information
Technology (NRC-IIT) performs research, development and technology transfer
activities to enable access to 3D information of the real world. Research in the 3D Technologies program focuses on three main areas: Virtualizing Reality
and Visualization, Collaborative Virtual Environments, 3D
Data Mining and Management [Institute for Information Technology, National
Research Council, Canada, 3D Technologies] artificial intelligence: Algorithms
& data analysis glossary
Google = about 1,120,000 July 19,
2002; about 3, 040,000 Oct. 22, 2004
BIRN Biomedical Informatics
Research Network: http://www.nbirn.net/
bias:
One of the two components of
measurement error (the other one being variance). Bias is a systematic
error that causes the measurement to differ from the correct value. Since bias
is systematic, it affects all experiment replicas the same way.
bibliomining:
The combination of data mining, bibliometrics, statistics, and reporting tools
used to extract patterns of behavior- based artifacts from library systems.
Scott Nicholson, Bibliomining: Data Mining for Libraries, Syracuse Univ. US http://www.bibliomining.com/
bioinformatics
visualization: BIoinformatics
Glossary
biomedical computing: Computers
& computing glossary
Google = about 11,800 July 19, 2002;
about 20,900 Oct. 22, 2004
biomedical
informatics:
Google
about 66,600 Oct. 22, 2004
biomedical ontologies: Open
Biomedical Ontologies is an umbrella web address for well-structured controlled
vocabularies for shared use across different biological and medical domains.
http://obo.sourceforge.net/
Google = about 102, Jan. 8, 2003;
about 294 Oct. 1, 2003; about 490 Oct 22, 2004; about 488 May 2, 2005
Biomedical
Ontologies: Overview
BIONLP.org: Bioinformatics
glossary
biopharmaceutical informatics:
Drug
companies go through a very arduous and regulated discovery, applied research,
and development process- typically spanning five years of laboratory research and
ten years of clinical studies .. multinational clinical studies, which need to
be done with tremendous precision over a very long period of time. The study
parameters must be identical for every patient (many times numbering 10,000
patients, followed for five or more years), and all the participating hospitals
essentially have to behave in exactly the same way for the trial to be valid. ..
The
life science industry is conservative by nature, and therefore it is a late-
adopting industry. It is very sensitive to standards because of the legacy
according to which these companies have to maintain data and information. Major
pharmaceutical companies typically adopt a 100-year minimum document retention
policy, ...each of the industry's four industrial sectors - the pharmaceutical,
the biotech, the medical device, and the diagnostics sector - has a different
set of needs and desires, as well as its own requirements for unique IT
solutions. ...
Life science companies are dealing with very large computational data sets. Some
are now approaching half terabyte sizes and upward Life science companies also
immensely concern themselves with security, because their data represent their
crown jewels. Other major concerns expressed by this industry include the
stability, scalability, and security of an operating environment. Life science
companies and regulatory bodies such as the FDA are more concerned than ever
with operating environments that decay with use: When under computational
stress, these fragile operating systems have a habit of crashing, and when these
systems crash, they tend to corrupt data. ...
Post-genomic,
proteomic, chemical information, and other data sets have created a major
appetite for solutions to deal with this tremendous amount of data. Scientists
are now asking their IT professionals for the ability to better conceptualize
and interpret the meaning of this vast information. To do this, scientists need
tools for 3D visualization with a tremendous degree of high definition and
accuracy. The next step is to take disparate data sets, render them into 3D
values, see the DNA and RNA interface, watch protein folds, and then put a
therapeutic small molecule in there and see how it relates within a virus that
environmentally influences a different process. Scientists Are
Demanding Solutions for Dealing with the Post-Genomic, Proteomic, and Chemical
Data Deluge: An Interview with Howard Asher, Director, Global Life Sciences
Group, Sun Microsystems, CHI GenomeLink 30 http://www.healthtech.com/newsarticles/issue30_1.asp
Biosemantics
Group:
http://www.biosemantics.org/
Addresses concept identification and disambiguation algorithms, meta-analysis
and visualization techniques, and biological applications [interconnect genes
and proteins, semi-automated annotations of protein functions.] Medical
Informatics department of the ErasmusMC
University Medical Center of Rotterdam and the Center
for Human and Clinical Genetics of the Leiden
University Medical Center.
blog:
Wikipedia http://en.wikipedia.org/wiki/Blog
Related
terms: blogging, blogosphere, microcontent, nanopublishing, weblog
blogging:
In
the beginning - say 1994 - the phenomenon now called blogging was little more
than the sometimes nutty, sometimes inspired writing of online diaries. These
days, there are tech blogs and sex blogs and drug blogs and onanistic teenage
blogs. But there are also news blogs and commentary blogs, sites packed with
links and quips and ideas and arguments that only months ago were the near-
monopoly
of established news outlets. Poised between media, blogs can be as nuanced and
well-
sourced
as traditional journalism, but they have the immediacy of talk radio.
Andrew Sullivan, "The blogging revolution" Wired Magazine, May 2002 http://www.wired.com/wired/archive/10.05/mustread.html?pg=2
bottom-up ontologies: Are flexible through the use of implicit and, hence, parsimonious
part- whole and
subconcept- superconcept relations. The bottom- up method complements current practice, where, as a rule, ontologies are built
top- down. The design method is illustrated by an example involving ontologies of pure substances at several levels of detail. It is not claimed that
bottom- up construction is a generally valid recipe; indeed, such recipes are deemed
uninformative or impossible. Rather, the approach is intended to enrich the ontology developer's
toolkit. [Paul E. van der Vet, Nicolaas J.I. Mars, Bottom- Up Construction of Ontologies,
IEEE Transactions on Knowledge Engineering, July- Aug, 1998 10(4): 513- 526] http://www.computer.org/tkde/tk1998/k0513abs.htm
Google = "bottom-up ontologies"
about 10 bottom-up ontologies about 2, 250 July 19, 2002
bottom-up taxonomies:
Faceted classification is a hallmark of the bottom-up approach and suggests
yet another reason why the phrase "build the taxonomy" is
ill-conceived. ... The bottom-up approach suggests a very different way to
classify content. When populating a top-down taxonomy, the central question is
"where do I put this?" but at the heart of the bottom-up approach is
the question "how do I describe this?" By asking this subtly different
question, you’ll wind up in a dramatically different destination. Peter
Morville, "Bottoms up: Designing complex, adaptive systems, Faceted
Classification, New Architect, 2002 http://www.newarchitectmag.com/documents/s=7733/na1202b/index3.html
Can mean from
specific to general, but it can also mean content- oriented. [Jean Graef
"Top down or bottom up" Montague Institute Review, 2001]
http://www.montague.com/review/topdown.html
CML Chemical Markup Language:
Chemoinformatics glossary
classification:
Involves the development and use of a scheme for the systematic organization of knowledge. (Taylor p 576) Arlene Taylor identified three approaches to
classification: enumerative, hierarchical, and analytico- synthetic. Enumerative classification attempts to assign headings for every subject and
alphabetically enumerates them. Hierarchical classification uses a more philosophical approach based on the inherent organization of the
subject being classified, and establishes logical rules for dividing topics into classes, divisions, and subdivisions.
Analytico- synthetic classification assigns terms to individual concepts and provides rules for the local cataloger to use in constructing headings for composite
subjects. Traditional classification systems in this country are basically enumerative, though many contain some elements of hierarchy and
faceting. (Taylor pp 319- 321) Amanda Maple, "FACETED ACCESS: A REVIEW OF THE LITERATURE"
Working Group on Faceted Access to Music, Music Library Association Annual Meeting, 10 February 1995 http://theme.music.indiana.edu/tech_s/mla/facacc.rev
Indexing in the
library and information management sense, but also see Algorithms
& data analysis glossary classification, classifiers
collaborative filtering:
Tools that leverage user preferences, patterns, and purchasing behavior to customize organization and navigation systems. [Peter Morville "Software for Information Architects" Argus Center for Information Architecture, 2000]
http://argus-acia.com/strange_connections/current_article.html
Amazon's recommendations based on what other buyers of a specific title are
buying is a familiar example of collaborative filtering.
Google = about 21,600
July 19, 2002; about 49,300 Oct. 22, 2004
collaborative
metadata:
A robust increase in both the amount and
quality of metadata is integral to realizing the Semantic Web. The research
reported on in this article addresses this topic of inquiry by investigating the
most effective means for harnessing resource authors' and metadata experts'
knowledge and skills for generating metadata. Jane Greenberg, W. Davenport
Robertson, Semantic web construction: An Inquiry of Authors' Views on
Collaborative Metadata Generation, International Conference DC 2002, Metadata
for e-Communities, Oct. 13- 17, 2003, Florence Italy
http://dois.mimas.ac.uk/DoIS/data/Papers/dcmdcflorp:5.html
http://www.bncf.net/dc2002/program/ft/paper5.pdf
Google = about 116
Apr. 24, 2003; about 377 Oct. 22, 2004
common ontology:
Defines the vocabulary with which queries and assertions are exchanged among agents. ... The agents sharing a vocabulary need not share a knowledge base; each knows things the other does not, and an agent that commits to an ontology is not required to answer all queries that can be formulated in the shared vocabulary. In short, a commitment to a common ontology is a guarantee of consistency, but not completeness, with respect to queries and assertions using the vocabulary defined in the ontology. [Tom Gruber, What is an ontology?" Knowledge Systems Lab, Stanford Univ. 2001]
http://www-ksl.stanford.edu/kst/what-is-an-ontology.html
Google = about 1,190 July 19,
2002, about 4,130 Oct. 22, 2004
Related
terms: ontological commitment, reusable ontologies, shared ontologies
communications standards: Pharmacogenomics
glossary
communities of practice:
Alliances glossary
competitive
intelligence: Business of
biopharmaceuticals glossary
computational linguistics:
Computational Linguistics, or Natural Language Processing (NLP), is not a new field. As early as 1946, attempts have been undertaken to use
computers to process natural language. These attempts concentrated mainly on Machine Translation
... the limited performance of these systems made it clear that the underlying
theoretical difficulties of the task had been grossly underestimated, and in the following years and decades much effort was spent on basic
research in formal linguistics. Today, a number of Machine Translation systems are available commercially although there still is no system that
produces fully automatic high- quality translations (and probably there will not be for some time). Human intervention in the form of pre-
and/ or
post-editing is still required in all cases. Another application that has become
commercially viable in the last years is the analysis and synthesis of spoken language, i.e. speech
understanding and speech generation. ... An application that will become at least as important as those already mentioned is the creation, administration, and presentation of texts by
computer. Even reliable access to written texts is a major bottleneck in science and commerce. The amount of textual information is enormous
(and growing incessantly), and the traditional, word- based, information retrieval methods are getting increasingly insufficient as either precision
or recall is always low (i.e. you get either a large number of irrelevant documents together with the relevant ones, or else you fail to get a large
number of the relevant ones in the collection). Linguistically based retrieval methods, taking into account the meaning of sentences as encoded
in the syntactic structure of natural language, promise to be a way out of this quandary.
[Computational Linguistics FAQ, Univ. of Zurich, Switzerland, 2001] http://www.ifi.unizh.ch/groups/CL/CL_FAQ.html
Google = about 97,100 July 19,
2002, about 283,000 Oct. 22, 2004
Linguistics, natural language, and
computational linguistics Meta- Index, Stanford Univ. US
http://www-nlp.stanford.edu/links/linguistics.html
configurable:
Many out-of-the-box solutions claim to be easy to
"customize," when in fact they are referring to configuration options,
not true customizability. Manufacturers have distinct challenges, some
which can be addressed out of the box, but many of which cannot. Manufacturers
also need the ability to capitalize on changing dynamics in the marketplace
before their competitors do. That's why it's imperative to understand the
differences between configuration and customization and the value of selecting a
CRM system that offers the flexibility to adapt and model specific manufacturing
business processes. Why you need to know the difference between
Customizable and Configurable CRM, CDC Software podcast, Intelligent
Enterprise, 2006 http://whitepaper.intelligententerprise.com/cmpintelligententerprise/search/viewabstract/86931/index.jsp
contextual
data: While proteomic studies
initially focused largely on expression and protein identification, progress in
these areas drove the demand for more detailed types of proteomic data. Now
researchers want information about where specific proteins are expressed, both
in terms of tissues and localization within the cell. Information relating
proteins to function require additional details of post- translational
modification, and studies of protein interactions have moved beyond just looking
at binary interactions to studies of protein complexes.
For both genomics and proteomics, this
shift can be characterized as an interest in more contextual data. Enhanced
insight into biological context is essential for obtaining a better
understanding of how biology actually works, and thus there is now an emphasis
to move from genomic and proteomic snapshots to time series data of expression.
Such context is of particular value if biological studies are to be translated
into medical advances, because of the importance of being able to predict the
impact of potential treatments. The integration of genomic and proteomic data
with medical conditions, treatment and outcomes becomes another critical type of
contextual information. Christina Lingham, Beyond Genome: Thinking Globally,
Cambridge Healthtech http://www.beyondgenome.com/download/editorial.pdf
controlled vocabulary:
Robin Cover's XML Cover Pages is described as "a collection of references on matters of Subject Classification, Taxonomies, Ontologies, Indexing, Metadata, Metadata Registries, Controlled Vocabularies, Terminology, Thesauri, Business Semantics",
2003 http://xml.coverpages.org/classification.html
A limited number of words or phrases used in an indexing system (subject headings) or database, to ensure reliable, consistent retrieval. Long used to enhance retrievability and consistency, ontologies and/ or taxonomies certainly sound sexier than "controlled vocabularies" but continue to have a good deal in common.
Taxonomies add hierarchies, while ontologies make information "machine- understandable" as well as
machine- readable.
Google = about 39,700 July
19, 2002; about 85,300 Oct. 22, 2004
Broader terms: ontology, taxonomy Related terms: RDF, semantic web
Thesauri and controlled vocabulary definitions,
National Library of Canada, 2002, http://www.tbs-sct.gc.ca/its-nit/standards/tbits39/crit392_e.asp
customizable:
Quite labor intensive and can be very expensive. Compare configurable.
DAML DARPA Agent Markup Language:
The goal of the DAML effort is to develop a language and tools to facilitate the concept of the semantic web.
http://www.daml.org/
Related term: OIL
DAML + OIL http://www.w3.org/TR/daml+oil-walkthru/
data cleaning, data integration: Algorithms
& data analysis glossary
Google = "data cleaning"
about 12,200; about 22,500 July 3, 2003
"data integration" about 175,000 July 19,
2002; about 306, 000 July 3, 2003; about 817,000 Mar. 22, 2004; about 2,940,000
June 22, 2007
data
conversion: Originally
data conversion was primarily a matter of moving text and database files from
one medium to another, one hardware platform to another, one operating system
environment to another. But as text and database representations became more
sophisticated it became apparent that application interoperability was going to
be the overriding issue of concern. Company History, Data Conversion Lab http://www.dclab.com/company_history.asp
Glossary,
DCL Labs http://www.dclab.com/glossary.asp
30+ definitions
data
management methods: Algorithms & data analysis glossary
has automated methods, methods in this glossary generally
combine human and automated methods.
data
management vocabulary: A third type of
taxonomy that is valuable in a business setting is the data management
vocabulary. This taxonomy is a short list of authorized terms without any
hierarchical structure that is used to support business transactions. For
example, with a large sales force, it is most efficient if salespeople report
their work using the same list of activities. They may count their contacts with
companies according to a simple list of contact types (managers,
decision-makers, and so on), and they may categorize the businesses they work
with according to different controlled descriptors that have to do with the
business's size or market. In this case, a shared taxonomy will help to support
reporting needs of management and other salespeople trying to mine the
information in the future. Without a shared taxonomy, a company risks developing
islands of data that cannot be shared or easily utilized by the rest of the
organization. Susan Conway and Char
Sligar, "What is a taxonomy" Unlocking Knowledge Assets, Chapter 6, Building Taxonomies, Microsoft Press,
2002 http://www.microsoft.com/mspress/books/sampchap/5516a.asp
Google =
about 49 July 9, 2007
Related
terms: descriptive taxonomies, navigational taxonomies
data mart, data mining, data pipelining,
data reduction methods, data warehouse: Algorithms
& data analysis glossary
data visualization: The
classical definition of visualization is as follows: the formation of mental
visual images, the act or process of interpreting in visual terms or of putting
into visual form. A new definition is a tool or method for interpreting image
data fed into a computer and for generating images from complex
multi-dimensional data sets (1987). Definitions and
Rationale for Visualisation, D. Scott
Brown, SIGGRAPH, 1999 http://www.siggraph.org/education/materials/HyperVis/visgoals/visgoal2.htm
includes information on data visualization.
Related term: information visualization;
Broader term: visualization
databases: Bioinformatics
glossary; Databases & software directory
deep web:
Most of the Web's information is buried far down on dynamically generated sites,
and standard search engines never find it. The deep Web is qualitatively different from the surface
Web. Deep Web sources store their content in searchable databases that only
produce results dynamically in response to a direct request. But a direct query
is a "one at a time" laborious way to search. [Michael K.
Bergman "The deep web: surfacing hidden value" White Paper,
BrightPlanet, 2000-2002] http://www.brightplanet.com/deepcontent/tutorials/DeepWeb/index.asp
Another version at http://www.press.umich.edu/jep/07-01/bergman.html
Google = about 10,200 Aug. 17, 2002;
about 42,900 Oct. 22, 2004
Related term: invisible web
description logic:
Has
existed as a field for a few decades yet only somewhat recently has appeared to
transform from an area of academic interest to an area of broad interest. This
paper provides a brief historical perspective of description logic developments
that have impacted DL usability to include communities beyond universities and
research labs. Deborah L.
McGuinness. ``Description Logics Emerge from Ivory Towers''. Stanford
Knowledge Systems Laboratory Technical Report KSL-01-08 2001. In the Proceedings
of the International Workshop on Description Logics. Stanford, CA, August 2001.http://www.ksl.stanford.edu/people/dlm/papers/dls-emerge-abstract.html
The main effort of the research in knowledge
representation is providing theories and systems for expressing structured
knowledge and for accessing and reasoning with it in a principled way. Description
Logics are considered the most important knowledge representation formalism
unifying and giving a logical basis to the well known traditions of Frame- based
systems, Semantic Networks and KL- ONE-like languages, Object- Oriented
representations, Semantic data models, and Type systems. [Description Logic
Knowledge Representation] http://dl.kr.org/
Description
Logics Home Page,
Patrick Lambrix,
Linkoping Univ. Sweden http://www.ida.liu.se/labs/iislab/people/patla/DL/index.html
descriptive ontology: A
descriptive ontology would try to explain how things are, whereas a normative
ontology would try to tell us how things ought to be. [Robert Kent "Ballot
comment", Standard Upper Ontology [SUO] E-mail archive, IEEE, 2001] http://suo.ieee.org/email/msg05921.html
Google = about 121 July 19, 2002;
about 343 Oct. 22, 2004
descriptive taxonomies:
Supports information retrieval through searching. By developing and maintaining a core set of controlled vocabularies, a company can consistently label or tag its content with descriptive metadata selected from these authorized vocabularies. In addition, vocabularies can capture knowledge worker terminology and map it to a company’s preferred terms. ... Active mining of new terms and phrases from emerging content and from search query logs will help keep a descriptive taxonomy relevant to the users of that information. A taxonomy built on the thesaurus model (designating a preferred or authorized term with entry terms or variants) helps to link these different terms together. At search time, the term that the knowledge worker uses is associated with the preferred (or key) term for more precise searching, or the knowledge worker’s term is expanded to include the variant forms of the term as well as
the authorized term for a broader search. Taxonomies built on the thesaurus model do not force all work groups to use a common set of terminology.
Susan Conway and Char
Sligar, "What is a taxonomy" Unlocking Knowledge Assets, Chapter 6, Building Taxonomies, Microsoft Press,
2002 http://www.microsoft.com/mspress/books/sampchap/5516a.asp
Google = about 119 July 19, 2002;
about 201 Oct. 22, 2004; about 456 July 9, 2007
Related terms: bottom-up taxonomies,
data management vocabulary, navigational taxonomies, shared taxonomies
digital libraries:
International digital libraries research is intended to contribute to the fundamental knowledge required to create information systems that can operate in multiple languages, formats, media, and social and organizational contexts. International collaborative research can bring complementary approaches, resources and perspectives to bear on common needs and information technology research challenges. International digital libraries applications testbeds are intended to build operational prototypes for globally distributed, internet- based resources, and to implement these in a variety of applications contexts. The testbeds are expected to advance technologies across the digital libraries lifecycle, focus collective work on organizing domain- specific content, and engage researchers, scholars, students and teachers in enhancing research and knowledge resources in a variety of subject domains. [National Science Foundation, International Digital Libraries Collaborative Research & Applications Testbeds program solicitation, 2002]
http://www.nsf.gov/pubs/2002/nsf02085/nsf02085.html
Google = about 197,000 July
19, 2002; about 1,480,000 Oct. 22, 2004
Directed Acyclic Graph
DAG:
A directed graph where no path starts and ends at the same vertex. See also directed graph, acyclic graph, cycle. Note: Also called a DAG or acyclic digraph.
Also called an oriented acyclic graph. [Paul E. Black, NIST, Dictionary of Algorithms, Data Structures and Problems, 2001]
http://www.nist.gov/dads/HTML/directAcycGraph.html
The difference between a DAG and a hierarchy is that in the latter each child can only have one parent; a DAG allows a child to have more than one parent. A child term may be an "instance" of its parent term (is a relationship) or a component of its parent term (part- of relationship). A child term may have more than one parent term and may have a different class of relationship with its different parents. [Gene Ontology Consortium, General Documentation" 2001]
http://www.geneontology.org/doc/GO.doc.html
Google = about 18,300
July 19, 2002; about 35,000 Oct. 2, 2004
disambiguate:
Make less ambiguous, clarify,
elucidate.
Google = about 33,100 July 19,
2002; about 65,300 Oct. 22, 2004
domain expertise:
Google = about 25,500 Dec. 18, 2002;
about 68,500 Oct. 22, 2004; about 785,000 June 22, 2007
domain ontology:
Ontologies glossary
domain taxonomies:
The first step is
to define the taxonomy of entities in the domain. This consists of firstly
defining the basic classes, then defining the sub- types of these classes.
[Mick O'Donnell, Defining domain taxonomies" Domain Acquisition in Ilex
3.0, 1993-1996] http://www.hcrc.ed.ac.uk/ilex/Manual/extending/Domain-Acquisition/domacq/node4.html#S0....
Google = about 166 July 19, 2002;
about 276 Oct. 22, 2004
drug
discovery informatics:
drug ontology: Drug
discovery & Development
Dublin Core Metadata Initiative:
An open forum engaged in the development of interoperable online
metadata standards that support a broad range of purposes and business models. The original workshop for the Initiative was held in Dublin, Ohio [OCLC] in 1995.
http://dublincore.org/
dynamic ontology:
Ontology glossary
dynamic taxonomies:
Developed as a way of sifting through large amounts of data. At its base it uses a domain specific taxonomic hierarchy, consisting of concepts connected by
is- a relationships. Examples from the medical domain include UMLS and SNOMED. Concepts from the hierarchy are used to classify chunks of guidelines text. The hierarchy is then used as an augmented index for guidelines chunk retrieval. Navigation is done via the operations of browsing and zooming. [Dennis Wollersheim, Implementation of dynamic taxonomies for clinical guidelines retrieval, La Trobe Univ., Australia, c. 2001]
http://homepage.cs.latrobe.edu.au/lewisba/SPIRT/dw2001c.pdf
Google = about 119 July 19, 2002;
about 369 Oct. 22, 2004
evolvability:
Tim Berners Lee defines http://www.w3.org/Talks/1998/0415-Evolvability/slide3-1.htm
Google = evolvability about 8,210
July 19, 2002; about 21,400 Oct. 22, 2004
See
also under interoperability
facet: Ranganathan was the first to introduce the word "facet" into library and information science, and the first to consistently develop the theory of
facet analysis. A facet is, simply put, a category. Taylor defines facets as "clearly defined, mutually exclusive, and collectively exhaustive
aspects, properties, or characteristics of a class or specific subject." Ranganathan demonstrated that analysis, which is the process of breaking
down subjects into their elemental concepts, and synthesis, the process of recombining those concepts into subject strings, could be applied
to all subjects, and demonstrated that this process could be systematized. (Taylor pp
320- 321; Foskett p 390). The phrase
"analytico- synthetic classification" derives from these two processes: analysis and synthesis.
Amanda Maple, "FACETED ACCESS: A REVIEW OF THE LITERATURE" Working Group on Faceted Access to
Music, Music Library Association Annual Meeting, 10 February 1995 http://www.musiclibraryassoc.org/BCC/BCC-Historical/BCC95/95WGFAM2.html
faceted classification: One of the most
powerful, yet least understood, methods of organizing information. Most folks,
when thinking about organizing objects or information, immediately think of a
hierarchical, or taxonomic, organization; a top- down structure, where
you start with a number of broad categories that get ever more detailed, until
you arrive at the object. In such structures, each object has a single home, and
typically, one path to get there -- this is how things are organized in
"the real world", where each item can only be in one place.
Oftentimes, when thinking of organizing information, a hierarchy is where people
begin (think Yahoo!). Faceted classification, on the other hand, is a
bottom- up scheme. Here, each object is tagged with a certain set of
attributes and values (these are the facets), and the organization of these
objects emerges from this classification, and how a user chooses to access them.
... Faceted classification allows for exploration directed by the user, where a
large dataset is progressively filtered through the user's various choices,
until arriving at a manageable set that meet the users' basic criteria. Instead
of sifting through a pre- determined hierarchy, the items are organized on- the-
fly, based on their inherent qualities. [Peter Merholz "Innovation in
classification" Sept. 23, 2001] http://www.peterme.com/archives/00000063.html
The use of facets in information retrieval did not originate with Ranganathan. In the 18th century, a Frenchman named Condorcet devised what
we would now call a faceted classification scheme for organizing information about objects or facts. (Whitrow) The Dewey Decimal
Classification, first published in 1876, contained elements of facet analysis. Dewey recognized four facets common to all basic classes:
bibliographic form, time, place, and general subjects (such as statistics or research) that at times are related to other subjects. (Foskett pp 176-7)
Dewey provided for "number building" to combine two or more facets to express a complex subject. (Taylor p 320) The Universal Decimal
Classification, based on the Dewey Decimal Classification and first published in 1905, was intended to be an international classification scheme.
It also had elements of a faceted structure, and partly influenced Ranganathan's thinking. (Foskett p 349; Vickery pp
12- 14) Amanda Maple, "FACETED ACCESS: A REVIEW OF THE LITERATURE"
Working Group on Faceted Access to Music, Music Library Association Annual Meeting, 10 February 1995
http://www.musiclibraryassoc.org/BCC/BCC-Historical/BCC95/95WGFAM2.html
faceted metadata:
Composed of orthogonal [mutually independent] sets of categories. For example, in the domain of architectural images, some possible facets might be Materials (concrete, brick, wood, etc.), Styles (Baroque, Gothic, Ming,
etc .... and so on. [Jennifer English et. al "Flexible search and navigation using faceted metadata" 2002]
http://bailando.sims.berkeley.edu/papers/chi02_short_paper.pdf
Google = about 360 July 19, 2002;
about 2,530 Oct. 22, 2004
fractal nature of the web: http://www.w3.org/DesignIssues/Fractal.html
Tim Berners- Lee, Commentary on architecture, Fractal nature of the web, first
draft
Society
has to be fractal - people want to be involved on a lot of different levels. The
need for things that are local and special will create enclaves. And those will
give us the diversity of ideas we need to survive. Tim Berners Lee, in "The
father of the web", Evan Schwartz, Wired Mar. 1997 http://www.wired.com/wired/archive/5.03/ff_father_pr.html
GIS Geographic Information Systems:
Maps have traditionally been used to explore the Earth and to exploit its
resources. GIS technology is an expansion of cartographic science. Geographic
information systems (GIS) technology can be used for scientific investigations,
resource management, and development planning. It has enhanced the efficiency
and analytic power of traditional mapping. GIS technology is becoming an
essential tool in the effort to understand the process of global change.
[Is GIS in your future? Boston Chapter, Special Libraries Association
meeting, Mar. 12. 2002] http://www.sla.org/chapter/cbos/meetings/fy02/sci_tech.htm
Good
Informatics Practices Guidance
Document (GIP): A newly drafted comprehensive body of information of
regulatory requirements in the form of existing (GLP, GMP, GCP and Part 11) and
currently used standards compiled in one reference guide for an IT system of a
life science or healthcare environment. http://www.lsit.org/initiatives/gip.php
GUI Graphical User Interface: Computers
& computing glossary
granularity:
<jargon, parallel> The size of the units of code under consideration in some
context The term generally refers to the level of detail at which code is considered, e.g. "You can specify the granularity for this profiling tool". The most common computing use is in parallelism where "fine grain parallelism" means individual tasks are relatively small in terms of code size and execution time, "coarse grain" is the opposite. You talk about the "granularity" of the parallelism. The smaller the granularity, the greater the potential for parallelism and hence speed- up but the greater the overheads of synchronisation and communication. [FOLDOC 1997]
The extent to which a system contains separate components (like granules). The more components in a system - or the greater the granularity - the more flexible it is. [Webopedia]
http://www.webopedia.com/TERM/g/granularity.html
Choosing different levels of granularity, i.e., imposing different quality criteria on models built by homology from representative, experimentally determined [protein] structures, leads to different numbers of family representatives as targets. [NIGMS Structural Genomics Targets Workshop February 11-12, 1999]
http://www.nigms.nih.gov/news/meetings/structural_genomics_targets.html
Concept of granularity, ISWorld Mailing List, Michael Chilton, 2001
http://www.isworld.org/isworldarchives/research.asp#
Level of detail seems to be the essence of granularity.
Google = about 250,000 July 19, 2002;
about 454,000 Oct. 22, 2004
health information data: Includes
Clinical data captured during the process of diagnosis and treatment.
Epidemiological databases , that aggregate data about a population. Demographic
data used to identify and communicate with and about an individual. Financial
data derived from the care process or aggregated for an organization or
population. Research data gathered as a part of care and used for research or
gathered for specific research purposes in clinical trials. Reference data that
interacts with the care of the individual or with the healthcare deliver
systems, like a formulary, protocol, care plan, clinical alerts or reminders,
etc. Coded data that is translated into a standard nomenclature or
classification so that it may be aggregated, analyzed, and compared. [Health
Information Management; Professional definitions, Committees on Professional
Development, American Health Information Management Association, 1999, 2000] http://www.ahima.org/infocenter/definitions/HIMprofessionaldefinition.htm
health information management:
Health
information management improves the quality of healthcare by insuring that the
best information is available to make any healthcare decision. Health
information management professionals manage healthcare data and information
resources. The profession encompasses services in planning, collecting,
aggregating, analyzing, and disseminating individual patient and aggregate
clinical data. It serves the healthcare industry including: patient care
organizations, payers, research and policy agencies, and other healthcare-
related industries. [Health Information Management; Professional
definitions, Committees on Professional Development, American Health Information
Management Association, 1999, 2000] http://www.ahima.org/infocenter/definitions/HIMprofessionaldefinition.htm
Google = about 56,700 Jan. 2, 2003;
about 145,000 Oct. 22, 2004
heavyweight ontologies:
Heavyweight ontologies, by contrast [to lightweight], contain class
hierarchies, constraints, and inference rules. It takes a long time and many
resources to develop and maintain them and it is uncertain if there will be a
benefit from this extra effort. Resource Description Framework (RDF)
and Web Ontology Language (OWL) of the World-Wide Web
Consortium (W3C) are technologies designed to model
heavyweight ontologies. Topic Maps are Emerging: Why Should I Care? H.
Holger Rath, http://www.idealliance.org/papers/dx_xmle04/papers/03-01-03/03-01-03.html
Google = about 21 July 19, 2002;
about 60 Oct. 22, 2004; about 70 May 2, 2005
heavyweight taxonomies, heavyweight taxonomy = 0 [except for this glossary]
heterogeneous data:
informatics:
The study of the application of computer
and statistical techniques to the management of information. In genome
projects, informatics includes the development of methods to search databases
quickly, to analyse DNA
sequence information, and to predict protein
sequence and structure
from DNA sequence data. ORD Office of Rare Diseases, NIH glossary http://ord.aspensys.com/asp/resources/glossary_a-e.asp#A
Narrower terms: bioinformatics;
cheminformatics;
Computers &
computing glossary clinical
informatics, molecular informatics, Biomaterials
matinformatics research
informatics; Drug
discovery & development life sciences informatics, Intellectual
property & legal glossary; patinformatics; Molecular
imaging image informatics; pharmacoinformatics,
pharmainformatics Proteomics
protein informatics
information -- how
much? How Much Information 2003,
School of Information Science and Systems, Univ. of California, Berkeley, 2003 http://www.sims.berkeley.edu/research/projects/how-much-info-2003/index.htm
information architecture: "Involves the design of organization, labeling, navigation, and searching systems to help people find and manage information more successfully."
Lou Rosenfeld, Peter Morville interview quoted in Mark Hurst "About
Information Architecture, Apr. 3, 2000] http://www.goodexperience.com/columns/040300infoarch.html
Google = about 132,000 July 19, 2002;
about 258,000 July 3, 2003; about 622,000 Oct. 22, 2004
Information architecture glossary,
Kat Hagedorn, Argus Associates, 2000, 60 + definitions http://argus-acia.com/white_papers/iaglossary.html
information ecology:
CSTB is
contemplating a major initiative that would examine the rise of new forms of
content, changes in media use patterns and their implications, changes in the
supply of different kinds of content or media and their implications (e.g., for
access, use, and the evolution of specific industries or institutions), and such
ramifications as growing potential for manipulation of digital information,
coping with data overload (data mining, visualization, and other data-intensive
applications), and the internationalization of content production, ownership,
and use. "Under Development" Computer Science and Telecommunications
Board, US National Academics, http://www7.nationalacademies.org/cstb/projects_under_development.html
Google =
about 11,100 Oct. 22, 2004
information extraction:
Computers & computing glossary
information harvesting: See under
Knowledge Discovery in Databases KDD
Google = about 871 July 19, 2002;
about 1,230 July 3, 2003; about 1,730 Oct. 22, 2004; about 1,140,000 June 22,
2007
information
integration: Our research group is developing
intelligent techniques to enable rapid and efficient information integration.
The focus of our research has been on the technologies required for constructing
distributed, integrated applications from online sources. This research
includes: Information
Extraction: Machine learning techniques for extracting information from
online sources; Source
Modeling: Constructing a semantic model of wrapped sources so that they can
be automatically integrated with other sources; Record
Linkage: Learning how to align records across sources; Data
Integration: Generating plans to automatically integrate data across
sources; Plan Execution:
Representing, defining, and efficiently executing integration plans in the Web
environment; Constraint-based
Integration Interactive constraint-based planning and integration for
the Web environment. Information Integration Research Group, Intelligent Systems
Division, Information Sciences Institute (ISI), University of Southern
California http://www.isi.edu/integration/
Google =
about 4,430,000 July 3, 2003; about 1,080,000 June 22, 2007
information management:
Information services of various kinds are fundamental to the discovery,
development and use of medicines. Within the pharmaceutical industry, often
regarded as the epitome of the 'information intensive' industry, research
information units provide both external and internal information provision and
management to discovery and development programmes, while medical information
units provide in- depth information on the company's products to external
doctors, pharmacists, etc., and commercial information units handle information
on competitors, marketing data, etc. Additionally, information personnel are
involved in activities such as records management and archiving, regulatory
affairs, data administration, IT support, and many more. Within the NHS
[National Health Service, UK] , Drug Information Pharmacists provide information
services on effective use of medicines to all healthcare professions, and are
also involved in databases compilation, records management, current awareness
etc. The move towards evidence- based medicine, with consequent need for
evaluation and presentation of information, is of obvious importance to this
group. Other sectors with a heavy reliance on the handling pharmaceutical
information and knowledge include publishing, database production, software
services, and consultancy of varied kinds. [MSc in Pharmaceutical
Information Management, City Univ. London, UK, Dept of Information
Science, Introduction, 2002 ]http://www.soi.city.ac.uk/organisation/is/teaching/pim/
Narrower term: health information
management
Google = about 1,470,000 Jan. 2, 2003;
about 4,200,000 Oct. 22, 2004
information overload:
Biomedicine is in the middle of revolutionary advances. Genome projects, microassay methods like DNA chips, advanced radiation sources for crystallography and other instrumentation, as well as new imaging methods, have exceeded all expectations, and in the process have generated a dramatic information overload that requires new resources for handling, analyzing and interpreting data. Delays in the exploitation of the discoveries will be costly in terms of health benefits for individuals and will adversely affect the economic edge of the country. [Opportunities in Molecular Biomedicine in the Era of Teraflop Computing: March 3 & 4, 1999, Rockville, MD, NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana- Champaign]
http://www.ks.uiuc.edu/Publications/Reports/teraflop/node4.html
Many
of today's problems stem from information overload and there is a desperate need
for innovative software that can wade through the morass of information and
present visually what we know. The development of such tools will depend
critically on further interactions between the computer scientists and the
biologists so that the tools address the right questions, but are designed in a
flexible and computationally efficient manner. It is my hope that we will
see these solutions published in the biological or computational literature.
Richard J. Roberts, The early days of bioinformatics publishing, Bioinformatics
16 (1): 2-4, 2000
"Information overload" is not an overstatement these days. One of the biggest challenges is to deal with the tidal wave of data, filter out extraneous
noise and poor quality data, and assimilate and integrate information on a
previously unimagined scale
Google = about 118,000
July 19, 2002; about 249,000 Oct. 22, 2004
Where's
my stuff? Ways to help with information overload, Mary Chitty, SLA
presentation June 10, 2002, Los Angeles CA
information
retrieval:
information theory: Algorithms
& data analysis glossary
information visualization: The direct
visualization of a representation of selected features or elements of complex
multi- dimensional data. Data that can be used to create a visualization
includes text, image data, sound, voice, video - and of course, all kinds of
numerical data. Our visual analysis systems also provide the tools to interact
with the data that has been visualized so that users can explore, discover and
learn. Users do not look at static images, but can subset the data, run queries,
do time sequence studies and create categories and correlations of data type. [Pacific
Northwest National Lab, About Visualization at PNNL, 1999] http://www.pnl.gov/infoviz/
Google = about 28,100 July 19, 2002;
about 94,200 Oct. 22, 2004
Information visualization resources on
the web, 2002 http://graphics.stanford.edu/courses/cs348c-96-fall/resources.html
Related term: data visualization; Broader
term: visualization
informational repositories:
A new strategy that allows universities to apply serious,
systematic leverage to accelerate changes taking place in scholarship and
scholarly communication, both moving beyond their historic relatively passive
role of supporting established publishers in modernizing scholarly publishing
through the licensing of digital content, and also scaling up beyond ad-hoc
alliances, partnerships, and support arrangements with a few select faculty
pioneers exploring more transformative new uses of the digital medium. Clifford
Lynch, Institutional Repositories: Essential Infrastructure for Scholarship in
the Digital Age, ARL Bimonthly Report 226, Feb. 2003 http://www.arl.org/newsltr/226/ir.html
DSpace,
MIT http://www.dspace.org/
integrated taxonomy: We developed
a comprehensive help taxonomy by combining both user interface and help system
attributes, ranging from help access interface, presentation, and supporting
knowledge structure, to implementation. The taxonomy systematically identifies
independent axes along which help can be categorized which in turn encloses a
space of help categories in which to place currently existing help research, and
identifies distinct help software architectural features which contrast pros and
cons in different approaches to implement help systems. The taxonomy projects a
vision of what help can be like if it is on a par with advances in user
interface technology, and desirable design features of help system architectures
which are in the progressive direction along with the user interface software
tools. [Piyawadee "Noi" Sukaviriya, An Integrated Taxonomy of
Online Help Based on User Interface View, GVU, Georgia Institute of Technology,
GIT-GVU-91-20] http://www.cc.gatech.edu/gvu/reports/1991/abstracts/91-20.html
Google = about 85 July 19, 2002;
about 353 Oct. 22, 2004
integrated
view definitions:
Related
terms: data mediation, knowledge based mediation
integration: Bioinformatics glossary
interoperability:
The ability of two or more systems or components to exchange information and to use the information that has been exchanged. [Institute of Electrical and Electronics Engineers. IEEE Standard Computer Dictionary: A Compilation of IEEE Standard Computer Glossaries. New York, NY: 1990]
http://www.sei.cmu.edu/str/indexes/glossary/interoperability.html
Enabling heterogeneous databases to function in an integrated way, sometimes refers to cross platform functionality and operability across relational, object- oriented, and non- standard types of databases.
Google = about 1,080,000 July 19, 2002;
about 2,380,000 Oct. 22, 2004
Related terms: metadata, ontology, taxonomies ;
Narrower terms: ontology interoperability, semantic interoperability, software interoperability
invisible web:
For this study, we have avoided the term "invisible Web" because it
is inaccurate. The only thing "invisible" about searchable databases
is that they are not indexable nor able to be queried by conventional search
engines. http://www.brightplanet.com/deepcontent/tutorials/DeepWeb/index.asp
Those parts of the web which are inaccessible to current search engines. A straightforward example
was PubMed/ Medline
(until Google started indexing it.) You still can't usually access proprietary (fee- based) databases
such as Thomson Dialog or Lexis- Nexis. except directly. Until recently PDF documents and PowerPoint slides were inaccessible to search engines.
Google = about 17,300 July 19, 2002;
about 278,000 Oct. 22, 2004
Direct Search,
Gary Price, George Washington Univ. US
gary@freepint.com
Invisible Web: Database contents rarely found in Search Engines, Univ. of California- Berkeley, Spring 2001
http://www.lib.berkeley.edu/TeachingLib/Guides/Internet/InvisibleWeb.html
Related terms: deep web, semantic web
just in time information:
90,200 websites were found with this phrase by Google on
May 23, 2007. An increasing need as we are deluged with information and data -- and still need time to reflect, discuss and think about
what all these mean.
Google = about
2,900 March 14, 2002, about 3,400 July 19, 2002; about 51,600 Feb. 21, 2006; about 88,400 May 7, 2007
Just-In-Time Information Retrieval. Bradley J. Rhodes. Ph.D. Dissertation, MIT Media Lab, May 2000. Just in time retrieval agents Bradley J. Rhodes
http://www.research.ibm.com/journal/sj/393/part2/rhodes.html
Related terms: information overload, remembrance agents;
Bioinformatics modularity
Knowledge Discovery in Databases (KDD): Algorithms
& data analysis glossary
knowledge integration:
Related terms: ontologies,
semantics
knowledge management:
An organization's collective knowledge - and the ability to access it - comprises a key corporate asset. Smart organizations know that to maintain competitive advantage, they need to manage their data, information, and knowledge effectively and systematically. Knowledge management involves much more than compiling data and retrieving information. It should be seen as an overarching concept that combines a management philosophy with
data warehousing, workflow strategies, database management, and knowledge distribution in a network computing environment. [William A. Woods "Knowledge Management Needs Effective Search Technology" Sun Journal]
http://www.sun.com/dot-com/sunjournal/V2N1/03_feat2a.html
Google = about 826,000 July 19, 2002;
about 3,520,000 Oct. 22, 2004
Knowledge
Management, FDA, 2004 http://www.fda.gov/cdrh/strategic/km.html
Virtual Library: Knowledge Management, May 2000
http://www.brint.com/km/ Definition, articles, white papers, interviews, business and technology library, periodicals and publications, “out of box thinking”, “movers and shakers”, “think tank”, calendar of events, emerging topics.
Knowledge Management definitions,
Charlie Matthews, VisualInterconnections, 2002 http://www.visualinterconnections.com/CEM/definitions.htm
KM
Glossary, GOTCHA, Univ. of California
Berkeley, 1999 About 50 terms. http://sims.berkeley.edu/courses/is213/s99/Projects/P9/web_site/glossary.htm
Related terms: ontologies, paraphrase problem, taxonomies
knowledge risk: Business
of biopharmaceuticals glossary
laboratory
informatics:
The specialized application of information technology to
maximize laboratory operations. Laboratory informatics encompasses data
acquisition, data processing, laboratory information management system (LIMS),
laboratory automation, scientific data management (including data analysis and
long- term archiving), and electronic laboratory notebooks. Focus is on the
application of this technology in analytical, production, and R&D
laboratories. Graduate Programs: Laboratory Informatics, Indiana Univ.
School of Informatics, US http://www.informatics.iupui.edu/Academics/graduate/laboratory_informatics/index.php
Related term: Drug
discovery & development LIMS
Laboratory
Informatics Primer, Waters Corp http://www.waters.com/WatersDivision/ContentD.asp?watersit=EGOO-6M3TVN
Google = about 1250 Dec.
31, 2002; about 3,000 Oct. 22, 2004
lexical
semantics:
http://en.wikipedia.org/wiki/Lexical_semantics
lexicon:
A
machine- readable dictionary that may contain a good deal of additional
information about the properties of the words, notated in a form that parsers
can utilize. [Bob Futrelle, A brief introduction to NLP, BIONLP.org, , Computer Science,
Northeastern Univ., US, 2002] http://www.ccs.neu.edu/home/futrelle/bionlp/intro.html
A linguistics term (words and their definitions), an
artificial intelligence term. Sometimes a synonym for glossary or dictionary.
Google = about 768,000 July 19, 2002;
about 1,960,000 Oct. 22, 2004
life sciences informatics:
Informatics are
essential at every step of genomics- based drug discovery and development. The
commercial landscape of life sciences information technology has changed
dramatically in the last few years. Bioinformatics,
in particular, has gone through a dramatic boom/bust. While IT companies are
looking to the drug discovery and development arena as a new market opportunity,
pharmaceutical companies are faced with rising pressure to reduce (or at
least control) costs, and have a growing need for new informatics tools to help
manage the influx of data from genomics, and turn that data into tomorrow's
drugs. Key IT tools, such as high- performance computing, Web services, and
grids, are being used to improve the speed and efficiency of drug discovery and
development. True breakthroughs are still lacking, particularly in key areas
such as gene prediction, data mining, protein structure modeling and prediction,
and modeling of complex biological systems. However, most experts agree that IT
and bioinformatics are essential to reaching the improved productivity the
pharmaceutical industry craves.
lightweight ontologies:
Topic maps are seen as lightweight ontologies because they are able to
model knowledge in a very ‘shallow’ way (e.g. just topics, their classes,
occurrences, and associations, but no class hierarchies, constraints, or
inference rules). Even ‘shallow’ topic maps are already very useful without
having put large investments in their creation. Topic Maps are Emerging: Why
Should I Care? H. Holger Rath, http://www.idealliance.org/papers/dx_xmle04/papers/03-01-03/03-01-03.html
Google = about 154 July 19, 2002;
about 287 Oct. 22, 2004; about 274 May 2, 2005
Compare: heavyweight ontologies
lightweight taxonomies: Existing ontologies vary in a continuum from
lightweight taxonomies (thesaura or conceptual vocabularies) to rigorous formalizations.
[Manuela Viezzer, Ontologies and conceptual modeling, 2000-08-31] http://www.cs.bham.ac.uk/~mxv/publications/onto_engineering/node1.html
Google = about 5 July 19, 2002;
about 4 Oct. 22, 2004
logic based ontologies:
Very expressive, model is a set of theories, well defined semantics, Automatic derived classification taxonomies, Concepts are defined and primitive. [Robert Stevens' slides, Univ. of Manchester, UK at Synopsis of the
Bio- Ontologies Workshop at the EBI for MGED, Dec. 5, 2001]
http://www.cbil.upenn.edu/Ontology/EBI_Bioontologies_Workshop.html Some powerpoints still on web.
Google = about 23 July 19, 2002;
about 71 July 14, 2004
lower ontologies: See under middle
ontologies
Google = "lower ontologies"
about 62 "lower level ontologies" about 134 Aug. 8, 2002
machine-readable: See under
metadata
Google= about 303,000 July 19, 2002;
about 535,000 Oct. 22, 2004
machine-understandable: See under
metadata
Google= about
3,730 July 19, 2002; about 8,950 July 14, 2004
markup languages: Computers
& computing glossary
Google = about 639,000 Aug. 9, 2002;
about 170,000 Oct. 22, 2004
mash-up
http://en.wikipedia.org/wiki/Mashup_(web_application_hybrid)
Google
= about 22,100,000 Oct. 27, 2006
Medbiquitous
Consortium: Technology standards
based on XML and web services. http://www.medbiq.org/index.html
medical informatics:
The field of information science concerned with the analysis and dissemination of medical data through the application of computers to various aspects of health care and medicine. [MeSH, 1987]
Medical
informatics has to do with all aspects of understanding and promoting the
effective organization, analysis, management, and use of information in health
care. While the field of medical informatics shares the general scope of these
interests with some other health care specialties and disciplines, medical
informatics has developed its own areas of emphasis and approaches that have set
it apart from other disciplines and specialties. For one, a common thread
through medical informatics has been the emphasis on technology as an integral
tool to help organize, analyze, manage, and use information. In addition, as
professionals involved at the intersection of information and technology and
health care, those in medical informatics have historically tended to be engaged
in the research, development, and evaluation side of things, and in studying and
teaching the theoretical and methodological underpinnings of data applications
in health care. However, today medical informatics also counts among its
profession many whose activities are focused on dimensions that include the
administration and everyday collection and use of information in health care.
What is Medical Informatics? History of MEdical Informatics, AMIA American
MEdical Informatics Association http://www.amia.org/history/what.html
medical Informatics:
Consisting of required course work concerning computer applications in medicine,
computer- assisted medical decision making, biomedical imaging, and
bioinformatics. Mark Musen, Design and Use of Clinical Ontologies: Curricular
Goals for the Education of Health Telematics Professionals, Stanford Medical
Informatics, 1999 http://smi-web.stanford.edu/pubs/SMI_Reports/SMI-1999-0767.pdf
Google = about 163,000 July 19, 2002;
about 479,000 Oct. 22, 2004, about 6,960,000 Oct. 3, 2005
metadata:
Could elevate the status of the web from machine- readable to something we might call machine- understandable. Metadata is "data about data" or specifically in our current context "data describing web resources." The distinction between "data" and "metadata" is not an absolute one; it is a distinction created primarily by a particular application ("one application's metadata is another application's data"). [W3C, "Introduction to RDF Metadata" 1997]
http://www.w3.org/TR/NOTE-rdf-simple-intro
Metadata is machine understandable
information for the web. The W3C
Metadata Activity addressed the combined needs of several groups for a
common framework to express assertions about information on the Web, and was
superceded by the W3C Semantic Web Activity.
[W3C, Metadata and Resource Description, W3C Technology and Society Domain,
2001]http://www.w3.org/Metadata/
Information about data that enables intelligent, efficient access and management of data. … metadata is always less than the data. [Robyne M. Sumpter “Whitepaper on Data Management” Lawrence Livermore National Laboratory, February 10, 1994]
http://www.llnl.gov/liv_comp/metadata/papers/whitepaper-draft.html
more on metadata Ontologies glossary
Google = about 1,640,000 July 19, 2002;
about 4,850,000 Oct. 22, 2004; about 25,600,000 May 9, 2005; about
62,700,000 May 7, 2007
Narrower
terms: Dublin Core Metadata Initiative, faceted metadata Related terms: interoperability, RDF, semantic web
micro-theories:
An ontology about a specific domain, that fits within, and for the most part
is consistent with, an ontology with a broader scope. For example, structural biology fits within the larger context of biology. Structural biology will have its own terminology and specific algorithms that apply within the specific domain, but may not be useful or identical to, for example, the genome community. [Lawrence Berkeley Lab "Advanced Computational Structural Genomics" Glossary]
Google = about 953 July 19, 2002;
about 8,670 Oct. 22, 2004
modularity:
Bioinformatics glossary
molecular informatics:
The effective use of information derived from genomics and proteomics is of central importance and the ability to identify the most important data, to assess its accuracy and to be aware of any assumptions and limitations of hypotheses and predictive models is absolutely essential. Whereas the development of predictive models based on analogy has been very successful in chemistry and cheminformatics, the complex nature of biomolecular systems limits similar transference within bioinformatics. Without a critical analysis,
in- silico discovery will be unable to be effectively integrated in the field of molecular informatics. The following themes will be covered: knowledge discovery and data mining, rational drug design, prediction of small molecule bioavailability (ADME Tox) properties, protein structure and function determination, new methods of drug- target modeling, cellular metabolism, and the use of high- throughput methods (biochips) for acquiring gene expression and protein binding information. [Beilstein- Institut, Molecular Informatics: Confronting Complexity International -Workshop May 13- 16 2002]
http://www.beilstein-institut.de/pdf_files/bozen_02_scientific_program.pdf
Unilever is investing over £13M to establish a new world- leading research group within the Department of Chemistry [Univ. of Cambridge, UK] in the emerging field of Molecular Informatics.
.. New methods will be devised for creating, manipulating and storing molecular data to deepen our understanding of molecules and their properties and to allow novel
in- silico experimentation.
Inter- disciplinary research is a fundamental goal of the centre, integrating chemical, biological and materials sciences through molecular informatics. [Cambridge Univ. Chemical Laboratory, UK, 2000-2001]
http://www-ucc.ch.cam.ac.uk/
Google = about 2,580 July 19, 2002;
about 4,410 Oct. 22, 2004
molecular information theory: Algorithms
& data analysis glossary
molecular taxonomy: Cancer
genomics glossary
"molecular taxonomy" Google = about 1,650 July 19, 2002;
about 5,260 Oct. 22, 2004
"molecular taxonomies" Google = about 11 July 19, 2002;
about 106, Oct. 22, 2004
Broader term: taxonomy
nanopublishing:
A term coined by Jeff Jarvis, head of content,
technology, and strategic development for Advance. This is part of the Newhouse
media group that owns Conde Nast, among other things. In the past, Jarvis
started Entertainment Weekly. Now, he's a committed blogger and his company has
put its money where his mouth is, that is, in Pyra, the company behind Blogger.
Jim McClellan, New biz on the blog, Guardian Jan. 30, 2003 http://www.guardian.co.uk/online/story/0,3605,884658,00.html
National
Center for Biomedical Ontology: http://www.bioontology.org/index.html
natural language ontologies:
Hand crafted, flexible but difficult to evolve, maintain and keep consistent, with weak semantics. Example Gene Ontology [Robert Stevens' slides, Univ. of Manchester, UK at Synopsis of the Bio-Ontologies Workshop at the EBI for MGED, Dec. 5, 2001]
http://www.cbil.upenn.edu/Ontology/EBI_Bioontologies_Workshop.html
Google = about 69 July 19, 2002;
about 96 Oct. 22, 2004
Natural Language
Processing NLP:
<artificial
intelligence> (NLP) Computer understanding, analysis, manipulation, and/or generation of natural language. This can refer to anything from fairly simple string- manipulation tasks like stemming, or building concordances of natural language texts, to higher- level
AI [artificial intelligence] -like tasks like processing user queries in natural language. [FOLDOC]
The newly emergent interest in natural
language processing for biology has been christened "Information
Extraction". But work in this area has been going on for many decades under
different names and this site includes a good deal of information about past and
current work in NLP and in information extraction for biology in particular.
[BIONLP.org, Bob Futrelle, Computer Science,
Northeastern Univ., US, 2002] http://www.ccs.neu.edu/home/futrelle/bionlp/
Google = about 166,000 July 19, 2002;
about 471,000 Oct. 22, 2004
START
Natural Language Question Answering System,
InfoLab Group, Computer Science and Artificial Intelligence Lab, MIT http://www.ai.mit.edu/projects/infolab/start-system.html
navigational taxonomies:
Aimed at discovering information through browsing. Once again the taxonomy provides a
controlled vocabulary, but rather than using it in the background for manipulating queries, you can display this taxonomy to knowledge workers to help them find the information they need. The navigational taxonomy consists of labels applied to categories of content based on knowledge workers’ mental models of how the information is organized. ... A navigational taxonomy is based on user behavior and not on content. As a result, the category labels may be organized differently from the concept- based descriptive taxonomy, and they also may contain words or phrases that would not meet the standards of a descriptive taxonomy. ... navigational taxonomies are often specialized and unique to an instance of information presentation (a portal, a site, an intranet), and multiple content management systems do not typically reuse them as they would a descriptive taxonomy. Navigational taxonomies are therefore not governed by the same rules about which taxonomy terms can be changed.
Susan Conway and Char Sligar, "What is a taxonomy" Unlocking Knowledge Assets, Chapter 6, Building Taxonomies, Microsoft Press,
2002
http://www.microsoft.com/mspress/books/sampchap/5516a.asp
Google = about 21 July 19, 2002;
about 27 Oct. 22, 2004; about 83 July 9, 2007
OIL Ontology Inference Layer:
A
proposal for a web- based representation and inference layer for ontologies,
which combines the widely used modelling primitives from frame- based languages
with the formal semantics and reasoning services provided by description logics.
It is compatible with RDF Schema
(RDFS), and includes a precise semantics
for describing term meanings (and thus also for describing implied information).
http://www.ontoknowledge.org/oil/
object based ontologies:
Computers
& computing glossary
Google = about 17,500 July 19, 2002
ontological commitment:
An agreement to use a vocabulary (i.e., ask queries and make assertions) in a way that is consistent (but not complete) with respect to the theory specified by an ontology. We build agents that commit to ontologies. We design ontologies so we can share knowledge with and among these agents. [Tom Gruber, What is an ontology?" Knowledge Systems Lab, Stanford Univ. 2001]
http://www-ksl.stanford.edu/kst/what-is-an-ontology.html
Google = about
2, 370 July 19, 2002; about 5,980 Oct. 22, 2004
ontology,
ontologies: A formal explicit specification of a shared conceptualization. In this context conceptualization refers to an abstract model of some phenomenon in the world that identifies that phenomenon's relevant concepts. Explicit means that the type of concepts used and the constraints on their use are explicitly defined, and formal means that the ontology should be machine understandable. ... Shared reflects the notion that an ontology captures consensual knowledge- that is, it is not restricted to some individual but is accepted by a group.
Dieter Fensel et. al "OIL: An Ontology Infrastructure for the Semantic Web" IEEE Intelligent Systems, Mar/Apr.
2001 www.cs.vu.nl/~frankh/postscript/IEEE-IS01.pdf
The word "ontology" seems to generate a lot of controversy in discussions about AI
[artificial intelligence]. It has a long history in philosophy, in which it refers to the subject of existence. ... In the context of knowledge sharing, I use the term ontology to mean a specification of a conceptualization. That is, an ontology is a description (like a formal specification of a program) of the concepts and relationships that can exist for an agent or a community of agents. This definition is consistent with the usage of ontology as set- of- concept- definitions, but more general. And it is certainly a different sense of the word than its use in philosophy. What is important is what an ontology is for. My colleagues and I have been designing ontologies for the purpose of enabling knowledge sharing and reuse. In that context, an ontology is a specification used for making ontological commitments. ... Notes: 1) Ontologies are often equated with taxonomic hierarchies of classes, but class definitions, and the subsumption relation, but ontologies need not be limited to these forms.
Tom Gruber, Stanford Univ. "What is an ontology?", 2001
http://www-ksl.stanford.edu/kst/what-is-an-ontology.html
more in Ontologies glossary
1.1 What is an ontology? W3C, Requirements
for a web ontology language, working in progress] http://www.w3.org/TR/webont-req/#onto-def
Similar to a dictionary or glossary, but with greater detail and structure that enables computers to process its content. An ontology consists of a set of concepts, axioms, and relationships that describe a domain of interest. An upper ontology is limited to concepts that are meta, generic, abstract and philosophical, and therefore are general enough to address (at a high level) a broad range of domain areas. [IEEE, Standard Upper Ontology (SUO) Working Group, 2002]
http://suo.ieee.org/
Long used in philosophy and artificial intelligence, major differences from controlled vocabularies or
taxonomies is the idea of making information "machine- understandable" as well as machine- readable, and amenable to logic (particularly by agreeing upon one specific meaning for a term.
Terminology
of methods and techniques for defining, sharing, and merging ontologies, John F.
Sowa, 2001.18 definitions, including formal ontology, mixed ontology, prototype
type ontology, terminological ontology. http://users.bestweb.net/~sowa/ontology/gloss.htm
Human
Ontology Resources, SOFG Standards
and Ontologies for Functional Genomics, http://www.sofg.org/resources/human.html#cbil
Google = ontology about 336,000 July 19, 2002;
about 1,140,000 Oct. 1, 2003; about 1, 250,000 Oct. 22, 2004
Narrower terms: bottom-
up ontologies, biomedical ontologies, common ontology, descriptive ontology, domain ontology, dynamic ontology, heavyweight ontologies, lightweight ontologies, logic based ontologies, micro-
theories, middle ontologies, mixed ontologies, taxonomies, natural language ontologies,
navigational ontology, object based
ontologies, orthogonal ontologies, pure ontologies, reusable ontologies, shared
ontologies, simple ontologies, structured ontology, top- down ontology, upper ontologies;
Functional genomics glossary Gene
OntologyTM GO;
Related terms: interoperability, metadata,
OIL Ontology Inference Layer, ontological commitment, ontology annotation tools, ontology editors,
ontology evolution, ontology interoperability, RDF, semantic web, web ontology
language; Microarrays glossary Ontology Working
Group
ontology annotation tools:
Link unstructured and semistructured information sources with ontologies. [Dieter Fensel et. al "OIL: An Ontology Infrastructure for the Semantic Web" IEEE Intelligent Systems, Mar/Apr. 2001]
www.cs.vu.nl/~frankh/postscript/IEEE-IS01.pdf
ontology editors:
Help human knowledge engineers build ontologies - they support the definition of concept hierarchies, the definition attributes for concepts, and the definition of axioms and constraints. They must provide graphical interfaces and conform to existing standards in Web- based software development. They enable the inspecting, browsing, codifying, and modifying of ontologies, and they support ontology development and maintenance tasks. [Dieter Fensel et. al "OIL: An Ontology Infrastructure for the Semantic Web" IEEE Intelligent Systems, Mar/Apr. 2001]
www.cs.vu.nl/~frankh/postscript/IEEE-IS01.pdf
Google = about 314 July 19, 2002;
about 873 Oct. 22, 2004
Related term: Computers
& computing glossary GUI Graphical User Interface
ontology evolution:
3.2 Ontology evolution,
W3C,
Requirements for a web ontology language, work in progress] http://www.w3.org/TR/webont-req/#goal-evolution
Google = about 234 July 19, 2002;
about 886 Oct. 22, 2004
ontology interoperability:
3.3 Ontology interoperability,
W3C,
Requirements for a web ontology language, work in progress http://www.w3.org/TR/webont-req/#goal-interoperability
Google = about 89 July 19, 2002;
about 276 Oct. 1, 2003; about 284 Oct. 22, 2004
Broader term: interoperability
ontology
language: An ontology must be encoded in some
language. If one is using a simple ontology, few issues arise. However, if one
is considering a more complex ontology, expressive power of a representation and
reasoning language needs to be considered. As with any problem where a language
is being chosen, it must be epistemologically adequate -- the language must be
able to express the concepts in the domain. Deborah L. McGuinness,
"Ontologies Come of Age". In Dieter Fensel, J im Hendler, Henry
Lieberman, and Wolfgang Wahlster, editors. Spinning the Semantic Web: Bringing
the World Wide Web to Its Full Potential. MIT Press, 2002. http://www.ksl.stanford.edu/people/dlm/papers/ontologies-come-of-age-mit-press-
Open
Biomedical Ontologies OBO: A collaborative experiment involving developers
of science-based ontologies who are establishing a set of principles for
ontology development with the goal of creating a suite of orthogonal
interoperable reference ontologies in the biomedical domain. http://www.obofoundry.org/
organizational
informatics: A field which studies the development and
use of computerized information systems and communication systems in
organizations. It includes social studies of their conception, design, effective
implementation within organizations, maintenance, use, organizational value,
conditions that foster risks of failures, and their effects for people and an
organization's clients. It is an intellectually rich and practical research area.
"Social Informatics" Indiana Univ, School of Library & Information
Science http://www.slis.indiana.edu/SI/oi1.html
Narrower
term: social informatics
orthogonal ontologies: Independent,
same basis for classification at all levels. Bernd G. Wenzel, Integration of
industrial data: Overview, NeuroSTEP and Shell, 1996- http://www.tc184-sc4.org/SC4_Open/SC4_and_Working_Groups/WG10/N-DOCS/Files/wg10n116.pdf
1997
Google = about 6 July 19, 2002;
about 72 Oct. 22, 2004
Related term: pure ontologies. Compare
mixed ontologies
orthogonal taxonomies:
Independent taxonomies, disjoint, with no overlap
parallel processing: The processing of program instructions by dividing them among multiple processor with the objective of running a program in less time [whatis.com]
http://whatis.techtarget.com/definition/0,289893,sid9_gci212747,00.html
Google = about 24 July 19, 2002;
about 45 Oct. 22, 2004
paraphrase problem:
The situation that arises when the terminology used in the request is different from that used by the author. [William A. Woods, Sun Microsystems Research]
http://research.sun.com/people/wwoods/ Conceptual Indexing for Precision Content Retrieval http://research.sun.com/knowledge/
Google = about 153 July 19, 2002;
about 211 Oct. 22, 2004
Related term:
knowledge management
pattern,
pattern language: Patterns, discussion FAQ http://g.oswego.edu/dl/pd-FAQ/pd-FAQ.html
phylogenetic
taxonomy: Phylogenomics glossary
Google = about 929 July 19, 2002;
about 1,900 Oct. 22, 2004
portal:
An entry or starting point on the web, with a mixture of content and services, usually capable of personalization.
Narrower term: web portal
precision:
Percentage of unrelated material excluded by a specific query or search statement.
Related
terms: Genetic testing
analytical specificity, clinical specificity
Compare recall
pure ontologies:
The
basis for classification is the same throughout the classification hierarchy.
Such ontologies can be expected to be orthogonal. Here orthogonal will mean that
classes at a level will be mutually exclusive. On the other hand an object can
be a member of a class in more than one ontology. [Matthew West, Integration of
Industrial Data for Exchange, Access and Sharing (IIDEAS), NIST, ISO
TC184/SC4/WG10 N71, 1996]
http://www.nist.gov/sc4/wg_qc/wg10/current/n071/wg10n071.htm
Google = about 13 July 19,
2002; about 17 Oct. 22, 2004
Related term: orthogonal
ontologies
query contraction:
Needed when a
search engine retrieves thousands of citations. May consist of additional
(Boolean AND terms) or different (Boolean OR).
Google = about 26 July 19, 2002;
about 130 Oct. 22, 2004
query expansion:
Adding new and/ or
different terms to a search statement (particularly when a search engine or
database retrieve no hits). Often uses Boolean OR.
Google = about 7,500 July 19, 2002;
about 21,300 Oct. 22, 2004
Related terms: ontologies, taxonomies
RDF Resource Description Framework:
Integrates a variety of web- based
metadata activities including sitemaps, content ratings, stream channel definitions, search engine data collection (web crawling), digital library collections, and distributed authoring, using
XML as an interchange syntax. The RDF specifications provide a lightweight ontology system to support the exchange of knowledge on the Web. [W3C, Semantic Web Activity: Resource Description Framework (RDF) Mar. 2001]
http://www.w3.org/RDF/
Related term: knowledge management
RSS
[Really Simply Syndication] feeds: A Web content syndication format based on
XML. Cathleen Moore, Search engines target weblogs, InfoWorld, Mar. 17, 2003 http://www.infoworld.com/article/03/03/17/HNblogs_1.html
Newsreaders
http://directory.google.com/Top/Reference/Libraries/Library_and_Informat...e
RSS
2.0 specifications, Dave Winer http://blogs.law.harvard.edu/tech/rss/
recall:
The percentage of applicable material retrieved by a specific query or search statement.
Compare precision. Related term:
Genetic testing glossary sensitivity
regulated
information systems: Drug approvals
glossary
relevance:
Percentage of truly related material retrieved by a specific query or search statement.
Related terms: precision
Genetic testing glossary
analytical specificity, clinical specificity. Compare recall
remembrance agents:
A set of applications that watch over a user's shoulder and suggest information relevant to the current situation. While query- based memory aids help with direct
recall, remembrance agents are an augmented associative memory. [Bradley Rhodes, Remembrance Agents Because serendipity is too important to be left to chance..., 2001]
http://rhodes.www.media.mit.edu/people/rhodes/RA/
Google = about 673 July 19, 2002;
about 549 Oct. 22, 2004
Related
terms: collaborative filtering, just in time information
research informatics:
Research glossary
resourceome:
-Omes & -Omics glossary
reusable ontologies: A key enabler for electronic Commerce, Richard Fikes, Knowledge
Systems Lab, Stanford Univ. http://ksl-web.stanford.edu/Reusable-ontol/index.html
Google = about 597 July 19, 2002;
about 1,330 Oct. 1, 2003; about 778 Oct. 22, 2004
Related term: shared ontologies
reusable taxonomies:
Metadata, Taxonomies and Content
Reusabilities, Marcia Morante http://adlcommunity.net/file.php/11/Documents/Eedo_Knowledgeware_Metadata_Taxonomies_and_Content_Reusability.pdf
Google = about 5 July 19, 2002;
about 8 Oct. 1, 2003; about 8 Oct. 22, 2004; about 8 June 22, 2007
Rosetta:
A systems- level design
language developed to address requirements specification for systems- on- chip
designs. Rosetta specifically addresses problems associated with heterogeneity
and complexity in current systems. Specifically, Rosetta allows designers to
develop and integrate specifications written in multiple semantic models to
provide language and semantic support for concurrent engineering of electronic
systems. Accellera Rosetta Standards Committee Homepage, EDA Industry
Working Groups, 2002
http://www.eda.org/slds-rosetta/
SOAP Simple Object Access Protocol:
A
lightweight protocol for exchange of information in a decentralized, distributed
environment. [SOAP, W3C 1.1, work in progress] http://www.w3.org/TR/SOAP/
semantic data integration:
Semantic data integration requires a shared understanding of the meaning of mathematical data. Until recently, math protocols provided no
support for shared semantics beyond the meaning of the primitive data types and simply assumed that the communicating partners ``knew''
each other. An important task of the Computer Algebra community is to close this semantic gap. Several initiatives addressing this problem are
underway (MP, OpenMath, MathBus) and we hope that more experience and a careful evaluation of the proposals will lead to a unifying
solution. Olaf Bachmann, Hans Schönemann "A Proposal for
Syntactic Data Integration for Math Protocols" Centre for Computer Algebra,
Dept. of Mathematics, Univ. of Kaiserslautern, Germany http://www.mathematik.uni-kl.de/~zca/Reports_on_ca/10/paper_html/node1.html
Google = about 214 July 19, 2002;
about 1,530 Oct. 22, 2004; about 23,900 June 22, 2007
semantic grid: As the Semantic
Web is to the Web, so is the Semantic Grid to the Grid. Rather than orthogonal
activities, we see the emerging semantic web infrastructure as an infrastructure
for grid computing applications. http://www.semanticgrid.org/
Google = about 190 July 19, 2002;
about 5,470 Oct. 22, 2004; about 182,000 June 22, 2007
Related term: Computers
& computing grid computing
semantic heterogeneity:
Semantic heterogeneity in document encoding systems is a serious obstacle to the interoperability required to create a critical mass of content for the electronic publishing industry.
This is a problem which persists even after a common syntax (e.g.
XML) has been adopted, and sometimes even when common vocabularies are used.
[Scholarly Technology Group, Brown Univ., US Jan 2002 http://www.stg.brown.edu/news/2002/nist_report.html
Different databases use different
controlled vocabularies, thesauri, taxonomies and/ or free text.
Google = about 2,820 July 19, 2002;
about 6,080 Oct. 22, 2004; about 78,700 June 22, 2007
Contrast with: structural heterogeneity Related terms: Natural Language Processing NLP;
Bioinformatics glossary databases, federated databases, integrated databases
semantic
interoperability:
Jeff Heflin, James Hendler, Semantic
interoperability on the web, Extreme Markup Languages, 2000 http://www.cs.umd.edu/projects/plus/SHOE/pubs/extreme2000.pdf
Google = about 7,280 Apr. 24, 2003;
about 18,300 Oct. 22, 2004; about 330,000 June 22, 2007
semantic relationships: Denote concepts such as water, sea, and river, that
are by definition permanent relationships; they arise from the definition of the subjects involved, and are not dependent on any particular
document content. ... Foskett described three groups of semantic relationships: equivalence, hierarchical, and affinitive/associative. In equivalence relationships,
more than one term denotes the same concept. These relationships are shown through
cross- references in an alphabetical tool, and through juxtaposition in a classified tool. Hierarchical relationships are of two kinds:
genus/ species and whole/ part. These relationships are shown
through hierarchies in classified tools and with Broader and Narrower Term codes in alphabetical tools. Foskett described several kinds of
affinitive/ associative relationships; these relationships are denoted by Related Term codes. (Foskett pp 72- 78)
Amanda Maple, "FACETED ACCESS: A REVIEW OF THE LITERATURE"
Working Group on Faceted Access to Music, Music Library Association Annual Meeting, 10 February 1995
http://www.musiclibraryassoc.org/BCC/BCC-Historical/BCC95/95WGFAM2.html
Related term: syntactic
relationships
semantic
transparency: Within the context of
interoperable XML- based information processing, "semantic
transparency" means that machines and humans are presented with
information that is both unambiguous (having a precise, predictably
interpreted meaning) and meaningfully correct (simultaneously satisfying a
number of integrity constraints). Computer agents, in particular, must
exchange well- defined data in order to calculate and pass along "the
correct answer." Semantic transparency first requires that small
information objects as well as large information objects built from
smaller ones are formally specified at a detailed level in terms of their
fundamental characteristics, relationships, and natural integrity
constraints, such that validation tools can apply heuristics to test
information correctness. Given unambiguous semantic specification, both
computing agents and humans can verify that XML- encoded information is
meaningful and trustworthy. Managing Names and Ontologies: An XML Registry
and Repository, Robin Cover (OASIS)
http://www.sun.com/981201/xml/
semantic web:
The Semantic Web is a vision: the idea of having data on the Web defined and linked in a way that it can be used by machines not just for display purposes, but for automation, integration and reuse of data across various applications. In order to make this vision a reality for the Web, supporting standards, technologies and policies must be designed to enable machines to make more sense of the Web, with the result of making the Web more useful for humans.
Facilities and technologies to put machine- understandable data on the Web are rapidly becoming a high priority for many communities. For the Web to scale, programs must be able to share and process data even when these programs have been designed totally independently. The Web can reach its full potential only if it becomes a place where data can be shared and processed by automated tools as well as by people. [W3C, Semantic Web Activity Statement, Apr. 2001]
http://www.w3.org/2001/sw/Activity
The first layer of the semantic Web consists of ontologies and taxonomies, like "A machine bolt is a type of screw." "A huge amount of this is being done very desperately in the realm of
biotech, for the human genome and new drug development. When you look at a Web services description, you realize that it's really just a very small
ontology" Tim Berners Lee, August 30, 2001 keynote at Software Development East in Boston. [Alexandra Weber Morales "Web founder seeks simplicity" Show Daily Online, 2001]
http://www.sdgnews.com/sd2001es_006/sd2001es_006.htm
Google = about 71,600 July 19, 2002;
about 967,000 Oct. 22, 2004; about 19,800,000 June 22. 2007
Semantic Web Business Special Interest Group:
http://business.semanticweb.org/
Semantic web challenge: http://challenge.semanticweb.org/
Semantic Web Community Portal
http://www.semanticweb.org/
Semantic Web HCLS Health Care and Life Sciences Interest Group http://www.w3.org/2001/sw/hcls/
Broader term: web Related terms: metadata, ontology,
RDF, taxonomies, XML. Compare: syntax
semantics:
How the information [in a data file] should be interpreted by others. [Russ Altman "Challenges for Biomedical Informatics and Pharmacogenomics, Stanford Medical Informatics, c.2001] http://www-smi.stanford.edu/pubs/SMI_Reports/SMI-2001-0898.pdf
shared ontologies: 3.1 Shared
ontologies, W3C, Requirements for a web ontology language, work in progress http://www.w3.org/TR/webont-req/#goal-shared-ontologies
Designing
Shared Ontologies, JOHO the Blog, 2004 http://www.hyperorg.com/blogger/mtarchive/003057.html
Google = about 1,090 July 19, 2002;
about 2,450 Oct. 1, 2003; about 2,520 Oct. 22, 2004
Related term: reusable ontologies
shared taxonomies:
Shared
Taxonomies, LouisRosenfeld.com, 2004 http://www.louisrosenfeld.com/home/bloug_archive/000276.html
Google = about 12 July 19, 2002;
about 70 Oct. 22, 2004; about 86 May 2, 2005; about 217 June 22, 2007
social
informatics: Refers to the body of research and study that examines social
aspects of computerization -- including the roles of information technology in
social and organizational change and the ways that the social organization of
information technologies are influenced by social forces and social practices. [1]
SI includes studies and other analyses that are labeled as social impacts of
computing, social analysis of computing, studies of computer-
communication (CMC), information policy, "computers and society," organizational
informatics, interpretive informatics, and so on. http://www.slis.indiana.edu/SI/concepts.html
The term "Social Informatics"
emerged from a series of lively conversations in February and March 1996 among
scholars with an interest in advancing critical scholarship about the social
aspects of computerization, including Phil Agre, Jacques Berleur, Brenda Dervin,
Andrew Dillon, Rob Kling, Mark Poster, Karen Ruhleder, Ben Shneiderman, Leigh
Star and Barry Wellman. As the conversation developed, it became clear that
labels that could energize scholars in one sub- community could readily turn off
participants in other communities. Various participants preferred different
labels; a sufficient consensus emerged around "Social Informatics"
that it can serve as a working label. ["Conceptions of social
informatics" Indiana Univ., School of Library and Information Science,
2002] http://www.slis.indiana.edu/SI/concepts.html
A serviceable working conception of "social informatics" is that it identifies a body of research that examines the social aspects of computerization. A more formal definition is "the interdisciplinary study of the design, uses and consequences of information technologies that takes into account their interaction with institutional and cultural contexts."
... Social informatics has been a subject of systematic analytical and critical research for the last 25 years. Unfortunately, social informatics studies are scattered in the journals of several different fields, including computer science, information systems, information science and some social sciences. Each of these fields uses somewhat different nomenclature. This diversity of communication outlets and specialized terminologies makes it hard for many
non- specialists (and even specialists) to locate important studies. [Rob Kling,
What is social informatics and why does it matter? D-Lib 5(1): Jan. 1999] http://www.dlib.org/dlib/january99/kling/01kling.html
Social informatics HomePage http://www.slis.indiana.edu/SI/
Red Rock Eater News Service, Phil Agre, UCLA,
US http://polaris.gseis.ucla.edu/pagre/rre.html
structural heterogeneity:
Different databases use different fields, fieldnames and relationships between elements. This
can also be a term in structural biology
Google = about 2,210 July 19, 2002;
about 9,340 Oct.. 22, 2004
Compare semantic heterogeneity Related term: metadata
structure:
In a biological or
anatomical context, the term structure is associated with two distinct concepts
(meanings): 1. a material object generated as a result of coordinated gene
expression, which necessarily consists of parts (e.g., hemoglobin molecule,
cell, heart, human body); and 2. the manner of organization or interrelation of
the parts that constitute a structure specified by the first definition (i.e.,
the structure of a structure). Both definitions emphasize the critical need for
declaring the principles according to which units of organization can be defined
in order to be able to state what is 'whole' and what is 'part'. Specifying the
manner in which parts interrelate must satisfy two requirements: 1. to determine
the kinds of parts of which various structures may be constituted; and 2. to
state the manner of spatial organization of parts by describing their
boundaries, continuities and attachments, as well as their location, orientation
and spatial adjacencies in terms of qualitative coordinates (in addition to the
quantitative geometric coordinates, which are embedded in the Visible Human data
sets). [Cornelius Rosse, et. al., Visible Human, Know Thyself: The Digital
Anatomist Dynamic Structural Abstraction, National Library of Medicine, US] http://www.nlm.nih.gov/research/visible/vhpconf2000/AUTHORS/ROSSE/TEXTINDX.HTM
Related terms: Cell
biology glossary, Expression glossary Compare
unstructured.
subsumption:
http://ai.eecs.umich.edu/cogarch0/subsump/
Google = about 30,800 July 19, 2002;
about 80,500 Oct. 22, 2004; about 159,000 May 2, 2005
syntactic heterogeneity:
Fausto
Giunchiglia, Pavel Shvaiko, rewritten by Stefano Zanobini, Semantic Matching,
2002 http://www.science.unitn.it/~tomasi/think/pdf/zanobini.pdf
Google = about 114 July 19, 2002;
about 243 Oct. 1, 2003; about 201 Oct. 22, 2004; about 227 May 9, 2005
syntactic relationships:
Denote otherwise unrelated concepts that are brought together as composite
subjects in the documents being indexed. These relationships are not permanent, but rather ad hoc.
... Syntactic relationships are displayed according to the syntax of a normal sentence, either through the syntax of the subject string (in
precoordinate indexing), or through devices such as facet indicators (in postcoordinate indexing). The result of not providing for the display of
syntactic relationships in postcoordinate systems results in users not being able to distinguish between different contexts for the same term.
... recent research in information retrieval also supports the use of syntactic as well as
semantic relationships. Amanda Maple, "FACETED ACCESS: A REVIEW OF THE LITERATURE"
Working Group on Faceted Access to Music, Music Library Association Annual Meeting, 10 February 1995
http://theme.music.indiana.edu/tech_s/mla/facacc.rev
Related term: semantic relationships
syntax:
How information is structured in a data file. [Russ Altman "Challenges for Biomedical Informatics and Pharmacogenomics, Stanford Medical Informatics, c.2001]
http://www-smi.stanford.edu/pubs/SMI_Reports/SMI-2001-0898.pdf
Compare
semantics
taxonomies,
taxonomy: Taxonomies define a world- view because they specify which characteristics that compose each item count as important and then they lay out the relationships that exist between those characteristics. Taxonomies are political, value- laden instruments of organization that have a wide- array of assumptions embedded within them. Along more formal lines, a taxonomy is a structured vocabulary that identifies a single key term to represent a concept that could be described using several words. [Katherine C. Adams "Immersed in Structure: The Meaning and Function of Taxonomies" Internetworking Aug. 2000] http://www.internettg.org/newsletter/aug00/article_structure.html
Frustrations with search engines and information retrieval (and information overload) have led to increased interest in specialized taxonomies. A form of controlled vocabulary, with hierarchical relationships (broader terms, narrower terms) which provide additional suggestions for browsing, as do lateral relationships (related terms) and preferred terms. While there is theoretical interest in
natural language processing, a very small percentage of web search engine queries actually use natural language processing successfully.
Directories such as Yahoo or the Open Directory Project are sometimes called taxonomies. In biology taxonomies are so associated with Linnaeus, and bioinformatics so dependent upon computers that ontology is almost always the preferred term in this context.
Google taxonomy = about 617,000 July 19, 2002,
about 3,270,000 Oct. 1, 2003, about 3,190,000 Oct. 22, 2004
Narrower terms:
bottom-up taxonomies, controlled vocabularies, descriptive taxonomies, domain
taxonomies, dynamic taxonomies, integrated taxonomy, lightweight taxonomies, morphological taxonomies, navigational taxonomies, orthogonal
taxonomies, shared taxonomies, top- down taxonomy; Cancer
genomics glossary molecular taxonomies Phylogenomics
glossary molecular taxonomy, phylogenetic taxonomy;
Related terms: classifiers, query expansion; Broader term: ontologies
See also FAQ
question # 4 which has more about taxonomies.
term mining:
Term Mining in Biomedicine, Sophia Ananiadou - University of Manchester,
2007 http://talks.cam.ac.uk/talk/index/6769
Google = about 1,990
June 16, 2003; about 2,980 Oct. 22, 2004; about 40,100 June 22, 2007
text categorisation: See Algorithms
& data analysis glossary under support vector machines
Google = about 902 "text
categorization" 9,220 July 19, 2002 about 27,100 Oct. 22, 2004
text mining:
Usually
data mining technologies mine knowledge from data with well-formed schemes such
as relational tables. But, text data don't have such scheme, and information is
described freely in the documents. Therefore, we focus on Natural Language
Processing (NLP) technologies to extract such information. Using NLP
technologies, documents are transformed into a collection of concepts, described
using terms discovered in the text.
Usually, "text
mining" is used to indicate a text search technique. But, we think of text
mining as having more functions. Text mining technologies extract more
information than just picking up keywords from texts: facts, author's
intentions, their expectations, and their claims. Tokyo Research Lab, IBM,
Text Mining http://www.trl.ibm.com/projects/textmining/index_e.htm
Using data mining on unstructured data, such as the
biomedical literature.
Competition in the
pharmaceutical industry has increasingly become based upon better recognition
and analysis of information, much of which is available as published text.
Breakthrough
Strategies for Text Mining in Pharmaceutical R&D, May 25, 2006, Philadelphia
PA
Text Mining
Glossary, ComputerWorld, 2004 http://www.computerworld.com/databasetopics/businessintelligence/story/0,10801,93967,00.html
Includes Categorization, clustering, extraction, keyword search, natural
language processing, taxonomy, and visualization.
Related terms: natural language processing; Algorithms
& data analysis glossary support vector machines
Google = about 20,600 July 19, 2002
about 39,300 July 3, 2003; about 113,000 Oct. 22, 2004; about 1,110,000 June
22, 2007
thesaurus, thesauri: See under controlled vocabulary
Google = thesaurus about
2,760,000 thesauri about 448,000 July 19, 2002; thesaurus about
6,270,000 Oct. 22, 2004
NISO Z39.19 Standard for Structure and
Organization of Information Retrieval Thesauri http://www.niso.org/standards/resources/Z39-19.html
top-down ontology: We spent the first
six months attempting to design a top- down ontology of engineering. We
accomplished very little until we selected a concrete system and example
applications as contexts for our work. {Jay M. Tenenbaum Lessons from PACT and
SHADE Enterprise Integration Technologies Corporation and Stanford
University, 1995] http://tools.org/EI/ICEIMT/archive/abstracts/PACT-SHADE.abstract
Google = about 10 July 19, 2002;
about 19 Oct. 22, 2004
top-down taxonomy:
Goes from the
general to the specific. Can also mean user oriented. Jean Graef "Top down
or bottom up" Montague Institute Review, 2001
Google = about 16 July 19, 2002
about 90 June 17, 2003; about 79 Oct. 22, 2004
topic maps:
This specification provides a model and grammar for representing the structure of information resources used to define topics, and the associations (relationships) between
topics Names, resources, and relationships are said to be characteristics of abstract subjects, which are called topics. Topics have their characteristics within scopes: i.e. the limited contexts within which the names and resources are regarded as their name, resource, and relationship
characteristics One or more interrelated documents employing this grammar is called a “topic map.”
http://www.topicmaps.org/xtm/1.0/
(XML) Topic Maps,
XML Cover Pages
, Robin Cover, 2002
http://xml.coverpages.org/topicMaps.html
Google = about 23,400 July 19, 2002
UDDI:
Business of biopharmaceuticals glossary
UMLS Unified Medical Language System
In 1986, the National Library of Medicine (NLM), began a
long term research and development project to build a Unified Medical Language
System ® (UMLS ® ).
The purpose of the UMLS is to aid the development of systems that help health
professionals and researchers retrieve and integrate electronic biomedical
information from a variety of sources and to make it easy for users to link
disparate information systems, including computer- based patient records,
bibliographic databases, factual databases, and expert systems. The UMLS project
develops "Knowledge Sources" that can be used by a wide variety of
applications programs to overcome retrieval problems caused by differences in
terminology and the scattering of relevant information across many databases.
[UMLS FactSheet, National Library of Medicine, NIH, US, 2002] http://www.nlm.nih.gov/pubs/factsheets/umls.html
unstructured data:
Today, transforming
unstructured data into a structured form is primarily a manual process; it is
time consuming and costly. However, all leading software applications must
leverage structured data to be effective. [About Mohomine] http://www.mohomine.com/about/index.asp
Generally free text, natural language.
Related term: natural language
processing. Compare structured.
Google = about 21,200 July 19, 2002
upper ontology: An upper ontology is limited to concepts that are meta, generic, abstract and philosophical, and therefore are general enough to address (at a high level) a broad range of domain areas.
[Upper Ontology, IEEE Standard Upper Ontology Working Group] http://ontology.teknowledge.com/\
Google
= about 11,000 Oct. 22, 2004
variance: One of the two components of
measurement error (the other one being bias). Variance results from
uncontrolled (or uncontrollable) variation that occurs in biological samples,
experimental procedures, and arrays themselves;
visualization:
A method of computing by which the enormous bandwidth and
processing power of the human visual (eye- brain) system becomes an integral
part of extracting knowledge from complex data. It utilizes graphics and
imaging techniques as well as knowledge of both data management and the human
visual system. [Lloyd Trenish, Visualization for Deep Thunder, IBM
Research, 2002] http://www.research.ibm.com/weather/vis/w_vis.htm
Use of computer-
generated graphics to make the
information more accessible and interactive. Related term data mining Narrower terms:
data
visualization, information visualization; Algorithms
& data analysis glossary dendogram, heat map, profile chart
visualisation:
As
the quantity of data produced by simulations grows, so does the difficulty of
extracting useful information. It is now clear that in many applications visual
methods are the only practical way of extracting information from the data.
Computer graphics and scientific visualisation techniques have become more
important in the last few years with the increased availability of computing
resource and of visualisation tools. Visualisation is becoming one of the
key tools for problem solving both in traditional areas such as visualisation of
complex flow and in new applications areas like the planning of surgical
operations using 3-D recontruction of anatomical sites using diagnostic images
or the development of highly-realistic aeroplane simulators for pilot
training. DIRECT Development of an Interdisciplinary Roundtable for
Emerging Computer Technologies, Edinburgh University, Scotland http://www.epcc.ed.ac.uk/DIRECT/vect.html
Definitions and
Rationale for Visualisation, D. Scott
Brown, SIGGRAPH, 1999 http://www.siggraph.org/education/materials/HyperVis/visgoals/visgoal2.htm
W3C World Wide Web Consortium: Develops
interoperable technologies (specifications, guidelines, software, and tools) to
lead the Web to its full potential. W3C is a forum for information, commerce,
communication, and collective understanding. http://www.w3.org/
web:
The genome community was an early adopter of the Web, finding in it a way to publish
its vast accumulation of data, and to express the rich interconnectedness of biological information. The Web is the home of primary data, of
genome maps, of expression data, of DNA and
protein sequences, of X-ray crystallographic structures, and of the genome project's huge outpouring of publications. ... However the Web is much more than a static repository of information. The Web is increasingly being used as a front end for sophisticated analytic software. Sequence similarity search engines, protein structural motif finders, exon identifiers, and even mapping programs have all been integrated into the Web. Java applets are adding rapidly to Web browsers' capabilities, enabling pages to be far more interactive than the original click- fetch- click interface. [Lincoln D. Stein "Introduction to Human Genome Computing via the World Wide Web", Cold Spring Harbor Lab, 1998]
Related terms: fractal nature of the
web, weblike Narrower terms: semantic web, web portals, web
services
web harvesting: A Web site is usually viewed as a collection of individual pages interconnected by a simple URL links. This is the common
basis for Web harvesting engines, where these pages are harvested, indexed, and the search results made available to
end- users. As Web sites become increasingly large and sophisticated, it is worthwhile to see how prevalent simple linking is, or
if other Web page navigation techniques are replacing the simple linking model.
[Web Characterization Project, OCLC, 2001] http://wcp.oclc.org/pubs/rn2-navigation.html
Google = about 536 July 19, 2002;
about 3,000 Oct. 22, 2004
weblogs:
Wikipedia http://en.wikipedia.org/wiki/Weblogs
A
history and a perspective http://www.rebeccablood.net/essays/weblog_history.html
Bob's
Weblog Backgrounder,
Bob Stepno http://radio.weblogs.com/0106327/stories/2002/12/14/bobsWeblogBackgrounder.html
Related
terms: blog, blogging, blogosphere, microcontent, nanopublishing
web ontology language: Requirements for a Web Ontology Language, working draft http://www.w3.org/TR/2002/WD-webont-req-20020307/
Google = about 736 July 19, 2002;
about 19,600 Oct. 22, 2004; about 326,000 Nov 17, 2006
web portals: 2.1 Web
Portals, W3C, Requirements for a web ontology
language, work in progress http://www.w3.org/TR/webont-req/#usecase-portal
Google = about 74,600 ("web portal" about
738,000) July 19, 2002
Web search
glossary, Google http://www.google.com/support/bin/answer.py?answer=50187
60 definitions
web service interoperability: Web services
technology has the promise to provide a new level of interoperability between
software applications. It should be no wonder then that there is a rush by
platform providers, software developers, and utility providers to enable their
software with SOAP, WSDL, and UDDI capabilities. http://www-106.ibm.com/developerworks/webservices/library/ws-inter.html
Google = "web service
interoperability" about 412 "web services interoperability"
about 9,620 July 19, 2002; about 283,000 Nov 17, 2006
web services: The goal of the Web Services Activity
is to develop a set of technologies in order to bring Web services to their full
potential. W3C "Web Services Activity 2002 http://www.w3.org/2002/ws/
Google = about 2,110,000 July 19, 2002;
about 122,000,000 Nov 17, 2006
Web services
glossary, W3C, http://www.w3.org/TR/ws-gloss/
webizing: "Webizing Existing
Systems" Tim Berners-Lee, last updated 2001 http://www.w3.org/DesignIssues/Webize
weblike:
[Tim Berners- Lee, Ralph
Swick, Semantic web Amsterdam, 2000 May 16] http://www.w3.org/2000/Talks/0516-sweb-tbl/slide3-1.html
Tim
Berners- Lee writes in his account of coming up with the idea of the web
Weaving the Web about "learning to think in a weblike way". I don't know that I can claim to approach this yet, but the more that I write and research this glossary on and for the web, the more insight I'm getting into what he might mean. Metaphors
like "shooting at a moving target" and like Wayne Gretzky
"skating to where the puck is going to be" are helpful images.
Google = about 3,020
July 19, 2002; about 5,510 Oct. 22, 2004; about 75,700 Nov 17, 2006
"web like" about 788,000,000 Nov 17, 2006
Wiki
collaborative software:
Allows users to post and edit content remotely. An
exciting (and free) way to build and manage content. Wiki Web sites allow
all users to add and edit content. While it might sound like a free-for-all, the
authors suggest such Web sites have been used successfully in research,
business, and education to document project designs, for brainstorming, and for
otherwise creating content in a collaborative fashion. Bo Leuf, Ward
Cunningham, The
Wiki Way: Collaboration and sharing on the internet, 2001
wild
cards and Google
http://www.google.com/support/bin/answer.py?answer=3178&ctx=sibling
Yes you can.
XML: Computers
& computing glossary
Bibliography
Barnes, Ken et. al, Microsoft Lexicon or Microspeak made easier,
1995- 1998, 150 +
terms. http://www.cinepad.com/mslex.htm
FOLDOC Free On-line Dictionary of Computing, Denis Howe, 2007.
14,400+ terms. http://foldoc.org/
Glossary of Ontology Terms, Stanford Univ., 2001, 24 terms.
http://www-ksl-svc.stanford.edu:5915/doc/frame-editor/glossary-of-terms.html
Information Resource Management
Glossary, Government of British Columbia, Canada, 2001 http://www.cio.gov.bc.ca/other/daf/IRM_Glossary.htm
Lycos Tech
Glossary 2002 http://webopedia.lycos.com/
Barnes,
Ken et. al, Microsoft Lexicon or Microspeak made easier, 1995- 1998, 150 +
terms. http://www.cinepad.com/mslex.htm
Schneider, Tom and
Karen Lewis, Glossary for Molecular Information Theory and the Delila System, Lab of Computational and Experimental Biology, NCI
Frederick, US, 2004. 100+ definitions. http://www.lecb.ncifcrf.gov/~toms/glossary.html
W3C Glossary and
Dictionary http://www.w3.org/2003/glossary/
Web search glossary, Google http://www.google.com/support/bin/answer.py?answer=50187
60 definitions
Web services
glossary, W3C, http://www.w3.org/TR/ws-gloss/
Webopedia http://www.webopedia.com/
whatis.com Information Technology encyclopedia. About 3,000 + definitions.
http://whatis.techtarget.com/
XML
Glossary http://www.softwareag.com/xml/about/glossary.htm
Alpha
glossary index
IUPAC definitions are reprinted with the permission of
the International Union of Pure and Applied Chemistry.
How
to look for other unfamiliar terms
|
|