ITcon Vol. 5, pg. 25-36, http://www.itcon.org/2000/2

Use of Keyphrase Extraction Software for Creation of an AEC/FM Thesaurus

submitted:November 1999
revised:February 2000
published:February 2000
editor(s):B-C. Bjoerk
authors:Branka Kosovac, PhD candidate,
University of British Columbia;
email: branka@civil.ubc.ca

Dana J. Vanier, PhD,
National Research Council Canada;
email: Dana.Vanier@nrc.ca

Thomas M. Froese, PhD,
University of British Columbia;
email: tfroese@civil.ubc.ca
summary:The paper describes a method used to collect terms needed for the development of a thesaurus in the roofing domain. This work is part of a larger effort to investigate the potential of thesauri as an aid in product modeling and as a tool for information management in model-based systems. Extractor, a software module that extracts keyphrases from documents, was used for collecting candidate thesaurus terms from Internet sources. The principal advantage of the Internet as a source of candidate terms is that it reflects the language that is actually used in communications concerning buildings and that it covers the widest range of different views on the domain. The advantage of using Extractor or similar software is that it allows processing huge text corpora available on the Internet while eliminating irrelevant terms. The methodology used was found to be highly useful, although it was not sufficient by itself for constructing a thesaurus for the architecture, engineering, construction and facilities management industries, as considerable human intervention was required. Some possibilities for customizing the software and for partially automating a thesaurus construction process are suggested. 
keywords:thesauri, Internet, automatic indexing software, thesaurus construction
full text: (PDF file, 0.066 MB)
citation:Kosovac B, Vanier DJ and Froese TM (2000). Use of Keyphrase Extraction Software for Creation of an AEC/FM Thesaurus, ITcon Vol. 5, pg. 25-36, http://www.itcon.org/2000/2