The enormous growth of the world wide flood of information makes it more and more impor-tant to use effective tools to extract and condense key information. There are ongoing re-searches in the branch of Natural Language Processing (NLP). Information Extraction (IE) is a section of NLP and is used to extract information from text to fill a database. However, there are limitations in the use of IE. The IE systems need to be specialised on a specific domain and therefore they are only able to handle text from an indicated domain. IE systems are con-sisting of several components, one of the important components may be composed of termi-nologies, ontologies, and vocabularies.
The UMLS combines a huge variety of source vocabularies, terminologies, and ontologies to the SPECIALIST lexicon, the Metathesaurus, and the Semantic Network. The UMLS is a gi-gantic knowledge base, which covers numerous themes in medicine. Due the large size of umls, it is difficult to extract information. Also matching concepts to phrases is not an easy task. With the help of MMTx the matching problem can be outsourced. To break down the complex data structure of UMLS and MMTx, a more simple and easy ac-cessible data structure was introduced, which is part of the UMLSint package. The UMLSint package was developed to simplify the access to the UMLS data, to extract the attributes, which are of interest, and to analyse the input data to find the referring concepts in the knowl-edge base. The UMLSint package gets as an input a sentence of medical text and returns at-tributes of interest from the UMLS in accordance to questioned phrase. The information con-sists of factual knowledge from the Metathesaurus and information generated by the MetaMap Transfer (MMTx) tool. The MMTx tool is used to create logical elements and gather informa-tion about the lexical and morphological structure. For each logical element various information is now accessible, such as semantic type, term type, Part-Of-Speech tag, Metathesaurus concept ID, and many more. This information can be used for both NLP and IE systems for further analysis of the text. The subject of this thesis is to enable IE systems, which process medical text, an easier access to the knowledge base named Unified Medical Language System (UMLS).Natural Language Processing
Advisor
Co-Advisor
Keywords
Abstract
Year of Publication
2007
Secondary Title
Institute of Visual Computing and Human-Centered Technology
Paper
Number of Pages
88
reposiTUm Handle
20.500.12708/14604
Publisher
TU Wien
Place Published
Vienna
M. Kohler, “UMLS for information extraction”, Institute of Visual Computing and Human-Centered Technology. TU Wien, Vienna, p. 88, 2007.
Master Thesis
AC05034616
Advisor
Co-Advisor
Abstract
Medical information is often stored in a narrative way, which makes the automated processing a difficult and time-consuming task.
Persons responsible for the authoring of medical documents do not take care of a further processing with automated systems. So, information stored in medical writings is not directly usable for the processing with computers. Due to this, efforts have been made to transfer these narrative documents in a format easier processable with computers. This matter of fact also applies to clinical practice guidelines (CPGs). As many medical documents, CPGs are written in a narrative speech as well, without regards to a computer-assisted processing. For the implementation of CPGs in medical facilities an automated processing is therefore desirable. An important fact is that a lot of information in CPGs is provided in a negated form, expressing that certain circumstances in patients or treatments are not available, existing or applicable. Although negated, this information is nevertheless very useful, since it can express the absence of certain conditions or diseases in patients. Moreover, negations can describe which treatment options should not be taken into account for a given patient, helping a practising physician or nurse in his/her decision process for the assortment of a proper treatment. Thus, a proper Negation Detection in CPGs is an important task for the automated processing of this type of medical documents. It helps to accelerate the decision making process and can support medical staff in their care for patients. We developed algorithms capable of Negation Detection in CPGs. We use syntactical methods provided by the English language to achieve a precise detection of occuring negations. According to our results we are convinced that the involvement of syntactical methods can improve Negation Detection, not only in medical writings but also in arbitrary narrative texts.Year of Publication
2008
Secondary Title
Institute of Visual Computing and Human-Centered Technology
Paper
Number of Pages
55
reposiTUm Handle
20.500.12708/10716
Publisher
TU Wien
Place Published
Vienna
URL
https://web.archive.org/web/20210417125958/http://ieg.ifs.tuwien.ac.at/projects/neghunter/
S. Gindl, “Negation detection in medical documents using syntactical methods”, Institute of Visual Computing and Human-Centered Technology. TU Wien, Vienna, p. 55, 2008.
Master Thesis
AC05037174