iUMLS – A Program to Enlarge and Modify Output of MetaMap


The aim of this project was the development of a program

  • to map UMLS concepts [2] to the text of clinical practice guidelines generated by means of the MMTx program [1],
  • to develop methods to modify these mappings on both a syntactical and semantical level,
  • to develop algorithms to improve the (often ambiguous) mappings of MMTx,
  • to provide the documents together with all the information gathered from MMTx in a reusable way for further processing.

The MMTx program [1] tokenizes free text of a medical document into phrases and maps UMLS concepts [2] to the medical concepts identified within these phrases. Due to the complexity of this task, it is still necessary to correct parts of the tokenization as well as the mapping of UMLS concepts.

Using the iUMLS program we are able to process a text or a document with the MMTx program. Afterwards, the iUMLS program provides the logic for modifying the output on both a syntactical and semantical level. Syntactical modification means that one is able to correct the tokenization. Semantical modification means that UMLS concept mappings are either corrected or in case of multiple candidates of UMLS concepts it is possible to choose the appropriate one among the candidates.

As manual correction and modification is still a laborious task we developed sophisticated algorithms to improve the mapping of UMLS concepts and their semantic types to text chunks. We tailored these algorithms to clinical practice guideline documents.

The iUMLS program also provides the possibility to save the documents together with their assigned information (received from MMTx or another program).

The iUMLS program is commandline-based. In order to provide a graphical interface for using the iUMLS program we developed the MapFace Editor. The iUMLSprogram is an easily extendable Java application.

An arbitrary sentence of a guideline could be:

The MMTx program tokenizes the sentence into phrase chunks and maps the text to medical concepts available in the UMLS Metathesaurus:

Corrections that need to be accomplished (by means of iUMLS):

  • Splitting the phrase chunk "with mild asthma inhaled steroids".
  • Coosing the correct UMLs concept in case of an ambiguous mapping.
  • Merging the concept chunks "five" and "years" to a single chunk and assingning a UMLS concept with the text "age":



References  [1] Alan R. Aronson (2001): Effective Mapping of Biomedical Text to the UMLS Metathesaurus: The MetaMap Program. In Proc. of the Annual AMIA Symposium 2001, 17-21. 
[2] Donald A. Lindberg, Betsy L. Humphreys, Alexa T. McCray (1993): The Unified Medical Language System. Methods of Information in Medicine, 32(4):281-291.



This work has been supported by “Fonds zur Förderung der wissenschaftlichen Forschung - FWF” (Austrian Science Fund), grant L290-N04.