Ontology-driven information extraction
Thesis
|
|
Author | |
Advisor | |
Reviewer | |
Abstract |
Since Berners-Lee proposed and started to endorse ontologies as the backbone of the Semantic Web in the nineties, a whole research field evolved around the fundamental engineering aspects of ontologies, such as the generation, evaluation and management of ontologies. However, many researchers were curious about the usability of ontologies within Information Systems in 'ordinary' settings, performing 'ordinary' information processing tasks. To be used within Information Extraction Systems (IESs), we consider ontologies as a knowledge source that can represent the task specification and parts of the domain knowledge in a formal and unambiguous way. In general, IESs use several knowledge sources (e.g., lexicons, parsers, etc.) to achieve good performance. Some also require humans to generate the extraction rules that represent the task specification of the system. However, often these resources are not at hand or the dependency on them lead to compromises regarding the scalability and performance of an IES. Therefore, we wondered whether ontologies formulated in a standard ontology representation language, such as OWL, are suitable enough to represent the task specification and also the domain knowledge to some extent, which the IES can utilise as its only knowledge resource. Our aim is to identify the limits of such an approach, so that we can conclude that things can only get better from that point onwards by using other resources whenever available. In this thesis, we propose an extraction method that utilises the content and predefined semantics of an ontology to perform the extraction task without any human intervention and dependency on other knowledge resources. We also analyse the requirements to ontologies when used in IESs and propose the usage of additional semantic knowledge to reconcile them. Further, we propose our method to detect out-of-date constructs in the ontology to suggest changes to the user of the IES. We state the results of our experiments, which we conducted using an ontology from the domain of digital cameras and a document set of digital camera reviews. After performing the experiments with a different task specification using a larger ontology, we conclude that the use of ontologies in conjunction with IESs can indeed yield feasible results and contribute to the better scalability and portability of the system. |
Keywords | |
Year of Publication |
2007
|
Academic Department |
Institute of Visual Computing and Human-Centered Technology
|
Number of Pages |
109
|
University |
TU Wien
|
City |
Vienna
|
reposiTUm Handle | |
Download citation |